08-29-2018 07:33 AM - edited 03-01-2019 05:38 AM
Hi,
Please find the attached diagram for your reference.
I have 100 Cisco 3560X LAN switches in my environment and all are access switches. Earlier, the switches were connected through 6513 and that is the root bridge of all the vlans. I have replaced 6513 to ACI last month, After that i frequently face the issue "Excessive STP TCN Flushes" and thus the connection between ACI Leaf to firewall lost due to the switches which are connected to LEAF (Right side of the diagram). I have confirm, there are no direct connections between all the access switches.
During the issue, I have checked on the switch ports, they are in error disable mode, thus I have to shut-no shut the port-channel to the solve the issue temporarily. I have found the Root switch of my STP topology.
Kindly suggest permanent solution for this. For Port-channel policy, I did not enable bpdu-guard and bpdu-filter.
Solved! Go to Solution.
08-30-2018 02:30 PM
I have seen multiple root bridge for same vlan on different access switches. Some VLANs have single root bridge, but some have multiple root bridges
Well, it certainly looks like the issue is related to Spanning Tree BPDU distribution. However, we need to make sure that the spanning tree used is consistent across all switches and the ACI fabric. In particular, the ACI configuraiton is different between Cisco's PVST and IEEE STP. If you capture and BPDUs and see that they are carrying VLAN tags, you are running Cisco's PVST or RPVST (default on most Cisco Catalyst switches - Nexus switches on the other hand run IEEE MST by default). It might be worth checking the STP configuration on all switches is consistent.
Looking forward to seeing what the TCN BPDU shows you!
08-31-2018 06:46 AM - edited 08-31-2018 09:05 AM
Hi Chris,
I have checked all the access switches connected to ACI, Spanning tree enabled protocol ieee and mode is PVST. But when I was analyzed the BPDU received at switch level in wireshark it shows PVST+ with the root bridge as source. I have checked root bridge mac address and found the the switch. I am sure that no PVST+ is running in my topology.
I would like to mention other thing is, earlier I had 6513 as my core switch and VTP server, now there is no VTP server and all switches are in vtp client mode. All the vlans are spread across the environment. Now, I am planning to change the VTP mode to Transparent and delete unnecessary vlans on the access switches. PVST has 128 STP instance limit and found that "no spanning tree vlan X,Y,Z... instance" on most of the switches, so better to delete unnecessary vlans on the access switches. So, STP will run efficiently.
I am also thinking to turn off VTP on all the switches and implement MST, instead of PVST. Kindly suggest steps for seamless migration to MST.
Awaiting for your response and the suggestion.
01-23-2019 10:01 PM - edited 01-23-2019 10:01 PM
Yes resolved except for 2 VLANs. I have reconfigured STP related parameters in all the switches and we are under process of migrating RSTP to MST
08-29-2018 02:02 PM
Hi Darpan,
Firstly I notice that the message is coming from all leaf switches:
Although not marked, I assume your switches are numbered like so:
You also mentioned that you had "found the root of the spanning-tree topology", but you forgot to tell us which switch is the root. For the moment I have labelled one of the Cisco3560xs as STP root. In any case the way to solve the problem is to find out why the TCNs are being generated.
You also didn't metion which version of spanning tree you are using, but since the messages are related to a VLAN, I'll assume you are running Cisco's proprietary PVST or RPVST.
No doubt you understand that whenever aport transitions on a VLAN, the switch sends a TCN back toward the root, which will then set the TCN bit on the next STP update. We need to find where those transitions are occurring.
You will also no doubt understand that in ACI, PVST and RPVST STP BPDUs (including TCNs) are flooded within an EPG - assuming that the EPG is a single VLAN. If the EPG has multiple VLANs, then there is every chance that the BPDUs could leak from one VLAN to another, but that is another whole story. Let's not go there yet.
Of course, if you are running IEEE STP/MST, then there is a whole new approach to containing Spanning Tree BPDUs in ACI (See this post for more info)
My first suspicion is that there is a Layer2 loop somewhere in the network that is causing the STP transitions. If you can capture one of the TCN BPDUs that are being sent from a switch TOWARDS the root bridge, then you might glean some information that might help track it down.
Is there any way that there could be leakage between the switches on the Blade Chassis?
Is there any VLAN traffic on the HA link between the firewalls?
I may not have solved your problem, but I hope I have given you some pointers as to where to look next.
I hope this helps
Don't forget to mark answers as correct if it solves your problem. This helps others find the correct answer if they search for the same problem
08-30-2018 06:25 AM - edited 08-30-2018 06:31 AM
Hello Chris ,
You also mentioned that you had "found the root of the spanning-tree topology", but you forgot to tell us which switch is the root. For the moment I have labelled one of the Cisco3560xs as STP root. In any case the way to solve the problem is to find out why the TCNs are being generated.
Yesterday night same issue occurred, now I have seen multiple root bridge for same vlan on different access switches. Some VLANs have single root bridge, but some have multiple root bridges.
You also didn't metion which version of spanning tree you are using, but since the messages are related to a VLAN, I'll assume you are running Cisco's proprietary PVST or RPVST.
STP - IEEE Standard PVST.
You will also no doubt understand that in ACI, PVST and RPVST STP BPDUs (including TCNs) are flooded within an EPG - assuming that the EPG is a single VLAN. If the EPG has multiple VLANs, then there is every chance that the BPDUs could leak from one VLAN to another, but that is another whole story. Let's not go there yet.
Of course, if you are running IEEE STP/MST, then there is a whole new approach to containing Spanning Tree BPDUs in ACI
I have configured one to one mapping
IP Subnet
10.1.1.X - Vlan 10 -BD 10 - EPG10
10.1.2.X - Vlan 20 -BD 20 - EPG20 and so on.
My first suspicion is that there is a Layer2 loop somewhere in the network that is causing the STP transitions. If you can capture one of the TCN BPDUs that are being sent from a switch TOWARDS the root bridge, then you might glean some information that might help track it down.
captured TCN BPDU's. now i am analyzing .
Is there any way that there could be leakage between the switches on the Blade Chassis?
I have confirmed from the Vendor of HP/DELL, there is no leakage between the switches.
Is there any VLAN traffic on the HA link between the firewalls?
HA link has diffrent vlan and ip segment. no traffic allwed between the firewall other than HA IPs.
The confirm thing is that all the switches are connected by LEAF switches and ACI policy of STP is default - bpdu guard/ filter not enabled.
08-30-2018 02:30 PM
I have seen multiple root bridge for same vlan on different access switches. Some VLANs have single root bridge, but some have multiple root bridges
Well, it certainly looks like the issue is related to Spanning Tree BPDU distribution. However, we need to make sure that the spanning tree used is consistent across all switches and the ACI fabric. In particular, the ACI configuraiton is different between Cisco's PVST and IEEE STP. If you capture and BPDUs and see that they are carrying VLAN tags, you are running Cisco's PVST or RPVST (default on most Cisco Catalyst switches - Nexus switches on the other hand run IEEE MST by default). It might be worth checking the STP configuration on all switches is consistent.
Looking forward to seeing what the TCN BPDU shows you!
08-31-2018 06:46 AM - edited 08-31-2018 09:05 AM
Hi Chris,
I have checked all the access switches connected to ACI, Spanning tree enabled protocol ieee and mode is PVST. But when I was analyzed the BPDU received at switch level in wireshark it shows PVST+ with the root bridge as source. I have checked root bridge mac address and found the the switch. I am sure that no PVST+ is running in my topology.
I would like to mention other thing is, earlier I had 6513 as my core switch and VTP server, now there is no VTP server and all switches are in vtp client mode. All the vlans are spread across the environment. Now, I am planning to change the VTP mode to Transparent and delete unnecessary vlans on the access switches. PVST has 128 STP instance limit and found that "no spanning tree vlan X,Y,Z... instance" on most of the switches, so better to delete unnecessary vlans on the access switches. So, STP will run efficiently.
I am also thinking to turn off VTP on all the switches and implement MST, instead of PVST. Kindly suggest steps for seamless migration to MST.
Awaiting for your response and the suggestion.
08-31-2018 03:57 PM
Quick reply for now, about to board a 5hr flight.
Migrating to MST = good idea
Migration plan = bit more complex!
01-23-2019 08:05 PM
Hi Darpan,
Just wanted to know, Have you got any solution for TCN flushes.
01-23-2019 10:01 PM - edited 01-23-2019 10:01 PM
Yes resolved except for 2 VLANs. I have reconfigured STP related parameters in all the switches and we are under process of migrating RSTP to MST
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide