cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1859
Views
0
Helpful
18
Replies

Spanning Tree Problem

smahrous
Level 1
Level 1

we have 2 core switches (6513) connected to 50 edges (3550) by dual links , I attached a visio diagram for illustration .All the network work probarly every day , we daily add new vlans on the edge switches to configure ports . ( we dont use VTP ) . The 2 cores are root bridges for the vlans vlans

Yestarday suddenly when we created a vlan on the edge switch ( already defined on the core switches ) all the network goes down. i have noticed that when i created the vlan on the edge ,it becomes the root for that vlan and the 2 up links becomes forward state and the networks becomes down.which means that the edge stop recieving the root BPDUs or the core switch stop sending it .

I tested other vlans on the night and it cause the same failure ,

when i removed the uplinks of edge and reconnected it again everything becomes ok and the core becomes the root of that vlan.

Why when i create this vlan or any new one on the core and gives it a priority 4096 and create it on the edge the network goes down because the edge becomes the root for that vlan i don't know why?

I don't know if there is a GBIC failure in the edges or what is the problem?

All the edge switches have this problem.

May be there is a hardware failure in the GBIC i don't know.what do you think about that?

i have upgraded all the edges switches to higher version but the problem is still exist.

Note that the network works fine every day , why suddenly that happened !!!!!

Any advices

regards,

18 Replies 18

timdeadman
Level 1
Level 1

I have seen a similar thing where the GBIC cable was bent at an alarming angle. Occasionally the GBIC would Xmit but could not recieve BPDUs and a spanning tree loop occurred.

I will check with one of our guys who was closer to the action than I was and get him to post his findings, but as a first step, changing GBICs and repatching cables tonight would be a good move.

I want to tell you that the network nowis running well . The failure will reocuured at any time when any of our staff creats a new vlan on any of the 50 edge switches .

Accordinally we postpone any change in the configuration still we get the reason .

So if there is a GBIC or a patch cord failure , that would not let the network running fine righnow !!!

You may be right if the GBIC failure would appear only when we add the new vlan ..which may be ???

So letme ask , is there any command I can run on the edges to check the Hardware and the conenctivity of my GBICs and fiber cables

No Command that I know of!

As I said I wasn't directly involved, I have put a mail to my colleague who was in the firing line. As I remember they had to OTDR all of the fibre cables and only found it by accident when they saw the GBIC stack cable. Changing the GBIC worked in this occasion but like a lot of intermittent faults, you are never REALLY sure what fixed the problem.

This is a very interesting problem, but I would not be looking for GBIC failures. You say that when you create a VLAN at the edge, then it affects that VLAN at the core, and that the edge switch becomes the root. That suggests to me that the priority of the VLAN at the moment of creation is not 32768 as it should be, but something lower than or equal to 4096, 'cos you say that the root priority is 4096.

When you do this experiment, does the real core switch regain its root status on that VLAN after some seconds? If so, then I think we are looking at a bug, although I couldn't find it in the bug database. If not, why is the edge switch the root? What is its priority?

Kevin Dorrell

Luxembourg

No the core still have the right priorites of the VLAN , I attached the 3 text files , one for the configured edge and the other 2 for both cores .

Hope that help

Thanks

see that your core switches are running rapid PVST+, and I wonder if that has anything to do with it. (Getting out of my depth here.) Is the edge switch running rapid Spanning Tree as well?

Also, do you have bpdu-filter set on any of the ports?

Kevin Dorrell

Luxembourg

I dont see rapid PVST+, when i show spanning tree inof ron the 3550 switches , But As I think it works fine all the previous 9 months .

No we dont have bpdus filter on any trunk , all trunks pass all vlans

I believe that you have a problem with VLAN Trunking!

Switch MAC Address Priority Sez root is Comment

------------ -------------- -------- ----------- -------

SW-Core1 000C-CF47-2080 4168 Itself Correct

SW-Core2 000C-CF6D-6280 8264 SW-Core1 Correct

SP-Access231 000D-28B4-8000 32840 Itself !Error!

Are you sure that you are trunking *all* VLANs between *all* switches?

Why don't we see any ports in "Blocking" state preventing a layer-2 loop?

---Richard

Yes , Iam trunking all vlans between switches , all vlans are allowed to pass throught the trunks ?

Is that a problem ?? But how it is working now ,

Not seeing a blocking port is only happened during the failure , in normal cases there are blocking ports as you can see in the attached files

I think you ran out of STP instances.

Your edge switch, a 3550, has a maximum of 128 STP instances, and with 55 VLANs and 2 trunk ports, you are already using 110 (55 x 2) unless you prune VLANs off the trunk. In addition, each 3550 access switch has probably another 24 ports running STP giving a total of 134 stp instances.

Check my other posting about pruning VLANs of the trunks.

1) The default spanning tree mode is PVST+. You may have a network running Rapid-PVST+ on the core and PVST+ on the edge. Check for a spanning-tree mode command in the config.

This link describes the differences between rapid STP and STP: http://www.cisco.com/en/US/tech/tk389/tk621/technologies_white_paper09186a0080094cfa.shtml

2) Before a switch gets an BPDU from the root bridge, it thinks that itself is the root. You could have a situation where it never receives any BPDUs on VLAN72 from the core. If both Core switches and the edge switch think they are the root for a VLAN, none of them will block any ports, therefore a STP loop.

3) I suggest this procedure in a service window:

Connect to your edge switch by the console cable (so you are sure you don't loose connectivity) and run debug spanning events.

Do a shutdown on both ports to the core, create your VLAN, and do a no shut on both links to the core. Observe what happens for all VLANs. There are also several other debug commands available.

4) Check the trunking of VLANs by using show int switchport. Are all VLANs allowed. Check also the link between the two core switches.

Verify that the native VLAN is the same on both sides.

regards,

harald

Post a "show span root" and "show span summ" for edge and core IOS based devices. The outputs look like IOS for the edge and CatOS for the core, so this may not be a valid CatOS command..

Yes , Iam trunking all vlans between switches , all vlans are allowed to pass throught the trunks ?

Is that a problem ?? But how it is working now ,

Not seeing a blocking port is only happened during the failure , in normal cases there are blocking ports as you can see in the attached files

The 6513 are running Rapid-PVST and the 3550 is PVST. Enable Rapid-PVST on the 3550 and see if that helps. You will need 12.1(13)EA1c or greater.

All Vlans on the 3550 see the same root 000c.cf6d.6280 via Gi0/2. You could alternate the root priority and achieve load balancing on the uplinks. I prefer 8192 for root and 16384 for secondary.

Review Cisco Networking for a $25 gift card