03-07-2012 10:42 AM - edited 03-07-2019 05:24 AM
Hi there,
a link failure today made us realize that we may have a problem with IEEE STP and RSTP interoperation. See our (partial) network below:
After the root port on WS-C3750V2 was unplugged we lost the connection to the switch, although the link to the old WS-C3548 (gi3/0/1) was still up and running. Here ist what I saw when I connected to the console port of the WS-C3750V2:
C3750#show int gi3/0/1
GigabitEthernet3/0/1 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is 8cb6.4f46.1101 (bia 8cb6.4f46.1101)
Description: [...]
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive not set
Full-duplex, 1000Mb/s, link type is auto, media type is 1000BaseSX SFP
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:03, output hang never
Last clearing of "show interface" counters 24w5d
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/0 (size/max)
5 minute input rate 9000 bits/sec, 11 packets/sec
5 minute output rate 0 bits/sec, 0 packets/sec
153099993 packets input, 15094492557 bytes, 0 no buffer
Received 76420545 broadcasts (63805332 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 63805375 multicast, 0 pause input
C3750#show int gi3/0/1 trunk
Port Mode Encapsulation Status Native vlan
Gi3/0/1 on 802.1q trunking 1
Port Vlans allowed on trunk
Gi3/0/1 10,20
Port Vlans allowed and active in management domain
Gi3/0/1 10,20
Port Vlans in spanning tree forwarding state and not pruned
Gi3/0/1 none
C3750#show spanning-tree vlan 10
VLAN0010
Spanning tree enabled protocol rstp
Root ID Priority 4106
Address 0011.5d8e.7000
Cost 30004
Port 109 (GigabitEthernet3/0/1)
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 32778 (priority 32768 sys-id-ext 10)
Address 8cb6.4f70.be80
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Aging Time 300 sec
Interface Role Sts Cost Prio.Nbr Type
------------------- ---- --- --------- -------- --------------------------------
Fa2/0/8 Desg FWD 200000 128.64 P2p Edge
Fa2/0/9 Desg FWD 200000 128.65 P2p Edge
Gi3/0/1 Root BLK 20000 128.109 P2p Peer(STP) <-- ???
Fa4/0/11 Desg FWD 100 128.175 Shr Edge Peer(STP)
Fa4/0/12 Desg FWD 100 128.176 Shr Edge Peer(STP)
c3750#show run int gi3/0/1
Building configuration...
Current configuration : 275 bytes
!
interface GigabitEthernet3/0/1
description Host: ...
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 10,20
switchport mode trunk
udld port aggressive
spanning-tree guard loop
end
c3750#show spanning-tree vlan 10 root
Root Hello Max Fwd
Vlan Root ID Cost Time Age Dly Root Port
---------------- -------------------- --------- ----- --- --- ------------
VLAN0010 4106 0011.5d8e.7000 30004 2 20 15 Gi3/0/1
c3750#show spanning-tree vlan 10 inconsistentports
Name Interface Inconsistency
-------------------- ------------------------ ------------------
Number of inconsistent ports (segments) in vlan 10 : 0
c3750#show spanning-tree vlan 10 detail
VLAN0010 is executing the rstp compatible Spanning Tree protocol
Bridge Identifier has priority 32768, sysid 10, address 8cb6.4f70.be80
Configured hello time 2, max age 20, forward delay 15, transmit hold-count 6
Current root has priority 4106, address 0011.5d8e.7000
Root port is 109 (GigabitEthernet3/0/1), cost of root path is 30004
Topology change flag not set, detected flag not set
Number of topology changes 1303 last change occurred 00:12:57 ago
from StackPort1
Times: hold 1, topology change 35, notification 2
hello 2, max age 20, forward delay 15
Timers: hello 0, topology change 0, notification 0, aging 300
[...]
Port 109 (GigabitEthernet3/0/1) of VLAN0010 is root blocking <-- again root blocking ???
Port path cost 20000, Port priority 128, Port Identifier 128.109.
Designated root has priority 4106, address 0011.5d8e.7000
Designated bridge has priority 32768, address 0006.536c.0f01
Designated port id is 128.75, designated path cost 10004
Timers: message age 15, forward delay 0, hold 0
Number of transitions to forwarding state: 1
Link type is point-to-point by default, Peer is STP
Loop guard is enabled on the port
BPDU: sent 14, received 7427448
This was not just a slow STP-convergence, the switch remained in root blocking mode forever, although there was only 1 physically remaining connection to the root bridge.
If you have any ideas what is going on, I'd be happy to hear them.
best regards
Pille
03-07-2012 02:20 PM
Pille,
This is most interesting... I wonder: can you try adding the VLAN 1 into the list of allowed VLANs on the link between the 3750 and 3548 and see if that changes the behavior?
Best regards,
Peter
03-07-2012 03:09 PM
Hallo Peter,
we had the same idea and added vlan 1 on both trunks to and from the 3548, however the situation remained the same, at least for vlan 10 and 20. For vlan 1 the 3548 has become root bridge and gi3/0/1 on 3750 is root port and forwarding.
Looking forward to other suggestions
Best regards,
Pille
03-07-2012 10:45 PM
Hello Pille,
At this point, I would personally suggest running debugs to see why the RPVST+ on your 3750 decides to keep the Gi3/0/1 as root blocking.
Do you believe you are able to provide the output of the following debugs after the primary link from 3750 to 6500 is disconnected?
debug spanning-tree events
debug spanning-tree pvst+
I can also try to replicate your situation in our lab. What exact STP version does the 3548 switch run - is it truly a single instance IEEE STP with no per-VLAN behavior?
In addition, however, I am very confused by an output you've provided earlier:
C3750#show spanning-tree vlan 10
VLAN0010
Spanning tree enabled protocol rstp
Root ID Priority 4106
Address 0011.5d8e.7000
Cost 30004
Port 109 (GigabitEthernet3/0/1)
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 32778 (priority 32768 sys-id-ext 10)
Address 8cb6.4f70.be80
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Aging Time 300 sec
Interface Role Sts Cost Prio.Nbr Type
------------------- ---- --- --------- -------- --------------------------------
Fa2/0/8 Desg FWD 200000 128.64 P2p Edge
Fa2/0/9 Desg FWD 200000 128.65 P2p Edge
Gi3/0/1 Root BLK 20000 128.109 P2p Peer(STP) <-- ???
Fa4/0/11 Desg FWD 100 128.175 Shr Edge Peer(STP)
Fa4/0/12 Desg FWD 100 128.176 Shr Edge Peer(STP)
What are the Fa4/0/11 and Fa4/0/12 ports connected to? I am quite surprised to see these ports identified both as Edge ports and being connected to a legacy STP peer. That is a contradiction in itself: a port works either an Edge (i.e. PortFast-enabled) port which means that it never receives any BPDUs whatsoever - and then it cannot know the STP type of the peer, or it receives BPDUs and then it shall immediately drop its PortFast operational status and become a non-edge port. It should not be possible for a port to be both Edge port and know the peer's STP type - because there shall be no peer at all, and if there is, then the port should have not retained the Edge status.
What IOS version are you currently running on the 3750? We should take into consideration that this can be a bug.
Best regards,
Peter
03-08-2012 02:05 AM
Hi Peter,
thanks for your continued interest.
Do you believe you are able to provide the output of the following debugs after the primary link from 3750 to 6500 is disconnected?
debug spanning-tree events
debug spanning-tree pvst+
Here's the debug output:
Mar 8 10:31:07.064 CET: RSTP(10): updt roles, root port Gi1/0/1 going down
Mar 8 10:31:07.064 CET: RSTP(10): we become the root bridge
Mar 8 10:31:07.064 CET: RSTP(20): updt roles, root port Gi1/0/1 going down
Mar 8 10:31:07.064 CET: RSTP(20): we become the root bridge
Mar 8 10:31:07.122 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/1, changed state to down
Mar 8 10:31:07.131 CET: RSTP(10): updt roles, received superior bpdu on St1
Mar 8 10:31:07.131 CET: RSTP(10): St1 is now root port
Mar 8 10:31:07.131 CET: RSTP(10): St1 received a tc ack
Mar 8 10:31:07.139 CET: RSTP(20): updt roles, received superior bpdu on St1
Mar 8 10:31:07.139 CET: RSTP(20): St1 is now root port
Mar 8 10:31:07.139 CET: RSTP(20): St1 received a tc ack
Mar 8 10:31:07.785 CET: RSTP(10): St1 received a tc ack
Mar 8 10:31:07.785 CET: RSTP(20): St1 received a tc ack
Mar 8 10:31:09.077 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/1, changed state to down
Mar 8 10:31:11.820 CET: RSTP(10): St1 received a tc ack
Mar 8 10:31:11.820 CET: RSTP(20): St1 received a tc ack
Mar 8 10:34:40.595 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/1, changed state to up
Mar 8 10:34:41.576 CET: STP: PVST vlan 64 port Gi1/0/1 created, ext id 3402898
Mar 8 10:34:41.576 CET: RSTP(10): initializing port Gi1/0/1
Mar 8 10:34:41.576 CET: RSTP(10): Gi1/0/1 is now designated
Mar 8 10:34:41.585 CET: STP: PVST vlan 911 port Gi1/0/1 created, ext id 3402898
Mar 8 10:34:41.585 CET: RSTP(20): initializing port Gi1/0/1
Mar 8 10:34:41.585 CET: RSTP(20): Gi1/0/1 is now designated
Mar 8 10:34:41.585 CET: RSTP(10): transmitting a proposal on Gi1/0/1
Mar 8 10:34:41.593 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/1, changed state to up
Mar 8 10:34:41.593 CET: RSTP(20): transmitting a proposal on Gi1/0/1
Mar 8 10:34:42.868 CET: RSTP(20): transmitting a proposal on Gi1/0/1
Mar 8 10:34:42.868 CET: RSTP(10): transmitting a proposal on Gi1/0/1
Mar 8 10:34:44.428 CET: RSTP(20): updt roles, received superior bpdu on Gi1/0/1
Mar 8 10:34:44.428 CET: RSTP(20): Gi1/0/1 is now root port
Mar 8 10:34:44.428 CET: RSTP(20): St1 blocked by re-root
Mar 8 10:34:44.437 CET: RSTP(20): St1 is now designated
Mar 8 10:34:44.873 CET: RSTP(10): updt roles, received superior bpdu on Gi1/0/1
Mar 8 10:34:44.873 CET: RSTP(10): Gi1/0/1 is now root port
Mar 8 10:34:44.873 CET: RSTP(10): St1 blocked by re-root
Mar 8 10:34:44.873 CET: RSTP(10): St1 is now designated
Mar 8 10:34:46.442 CET: RSTP(20): Gi1/0/1 agree (allSynced)
Mar 8 10:34:46.870 CET: RSTP(10): Gi1/0/1 agree (allSynced)
Mar 8 10:34:48.463 CET: RSTP(20): Gi1/0/1 agree (allSynced)
Mar 8 10:34:48.874 CET: RSTP(10): Gi1/0/1 agree (allSynced)
Mar 8 10:34:50.502 CET: RSTP(20): Gi1/0/1 agree (allSynced)
Mar 8 10:34:50.904 CET: RSTP(10): Gi1/0/1 agree (allSynced)
Mar 8 10:34:52.565 CET: RSTP(20): Gi1/0/1 agree (allSynced)
Mar 8 10:34:52.901 CET: RSTP(10): Gi1/0/1 agree (allSynced)
Mar 8 10:34:54.579 CET: RSTP(20): Gi1/0/1 agree (allSynced)
Mar 8 10:34:54.897 CET: RSTP(10): Gi1/0/1 agree (allSynced)
I can also try to replicate your situation in our lab. What exact STP version does the 3548 switch run - is it truly a single instance IEEE STP with no per-VLAN behavior?
Hard to believe these antiques are still in service isn't it? It says: "Spanning tree 1 is executing the IEEE compatible Spanning Tree protocol ", so my understanding is it is truly the single instance IEEE STP.
If you are willing to invest that much time and replicate the scenario in your lab, I'd be more than happy to hear your results.
What are the Fa4/0/11 and Fa4/0/12 ports connected to? I am quite surprised to see these ports identified both as Edge ports and being connected to a legacy STP peer. That is a contradiction in itself: a port works either an Edge (i.e. PortFast-enabled) port which means that it never receives any BPDUs whatsoever - and then it cannot know the STP type of the peer, or it receives BPDUs and then it shall immediately drop its PortFast operational status and become a non-edge port. It should not be possible for a port to be both Edge port and know the peer's STP type - because there shall be no peer at all, and if there is, then the port should have not retained the Edge status.
I was confused for a moment as well. These ports are connected to L3-Interfaces of old C2600 Routers that we abuse as terminal servers. Since there is no STP instance existing on the C2600s there couldn't be any received BPDUs on the C3750. I suppose it is cdp that tells the C3750 that if STP comes alive it will be IEEE.
What IOS version are you currently running on the 3750? We should take into consideration that this can be a bug.
That is our working hypothesis right now. Current IOS is 12.2(50)SE1 IPBase K9.
I hope i answered all your questions. If you have any additional suggestion what to do, let me know. I ran out of ideas.
Best regards,
Pille
03-08-2012 02:47 AM
Hello Pille,
The debugs are somewhat confusing. They show that at 10:31, you've disconnected your Gi1/0/1 port and St1 (the stacking port) became the new root port. Strange. There is absolutely no info about the Gi3/0/1 port behavior. Were you connected to the switch 3 in the stack? Would it be possible to conduct these debugs from it? Three minutes later, you've connected the Gi1/0/1 port back and the usual proposal/agreement procedure took place, but again no mention of the Gi3/0/1.
Hard to believe these antiques are still in service isn't it? It says: "Spanning tree 1 is executing the IEEE compatible Spanning Tree protocol ", so my understanding is it is truly the single instance IEEE STP.
As a matter of fact, I've tried to perform tests in our lab - and, as a part of them, I have determined that the 3548 switch must be running PVST+ and not the plain IEEE 802.1D legacy STP. The reason is that if the 3548 ran the legacy STP, it would emit all its BPDUs untagged, essentially placing them into VLAN1. As the VLAN1 was disallowed on the trunks, the 3548 would be the only switch in the VLAN 1 and it would therefore become the root for VLAN1, having both its ports Designated Forwarding. For the RPVST+ BPDUs, it would be completely transparent. That, however, contradicts the information produced in the show span int gi3/0/1 you've posted:
Port 109 (GigabitEthernet3/0/1) of VLAN0010 is root blocking
Port path cost 20000, Port priority 128, Port Identifier 128.109.
Designated root has priority 4106, address 0011.5d8e.7000
Designated bridge has priority 32768, address 0006.536c.0f01
Note that the designated bridge on this port is different from the root bridge - and they would be identical if the RPVST+ BPDUs were transparently carried from your topmost 6500 through the 3548 down to 3750. Hence, the 3548 must be running PVST+ and I assume that the MAC address 0011.5d8e.7000 is its base MAC address - can you verify that?
Regarding my lab tests - I have emulated a plain 802.1D switch by using a 1841 configured for IRB bridging and running the legacy 802.1D STP on it, being completely transparent to PVST+ or RPVST+ BPDUs. The results were as explained earlier: my "3750" chose the right leg via the "3548" (in this case, the 1841 router) as its root port immediately for VLAN 10 and 20 because the cost in VLAN10 and VLAN20 was smaller (1 x cost of FastEthernet interface vs. 2 x cost of FastEthernet interface if we went via the left leg). Allowing VLAN1 on all trunks just made the topmost "6500" to be the root switch for VLAN1 but otherwise, no ill behavior was observed. So I changed the 1841 with a plain 2960 switch configured for PVST+ - and as I expected, the STP behaved normally. I wasn't able to reproduce the behavior you were experiencing.
Since there is no STP instance existing on the C2600s there couldn't be any received BPDUs on the C3750. I suppose it is cdp that tells the C3750 that if STP comes alive it will be IEEE.
I do not believe CDP carries any that sort of information and I have never seen actually STP taking any information from CDP. I am practically sure that this is not the case.
That is our working hypothesis right now. Current IOS is 12.2(50)SE1 IPBase K9.
Would it be possible to upgrade your switch to the latest 12.2(58)SE2 version? Please try to avoid the 15.x series at the moment. The behavior as you've experienced it is definitely not normal.
Please keep me informed.
Best regards,
Peter
03-08-2012 05:18 AM
You also may want check the 3548 version and upgrade it to the last version they made back in 2007
c3500xl-c3h2s-mz.120-5.WC17.bin and see if your results are any different.
03-08-2012 06:46 AM
Glen,
Good point - thanks for that. Although the problem does not appear, so far, to be caused upstream from the 3750, it should not do any harm to have the 3548 upgraded.
I faintly remember Francois Tallet making an offhand remark on Cisco Live! 2012 in London about some STP/RSTP bug in recent IOSes. Perhaps we're dealing with some of it now.
Best regards,
Peter
03-08-2012 02:57 PM
Hi Peter,
The debugs are somewhat confusing. They show that at 10:31, you've disconnected your Gi1/0/1 port and St1 (the stacking port) became the new root port. Strange. There is absolutely no info about the Gi3/0/1 port behavior. Were you connected to the switch 3 in the stack? Would it be possible to conduct these debugs from it? Three minutes later, you've connected the Gi1/0/1 port back and the usual proposal/agreement procedure took place, but again no mention of the Gi3/0/1.
Oh yes, the logs are confusing and considering what I have witnessed today my confusion even increased. To answer your question, I wasn't connected to switch 3, actually I wasn't connected to the switch at all, because I got tired of running down to the rather cold datacenter. What you see is the logging buffer, but I was connected to switch number 3 half a day yesterday when I first started the debugging process and I can assure you, the debug looked the same. Interface Gi3/0/1 wasn't mentioned at all. I even changed the uplink to member-Switch number 2 (Gi2/0/1), same result. However I wasn't aware that St1 actually means Stackport 1 until you told so.
As a matter of fact, I've tried to perform tests in our lab - and, as a part of them, I have determined that the 3548 switch must be running PVST+ and not the plain IEEE 802.1D legacy STP. The reason is that if the 3548 ran the legacy STP, it would emit all its BPDUs untagged, essentially placing them into VLAN1. As the VLAN1 was disallowed on the trunks, the 3548 would be the only switch in the VLAN 1 and it would therefore become the root for VLAN1, having both its ports Designated Forwarding. For the RPVST+ BPDUs, it would be completely transparent. That, however, contradicts the information produced in the show span int gi3/0/1 you've posted:Port 109 (GigabitEthernet3/0/1) of VLAN0010 is root blocking
Port path cost 20000, Port priority 128, Port Identifier 128.109.
Designated root has priority 4106, address 0011.5d8e.7000
Designated bridge has priority 32768, address 0006.536c.0f01
Note that the designated bridge on this port is different from the root bridge - and they would be identical if the RPVST+ BPDUs were transparently carried from your topmost 6500 through the 3548 down to 3750. Hence, the 3548 must be running PVST+ and I assume that the MAC address 0011.5d8e.7000 is its base MAC address - can you verify that?
While I agree with your explanation, the MAC-adresses are the other way around: .7000 is the correct root adress (the topmost 6500) and .0f00 is the 3548's base MAC, with .0f01 being the first FastEthernet interface.
I do not believe CDP carries any that sort of information and I have never seen actually STP taking any information from CDP. I am practically sure that this is not the case.
You sound very convinced. I am willing to accept your statement, however I don't believe this is somehow related to the current problem.
Would it be possible to upgrade your switch to the latest 12.2(58)SE2 version? Please try to avoid the 15.x series at the moment. The behavior as you've experienced it is definitely not normal.
That will be the next step, maybe sometime late next week I will be able to do an IOS Update.
Today we tried to manipulate the STP path cost on the 3750 to force a transition on gi3/0/1, usually with the result of the switch being unreachable. However after a few shut/noshut and the like we witnessed the following:
c3750#show spanning-tree vlan 10
VLAN0010
Spanning tree enabled protocol rstp
Root ID Priority 4106
Address 0011.5d8e.7000
Cost 30000
Port 1 (GigabitEthernet1/0/1)
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 32778 (priority 32768 sys-id-ext 10)
Address 8cb6.4f70.be80
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Aging Time 300 sec
Interface Role Sts Cost Prio.Nbr Type
------------------- ---- --- --------- -------- --------------------------------
Gi1/0/1 Root FWD 20000 128.1 P2p
St1 Desg FWD 20000 128.872 P2p Peer(STP)
Fa2/0/8 Desg FWD 200000 128.64 P2p Edge
Fa2/0/9 Desg FWD 200000 128.65 P2p Edge
Gi3/0/1 Altn BLK 20000 128.109 P2p Peer(STP)
Fa4/0/11 Desg FWD 100 128.175 Shr Edge Peer(STP)
Fa4/0/12 Desg FWD 100 128.176 Shr Edge Peer(STP)
Temporarily we had St1 and Gi3/0/1 as root port at the same time and the forwarding to the c3548 worked. After a topology change (shut/no shut) however we were back to the situation above. I suppose it is some kind of software bug in regard to the stack configuration, because the stack looks otherwise to be ok:
c3750#show switch
Switch/Stack Mac Address : 8cb6.4f70.be80
H/W Current
Switch# Role Mac Address Priority Version State
----------------------------------------------------------
*1 Master 8cb6.4f70.be80 1 0 Ready
2 Member 8cb6.4f70.9580 1 0 Ready
3 Member 8cb6.4f46.1100 1 0 Ready
4 Member 108c.cfbf.a500 1 0 Ready
c3750#show switch stack-ports sum
Switch#/ Stack Neighbor Cable Link Link Sync # In
Port# Port Length OK Active OK Changes Loopback
Status To LinkOK
-------- ------ -------- -------- ---- ------ ---- --------- --------
1/1 OK 2 50 cm Yes Yes Yes 1 No
1/2 OK 4 50 cm Yes Yes Yes 1 No
2/1 OK 3 50 cm Yes Yes Yes 1 No
2/2 OK 1 50 cm Yes Yes Yes 1 No
3/1 OK 4 50 cm Yes Yes Yes 1 No
3/2 OK 2 50 cm Yes Yes Yes 1 No
4/1 OK 1 50 cm Yes Yes Yes 1 No
4/2 OK 3 50 cm Yes Yes Yes 1 No
That's it for the moment. Just to let you know, I really appreciate your comments.
best regards,
Pille
03-19-2012 01:18 PM
Hi there,
Just to let you know, I updated the c3750 to the lastest 12.2 Version and the problem disappeared. Seems as if it was in fact an IOS bug.
Thanks for your input,
best regards,
Pille
06-08-2012 07:51 AM
Hi Pille
Could you advise which C3750 IOS version you were running previously (with the bug), and also which version you updated to?
Great post by the way!
Many thanks
Yves
06-11-2012 12:00 AM
Hi Yves,
the bugged IOS was 12.2(50)SE1 IPBase K9, today we use 12.2(58)SE2 IPBase K9.
Bye
Pille
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide