Rapid STP Cisco and non

lamav · ‎03-06-2011

Quick sanity check...

I have a non-Cisco access layer switch that is dual homed to cisco distros.

The non-Cisco switch is running 802.1w. Cisco is running rapid pvst+.

When I disconnect the active uplink, I fail over almost immediately to the other uplink and barely drop a ping. When I reconnect what was the active link, it has to go through the LIS and LRn states, meanwhile Im dropping all my pings.

This cant be helped given the non-Cisco switch that isnt running uplink fast, correct?

Thanks

Peter Paluch · ‎03-06-2011

Hello Victor,

When I reconnect what was the active link, it has to go through the LIS and LRn states, meanwhile Im dropping all my pings.

This cant be helped given the non-Cisco switch that isnt running uplink fast, correct?

No, I don't think so. The UplinkFast and BackboneFast are proprietary STP enhancements but they have standardized counterparts in 802.1w RSTP. In fact, even if you activated the UplinkFast and BackboneFast on a Cisco switch running RSTP, they have no effect and are not used because RSTP has its own mechanisms to provide a similar functionality.

I guess this is something that requires a closer analysis. When you check the show spanning-tree on the Catalyst switch that faces the 802.1w region, how is the STP protocol detected and indicated on the boundary ports - is it RSTP or just STP?

Best regards,

Peter

lamav · ‎03-06-2011

Hey, Peter...long time.

So, here goes...

I labbed this up about a month ago and have had so many things on my plate that I havent had a chance to play with it further. As an FYI, I dont do the day-to-day network engineering stuff anymore. I am a sales engineer and I work for a data center solutions company. Most of my life involves working with exciting, bleeding edge technologies, not this STP nonsense, which should disappear once TRILL is fully evolved. Anyway, I also dont remember everything I did. I will do it again either today or tomorrow and run my show commands again. If I recall right, the output of the show spanning tree command on the Cisco's indicated RSTP, not STP.

But I, too, was/am under the impression that the open standard version of RSTP has similar functionality (UF and BBF) either built in or that must be configured, but I dont see any counterpart to UF and BBF that I can configure, so if it exists, it must be built-in to the code. But if it does indeed exist, why does the port have to go into the LIS and LRN states? The port that does it is the DP on the RB.

I did document my lab set up for future reference. Let me post it here.

Hardware:

2 Cisco 3550s at the distro layer (D1 and D2), running 12.2(35)SE5.

1 Dell 6220 access switch (A1), running 2.2.0.3.

Topology:

6220 dual homed to each 3550

3550s connected to each other

All connections are dot1q trunks allowing all vlans (only VLAN 10 configured)

Cisco config includes:

Vlan 10

SVIs for Vlan 10

D1 is root for Vlan 10

D2 is Secondary for Vlan 10

Default RSTP timers

Rapid-pvstp+

Dot1q trunk downlinks to Dell 6220

Dot1q trunk between D1 and D2 (deliberately installed loop)

Dell config includes:

Vlan 10

rstp

dot1q uplinks (switchport mode general)

default RSTP timers

Converged STP status:

A1 to D1 link FWD

A1 to D2 link BLK (Cisco port blocked, Dell port in ALT)

D1 to D2 link FWD

D1 root bridge

D2's root port facing D1

A1's root port facing D1

Test 1:

Started continuous PING from A1 to SVI IP of D1

Disconnected A1 to D1 link --> failover to D2 link pretty fast (dropped one PING)

Reconnect A1 to D1 link ---> over 30 seconds to reconverge (about 18 PINGs dropped with a 2 sec timeout/PING)

My off-the-cuff suspicion:

Cisco’s rstp includes uplinkfast and backbonefast, Dell does not offer them, of course. So, there is no “agreement” created between the 3550 and the 6224. So, upon reconnecting the A1 to D1 link, the 3550 port facing the 6220 has to transition to LIS and LRN and then FWD – 2xfwd delay intervals.

So, I configured D1’s 6220-facing port for portfast (spanning-tree portfast trunk) and, of course, it transitioned immediately into FWD state and dropped only one PING when I reconnected the link. Not exactly a good practice to keep an inter-switch link set for portfast. :-)

Test 2:

Started continuous PING from A1 to SVI IP of D1

Installed same failure, but all switches now configured for 802.1D

Failover, as expected, was 50 seconds upon removal of A1 to D1 link and 50 seconds upon reconnection of A1 to D1 link.

Test 3:

Started continuous PING from A1 to SVI IP of D1

Kept 3550s in 802.1D (STP)

Disabled STP altogether from 6220 uplinks to 3550s.

Failover was 50 seconds upon removal of A1 to D1 link and 50 seconds upon reconnection of A1 to D1 link.

Debug feature not available on 6220s.

Debug on Ciscos reveal bi-directional BPDU exchanges with 6220 during reconvergence. Content not informative enough to draw conclusions.

lamav · ‎03-06-2011

Peter, I tried it again and the mode is RSTP when I do a show spanning tree vlan 10 on all the switches, not just the Ciscos. The only way to speed up convergence is if I conrfigure the DP on the RB for UF. Not going to do that in a production environment, of course.

Peter Paluch · ‎03-06-2011

Hello Victor,

The pleasure of being in touch with you again is all mine.

Regarding this "STP nonsense", well, I'd say that it would all run just fine if all vendors, including Cisco, would care to implement it properly and without doing obscure optimizations or per-VLAN instantiations in times where MSTP can do its work just fine.

Let me sum up some of my thoughts. First of all, the RSTP already intrinsically contains functionalities equivalent to Cisco's UplinkFast and BackboneFast which have been developed only and solely for Cisco's STP/PVST/PVST+, not for RSTP/RPVST/RPVST+. In particular:

The UplinkFast functionality is provided in RSTP by maintaining the evidence of Alternate ports, and moving the "best" Alternate port into Root Forwarding role/state once the current root port fails.
The BackboneFast functionality is provided in RSTP by immediately accepting any BPDU sent by the current designated switch on a segment on our Alternate Discarding port, even if the BPDU is worse than the previously received. In plain STP, only the best BPDU is stored on a port, and if worse BPDUs start to arrive, they will be processed only after the current BPDU expires (the max_age timer).

Therefore, debating whether Cisco's RSTP supports UplinkFast/BackboneFast and the Dell's RSTP does not support them is not correct from a fundamental point of view: RSTP has its own ways of doing equivalent functions, and it is directly built-in as an integral part of the RSTP operations. It even cannot be turned off. To be completely blunt, RSTP, Cisco's or Dell's, must converge rapidly without any additional tinkering. If it does not, something is rotten.

I have been somewhat surprised by your description of the converged STP state:

A1 to D1 link FWD

A1 to D2 link BLK (Cisco port blocked, Dell port in ALT)

D1 to D2 link FWD

D1 root bridge

D2's root port facing D1

A1's root port facing D1

For which VLAN is this output valid? Note that the A1/D2 link is according to your description blocked on both ends which is impossible in plain RSTP. With Cisco's RPVST+, the situation is more complicated: for all VLANs except VLAN1, the Dell switch is considered just a shared segment - a hub if you like to put it that way, because for RPVST+, it indeed behaves just as a hub. The D1 and D2 completely ignore the A1 for all other VLANs: this is done by encapsulating the RPVST+ BDPUs in SNAP frames so that they are not interpreted by the Dell, and they are sent as multicasts so that they completely traverse the plain RSTP region. The VLAN1 is the only VLAN on the Catalyst that speaks the plain RSTP in addition to RPVST+ and thus interacts with the RSTP region in a normal way.

Now:

In the VLAN1, the D1 is the root, the D2 is the secondary root. The A1/D1 link should be forwarding, and on the A1/D2 link, the A1 should be Alternate Discarding while the D2 should be Designated Forwarding (the priority of the D2 should be lower than the priority of the A1). Can you verify this for me?
Now, on the A1/D2 link, if the A1 is Alternate Discarding, it cannot propagate any RPVST+ BPDUs received from the A1/D1 link towards D2. Thus, I find it illogical for D2 to become Alternate Discarding (you said it is blocked - how blocked? Is it indeed Alternate Discarding or Backup Discarding? I need to be absolutely sure on this!) because it actually does not receive any BPDUs from the blocked port on the Dell.

So let's first try to clean up this particular issue. I do not currently remember the exact debug commands on the Catalyst that would enable us to see the Proposal/Agreement mechanism in work (which is obviously the mechanism that is failing here, and that's why you do not have the rapid convergence - but that seems to be a consequence of another problem which we first have to identify).

Looking forward to reading from you soon

Best regards,

Peter

lamav · ‎03-06-2011

Peter, thanks for that lucid answer and information. Good stuff.

You're right about D2. I fat-fingered my typing.

The D2 port on the A1/D2 link is indeed DESG FWD, not "blocked". I had written something else and then wrote something else without editing my sentence. Anyway, A1 port on the A1/D2 link is ALT DISC. And the vlan is 2, not 10. That is the only vlan I have configured. Although I am showing the output of VLAN 2 on the Ciscos, the VLAN 1 port states for both D1 and D2 are exactly the same as they are for VLAN 2. All their ports are FWD - with D1 as the root.

Cisco_3550_1#sh spanning-tree vlan 2

VLAN0002
Spanning tree enabled protocol rstp
Root ID    Priority    4098
             Address     000e.8364.6d80
             This bridge is the root
             Hello Time   2 sec Max Age 20 sec Forward Delay 15 sec

Bridge ID Priority    4098   (priority 4096 sys-id-ext 2)
             Address     000e.8364.6d80
             Hello Time   2 sec Max Age 20 sec Forward Delay 15 sec
             Aging Time 300

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/1            Desg FWD 19        128.1    P2p
Fa0/24          Desg FWD 19        128.24   P2p

Cisco_3550_1#
Cisco_3550_1#

Cisco_3550_2#sh spanning-tree vlan 2

VLAN0002
Spanning tree enabled protocol rstp
Root ID    Priority    4098
             Address     000e.8364.6d80
             Cost        19
             Port        24 (FastEthernet0/24)
             Hello Time   2 sec Max Age 20 sec Forward Delay 15 sec

Bridge ID Priority    32770 (priority 32768 sys-id-ext 2)
             Address     000d.bc6e.9300
             Hello Time   2 sec Max Age 20 sec Forward Delay 15 sec
             Aging Time 300

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/1            Desg FWD 19        128.1    P2p
Fa0/24          Root FWD 19        128.24   P2p

Cisco_3550_2#

6224P_1#
6224P_1#show spanning-tree deta
Spanning tree Enabled mode rstp

Port 1/g1    Enabled
State: Forwarding                                Role: Root
Port id: 128.1                                   Port Cost: 200000
Port Fast: No (Configured: no )                 Root Protection: No
Designated bridge Priority: 61440                Address: 10:01:00:0E:83:64:6D:80
Designated port id: 128.1                        Designated path cost: 0
CST Regional Root: 10:01:00:0E:83:64:6D:80       CST Port Cost: 0
--More-- or (q)uit
BPDU: sent 35, received 538

Port 1/g2    Enabled (THIS PORT FACES D2)
State: Discarding                                Role: Alternate
Port id: 128.2                                   Port Cost: 200000
Port Fast: No (Configured: no )                 Root Protection: No
Designated bridge Priority: 61440                Address: 80:01:00:0D:BC:6E:93:00
Designated port id: 128.1                        Designated path cost: 19
CST Regional Root: 80:01:00:0D:BC:6E:93:00       CST Port Cost: 0
BPDU: sent 19, received 220

So, in short, everything else I told you is the way you understand it. I agree that RSTP (IEEE 802.1w) should have the equivalent functionality of Cisco's UF and BBF, which is why I am surprised that this is not converging faster. This is why I titled this thread "sanity check."

Peter Paluch · ‎03-07-2011

Hello Victor,

Thanks for the update.

Could you please activate the following two debugs on your D1 and D2 switch, and perform the connectivity tests again while capturing the output of the debugs?

debug span events

debug span switch state

I am especially interested in seeing the debugs from D1 after the disconnected A1/D1 link is connected back.

Thank you!

Best regards,

Peter

Peter Paluch · ‎03-10-2011

Hello Victor,

Any news on this? Please respond

Best regards,

Peter

burleyman · ‎03-10-2011

What about these two debug commands? Would they help?

debug spanning-tree backbonefast

debug spanning-tree uplinkfast

Mike

Peter Paluch · ‎03-10-2011

Hi Mike,

I do not think so. We are hunting problems with RSTP, not with PVST+ extensions. Then again, adding them surely won't hurt.

Best regards,

Peter

lamav · ‎03-12-2011

Peter never got a chance to try it.

Peter Paluch · ‎03-13-2011

Hi Victor,

Hmm... and will you have that chance yet? I got very intrigued by this issue.

Best regards,

Peter

lamav · ‎04-02-2011

Hi all,

Id like to pick up where I left off last.

I was rethinking my methodology for testing failover.

Going back to my setup, which you could understand if you scroll up and read my initial post....

Access layer switch A1 (NON-Cisco) is dual homed to 2 Cisco Distribution switches, D1 and D2. And D1 and D2 are connected to each other. Typical triangle topology with all vlans trunked...

When testing, I was PINGing D1 from A1 and then disconnecting the active link (A1-D1), letting it failover to the backup (A1-D2), which happened fast, and then reconnecting what was the active link (A1-D1). When I did the reconnect, I had to wait for the Cisco switchport on D1 to go through all the STP states before forwarding traffic, so I would drop PINGs for about 40 seconds.

Im wondering if doing the disco and reconnect method is fair, so to speak. How would failover have occured if the access layer was a Cisco switch? I think the failback upon reconnect would have been immediate because config BPDUs would be exchanged, informing the D1 switch that its part of an Uplinkfast group....or something like that. But since the access switch is a NON-Cisco switch, they dont "talk to each other about uplinkfast.

Giuseppe Larosa · ‎04-03-2011

Hello Victor,

interesting thread as usual.

My understanding of uplink fast is the following:

the access layer switch does send out special BPDUs to tell or negotiate uplink fast.

Actually it tries to avoid to become transit switch, by increasing its bridge priority to 49152 = 12*4096 and the port cost are increased by 3000 in all ports.

The mechanism is very simple:

when the new uplink is used the switch configured for uplinkfast sends out multicast frames with source S = MAC address of each locally learned MAC address (learned on switch ports other then an uplink).

So uplinkfast is traffic driven and you can configure the max rate of sending these messages.

Backbonefast is a different matter and uses the root link query message to detect indirect failures in the path to the root bridge.

This is clearly a subset of handshaking mechanism in RSTP that happens in p2p links.

So your distibution switches are running PVST+, they correctly detect RSTP BPDUs on links to the Dell switch, but they do not interact with it at a level comparable to what should happen if they were running Rapid PVST.

Edit:

your switches are running Rapid PVST so they should be able to interact with the Dell ...

if they were running Rapid PVST they would use the handshake mechanism to quicky negotiate port state without having to wait for STP timers to expire.

You have seen fast convergence when you have used spanning tree portfast on cisco switch side because you have been able to bypass the timers.

the Dell is not going to emulate uplinkfast traffic push, but it should be able to negotiate with a rapid STP neighbor.

Edit:

it is probably related to the fact that the Dell is running a mono instance 802.1W STP so they do not interact with it in vlan 2.

Hope to help

Giuseppe

lamav · ‎04-03-2011

Giuseppe:

Long time, my friend! I started a thread a few weeks ago, where I was asking if anyone had heard from you or Jon. Hope all is well.

Anyway, let me see if my thinking is correct.

1. Uplinkfast (UF) and backbonefast (BBF) are definitely Cisco proprietary, of course. Moreover, when rapid-PVST+ is configured on a CIsco switch, UF and BBF are automatically enabled.

2. The open standard of rapid-STP (802.1w) includes similar functionality to UF and BBF.

3. So, when you have an all-Cisco environment and every switch is using rapid-PVST+, they all use UF and BBF.No problems.

4. HOWEVER, when that Cisco switch configured for rapid-PVST+ has to interact with a non-Cisco switch running open standard rapid-STP (802.1W), the Cisco switch will revert to the 802.1w version and try to conduct a "proposal" and "ágreement" exchange with that non-Cisco switch using the open standard's semantics, NOT UF and BBF.

Is each point correct, as far as you understand it?