Solved: Re: How to minimise the service disruption -RSTP recalculation

sasanka1912 · ‎04-06-2025

Hi ,

Currently, we have the following connectivity in one of our offices, and the CORE-1 switch acts as the RSTP root bridge. Between CORE-1 and CORE-2 switches, we have a PO1 (port-channel) trunk interlink.

Recently, one of my colleague tried to add a new vlan across multiple trunk links and when he tried to configure switchport trunk allowed vlan add xxx ,in po1 interlink from CORE-2 Switch, switchport trunk allowed vlan "add" syntax wasn't copied correctly and did override all the vlans in that PO1 trunk link. (This was rectified and resolved immediately)

During this time, Operations teams had noticed that most of the Access points dissociated from the Wireless controller and reconnected back with in 30 seconds due to RSTP recalculation.

My question is , what are the additional steps we can take to minimise such service impacts when a single changes goes wrong in one link impacting entire building .

I am looking for some advice regarding design/topology/configuration changes should consider on this ?

r.heitmann · ‎04-07-2025

the main reasons for behaving RSTP in the same manner as "non-rapid" Standard RSTP are
1) single switches in the LAN operating in non-RSTP-Mode

- since they don't participate in "RSTP Handshaking" which is the way RSTP improves convergence time - and at least parts of the network have to fall back to "timer-based" 30sec convergence when topology-changes occur

2) Edge-Ports not configured as "STP Portfast" or "STP port-type-Edge" or Server/Trunk-Ports not configured as "STP port-type-edge-TRUNK"

=> any edge port (ports with no RSTP-Bridge behind) has to be configured as edge-port or edge-trunk-port

otherwise the switch "has to wait" (timer based, no handshaking for rapid convergence) and the whole STP-Convergence acts as non-RSTP with 30s delay.

=> a single edge-port in the LAN not configured as edge-port can break the whole STP-convergence behaviour

Doing this, you can expect "RSTP-Behaviour" with sub-second failover.

There have been STP-bugs in the code, yes - but 99% of all STP-"30s"-slow-convergence-issues I've been seeing in the field have been fixed using both config-check rules - at some part of your network it fails back to "classic STP"...

View solution in original post

Giuseppe Larosa · ‎04-08-2025

Hello @sasanka1912 ,

for Rapid STP to work as expected it is really important that all access ports in all access layer switches are configured as STP edge ports also known as spanning-tree portfast as it is explained by @r.heitmann .

it is not enough to have spanning-tree mode rapid-pvst to have fast convergence.

Hope to help

Giuseppe

View solution in original post

Joseph W. Doherty · ‎04-09-2025

Yup. What @r.heitmann "b" case was trying to avoid is also explained here: https://www.cisco.com/c/en/us/support/docs/ip/hot-standby-router-protocol-hsrp/10583-62.html#toc-hId--925083550

View solution in original post

M02@rt37 · ‎04-06-2025

Hello @sasanka1912

Regarding your topology, ensure proper STP root bridge and backup configurations, assigning CORE-1 as the root with the lowest priority and CORE-2 as the secondary with a slightly higher one.

Avoid using dynamic vtp modes by setting all switches to vtp transparent mode and manually managing vlan popagation to prevent un-intended overwrites.

On the "operational side", enforce etherchanel consistency by using lacp instead of static configuration, and enable features like BPDU guard on access ports and loop guard on trunk links to catch misconfigurations early...

For long term stability, see to migrate into a L3 core design, which reduces the scope of L2 domains and the impact of STP events.

Best regards
.ı|ı.ı|ı. If This Helps, Please Rate .ı|ı.ı|ı.

sasanka1912 · ‎04-07-2025

M02@rt37 Thanks for your reply and Core 1 and Core-2 currently configured as STP priority with 8192 and 16384.

Port channels are configured as channel-group xxx mode on on both ends instead of active/passive to for LACP.

all the switches are configured with VTP transparent mode as well.

Finally re-l3 DESIGN instead of L2 , do you have example configuration /topology you may be able to advise ?

M02@rt37 · ‎04-07-2025

@sasanka1912

"Routed access is an alternative configuration in which Layer 3 is extended all the way to the access layer switches. In this design, access layer switches act as full Layer 3 routed nodes (providing both Layer 2 and Layer 3 switching), and the access-to-distribution Layer 2 uplink trunks are replaced with Layer 3 point-to-point routed links. Consequently, the Layer 2/ Layer 3 demarcation point is moved from the distribution switch to the access switch, as illustrated in Figure 22-10."

source: CCNP ENCOR 350-401_Chap.22 "Enterprise Network Architecture"

Because there are no L2 links to block, this design eliminates the need for STP and both uplinks from access to distribution can be used, increasing the effective bandwidth available.

Best regards
.ı|ı.ı|ı. If This Helps, Please Rate .ı|ı.ı|ı.

sasanka1912 · ‎04-07-2025

M02@rt37 Thanks .Will look further in to this ..

M02@rt37 · ‎04-07-2025

You're welcome @sasanka1912

You talk about AP and Wireless Controller... so I suppose your are focus on a Campus Design ? not Datacenter ? right ?

Best regards
.ı|ı.ı|ı. If This Helps, Please Rate .ı|ı.ı|ı.

sasanka1912 · ‎04-07-2025

M02@rt37 yes That's correct .. This is a campus set up..

r.heitmann · ‎04-07-2025