Spanning tree - Catalyst and Nexus

geeksy · ‎06-10-2022

Hi,

Setup is two Nexus, connected to various downstream switches in vpc. Each switch has one connection to each of the Nexus.
No direct connections between the switches.

Network outage which I believe was caused by user connecting a cable from a hub back to another access port on the same switch. Log types from the access switch and Nexus are as below.

Access switch logs
%SW_MATM-4-MACFLAP_NOTIF:

Nexus primary
%EIGRP-3-DUP_ADDR:
%ETH_PORT_CHANNEL-5-PORT_DOWN:
%ETH_PORT_CHANNEL-5-FOP_CHANGED:
%ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
%FWM-2-STM_LOOP_DETECT:

Nexus secondary
N3K2 %STP-2-DISPUTE_DETECTED:
%STP-2-DISPUTE_CLEARED:

On the access switches, port fast was enabled but no BPDU guard. I have since applied BPDU guard on all access ports. UDLD is enabled on the fibre links. One of the Nexus is set as root bridge manually.

Not sure if enabling broadcast storm control would help? Would appreciate any other advice and suggestions for best practice.

Thanks in advance

marce1000 · ‎06-10-2022

- I would wait and possibly test (but there are more implications for that indeed) , if the BPDU guard can do it's job. I once had a similar problem on a network with nexus (vpc) backbone with 4500-office leaf switches where even the bpdu-guard would not kick in. Caused network meltdown when the first bozo did the same as in your case. Later we saw it multiple times, can also happen with bad patching from helpdesk , or user seeing cable on the floor and just plugging it into a network (outlet) e.g. In the end I needed to configure the ports with port security with a maximum of 3 addresses to have a kind of manual bpdu guard (against the loop) - which worked. Storm control from my viewpoint has a lower priority because it can slow down the counter-measuring too. Meaning storm control should be oriented to prevent what it says (when originating from users) , not for loops.

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

geeksy · ‎06-11-2022

Thanks for your reply. Am I right in saying BPDU guard should help in the situations like the one you mentioned "bad patching from helpdesk , or user seeing cable on the floor and just plugging it into a network"? Assuming it works..

marce1000 · ‎06-11-2022

- Normally , it should work.

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

MHM Cisco World · ‎06-10-2022

THIS IS MY OPINION
there are two solution here
USE NON-vPC VLAN for single Link SW connect <<<better solution
or
Peer-Switch in NSK

geeksy · ‎06-11-2022

Sorry I am not sure if I understood. I am using "peer-switch" feature in the vpc domain

MHM Cisco World · ‎06-11-2022

OK, let explain what happened here
with Peer-Switch BOTH NSK send BPDU to the connect access SW,
that perfect for vPC VLAN BECAUSE the access SW "as shown below" is connect via Port channel to both Nexus
meaning that Access SW see ONLY ONE SW "this the core of using vPC is connect to physical SW to build one virtual SW"

this is the BPDU send from Both Nexus Peer it show that Root and bridge is same because as I mention above it one virtual SW.

YOUR ISSUE AS I THINK
you connect SW which elect as Root,
Root Bridge by default all Port is Designated FWD, that ok in usually SW but with NSK
NSK always try to keep Peer-link is FWD <<- here the LOOP

access SW send frame ->NSK1->peer-link->NSK2->access SW
the mac address see mac flapping
NSK detect loop storm ""%FWM-2-STM_LOOP_DETECT"".

so your idea by making the BPDU guard I think it solve issue here if NSK send BPDU.

this make the one link to NSK be in err-disable if it detect BPDU from NSK it connect.

please check the access SW
show spanning tree <- are it root for this vlan ??

geeksy · ‎06-12-2022

Root for all VLANs is NXS1

The issue specifically with "%FWM-2-STM_LOOP_DETECT" happened at the same time when a bad patching was made in the access layer

MHM Cisco World · ‎06-12-2022

I know you config nsk1 as root but if connect sw have same priority and low mac it can elect as root,

In access sw

Show spanning tree

Check the root bridge.

geeksy · ‎06-12-2022

right, I see what you mean.

On the access switch, the root bridge is shown as NXS1.

MHM Cisco World · ‎06-12-2022

Accss SW connect to nsk1 direct,

Then by mistake some user connect this SW to NSK2? Is that what happened?

geeksy · ‎06-12-2022

Access SW connect to nsk1 & nsk2 direct, port channel.

user mistakenly connected two ports on same access switch together (via a hub)

MHM Cisco World · ‎06-12-2022

Now It clear to me why NSK is detect this LOOP.
happened is as following
NSK send Broadcast/multicast -> Access SW
Access SW forward it via Port 1
BUT Port 1 is by mistake connect to Port 2 in same Access SW
so the broadcast/multicast is return to access SW
the Access SW will forward it to all port expect the Port 2 it receive,
here is LOOP
NSK receive broadcast/multicast it send and detect LOOP.

so portfast not prevent BPDU and stop STP, where when portfast detect BPDU it return to normal port and by default BLK port BUT this done after short period LOOP that not accept from NSK.
this is Why cisco recommend to config BPDU guard with portfast to prevent short period LOOP.
so go ahead config BPDU guard

geeksy · ‎06-12-2022

Thanks. I have BPDU guard on all access ports now.

MHM Cisco World · ‎06-12-2022

You are so so welcome friend.
thanks for sharing your issue and your solution