Solved: can Native Vlan mismatch or Duplex mismatch be the cause of looping?

dolahadad · ‎05-26-2012

1 week ago, I got a call from a client that reported a network outage

the client told me that, 3 switch has crashed he try to console but it just hang..

I ask him, did you change something?

he said he didn't change anything, he just pluged a nortelswitch to the cisco switch number 9, but that switch doesn't crash like the others (3,4,8)..

I check the uptime, and yes the switch never been powered off..

the topology look like this

____ 6500 ____

/ / | \ \

1 2 3 4 5 ...... 9

the vlan is end to end vlan, so vlan span between all those switches..

vtp transparant.

this is collapsed topology, core and distribution is the 6500 itself

all of the 1-9 access switch are in the same rack, with no loopguard, and bpdu guard configured. and connected to the core using etherchannel.

the problem is there is no log available to start the troubleshooting/investigation.

the client seems to have a lack of understanding both cisco and nortelswitch configuration

cisco device running RSTP

Nortel Run its default STP.

My Asumption is :

1. Client mistakenly configure something that trigger looping

looping can explain why the switch crashed and need to be reboot manually.

Regards,

Dola

Rolf Fischer · ‎05-27-2012

Unfortunatley I can't tell you what exaclty is going wrong in your customers network, our scenario was different to yours.

Except for the etherchannels there is no redundancy, right?

So, a misconfigured or incorrect working channel is one thing which could cause a loop.

Which switches crash first in a layer 2 loop situation can depend on their load, link capacity, hardware platform, etc.

Maybe the crashing of one particular switch even cut the loop. This could be helpfully finding the cause.

What I mentioned about proprietary Cisco BPDU format means:

- they send a untagged BPDU to the IEEE destination MAC 01:80:C2:00:00:00

- they ALSO send a untagged BPDU with Cisco proprietary destination MAC 01:00:0C:CC:CD

- and for every tagged VLAN they send a tagged BPDU with Cisco proprietary destination MAC 01:00:0C:CC:CD

Non-Cisco switches won't recognize frames with destination address 01:00:0C:CC:CD as BPDU but with the group-bit set they treat it as multicast - which means flooding. I know, this doesn't explain what's going wrong in your case but it's important to know in such scenarios.

There are a couple of things to examine:

- the safer ground regarding STP between Cisco- and Non-Cisco switches ist 802.1s MSTP - here they have implemented the standard.

- I think I remember Nortel offers somekind of proprietary layer-2 redundancy. Can't remember the concept but it might be worth checking this.

- you could give it a try with only a single uplink (no etherchannel) to see if the problem disappears then

Layer 2 loops are always a challenge, I wish you good luck!

HTH,

Rolf

View solution in original post

Rolf Fischer · ‎05-26-2012

A couple of years ago I experienced the same thing when we connected our first Cisco switch in a Nortel environment. I think the Nortel switches do no play well with the Cisco PVSTP BPDUs. They can't recognize them as BPDUs but treat them as multicast (Group bit ist set to 1) which means flooding them over all ports.

Depending on the capabilities I'd recommend the use of MSTP in a multivendor environment with Cisco switches.

dolahadad · ‎05-26-2012

Someone need to try this on the lab, or maybe cisco already did that,

can someone refer me the documentation about cisco switch interoperability between other vendor?

well Fischer, your answer gonna be the correct answer if no one came up with better explanation, but how can it explain why only 3 switch crashed on my case, but the others 6 didn't?

I have to came up with good explaination, to the manager, what causing the outage.