cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
22929
Views
97
Helpful
44
Replies

How does a loop form in a misconfigured Etherchannel?

Peter Paluch
Cisco Employee
Cisco Employee

Dear friends,

It is a commonly seen and practically proven issue that if two switches are interconnected by a number of parallel links which are bundled into an Etherchannel on one switch (obviously using the on mode) while being unbundled on the second switch, a Layer2 loop may well be created. However, I do not understand the exact mechanism of the formation of this loop.

I am well aware of the basic principles behind: I know that STP treats the Port-channel interface as a single interface, and thus all member links bundled in this Etherchannel share the same STP state/role. I also understand that a broadcast/multicast/unknown unicast frame sent by a port in the Etherchannel will reach the opposite switch and get flooded over all remaining links, eventually arriving back at the switch with the configured Etherchannel.

And this precise moment is where my understanding ends: The frame arrived back and its destination is still unknown. However, from the viewpoint of the switch, the frame came in through a particular Port-channel interface. If this switch floods the frame, it will flood it through all remaining ports except the port through which the frame came in, meaning that the frame will never be sent back through the Port-channel. How does the loop get created, then?

Thank you very much for helping me out with this!

Best regards,

Peter

4 Accepted Solutions

Accepted Solutions

Edison Ortiz
Hall of Fame
Hall of Fame

Peter,

Packets will be flooded out of all interfaces for unknown broadcast/unicast packets with the exception of the interface where the flood came from.

In a 2 switch topology, Switch A (which has the bundle) will flood the packet out of the members of the bundle expecting Switch B to receive the packet on its bundle.

However Switch B will only receive the packet on one of its physical interfaces and can potentially flood back to Switch B out of its other connected physical interface causing the STP loop.

Regards,

Edison.

View solution in original post

You are correct but the frame that was sent originally came back via the same interface - in essence that's a loop.

As you stated, it won't send the frame back out but take into account a network with tons of unknown of multicast/broadcast traffic.

View solution in original post

paluchpeter wrote:

Jon,

Logic dictates that a frame received by a link in an Etherchannel bundle must not be sent back through any of the links constituting the bundle. That is just an extension of the usual rule that a frame is never going to be sent back its own ingress port, even if the destination is learned on that same port.

I may perform a lab experiment tomorrow and get back with the results but I do not expect to learn anything new (although I should not be so biased).

Best regards,

Peter

Peter

Logic dictates that a frame received by a link in an Etherchannel bundle must not be sent back through any of the links constituting the bundle.

That was my thinking as well but i wondered if it could happen due to some peculiarities of etherchannel. I am strugging to see how a loop where packets are literally transmitted back and forth would happen because of this etherchannel setup. Unfortunately not having any switches i am unable to verify these sort of things at the moment.

Jon

View solution in original post

Peter Paluch
Cisco Employee
Cisco Employee

Edison, Jon and everybody,

I think I've got it. Please bear with me while I explain my thoughts.

As we have discussed here, a pure Etherchannel between just two switches, with one switch being configured for Etherchannel and the other not, may result in frames being reflected (bounced) back but not in frames going around in circles. A frame received on a member port of an Etherchannel bundle will not be forwarded back through any member port of the same bundle, thus, the loop cannot be formed that way. I have tested this in our lab - I have connected two switches together with two links, one switch was configured for Etherchannel, the other not. No broadcast storm occured as the result although I have intentionally flooded the topology with broadcast traffic (a videostream sent to a broadcast IP address). As soon as I stopped transmitting the stream, the topology remained silent - no frames got caught in a loop, and no network collapse ensued whatsoever.

What I realized is that the loop must be formed by an additional redundant link in the topology that somehow - when combined with a misconfigured Etherchannel - results in multiple paths between switches, thus forming a loop. The networks we have seen to collapse under Etherchannel misconfiguration were always more redundant than just a single Etherchannel between switches. I began to suspect that there must be additional redundancy present in the network so that the loop can be formed.

I have therefore analyzed the simplest scenario fulfilling this requirement - adding a third standalone link between the two switches. To better visualize the concept, please have a look at the exhibit Example1.png I have attached. There are two switches - the Distribution Switch (DS) and the Access Switch (AS). The standalone link is the Fa0/1. Furthermore, DS has ports Fa0/2-3 bundled in an Etherchannel while AS has no Etherchannel configured. The DS is configured as STP Root. What will now happen?

  1. Because DS is STP Root, all its ports (Fa0/1 and Po1) are Designated Forwarding
  2. AS will receive BPDUs via Fa0/1 and via exactly one of the links of the Etherchannel from DS. Let's assume that the link is Fa0/2
  3. AS will declare Fa0/1 to be its Root port (the lowest sender port ID) and Fa0/2 as Alternate Discarding.
  4. However, AS receives no BPDUs on Fa0/3. Therefore, it declares the Fa0/3 as Designated Forwarding.

And voila! - we have a loop here - two links completely unblocked and forwarding: Fa0/1 and Fa0/3. A single broadcast now starts the usual broadcast storm that I've have striving for so long!

And then it all suddenly began making sense. The true loop with frames endlessly circulating is not actually created by the presence of the misconfigured Etherchannel itself but rather by the modified operation of STP over an Etherchannel bundle - that the BPDUs for a particular VLAN are sent through a single link in the entire bundle. All other bundled ports that do not carry BPDUs can be mistakenly considered as eligible for Designated Forwarding by the switch with a missing/misconfigured Etherchannel, and that forms the basis of the actual loops.

I have subsequently analyzed a rather common scenario with two distribution layer switches and an access switch connected to both distribution switches. Please see first the Example2.png. The DS1 is configured as STP Root, the DS2 is configured as STP Secondary Root. In this exhibit, the AS has the ports Fa0/1-2 unconfigured by mistake (or not yet). Assuming that the bundle on AS towards DS2 becomes the root port (Etherchannel has a lower cost than individual links so with enough links in an Etherchannel of an appropriate speed, this may happen by default), one of the ports Fa0/1-2 on AS becomes Alternate Discarding (because it receives BPDUs from DS1) and the other becomes Designated Forwarding as it receives no BPDUs. A loop is thus formed.

The Example3.png depicts another common scenario with ports Fa0/5-6 unbundled on AS. DS1 is again STP Root, DS2 is STP Secondary Root. Here, the bundle on AS towards DS1 will be declared Root port, and because the DS2 has a lower BID than AS (it is Secondary Root), its bundle towards AS will be declared as Designated Forwarding. Again, AS will declare one of the ports Fa0/5-6 as Alternate Discarding and the other as Designated Forwarding, and here we have the loop again.

I assume this is actually what brought down the networks with misconfigured Etherchannels.

I have experimentally tested all three scenarios in our lab and I have been able to easily create the broadcast storm in each of these cases. Furthermore, I have not deactivated the STP Etherchannel Misconfig Guard. Even with this guard left active, there was absolutely no problem in creating these loops as described earlier. The reason is that this guard basically reacts to arrival of BPDUs sent from differing MAC addresses on ports bundled in an Etherchannel which is not expected in correct configuration. However, in my particular topology, each bundle consisted of two links. Whenever AS declared one of these links as Alternate Discarding and the second as Designated Forwarding, the Etherchannel received BPDUs only via the Forwarding link and so the EC Misconfig Guard had no reason to kick in. It would be probably different if my bundles consisted of at least 3 links but for two-link bundles, this guard is helpless (which is logical - it performs only local decisions and so has very limited information).

Phew I would like to thank ANYONE that has joined this thread so far and helped me to finally resolve this mystery!

Best regards,

Peter

View solution in original post

44 Replies 44

Edison Ortiz
Hall of Fame
Hall of Fame

Peter,

Packets will be flooded out of all interfaces for unknown broadcast/unicast packets with the exception of the interface where the flood came from.

In a 2 switch topology, Switch A (which has the bundle) will flood the packet out of the members of the bundle expecting Switch B to receive the packet on its bundle.

However Switch B will only receive the packet on one of its physical interfaces and can potentially flood back to Switch B out of its other connected physical interface causing the STP loop.

Regards,

Edison.

Hi Edison,

Thanks so much for your reply.

In a 2 switch topology, Switch A (which has the bundle) will flood the packet out of the members of the bundle expecting Switch B to receive the packet on its bundle.

However Switch B will only receive the packet on one of its physical interfaces and can potentially flood back to Switch B out of its other connected physical interface causing the STP loop.

Exactly! And this is precisely what I wrote in my first post. But my doubts go further, and I will repost them here for clarity:

The frame arrived back to the switch A that has the bundle configured and the frame's destination is still unknown. However,  from the viewpoint of the switch, the frame came in through a particular  Port-channel interface. If this switch floods the frame, it will flood  it through all remaining ports except the port through which the frame  came in, meaning that the frame will never be sent back through the  Port-channel.

Is this assumption correct?

Best regards,

Peter

You are correct but the frame that was sent originally came back via the same interface - in essence that's a loop.

As you stated, it won't send the frame back out but take into account a network with tons of unknown of multicast/broadcast traffic.

Hello Edison,

You are correct but the frame that was sent originally came back via the same interface - in essence that's a loop.

Certainly, I agree with this. But this kind of loop does not result in the same broadcast frame to loop forever. Yet, we have seen the network being brought to its knees by misconfigured Etherchannels. A broadcast frame hitting a switch twice and being dropped afterwards, pardon me... I don't believe it can cause such a massive outage.

So what happens in such instances?

Best regards,

Peter

Peter,

Packet duplication in a large network can certainly put any switch on its knees. The cases may be different on each EC misconfiguration. It depends on the network state. I've seen networks with EC misconfiguration running for a while until heavy traffic traverses those links.

Say you deploy multicast on that segment after those switches have been misconfigured for a while and hell break loose. You would attribute the problem with the multicast deployment while the culprit was a bad design to begin with.

Edison,

I get the point. I am now moving on a thin ice as I am talking about something I believe to have seen in the past but I am not entirely sure: with a misconfigured Etherchannel, an obvious storm ensued, with the switches indicating heavy traffic on all their LEDs (blinking wildly). That would mean the frames received by a Port-channel interface were reflooded back which is actually the reason behind this entire thread. Have you had a similar experience?

Best regards,

Peter

Peter,

I haven't seen the case where packets are being flooded back via the EC - if that was your intended question. However,  I wouldn't discard this scenario which can be attributed to a software defect along with poor design implementation.

You can check a detailed article about this behavior in this link 

http://www.ciscozine.com/static-port-channel-layer2-loop/

 

Cheers,

Fabio

Hi Fabio,

Nice article you have.

I see you were studying from the CCIE R&S v5.0 Official Cert Guide Vol. 1. I hope you liked it! I wrote the Chapter 3 on STP and EtherChannel in that book, and when I was writing about the loop formation on a misconfigured EtherChannel, I reused a lot of knowledge from this very thread.

Best regards,
Peter

Thanks to you, Peter! the chapter 3 is very cool and I suggest it to anyone!

Congrats!!

Best regards,

Fabio

hi why looping don t present in etherchennal? give explanetion

Edison,

Still one more comment, though:

Packet duplication in a large network can certainly put any switch on its knees.

Agree. But the switch having the Etherchannel configured should actually not be bothered so much with the traffic "reflected" back from the opposite unconfigured switch - it would merely drop it which is performed in the hardware as one of the basic switching functions. Even the unconfigured switch should not be loaded in a significantly higher way - it is in the same position as a switch receiving a broadcast/multicast traffic on a port and replicating the flow on remaining ports. Neither of these switch is actually going to process and forward any such frame twice.

Best regards,

Peter

We need to take into account where the STP Root may be located as well.

Do either of you guys know what would happen if a switch received a packet on a port-channel interface and the destination mac was pointing back out the same po1 interface eg.

po1 = gi0/1  - 4

switch receives packet with dst mac aa.bb on gi0/2 and has recorded in it's cam table that aa.bb is reachable via po1 ? Would it simply select one of it'sl links and send the packet back out ?

Jon

Review Cisco Networking for a $25 gift card