09-01-2004 12:57 AM - edited 03-02-2019 06:09 PM
Hi,
I've had a problem with a port channel consisting of 2x10GE on a 7600+sup720-3BXL. After removing vlans from a link (not the portchannel interface, the vlans were not configured on either ends of the channel) in a working portchannel the channel detected a config inconsistency and shutdown the link affected. when this happened the 2nd port in the portchannel did not receive the traffic from the downed 1st port, which caused the router to be unresponsive. The channel is not using PAgP or LACP.
My questions are (in a working environment):
1. Is it normal behaviour for a portchannel to stop working when a port in a port channel is forced down due to malconfiguration? As far as I know the 2nd port should take over the load without a hitch. The port channel interface never went down.
2. What would happen if the portchannel config becomes inconsistent between 2 switches? Would the channel go down?
I have found these discussions:
and
Which to me indicates that etherchannel is not for a high volume environment since you cannot have a misconfiguration on either end without downing the channel. It does not discuss the behaviour of channels in a production environment, though.
Any help is appreciated.
Thanks,
Niels
09-01-2004 05:18 AM
Hi Niels,
portchannels should be normaly a very easy situation. For a corect statement it would be better to see the config. And now to your questions:
1: It could be a normal behaviour that the portchannel goes down, when the configuration is wrong. When you configure a portchannel it would always a good idea to shut down the interface before you will change the config.
2: When the channel is inconsitent between two switches, the channel goes down.
You should try the configuration with PAgP. Because trough this protocol the Channel exchange the capabilities of the interfaces. Furthermore configure the physical interfaces with the interface range command. So you can be sure, that on all interfaces the config is the same.
And here are the complete statement from the configuration guide on the CCO:
"...When EtherChannel interfaces are configured improperly, they are disabled automatically to avoid network loops and other problems. To avoid configuration problems, observe these guidelines and restrictions:
.....
For Layer 2 EtherChannels:
-Assign all LAN ports in the EtherChannel to the same VLAN or configure them as trunks.
-If you configure an EtherChannel from trunking LAN ports, verify that the trunking mode is the same on all the trunks. LAN ports in an EtherChannel with different trunk modes can operate unpredictably.
-An EtherChannel supports the same allowed range of VLANs on all the LAN ports in a trunking Layer 2 EtherChannel. If the allowed range of VLANs is not the same, the LAN ports do not form an EtherChannel.
-LAN ports with different STP port path costs can form an EtherChannel as long they are compatibly configured with each other. If you set different STP port path costs, the LAN ports are not incompatible for the formation of an EtherChannel.
-An EtherChannel will not form if protocol filtering is set differently on the LAN ports.
Regards
Peter
09-01-2004 06:39 AM
Thanks for your reply Peter.
They used to be straight forward... :-)
With "...When EtherChannel interfaces" do they refer to the interface port-channel or each separate interface in the port-channel? Because when I had the issue the port I removed the vlans went down and with it the whole router, it completely locked up and caused quite a bit of havoc (RP usage went to 100%, and it lost all its OSPF ajacencies and BGP sessions). It seems to me that shutting down 1 port in the channel and the other failing to asume the traffic caused the SP and RP to be overloaded with traffic without a destination.
The frase "If the allowed range of VLANs is not the same, the LAN ports do not form an EtherChannel" can be interpreted in more than one way, and does not refer to channels in production. Does it refer to all the ports in the channel or the two ports in either switch? In my case the port (not channel) went down but caused the effect of having the whole channel down, which is not a case of "working as designed". The failing port had a load of no more than 100 Mb.
Thanks,
Niels
09-02-2004 01:33 AM
Hi Niels,
i made some tests in our lab. When the configuration on Switch 1 Port 1 and 2 (for example) is not the same, only one port forms the channel. On switch 2 both port are still in the channel. This situation is not very good. Before you configure one port, set the ports on both switches to shut (Switch 1 port 2 and Switch 2 Port2 for example). When you configure one port wrong and the load on the RP is very high, you should consider a problem with the software.
Regards
Peter
09-03-2004 03:27 AM
Hi Peter,
Thanks for your info. This seems to be what has happened. switch1 downs a port in the channel and switch2 doesn't notice and keeps sending some packet flows into a blackhole causing OSPF/LDP/BGP to converge indefinately hence the high CPU usage.
To me this makes portchannel a technology with a BIG problem. The port on switch2 logically should shutdown at the same time as the port in switch1 not causing this "asyncronous" state.
Cheers,
Niels
09-03-2004 04:17 AM
If you disable one link of a channel manually, then both ends notice, and there is no interruption in service. I suppose, from what you are both saying, when it disables one of the links because of configuration mismatch, the layer 1 stays up, which is why the other end does not detect the failure.
Here is a thought: if you had UDLD (UniDirectional Link Detection) enabled, would it have been more stable? UDLD is specifically designed to detect those rare situations where the layer-1 is still up, but the layer-2 has failed.
Kevin Dorrell,
Luxembourg
09-05-2004 09:58 AM
Hi,
UDLD is nice feature. But through UDLD you don't solved the problem, you only react to the error.
Regards
Peter
09-05-2004 11:09 PM
Sure, I understand that. I understand it does not solve the underlying problem. But detecting the error and reacting to it, would UDLD avoid the instabilities that he is describing? Would UDLD allow the network to handle the error in a more controlled way?
Kevin Dorrell
Luxembourg.
09-05-2004 11:47 PM
Hi Kevin,
The port with the inconsistent vlan config (error was vlan mask mismatch) went down and a message was written to the log indicating that the port went down physically.The messages were the "%LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernetx/x, changed state to down" first and the second one was the "%LINEPROTO-SP-5-UPDOWN: Line protocol on Interface TenGigabitEthernetx/x changed state to down". So I would assume that the port channel on both sides should see the suspended port as down and not just down on one side. But this assumption has proved wrong according to the test that Peter did in his lab and it also explains the issues I saw in production. The question is should this be considered acceptable and normal (and undocumented or vaguely documented) behaviour or is this a design issue with the protocol?
I´ve been considering this possibility as well, but I haven´t been able to test UDLD with etherchannel in a lab environment.
Thanks,
Niels
09-06-2004 01:24 AM
Hi Niels,
unfortunally i would off shift this week. So i can't make any test this week. But i will ask my colleague, perhaps he can do it. I'am thinking that this is a dessign issue and that you configure the channel wrong. Now you must only see, if udld can react to the confifuration mismatch.
Regards
Peter
09-06-2004 03:34 AM
Thanks Peter,
You're right there. I messed up the wrong port and discovered a single point of failure in the design of a node in the network. Now I´m trying to remedy this and make sure not to make the same mistake again. It would be interesting to see how etherchannel reacts to udld putting a port in err/disable.
Thanks,
Niels
09-06-2004 04:36 AM
Hi Niels,
my colleague made the test this morning. With UDLD enabled he don't saw any reaction. He tried udld enable and udld agressive. The port was furthermore in supend. This was a result which i'am not expected. I'am thinking, that the best way is really don't make the same mistake again ;-))
Regards
Peter
09-06-2004 01:20 AM
You describe the funtion from UDLD corect. I would recommend to make a test in the lab. Normally i would expect that udld avoid the instabilities, but in this way that UDLD set the port to error disable. And i'am really sure, that he don't want this. When you have an error in your network it#s allways better to search and repair the error as to react on it. Nevertheless it#s good to enable UDLD in your network.
Regards
Peter
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide