08-29-2007 12:04 AM - edited 03-05-2019 06:09 PM
Hello all,
I've experienced a strange problem with a customer of ours. The setup for this customer uses 2 core 3750 L2/L3 switches configured with several vlan's and hsrp groups (with svi interfaces) all configured on these 2 3750's. Between the 2 3750's there is a 4Gb ether-channel configured. Everything was working as designed.
The problem started yesterday when a faulty fibre patchcable caused a link-flap on one link on one side of the ether-channel. The port got err-disabled. Not a major priority because the other 3 links were still operating normaly. However around the same time our customer started complaining about intervlan communication problems.
During troubleshooting i noticed that one of the vlan got split up (hsrp status for one vlan on both 3750's got active). So one hsrp group wasn't able to communicatie accross the ether-channel. All the other hsrp groups were operating normaly.
At this time i started suspecting it had something to do with the err-disabled link. So after swapping the cable i re-enabled this link. As soon as this link got operational again the communication problems were gone, also the hsrp communication started working again.
Now as far as i can explain this behaviour isn't normal. One err-disabled link within a multilink ether-channel should cause communication problems for 15 to 20 min. It almost seems as if the switch with the err-disabled link was still trying to use this link within the ether-channel. Have already consulted the bug dbase and release notes but could find anything related to this problem.
Has anyone seen these kind of problems or maybe got an explaination why this was happening?
Setup details: 2 3750-24TS, IOS 122-35SE1
etherchannel: 4Gb dot1q Trunk (mode: on / load-balance: src-mac) links through SFP-CWDM's. All links may pass all the vlan's.
Many thanks,
Dennis
08-29-2007 12:52 AM
Hi Dennis,
Is this a Unidirectional link failure.During this condition this kind of problems may happen because the port channel load shares the traffic in any one of the links and traffic will be lost if the link has failed in one direction.
Regards
Nambi
08-29-2007 12:59 AM
Hi Nambi,
No i don't think it's a unidirectional link failure. As one side got err-disabled, the other side had a down status, so this link wouldn't normaly be used anymore by the ether-channel.
Regards
Dennis
08-29-2007 01:29 AM
I made a typo, obviously it should be
"err-disabled link within a multilink ether-channel shouldN'T cause communication problems for 15 to 20 min."
08-29-2007 04:24 AM
Would have been interesting to see if you had admined down the other side if it would have recovered and this would have told you if you had a unidirectional link , which it almost sounds like . If you aren't using UDLD detection you should be and this would eliminate anything like this .
08-29-2007 04:31 AM
i agree with glen and udld aggressive mode is the right choice to avoid this kind of problem.
08-29-2007 04:41 AM
Glen/nambi,
Thanks for you responses so far. Let me do some checking on the udld aggresive mode, to see if this can prevent similar future problems
Dennis
08-29-2007 04:33 AM
Guys,
I'm not convinced that udld aggressive mode will prevent this from happening in the future.
I was already using udld detection on all the links. The link got err-disabled because of a link-flap, not because of a udld-detect event. The other site of the link did go down when the err-disabled event happened.
Even if this side didn't go down directly, udld was enabled and when an udld event got detected it would have shut down the port after several secconds.
Dennis
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide