01-05-2018 08:24 AM - edited 03-08-2019 01:19 PM
Recently got two 2960s to replace in service late model 3750s directly connected in remote locations via direct fiber. Have not been able to get the link to stay UP - they flap and go err-disable immediately. If i use the same fiber and connect the 2960 to the remote end 3750 it's fine. Swapped to different new optics and still no dice. Both 2960 are running latest Version 15.2(6)E. Seems like an IOS bug and I got a support ticket working through. Never seen a problem like this, anyone come across this problem?
Solved! Go to Solution.
01-05-2018 11:18 AM
Thank you for this information Stephan,
Usually, error-disable conditions due to "link-flap" mean that either the local switch port or remote switch port bounced a number of times, in a very short period of time. As a protection mechanism, in order to prevent network instability (in terms of STP, for example), the switch sends the port into an err-disable state to prevent any further flapping.
The majority of the times this points to a L1 issue, so the troubleshooting methodology would happen as follows:
- Make sure the fiber is clean and if possible clean it/replace it.
- Make sure any media distributor/patch panel in between is clean and healthy too.
- Move the connection from the local site to a different port. See if it comes up.
- Move the connection from the remote site to a different port. See if it comes up.
- Use different transceivers (same model), in the local site and then in the remote site.
Once the issue is isolated to a switch (which, looks like you have done all that shown above already) per the case notes), further troubleshooting can happen at that switch.
Going forward, the case notes show that the issue seems to be related to bug CSCvg04687. Although I am aware the switch was upgraded to 15.2(6)E, it looks like the bug is not fully fixed there. It looks like the current TAC engineer is already in talks with our internal teams for confirmation. I would let the current TAC engineer continue to work so they can let you know if there is some other OS with the fix.
As an interim workaround, you may want to change the err-disable link-flap timers to be less aggressive, at least until there is feedback from that TAC engineer.
I apologize about not being able to help further.
Hope this helps.
Eduardo.
01-05-2018 09:35 AM
Good day,
When the link goes into an err-disable state, can you provide the output of "show interface status err-disable", please?
Also, it would be great to have the output of "show interface Gi x/y/z" of the respective involved ports.
Thank you,
Eduardo.
01-05-2018 10:39 AM
01-05-2018 11:18 AM
Thank you for this information Stephan,
Usually, error-disable conditions due to "link-flap" mean that either the local switch port or remote switch port bounced a number of times, in a very short period of time. As a protection mechanism, in order to prevent network instability (in terms of STP, for example), the switch sends the port into an err-disable state to prevent any further flapping.
The majority of the times this points to a L1 issue, so the troubleshooting methodology would happen as follows:
- Make sure the fiber is clean and if possible clean it/replace it.
- Make sure any media distributor/patch panel in between is clean and healthy too.
- Move the connection from the local site to a different port. See if it comes up.
- Move the connection from the remote site to a different port. See if it comes up.
- Use different transceivers (same model), in the local site and then in the remote site.
Once the issue is isolated to a switch (which, looks like you have done all that shown above already) per the case notes), further troubleshooting can happen at that switch.
Going forward, the case notes show that the issue seems to be related to bug CSCvg04687. Although I am aware the switch was upgraded to 15.2(6)E, it looks like the bug is not fully fixed there. It looks like the current TAC engineer is already in talks with our internal teams for confirmation. I would let the current TAC engineer continue to work so they can let you know if there is some other OS with the fix.
As an interim workaround, you may want to change the err-disable link-flap timers to be less aggressive, at least until there is feedback from that TAC engineer.
I apologize about not being able to help further.
Hope this helps.
Eduardo.
01-05-2018 03:11 PM
Post the complete output to the command "sh controll e Gi0/51".
What is the end-to-end distance of the fibre?
If the error-disable is caused by Layer 1 issue, I don't see this evident in the output of the "sh interface Gi0/51".
01-16-2018 06:12 AM
TAC pointed to a documented bug behavior: CSCvg04687. My solution was to replace on of the new 2960L units with a 11 year old 3750. Basically just subscribe to the bug tracker for CSCvg04687 and wait for a fix in 15.2(6)E2. ¯\_(ツ)_/¯
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide