03-08-2023 02:58 AM
I experienced a major network outage last weekend:
The uplink ports of many access switches become err-disable. Each access switch is connected to the core switch through port-channel.
When I checked the logs, I found:
1. The faulty access switch has the following prompts:
ETHCNTR-3-LOOP_BACK_DETECTED & PM-4-ERR_DISABLE loopback error.
2. The core switch is connected to many access switches, but the switches with uplink port err-disable only have the following versions:
Switch Ports Model SW Version SW Image
------ ----- ----- ---------- ----------
* 1 52 WS-C2960X-48LPS-L 15.0(2a)EX5 C2960X-UNIVERSALK9-M
Switch Ports Model SW Version SW Image
------ ----- ----- ---------- ----------
* 1 52 WS-C2960X-48FPS-L 15.0(2)EX5 C2960X-UNIVERSALK9-M
The remaining normal switch system versions are: 15.2(2)E6 or 12.2(55)
Question: I don't know if this fault is related to bug:CSCur05027, because the description of bug:CSCur05027 does not mention 15.0(2a)EX5/15.0(2)EX5, only 15.0(02)EX01
Supplementary fault information:
The failure happened suddenly over the weekend, and no one modified the switch configuration.
The failed switches are spread across different physical locations
Hope someone can help me find the reason, I'm very confused. Thanks!
03-08-2023 03:04 AM - edited 03-08-2023 03:05 AM
as per the message looks like some where there was STP Loop
what is your core switch model and code ?
the one error disabled and one not effected same L2 etherchannel to core switch ?
how is config looking like some new version has feature err disable recovery ?
Note : also check any keepalive config issue
03-08-2023 03:12 AM
The model and version of the core switch are:
Cisco C6807-XL Version 15.3(04r)SYS
The port-channel and physical port of the core switch are configured with keepalive by default, and the port-channel of the access switch is also configured with keepalive by default, but the physical port of the access switch displays keepalive not set.
I'm not sure if it's related to CSCur05027, because CSCur05027 doesn't mention 15.0(2a)EX5. But this failure happened suddenly, and only 15.0(2a)EX5 and 15.0(2)EX5 failed, and other access switches connected to the core switch using port-channel were normal.
03-08-2023 03:17 AM
If possible upgrade the code....suggested or work around ? (may be effecting with bugs)
also i compare config working vs not working. make changes as per the environment and test it.
if this is 1 or 2 devices you can apply quick fix as suggested , if this is effecting many devices ? you need to look there the cause of the issue.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide