Re: Cisco 9200L link flap

PSPICT · ‎02-04-2025

Hi there,

i have a set of 3 C9200L-48P-4G within a stack. these are on 17.12.02 as i upgraded them due to encountering what i thought was bug CSCwc41288,

since the upgrade, the switches ran fine for months and then the observed behavior has come back again yet the counters for the issue haven't incremented.

At present, we observe link down/up events on the fibre uplink on this switch and on the other switch we are not seeing a link flap but loop guard block/unblock as the particular route is a redundant link at present.

i have changed every fibre patch lead at both ends, replaced the SFP modules and checked the light levels both ends. i tried a different uplink port but still no joy and even tried a different fibre pair between the two buildings.

Any ideas?

Flavio Miranda · ‎02-04-2025

@PSPICT

Another bug probably.

marce1000 · ‎02-04-2025

- How do you define 'the counters for the issue' ; meaning do you also check all (error) counters ?
Also look into the latest advisory release https://software.cisco.com/download/home/286320038/type/282046477/release/Dublin-17.12.4 , if the issue is that strong or 'determined' it becomes worthwhile to take that step (upgrading to latest advisory) too.

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Joshqun Ismayilov · ‎02-04-2025

Hi @PSPICT
You can inspect the interface statistics for CRC errors, drops, or flaps:
#show interfaces gb x/x/x counters errors
#show interfaces gbx/x/x transceiver details

If you see high CRC errors, it could still indicate a cabling or SFP issue despite the replacements.

Thanks!

vishalbhandari · ‎02-04-2025

@PSPICT It sounds like you've done thorough troubleshooting already. Since the issue persists despite replacing fiber patch leads, SFPs, and using different ports and fiber pairs, it might be worth checking for potential hardware issues on the switch itself, like a faulty uplink ASIC. Also, review the spanning-tree configuration, especially around loop guard settings, as they may be reacting to inconsistent BPDUs. Check for any high CPU usage or process spikes that could affect link stability. Lastly, consider enabling detailed logging (with debugs if safe to do so) to catch any subtle errors not reflected in the usual counters.

lunpi · ‎04-28-2025

Hi!
We have similar issue on 17.12.4 on at least 10 switches with same image at different locations. Came after image upgrade. Using same cables/sfp's and so on, as before upgrade.

HW: 9200L 24/48 -P
SW: 17.12.4 (upgraded from 17.9.4)
SFP: Prooptix SFP-MMD/SMD/GE
Errors observed: Input errors / CRC / LPGviolation / FcsErr / SymbolErr

Now running some of them affected switches at 17.9.6a and the error counters are not increasing anymore. (And of course end user experience improved)

Leo Laohoo · ‎04-28-2025

@lunpi wrote:
We have similar issue on 17.12.4 on at least 10 switches with same image at different locations. Came after image upgrade.
Now running some of them affected switches at 17.9.6a and the error counters are not increasing anymore.

I have never heard of a behaviour like this before. Has TAC been engaged?

lunpi · ‎04-29-2025

no support on access switches (due to cost/amount of switches ~1,5k) so unfortunately unable to create a TAC.

Leo Laohoo · ‎04-29-2025

This issue is uncannily similar to CSCwo13276, which I am hitting now.

MHM Cisco World · ‎04-29-2025

Make new post I will reply.

MHM

Joseph W. Doherty · ‎04-29-2025

Any ideas?

Shooting from the hip, as you mentioned after an IOS upgrade, problem disappeared for months, just wonder all kinds of issues can develop, after running for some time due to lack of free memory or size of free memory blocks (i.e. fragmentation) either caused by a very slow memory leak.

Short term fix, reload. Long term fix, elimination of cause.

Free memory analysis might be done, first, to see if this is a possible issue.