02-13-2019 07:13 AM - edited 02-27-2019 10:51 AM
I have exactly replicated the problem described in CSCuz48487. I have two ASR1001-X routers, with a 3850 in the middle.
[ASR1001-X(L3 te0/0/1.101 WAN-MACSEC)]-----[(trunk) 3850 (trunk)]-----[(L3 te0/0/1.101 WAN-MACSEC)ASR1001-X]
I am using WAN MACSEC with dot1q encapsulation and tag-in-the-clear between the ASRs.
When MACSEC is enabled, I get packet loss on the 3850. I can see IGR_MISC_FATAL_ERROR increase as the packets are lost (show plat fwd drop exceptions), just as described in CSCuz48487.
If I simply turn MACSEC off between the ASRs, the packet loss resolves instantly. If I turn MACSEC back on, the packet loss immediately returns. This problem is very easy to reproduce.
Hardware and software:
cisco ASR1001-X : asr1001x-universalk9.16.06.04.SPA.bin
WS-C3850-48P-S, Multiple IOS versions listed below.
16.0.9.02(Fuji - latest): Approximately 2%-8% packet loss occurs, just as described in CSCuz48487
16.0.6.05(Everest - latest): Approximately 2%-8% packet loss occurs, just as described in CSCuz48487
16.0.3.07(Denali - latest): Approximately 2%-8% packet loss occurs, just as described in CSCuz48487
3.7.5E(Catalyst - latest): Approximately 2%-8% packet loss occurs, just as described in CSCuz48487
3.6.9E(Catalyst - latest): NO PACKET LOSS OCCURS. The release notes specifically say that CSCuz48487 was fixed, and I can confirm that indeed it was.
In my tests, the only thing I changed was the code version. Nothing else.
The bug was identified in the 3.6.x train and was fixed. The bug is currently present and still affects all modern 16.x code trains, as well as 3.7.x. Currently, the only work around is to downgrade to 3.6.x. I really don't want to do this, but it is presently my only option.
Please expand the scope of CSCuz48487 and re-open it. My account does not have the ability to report this through TAC, so I am reporting it here and hoping for the best.
03-08-2019 05:12 AM
Can we get some eyes on this? I too have experienced this issue.
03-08-2019 05:39 AM
07-22-2019 11:53 PM
I've observed this same behaviour with MACsec over AToM pseudowires with Cat9500s running IOS XE 16.9.3. I've raised a TAC case and will report back on the fix (presumably a code fix to port the XE 3.6.x fix into 16.9.x).
08-19-2019 09:06 PM
I've just closed the TAC case. Advice I received was my scenario is not the same as CSCuz48487 (although symptoms seem to be identical). A new Bug ID was created for me CSCvq85074 which has not yet been updated with the case notes, but the conclusion was Cleartag MACsec (DDTS id: CSCvg73574) was a feature enhancement introduced in 16.10 that resolves this behaviour. I requested the feature be back-ported into 16.9.x but this was deemed not feasible.
Resolution for me was to upgrade to recently released 16.12.1 and cross my fingers it's not too buggy :)
02-14-2024 01:37 AM
Did you ever find a solution to this?
My current theory is that this packet loss is caused by macsec replay protection window. Meaning that when your packets arrive out of order on the macsec link and you have macsec replay protection enabled (which is on by default with 0 window size), they will be dropped and it will increase the IGR_MISC_FATAL_ERROR counter.
There could be other reasons as well, but from my testing, it seems this can be one of the causes.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide