Showing results for 
Search instead for 
Did you mean: 
Iulian Vaideanu

TestNonDisruptiveLoopback failures on C6800-xP10G cards

We've been experiencing a really annoying issue with our two 6807s (each with two SUP2-2T and two C6800-32P10G modules, XL variant), ever since we got them some five years ago:  every once in a while ports start failing the TestNonDisruptiveLoopback test and traffic stops flowing properly through those ports (this gets especially serious when that traffic is PVST for a large-traffic vlan or when neighboring devices err-disable their ports because of improper UDLD operation).

It has happened on both routers, on all four modules, so I don't think it's a hardware failure.  Also, there's a pattern, ports don't just fail randomly - if we divide the 32 ports on the module in quarters (odd 1-15, odd 17-31, even 2-16, even 18-32), one such incident always affects either 1st+2nd+5th+6th or 3rd+4th+7th+8th ports of a quarter (for example, Te1/17+Te1/19+Te1/25+Te1/27 or Te2/6+Te2/8+Te2/14+Te2/16).

I found a couple of bugs that describe similar behaviour (CSCvg70513 for 6880-X-LE and CSCvn62647 for C68xx Gigatron based modules (which I think is the case here) under heavy traffic and link flap conditions), but their resolutions didn't work for us, on the contrary: we upgraded from 15.2(1)SY* (one or two incident occurences a year) to 15.5(1)SY4 (one occurence every 2-3-4 days).  Also, I see that 15.5(1)SY3 (which is supposed to fix CSCvn62647) has a bad review that mentions exactly one of the symptoms we experience (UDLD-caused err-disable).

Has anyone else encountered this behaviour?  Any tips for solving (or at least working around) it?  Unfortunately TAC is not an option anymore...

Thank you.

Iulian Vaideanu

Just a quick update: we gave 15.5(1)SY5 a try and things have been stable for 6 weeks now.  Fingers crossed...

Hello @Iulian Vaideanu ,

thanks for your feedback it may help someone else with similar issues.

Upgrading or downgrading the IOS were the only possible choices, the frequency of the issue had become excessive.


Best Regards



Iulian Vaideanu

Unfortunately, 25 weeks into the uptime, it happened again :(...  I'm hoping we'll get away with just a card reset.