07-05-2011 06:35 AM - edited 03-07-2019 01:05 AM
Switch failed without any notice, no valid information in the syslogs to indicate what the cause may have been, I have attached the crashdump log which includes version information. Any assistance will be appreciated.
Regards
Walter
07-05-2011 06:52 AM
I did notice this:
2y1w: %C4K_SWITCHINGENGINEMAN-4-FATALERRORINTERRUPTSEEN: Fatal NFL Error NFL 00010608 FDT1 00000000 FDT2 000048F8 FLD 00000000
Check this out. It may be of help t you.
http://www.cisco.com/en/US/docs/switches/lan/catalyst4500/release/note/OL_11511.html#wp741097
The link above directs you to a 4900 page. But, as noted on the page.
Note: Although their Release Notes are unique, the 4 platforms (Catalyst 4500, Catalyst 4900, Catalyst ME 4900, and Catalyst 4900M/4948E) use the same Software Configuration Guide, Command Reference Guide, and System Message Guide.
Message was edited by: Antonio Knox
07-06-2011 12:46 AM
Hi,
The logs show the following message:
2y1w: %C4K_SWITCHINGENGINEMAN-4-FATALERRORINTERRUPTSEEN: Fatal NFL Error NFL 00010608
Looks like the IOS is hitting this bug:
CSCsl99781 - SUPV 10GE reload from Fatal NFL Error
Please confirm that you are using sup-V
Cheers
Sweta
P.S. Please rate the helpful answers
07-06-2011 05:25 AM
Hi
Thanks you for the reply I have a SUP IV installed
NAME: "Linecard(slot 1)", DESCR: "Supervisor IV with 2 1000BaseX GBIC ports"
PID: WS-X4515 , VID: , SN: JAB071604ER
It may still be that the same error occurs on the supervisor I have installed, so I will look at upgrading the code even though its been running on this switch for sometime now without any issues. Saying that its only recently that any Netflow config was added.
07-06-2011 09:33 AM
As you said, recently netflow configs were added.
The crash is due to data from the netflow card. In looking at the decodes we can see data being sent from the netflow card which causes the crash. This data should not cause a crash. In bug CSCsl99781:SUPV 10GE reload from Fatal NFL Error a change has been made to the IOS to not crash when it receives this specific error. With your IOS if it receives one of these errors it then crashes. With the newer IOS it need to receive 10,000 of these errors to then crash. This is due to the fact a great deal of these errors can happen and still not mean a hardware failure, once you receive an extreme amount of these errors then the device is starting to show a hardware failure.
Cheers,
Sweta
P.S. Rate the answer if it helped you.
07-06-2011 09:34 AM
After checking and doing some research, I guess I can confirm that same bug applies to sup IV also.
07-07-2011 12:24 AM
Thank you for your assistance, I also have a TAC case open and will let you know if the findings are the same.
07-11-2011 03:15 AM
Root cause provided from Cisco
Thank you for the information you have provided. The crash youexperienced was due to a parity error.
The best thing to do in this scenario is to monitor the device for 72 hours. If the parity error occurs a second time, then you
have a hard parity error and the hardware should be replaced. If not, you are dealing with a soft parity error, in which case no further
action is required.
http://www.cisco.com/en/US/products/hw/routers/ps341/products_tech_note09186a0080094793.shtml
There is a bug fix that would prevent the crash and cause only an error message to be printed when this parity error was detected:
CSCsl99781
This bug id fixed in below mentioned releases or later
12.2(25)EWA13
12.2(31)SGA6
12.2(46)SG
The switch has been running for a week now without any further incident, I plan to do the code level upgrade as soon as we get some scheduled downtime.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide