cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2964
Views
0
Helpful
13
Replies
enpingado
Beginner

4900m crash/reboot issue

I have a 4900m with IOS 12.2(53) SG2.

Over the past 6 months, there have been about 5 instances where the switch has rebooted itself.  Most of them occuring within the last few weeks. So it seems to be getting worse. 

dumping the log data showed this at the end

Jawa Crash Data:

Interrupt Mask: 0xE100

Interrupt: 0x2000

Forerunner CRC Error

Is this telling me i am having possible hardware failures like RAM?

13 REPLIES 13
Reza Sharifi
Hall of Fame Expert

You should open a ticket with TAC and send them the crash file and any other info you have.  It maybe a memory issue.

HTH

Leo Laohoo
VIP Community Legend

Over the past 6 months, there have been about 5 instances where the switch has rebooted itself.

Sounds like an IOS issue.

Can you attach/post the crashinfo files?

InayathUlla Sharieff
Cisco Employee

Hi,

Please send me the  show tech . or show ver/show platform crashdump.

Regards

Inayath

here is crash dump, sorry i have it only as scanned images.

I have a report from a second crash. 

similar to the previous dump with a few exceptions;

Machine Check Interrupt Count: 1c9910b

L1 Instruction Cache Parity Errors: 0

L1 Instruction Cache Parity Errors (CPU30): 0

L1 Data Cache Parity Errors: 1c9910b

Jawa Crash Data:

Interrupt Mask: 0xe100

Interrupt: 0x1000

The L1 info, is that related to CPU L1 cache?  It is looking like its a hardware issue not a software issue.

      

I found this on the 12.2 (54)SG release notes:

Parity errors in the CPU's cache cause IOS to crash with a crashdump file like the following:
Switch# show platform crashdump
VECTOR 0
*** CRASH DUMP ***
02/09/2009 10:10:30
Last crash: 02/09/2009 10:10:30
Build: 12.2(20090206:234053) IPBASE
buildversion addr: 13115584
MCSR: 40000000 <--- non-zero value!
.
The key pieces of data are "VECTOR 0" and a MCSR value of 40000000, 20000000, or 10000000.

Workaround: Enter the show platform cpu cache command to lanuch an IOS algorithm that
detects and recovers from parity errors in the CPU's cache. You will obtain a running count of the
number of CPU cache parity errors that have been successfully detected and corrected on a running
system:

Switch# show platform cpu cache
L1 Instruction Cache: ENABLED
L1 Data Cache: ENABLED
L2 Cache: ENABLED
Machine Check Interrupts: 5
L1 Instruction Cache Parity Errors: 3
L1 Instruction Cache Parity Errors (CPU30): 1
L1 Data Cache Parity Errors: 1

CSCsx15372

I get a similar return with non zero Parity errors.  Is the workaround saying that running the command "show platform cpu cache" will fix the errors? Or is that a temporary thing related to the IOS. 

I am wondering if i should update the IOS to solve this or if the issue is really a hardware problem.

Thanks

Hi,

Just finished analyzing your data and it related to Hardware issue. Kindly go ahead and raise the RMA for the same.

By any chance do you see following msgs in the logs:

%C4K_L3HWFORWARDING-4-PROFILEIDMAPTABLEPARITYERROR: Parity error detected and corrected at profileIdMapTable 

HTH

Regards

Inayath

*Plz rate if this info is helpfull.

So i ended up replacing the 4900m unit that had crashed at least 3 times with one from the lab that never reported this issue.   And now this new one also crashed in the same manner doing a self reboot. 

Now i find it hard to beleive it is a hardware issue given that it has happened to different units.  

Could this possible be a IOS bug?  I am thinking of upgrading to 15, but it next to impossible to reproduce the problem.

any ideas?

I have seen issues of software parity errors causing 4900M's to reload, and I have also seen positive impact from upgrading code. Unless you're set on moving to the 15 train, you may find that some of the later versions still in your train, like 12.2(53)SG8, may run more stable. I typically am a bit timid when it comes to the latest releases.

In short, many times parity issues are software instead of hardware. If this is the case, notably when release notes and bug toolkit indicate as such, a software upgrade may be beneficial.

Good luck!

Matt

Beat.Traber
Beginner

Hi

Did you ever find a solution for this?

I seem to have a similar problem, 4900M booting out of the blue, no syslog entries, and Forerunner CRC Error in the crash dump.

Beat

Leo Laohoo
VIP Community Legend

Post the output to the following commands:

1.  sh version

2.  dir