cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
394
Views
0
Helpful
4
Replies

GEIP crashed

shahaij
Level 1
Level 1

Dear sir,

My router is cisco 7507 and ios is rsp-pv-mz.121-18.1.E.

My router met a vip crash a few days early.

"

Aug 8 23:10:46.084: %RSP-3-ERROR: MD error 0000008000004000

-Traceback= 40404034 40404750 40404888 40405C54 403CA6D4

Aug 8 23:10:46.088: %RSP-3-ERROR: SRAM parity error (bytes 0:7) 10

-Traceback= 40404110 40404750 40404888 40405C54 403CA6D4

Aug 8 23:10:46.088: %VIP4-80 RM7000-3-MSG: slot1 VIP-3-MVIP_CYBUSERROR_INTERRUPT: A Cybus Error occurred.

Aug 8 23:10:46.476: %VIP4-80 RM7000-3-MSG: slot1 VIP-3-SVIP_RELOAD: SVIP Reload is called.

Aug 8 23:10:46.488: %VIP4-80 RM7000-3-MSG: slot1 VIP-3-SYSTEM_EXCEPTION: VIP System Exception occurred sig=22, code=0x0, context=0x608D46C8

Aug 8 23:10:48.380: %DBUS-3-CXBUSERR: Slot 1, CBus Error

Aug 8 23:10:48.380: %DBUS-3-DBUSINTERRSWSET: Slot 1, Internal Error due to VIP crash

Aug 8 23:10:48.380: %RSP-3-ERROR: End of MEMD error interrupt processing

-Traceback= 40404820 40404888 40405C54 403CA6D4

Aug 8 23:10:48.644: %CBUS-3-CMDTIMEOUT: Cmd timed out, CCB 0xF800FF20, slot 0, cmd code 2

-Traceback= 40458868 40458D54 40450608 4044E0B4 403718B8 40371AE8 4039C8D4 4039C8C0

"

My RSP didn't crash as that time. So i think the problem is caused by VIP itself. I know sig=22 is a fatal hardware error.

My question is: is there other reason for that crash? Such as RSP's SRAM issue? or My VIP has limit memory which lead the module crash.

here is 'show diag' information for that module

"

Slot 1:

Physical slot 1, ~physical slot 0xE, logical slot 1, CBus 0

Microcode Status 0x4

Master Enable, LED, WCS Loaded

Board is analyzed

Pending I/O Status: None

EEPROM format version 1

VIP4-80 RM7000 controller, HW rev 2.02, board revision A0

Serial number: 24986590 Part number: 73-3143-04

Test history: 0x00 RMA number: 00-00-00

Flags: cisco 7000 board; 7500 compatible

EEPROM contents (hex):

0x20: 01 22 02 02 01 7D 43 DE 49 0C 47 04 00 00 00 00

0x30: 50 17 EF 00 00 00 00 00 00 00 00 00 00 00 00 00

Slot database information:

Flags: 0x4 Insertion time: 0x3940 (5d01h ago)

Controller Memory Size: 64 MBytes CPU SDRAM, 64 MBytes Packet SDRAM

PA Bay 0 Information:

Gigabit-Ethernet PA(Dual-Wide), 1 ports

EEPROM format version 4

HW rev 0.02, Board revision A0

Serial number: 12689803 Part number: 73-4520-03

3 crashes since restart.

Last crash context (Aug 10 2003 09:25:58):

Nevada Error Interrupt Register = 0x1

CYASIC Error Interrupt Register = 0x0

CYASIC Other Interrupt Register = 0x80

QE HIGH Priority Interrupt

Unknown CYA oisr bit 0x00000080

QE TX HIGH Priority Interrupt

CYBUS Error Register = 0xE02B544, PKT Bus Error Register = 0x0

$0 : 00000000, AT : 607A0000, v0 : 00000000, v1 : 0000000E

a0 : 00000000, a1 : 07581FC0, a2 : 00000000, a3 : 00000000

t0 : 00000000, t1 : 00000000, t2 : 34004000, t3 : FFFF00FF

t4 : 000000F8, t5 : 4E80000F, t6 : 00000400, t7 : 00000000

s0 : 60850000, s1 : 6084E500, s2 : 80007D40, s3 : 00000000

s4 : 00000001, s5 : 00000000, s6 : 60850000, s7 : 00000000

t8 : 600FF6C8, t9 : 00000000, k0 : 60A3AAC0, k1 : 00000200

gp : 607AE560, sp : 80007D28, s8 : 00000000, ra : 6010A0A0

EPC : 600ECF10, ErrorEPC : 80008680, SREG : 3400FF03

MDLO : 812FB715, MDHI : D2C85F27, BadVaddr : 68040422

Cause 00000000 (Code 0x0): Interrupt exception

Traceback= 0x600ECF10 0x6010A0A0

--Boot log begin--

Cisco Internetwork Operating System Software

IOS (tm) VIP Software (SVIP-DW-M), Version 12.1(18.1)E, EARLY DEPLOYMENT MAINTENANCE INTERIM SOFTWARE

TAC Support: http://www.cisco.com/tac

Copyright (c) 1986-2003 by cisco Systems, Inc.

Compiled Tue 28-Jan-03 08:09 by hqluong

Image text-base: 0x60010BF0, data-base: 0x60400000

--Boot log end--

"

Sir, Can you give me some hint about that error?

Thank you so much!

Sha, Haijiang

4 Replies 4

beng
Level 1
Level 1

Hi, Haijiang,

Your problem is not on the VIP but on the RSP.

The router experienced a Parity Error in the SRAM on the RSP.

This is the explaination for error message :

A MD error on RSP can report bad parity as:

MC control parity

RP parity

SRAM parity

QA parity

Cybus0 parity

Cybus1 parity

Of these, the first four indicate bad parity found on the RSP, and the last two indicate bad parity from another card.

The most probable reason for a parity error will be due to a transient failure in DRAM. As a result, the following approach is recommended for dealing with a crash

attributed to a PMPE **when console logs or other error information is unavailable**.

A. Reseat the DRAM. Customer dynamics may preclude this option.

B. Replace the DRAM.

If error re-occurs, replace RSP (since it addresses 5 of the error sources - DRAM, SRAM, Processor, ASICs, manufacturing defect).

Hope this info helps.

Regards,

/Bessie

Cisco Systems Inc.

Bessie,

Thank you for your answer!

Do you mean that reseat the DRAM on VIP module?

And are the 5 error sources all on RSP?

I ever opened a case for similar problem. Case was opened on 08-MAY-2003 22:50:13 PST, the case owner is danilewi@cisco.com. But he said as the RSP hasn't crashed but the VIP had an fatal hardware error, he adviced me to replace the VIP.

I ever doubt Daniel's conclusion last time and these days my customer met lots of VIP crase. So I came here to take my chance this time.

It seems your conclusion is something different from him. If you want to read the case i mentioned, please give me an email, i can tell you the case number. My email address is shahaij@cn.ibm.com

Thank you so much!

Regards,

Haijiang

Haijiang,

I cannot judge others troubleshooting. At least, on these error messages, the problem is the RSP. It's the DRAM on RSP.

Regards,

/Bessie

Bessie,

That's ok. Do you have an URL which I can find an article description such error clearly.

Thank you!

Regards,

Haijiang

Review Cisco Networking for a $25 gift card