cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
9781
Views
5
Helpful
2
Replies

Cat 6509 reboot

Hi, today Cat6509-E reload

Cisco IOS Software, s72033_rp Software (s72033_rp-IPBASE-M), Version 12.2(33)SXI4a, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2010 by Cisco Systems, Inc.
Compiled Fri 16-Jul-10 19:51 by prod_rel_team

ROM: System Bootstrap, Version 12.2(17r)SX7, RELEASE SOFTWARE (fc1)

RUDSCYUZC040 uptime is 11 hours, 0 minutes
Uptime for this control processor is 11 hours, 0 minutes
Time since RUDSCYUZC040 switched to active is 10 hours, 59 minutes
System returned to ROM by s/w reset at 08:20:03 YUS Thu Mar 1 2012 (SP by bus error at PC 0x4112D2B4, address 0x51CE5D44)
System restarted at 08:22:32 YUS Thu Mar 1 2012
System image file is "sup-bootdisk:s72033-ipbase-mz.122-33.SXI4a.bin"
Last reload reason: Unknown reason


cisco WS-C6509-E (R7000) processor (revision 1.5) with 983008K/65536K bytes of memory.
Processor board ID SMC1432000M
SR71000 CPU at 600Mhz, Implementation 0x504, Rev 1.2, 512KB L2 Cache
Last reset from s/w reset
4 Virtual Ethernet interfaces
96 FastEthernet interfaces
75 Gigabit Ethernet interfaces
2 Ten Gigabit Ethernet interfaces
1917K bytes of non-volatile configuration memory.
8192K bytes of packet buffer memory.

65536K bytes of Flash internal SIMM (Sector size 512K).
Configuration register is 0x2102


In crashinfo from bootflash I see messages:

107310: Mar  1 2012 08:19:57.948: %C6K_PLATFORM-2-PEER_RESET: RP is being reset by the SP

%Software-forced reload


08:19:57 YUS Thu Mar 1 2012: Breakpoint exception, CPU signal 23, PC = 0x411C6EDC

-Traceback= 411C6EDC 411C4A30 40A06394 40A07A18 40D92F64 40D930BC 411B9CF4

$0 : 00000000, AT : 438A0000, v0 : 45290000, v1 : 00000000

a0 : 46B4B2F0, a1 : 00008100, a2 : 00000000, a3 : 00000000

t0 : 411B9E38, t1 : 34008101, t2 : 411B9E60, t3 : FFFF00FF

t4 : 411B9E38, t5 : 500122E8, t6 : 00000000, t7 : BCDEEE22

s0 : 00000000, s1 : 435A0000, s2 : 00000000, s3 : 43510000

s4 : 43510000, s5 : 43510000, s6 : 42AF0000, s7 : 00000001

t8 : 5001234C, t9 : 00000000, k0 : 00000000, k1 : 00000000

gp : 4389D5D8, sp : 50012418, s8 : 42AF0000, ra : 411C4A30

EPC  : 411C6EDC, ErrorEPC : 40E24428, SREG     : 34008103

MDLO : 00000000, MDHI     : 00000000, BadVaddr : 00000000

DATA_START : 0x433A3E10

Cause 00000824 (Code 0x9): Breakpoint exception

========= Context ==============================================================

s72033_rp Software (s72033_rp-IPBASE-M), Version 12.2(33)SXI4a, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Compiled Fri 16-Jul-10 19:51 by prod_rel_team
Signal = 23, Code = 0x24, Uptime 1y0w
$0 : 00000000, AT : 438A0000, v0 : 45290000, v1 : 00000000
a0 : 46B4B2F0, a1 : 00008100, a2 : 00000000, a3 : 00000000
t0 : 411B9E38, t1 : 34008101, t2 : 411B9E60, t3 : FFFF00FF
t4 : 411B9E38, t5 : 500122E8, t6 : 00000000, t7 : BCDEEE22
s0 : 00000000, s1 : 435A0000, s2 : 00000000, s3 : 43510000
s4 : 43510000, s5 : 43510000, s6 : 42AF0000, s7 : 00000001
t8 : 5001234C, t9 : 00000000, k0 : 00000000, k1 : 00000000
gp : 4389D5D8, sp : 50012418, s8 : 42AF0000, ra : 411C4A30

In crashinfo file from sup-bootdisk I see:

Cache error detected!
  CPO_ECC     (reg 26/0): 0x00000084
  CPO_CACHERI (reg 27/0): 0x20000000
  CP0_CAUSE   (reg 13/0): 0x00000C00

Real cache error detected.  System will be halted.

Error: Primary instr cache, fields: data,
Actual physical addr 0x00000000,
virtual address is imprecise.

Imprecise Data Parity Error

Imprecise Data Parity Error

08:19:57 YUS Thu Mar 1 2012: Interrupt exception, CPU signal 20, PC = 0x4112D2B4

-Traceback= 4089B5AC 4112D2B4
$0 : 00000000, AT : 443A0000, v0 : 4F4E967D, v1 : D2F914FD
a0 : 51CEA278, a1 : 923EF593, a2 : 43B59168, a3 : 43B60000
t0 : 51CE81AC, t1 : 538C712C, t2 : 538C7128, t3 : 538C7124
t4 : 538C7120, t5 : 538C711C, t6 : 538C7118, t7 : 538C7114
s0 : 000004E0, s1 : FFFFFFFF, s2 : 00000000, s3 : 51CE5CC4
s4 : 51CE5D24, s5 : 0000000D, s6 : 51CE57E4, s7 : 43BB0000
t8 : 538C716C, t9 : 00000000, k0 : 4813998C, k1 : 408EAE50
gp : 4263F918, sp : 51CEA288, s8 : 51CE5844, ra : 4112D2B4
EPC  : 4089B5AC, ErrorEPC : 4112D2B4, SREG     : 3400FF05
MDLO : 51EB851F, MDHI     : 00000000, BadVaddr : 00000000
DATA_START : 0x4231AB60
Cause 00000000 (Code 0x0): Interrupt exception

========= Context ==============================================================

s72033_sp Software (s72033_sp-IPBASE-M), Version 12.2(33)SXI4a, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Compiled Fri 16-Jul-10 20:18 by prod_rel_team
Signal = 20, Code = 0x20000000, Uptime 1y0w
$0 : 00000000, AT : 443A0000, v0 : 4F4E967D, v1 : D2F914FD
a0 : 51CEA278, a1 : 923EF593, a2 : 43B59168, a3 : 43B60000
t0 : 51CE81AC, t1 : 538C712C, t2 : 538C7128, t3 : 538C7124
t4 : 538C7120, t5 : 538C711C, t6 : 538C7118, t7 : 538C7114
s0 : 000004E0, s1 : FFFFFFFF, s2 : 00000000, s3 : 51CE5CC4
s4 : 51CE5D24, s5 : 0000000D, s6 : 51CE57E4, s7 : 43BB0000
t8 : 538C716C, t9 : 00000000, k0 : 4813998C, k1 : 408EAE50
gp : 4263F918, sp : 51CEA288, s8 : 51CE5844, ra : 4112D2B4

In tech_sup show:

107310: Mar  1 2012 08:19:57.948: %C6K_PLATFORM-2-PEER_RESET: RP is being reset by the SP

%Software-forced reload


08:19:57 YUS Thu Mar 1 2012: Breakpoint exception, CPU signal 23, PC = 0x411C6EDC

-Traceback= 411C6EDC 411C4A30 40A06394 40A07A18 40D92F64 40D930BC 411B9CF4

$0 : 00000000, AT : 438A0000, v0 : 45290000, v1 : 00000000

a0 : 46B4B2F0, a1 : 00008100, a2 : 00000000, a3 : 00000000

t0 : 411B9E38, t1 : 34008101, t2 : 411B9E60, t3 : FFFF00FF

t4 : 411B9E38, t5 : 500122E8, t6 : 00000000, t7 : BCDEEE22

s0 : 00000000, s1 : 435A0000, s2 : 00000000, s3 : 43510000

s4 : 43510000, s5 : 43510000, s6 : 42AF0000, s7 : 00000001

t8 : 5001234C, t9 : 00000000, k0 : 00000000, k1 : 00000000

gp : 4389D5D8, sp : 50012418, s8 : 42AF0000, ra : 411C4A30

EPC  : 411C6EDC, ErrorEPC : 40E24428, SREG     : 34008103

MDLO : 00000000, MDHI     : 00000000, BadVaddr : 00000000

DATA_START : 0x433A3E10

Cause 00000824 (Code 0x9): Breakpoint exception

========= Context ==============================================================

s72033_rp Software (s72033_rp-IPBASE-M), Version 12.2(33)SXI4a, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Compiled Fri 16-Jul-10 19:51 by prod_rel_team
Signal = 23, Code = 0x24, Uptime 1y0w
$0 : 00000000, AT : 438A0000, v0 : 45290000, v1 : 00000000
a0 : 46B4B2F0, a1 : 00008100, a2 : 00000000, a3 : 00000000
t0 : 411B9E38, t1 : 34008101, t2 : 411B9E60, t3 : FFFF00FF
t4 : 411B9E38, t5 : 500122E8, t6 : 00000000, t7 : BCDEEE22
s0 : 00000000, s1 : 435A0000, s2 : 00000000, s3 : 43510000
s4 : 43510000, s5 : 43510000, s6 : 42AF0000, s7 : 00000001
t8 : 5001234C, t9 : 00000000, k0 : 00000000, k1 : 00000000
gp : 4389D5D8, sp : 50012418, s8 : 42AF0000, ra : 411C4A30

***************************************************
****** Information of Last System Crash - SP ******
***************************************************


Using bus error at PC 0x4112D2B4, address 0x51CE5D44.

%Error opening sup-bootdisk:bus error at PC 0x4112D2B4, address 0x51CE5D44 (File not found)

Can you help me answer on some questions:

1. Where problem: Software or Hardware?

2. Why two crashinfo files?

1 Accepted Solution

Accepted Solutions

nkarpysh
Cisco Employee
Cisco Employee

Hi Konstantin,

SO

%C6K_PLATFORM-2-PEER_RESET: RP is being reset by the SP

means that SP crashed first and valid crashinfo is in sup-bootdisk. That one says:

Cache error detected!
  CPO_ECC     (reg 26/0): 0x00000084
  CPO_CACHERI (reg 27/0): 0x20000000
  CP0_CAUSE   (reg 13/0): 0x00000C00
Real cache error detected.  System will be halted.
Error: Primary instr cache, fields: data,
Actual physical addr 0x00000000,
virtual address is imprecise.
Imprecise Data Parity Error
Imprecise Data Parity Error

So this is Cache Parity error.

A parity error can happen in all type of RAM, regardless of the manufacturing. There are two kinds of parity errors:

Soft parity errors
These errors occur when an energy level within the chip (for example, a one or a zero) changes, most often due to radiation. When referenced by the CPU, such errors cause the system to crash. In case of a soft parity error, there is no need to swap the board or any of the components.

Hard parity errors
These errors occur when there is a chip or board failure that corrupts data. In this case, you need to re-seat or replace the affected component, which usually involves a memory chip swap or a board swap.

At the first occurrence it is not possible to distinguish between a soft or hard parity errors. From experience, most parity occurrences are soft parity errors, and you can usually dismiss them. Studies have shown that soft parity errors are 10 to 100 times more frequent than hard parity errors. Therefore, Cisco highly recommends you to wait for a second parity error on that particular affected component before you replace anything. This greatly reduces the impact on your network.

To learn more about Parity Errors please check the following CCO documentations:
https://www.cisco.com/en/US/products/hw/routers/ps341/products_tech_note09186a0080094793.shtml
http://www.cisco.com/en/US/products/hw/routers/ps167/products_tech_note09186a0080094340.shtml

So if it first occurrence - then just monitor as ususally that is transient. If will be seen again in short period - replace Supervisor.

Hope it helps,

Nik

HTH,
Niko

View solution in original post

2 Replies 2

nkarpysh
Cisco Employee
Cisco Employee

Hi Konstantin,

SO

%C6K_PLATFORM-2-PEER_RESET: RP is being reset by the SP

means that SP crashed first and valid crashinfo is in sup-bootdisk. That one says:

Cache error detected!
  CPO_ECC     (reg 26/0): 0x00000084
  CPO_CACHERI (reg 27/0): 0x20000000
  CP0_CAUSE   (reg 13/0): 0x00000C00
Real cache error detected.  System will be halted.
Error: Primary instr cache, fields: data,
Actual physical addr 0x00000000,
virtual address is imprecise.
Imprecise Data Parity Error
Imprecise Data Parity Error

So this is Cache Parity error.

A parity error can happen in all type of RAM, regardless of the manufacturing. There are two kinds of parity errors:

Soft parity errors
These errors occur when an energy level within the chip (for example, a one or a zero) changes, most often due to radiation. When referenced by the CPU, such errors cause the system to crash. In case of a soft parity error, there is no need to swap the board or any of the components.

Hard parity errors
These errors occur when there is a chip or board failure that corrupts data. In this case, you need to re-seat or replace the affected component, which usually involves a memory chip swap or a board swap.

At the first occurrence it is not possible to distinguish between a soft or hard parity errors. From experience, most parity occurrences are soft parity errors, and you can usually dismiss them. Studies have shown that soft parity errors are 10 to 100 times more frequent than hard parity errors. Therefore, Cisco highly recommends you to wait for a second parity error on that particular affected component before you replace anything. This greatly reduces the impact on your network.

To learn more about Parity Errors please check the following CCO documentations:
https://www.cisco.com/en/US/products/hw/routers/ps341/products_tech_note09186a0080094793.shtml
http://www.cisco.com/en/US/products/hw/routers/ps167/products_tech_note09186a0080094340.shtml

So if it first occurrence - then just monitor as ususally that is transient. If will be seen again in short period - replace Supervisor.

Hope it helps,

Nik

HTH,
Niko

Hi Nokolay,

Now observe the equipment and plan to upgrade to Release 12.2 (33) SXI8.

Thank you for your detailed response to questions.

Konstantin

Review Cisco Networking for a $25 gift card