cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2777
Views
5
Helpful
2
Replies

NCS540 Reload

Hi everyone can someone help me to understanding with this syslog in NCS540  (N540-28Z4C-SYS-A) 

the syslog in the sequence below after that the NCS reloaded automatic  without any one caused this reload 

shelfmgr[185]: %PLATFORM-CPA_INTF_SHELFMGR-3-HW_FAULT_RECOVERY : node0_RP0_CPU0: SEU Correctable error(intsts:0x18, sem_status:0x145)
processmgr[51]: %OS-SYSMGR-6-INFO : Prepared RMF to reboot

processmgr[51]: %MGBL-SCONBKUP-6-INTERNAL_INFO : Reload debug script successfully spawned

1 Accepted Solution

Accepted Solutions

smilstea
Cisco Employee
Cisco Employee

Device rebooted due to self-recovery mechanism invoked to rectify SEU Correctable errors. SEU (Single Event Upset) is a "soft" parity error events which is typically transient or random in nature, and usually only occur a single time as a result of an environmental  disruption of the memory data.
These are not caused by hardware malfunction.
Research has shown that the majority of single event (or "soft") errors in memory chips occur as a result of background radiation (chiefly neutrons from cosmic rays), electro-static discharge (ESD), or electro-magnetic interference (EMI), which may randomly change the electrical state of one or more memory cells or interfere with the circuitry used to read & write them.


If you encounter soft parity errors, analyze recent environmental changes that have occurred at the location of the affected system.
Common sources of ESD and EMI that may cause soft parity errors include:
Power cables and supplies
Power distribution units
Universal power supplies
Lighting systems
Power generators
Nuclear facilities (radiation)
Solar flares (radiation)

 

In the event that the box is reloading multiple times then we can consider an RMA.

 

Sam

View solution in original post

2 Replies 2

mbarfield
Level 1
Level 1

Had the same issue on a N540X-4Z14G2Q-A happen last night. Did you ever open a case with Cisco?

smilstea
Cisco Employee
Cisco Employee

Device rebooted due to self-recovery mechanism invoked to rectify SEU Correctable errors. SEU (Single Event Upset) is a "soft" parity error events which is typically transient or random in nature, and usually only occur a single time as a result of an environmental  disruption of the memory data.
These are not caused by hardware malfunction.
Research has shown that the majority of single event (or "soft") errors in memory chips occur as a result of background radiation (chiefly neutrons from cosmic rays), electro-static discharge (ESD), or electro-magnetic interference (EMI), which may randomly change the electrical state of one or more memory cells or interfere with the circuitry used to read & write them.


If you encounter soft parity errors, analyze recent environmental changes that have occurred at the location of the affected system.
Common sources of ESD and EMI that may cause soft parity errors include:
Power cables and supplies
Power distribution units
Universal power supplies
Lighting systems
Power generators
Nuclear facilities (radiation)
Solar flares (radiation)

 

In the event that the box is reloading multiple times then we can consider an RMA.

 

Sam