08-20-2021 06:13 AM
Greetings,
I have problems with a ASR9k Line card A9K-8T-L
LC/0/0/CPU0:Aug 20 02:41:47.954 : pfm_node_lc[283]: %PLATFORM-NP-0-HW_DOUBLE_ECC_ERROR : Set|prm_server_tr[159827]|0x1008004|NP DOUBLE ECC ERROR, NP=4, memId=18, subMemId=0x2
LC/0/0/CPU0:Aug 20 02:41:47.958 : pfm_node_lc[283]: %PLATFORM-PFM-0-CARD_RESET_REQ : pfm_dev_sm_perform_recovery_action, Card reset requested by: Process ID: 159827 (prm_server_tr), Fault Sev: 0, Target node: 0/0/CPU0, CompId: 0x1f, Device Handle: 0x1008004, CondID: 1001, Fault Reason: NP DOUBLE ECC ERROR, NP=4, memId=18, subMemId=0x2
LC/0/0/CPU0:Aug 20 02:41:47.958 : syslog_dev[88]: pfm_node_lc[283] PID-159820: Request Graceful Reboot via Sysmgr: Reason: pfm_dev_sm_perform_recovery_action, Card reset requested by: Process ID: 159827 (prm_server_tr), Fault Sev: 0, Target node: 0/0/CPU0, CompId: 0x1f, Device Handle: 0x1008004, CondID: 1001, Fault Reason: NP DOUBLE ECC ERROR, NP=4, memId=18, subMemId=0x2
LC/0/0/CPU0:Aug 20 02:41:47.958 : syslog_dev[88]: pfm_node_lc[283] PID-159820:
RP/0/RSP0/CPU0:Aug 20 02:41:48.186 : shelfmgr[401]: %PLATFORM-SHELFMGR-6-NODE_KERNEL_DUMP_EVENT : Node 0/0/CPU0 indicates it is doing a kernel dump.
RP/0/RSP0/CPU0:Aug 20 02:41:48.186 : shelfmgr[401]: %PLATFORM-SHELFMGR-6-NODE_STATE_CHANGE : 0/0/CPU0 A9K-8T-L state:IOS XR FAILURE
RP/0/RSP0/CPU0:Aug 20 02:41:48.192 : shelfmgr[401]: %PLATFORM-SHELFMGR-6-NODE_STATE_CHANGE : 0/0/CPU0 A9K-8T-L state:BRINGDOWN
I will appreciate any information about this.
Thanks
Solved! Go to Solution.
08-20-2021 06:59 AM
Hello,
I found the bug below:
prm_server should not crash/abort upon encountering HW problem
CSCte19077
Description
Symptom:
prm_server crash after certain HW problems are detected on the Network Processor (NP), for example " %PLATFORM-NP-0-HW_DOUBLE_ECC_ERROR ". There will also be another syslog displaying "NP0 fails to setup"
Conditions:
Hardware problem found on the network processor
Workaround:
none
Recovery:
none
Further Problem Description:
08-20-2021 07:34 AM - edited 08-20-2021 07:43 AM
Hello @jedielbarreto ,
>> %PLATFORM-NP-0-HW_DOUBLE_ECC_ERROR :
This looks like to be ah hardware issue you can try to remove the linecard wait a few minutes and then insert the linecard again in the hope to make a hard reset.
However, if this does not fix I would consider an RMA of the linecard.
Edit:
The bug that @Georg Pauwen has found looks like similar but it applies only to IOS XR 3.9.0 that is a quite old version now.
But some error messages is missing in your logs like the "NP0 fails to setup".
Hope to help
Giuseppe
08-20-2021 06:59 AM
Hello,
I found the bug below:
prm_server should not crash/abort upon encountering HW problem
CSCte19077
Description
Symptom:
prm_server crash after certain HW problems are detected on the Network Processor (NP), for example " %PLATFORM-NP-0-HW_DOUBLE_ECC_ERROR ". There will also be another syslog displaying "NP0 fails to setup"
Conditions:
Hardware problem found on the network processor
Workaround:
none
Recovery:
none
Further Problem Description:
08-20-2021 07:34 AM - edited 08-20-2021 07:43 AM
Hello @jedielbarreto ,
>> %PLATFORM-NP-0-HW_DOUBLE_ECC_ERROR :
This looks like to be ah hardware issue you can try to remove the linecard wait a few minutes and then insert the linecard again in the hope to make a hard reset.
However, if this does not fix I would consider an RMA of the linecard.
Edit:
The bug that @Georg Pauwen has found looks like similar but it applies only to IOS XR 3.9.0 that is a quite old version now.
But some error messages is missing in your logs like the "NP0 fails to setup".
Hope to help
Giuseppe
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide