02-26-2018 11:04 PM - edited 03-01-2019 03:22 PM
My line-card mod80 self-reboot with unknown reason. Detail as below
Help me please!!
RP/0/RSP0/CPU0:PE_AYA# show reboot history location 0/1/cpu0
No Time Cause Code Reason
--------------------------------------------------------------------------------
01 Fri Feb 23 00:37:35 2018 0x2c000015 Cause: Excess Machine Check Condition
Process: p40x0mc
RP/0/RSP0/CPU0:PE_AYA#dir harddisk:/dumper
25229 -rwx 47046 Fri Feb 23 00:37:39 2018 LC3.180222-173739.crashinfo.by.p40x0mc
-----------------------------Log----------------------------------
LC/0/1/CPU0:Feb 23 00:37:33.651 : p40x0mc[74]: PCI3 PEX_ERR_DR : 0x80000020 [ME|LDDE] Count = 68
LC/0/1/CPU0:Feb 23 00:37:33.699 : p40x0mc[74]: %C-MC-CHECKER-3-DEBUG : Excess 128 MC errors
LC/0/1/CPU0:Feb 23 00:37:33.701 : p40x0mc[74]: 29/00/00/000: Vendor/Device ID : ffffffff [-]
LC/0/1/CPU0:Feb 23 00:37:33.800 : p40x0mc[74]: 29/00/00/004: Command : 0000ffff [INTDIS|SERR|PERR|BM|MEM|IO|UNDOCUMENTED]
LC/0/1/CPU0:Feb 23 00:37:33.923 : p40x0mc[74]: 29/00/00/006: Status : 0000ffff [DETPE|SIGSE|RECMA|RECTA|SIGTA|MPEDET|CAPLST|INTST|UNDOCUMENTED]
LC/0/1/CPU0:Feb 23 00:37:33.983 : p40x0mc[74]: 00/02/00/000: Vendor/Device ID : 04081957 [-]
LC/0/1/CPU0:Feb 23 00:37:34.060 : p40x0mc[74]: 00/02/00/10c: Uncorrectable Error Sev. : 00062010 [MTLP|RXO|FCPE|DLPE]
LC/0/1/CPU0:Feb 23 00:37:34.125 : p40x0mc[74]: 00/02/00/114: Correctable Error Mask : 00002000 [ADV_NFE]
LC/0/1/CPU0:Feb 23 00:37:34.214 : p40x0mc[74]: 00/02/00/118: Adv. Error Cap. & Control : 000000a0 [ECRCCC|ECRCGC|FIRST_ERR_PTR=0]
LC/0/1/CPU0:Feb 23 00:37:34.276 : p40x0mc[74]: 00/02/00/12c: Root Error Command : 00000004 [FERE]
LC/0/1/CPU0:Feb 23 00:37:34.336 : p40x0mc[74]: 00/02/00/404: LTSSM State Status Register: 00000004 [-]
LC/0/1/CPU0:Feb 23 00:37:34.444 : p40x0mc[74]: 00/02/00/054: Device Control : 0000281f [NSE|RO|URR|FER|NFER|CER|MAX_READ=2|MAX_PAYLOAD=0]
LC/0/1/CPU0:Feb 23 00:37:34.536 : p40x0mc[74]: 00/02/00/05e: Link Status : 0000c011 [LABS|LBMS|NEG_LINK_W=1|LINK_SP=1]
LC/0/1/CPU0:Feb 23 00:37:34.597 : p40x0mc[74]: 00/02/00/066: Slot Status : 00000040 [PDS]
LC/0/1/CPU0:Feb 23 00:37:34.661 : p40x0mc[74]: 00/02/00/068: Root Command : 00000004 [SEFEE]
LC/0/1/CPU0:Feb 23 00:37:34.746 : p40x0mc[74]: 00/02/00/004: Command : 00000547 [INTDIS|SERR|PERR|BM|MEM|IO]
LC/0/1/CPU0:Feb 23 00:37:34.811 : p40x0mc[74]: 00/02/00/006: Status : 00000010 [CAPLST]
LC/0/1/CPU0:Feb 23 00:37:35.820 : p40x0mc[74]: EISR0: 0x00040000 [PCI3]
LC/0/1/CPU0:Feb 23 00:37:35.874 : p40x0mc[74]: PCI3 PEX_ERR_DR : 0x80000020 [ME|LDDE] Count = 69
LC/0/1/CPU0:Feb 23 00:37:35.949 : p40x0mc[74]: %C-MC-CHECKER-3-DEBUG : Shutting down node due to excess 129 MC errors
LC/0/1/CPU0:Feb 23 00:37:37.958 : p40x0mc[74]: reboot_internal: Incomplete graceful reboot cleanup (Connection timed out)
LC/0/1/CPU0:Feb 23 00:37:37.958 : p40x0mc[74]: Fri Feb 23 00:37:35 2018:sync start
LC/0/1/CPU0:Feb 23 00:37:37.958 : p40x0mc[74]: Fri Feb 23 00:37:35 2018:sync end
LC/0/1/CPU0:Feb 23 00:37:37.958 : p40x0mc[74]: Fri Feb 23 00:37:35 2018:platform_reboot_op start
RP/0/RSP0/CPU0:Feb 23 00:37:58.350 : shelfmgr[383]: %PLATFORM-SHELFMGR-6-NODE_CPU_RESET : Node 0/1/CPU0 CPU reset detected.
RP/0/RSP0/CPU0:Feb 23 00:37:58.351 : shelfmgr[383]: %PLATFORM-SHELFMGR-6-NODE_STATE_CHANGE : 0/1/CPU0 A9K-MOD80-SE state:BRINGDOWN
Solved! Go to Solution.
02-27-2018 04:58 AM
If this was a single occurrence you don't need to take any actions. If the card is continuously reporting this failure it should be replaced through RMA. It would be the best if you opened a TAC SR for this.
/Aleksandar
02-27-2018 04:58 AM
If this was a single occurrence you don't need to take any actions. If the card is continuously reporting this failure it should be replaced through RMA. It would be the best if you opened a TAC SR for this.
/Aleksandar
03-04-2018 11:26 PM
Thank you for your suggestion.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide