cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2115
Views
5
Helpful
11
Replies

ASR 1004 and restarts problem

Dear group members, I am asking you for help with my Cisco ASR1004. Well, I use the device as a BGP and BRAS router on the network. The network is not very big - it is about 4,000. users.

 

The problem is that the device refuses to obey after a few hours of correct operation - in fact, the console remains available, but everything else stops working completely, BGP sessions are disconnected, there is no communication on the interfaces, and after a few moments the device restarts.

 

I am asking for a hint what the problem may be.

 

Below is the hardware configuration of my ASR.

NAME: "Chassis", DESCR: "Cisco ASR1004 Chassis"
PID: ASR1004           , VID: V02  , SN: FOX1544G2H9

NAME: "module 0", DESCR: "Cisco ASR1000 SPA Interface Processor 40”
PID: ASR1000-SIP40     , VID: V08  , SN: JAE1LCK08FO

NAME: "module R0", DESCR: "Cisco ASR1000 Route Processor 2"
PID: ASR1000-RP2       , VID: V04  , SN: JAE1781239A

NAME: "module F0", DESCR: "Cisco ASR1000 Embedded Services Processor, 20Gbps"
PID: ASR1000-ESP20     , VID: V04  , SN: JAE2017063A

NAME: "Power Supply Module 0", DESCR: "Cisco ASR1004 AC Power Supply"
PID: ASR1004-PWR-AC    , VID: V03  , SN: ART1721B057

NAME: "Power Supply Module 1", DESCR: "Cisco ASR1004 AC Power Supply"
PID: ASR1004-PWR-AC    , VID: V03  , SN: ART1724A06C

Log file where you can see described problem.

Aug 3 03:45:11.870: DHCPD: AAA id already present , AAA UID = 62947
Aug 3 03:45:11.870: DHCPD: AAA id already present , AAA UID = 62948
Aug 3 03:45:11.870: DHCPD: AAA id already present , AAA UID = 62949
Aug 3 03:45:11.870: DHCPD: AAA id already present , AAA UID = 62950
Aug 3 03:45:11.975: %CPPDRV-3-LOCKDOWN: F0: fman_fp_image: QFP0.0 CPP Driver LOCKDOWN encountered due to previous fatal error (HW: QFP interrupt).
Aug 3 03:45:11.979: %CPPHA-3-FAULT: F0: cpp_ha: CPP:0.0 desc:INFP_INF_SWASSIST_LEAF_INT_INT_EVENT0 det:DRVR(interrupt) class:OTHER sev:FATAL id:2121 cppstate:RUNNING res:UNKNOWN flags:0x7 cdmflags:0x8
Aug 3 03:45:11.980: %CPPOSLIB-3-ERROR_NOTIFY: F0: cpp_ha: cpp_ha encountered an error -Traceback= 1#e3468b513498723f47824eee70a65a93 errmsg:E705000+E90 cpp_common_os:FF58000+B7FC cpp_common_os:FF58000+B28C cpp_common_os:FF58000+197E0 cpp_drv_cmn:F83E000+39CA0 cpp_dmap:FCCC000+353F4 :10000000+26648 :10000000+26A80 :10000000+278A0 :10000000+E530 :10000000+12630 :10000000+C2C8 cpp_common_os:FF58000+109AC cpp_common_os:FF58000+10FA0 evlib:E054000+A214 evlib:E054000+A88C cpp_common_os:FF58000+137C4 :10000000+6378 c:B990000+21EB0 c:B990000+2
Aug 3 03:45:11.980: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.
Aug 3 03:45:12.009: %IOSXE-1-PLATFORM: F0: kernel: QFP0.0: Fatal Fault: HW reported: QFP interrupt
Aug 3 03:45:11.992: %IOSXE-3-PLATFORM: F0: cpp_cdm: CPP crashed, collecting state.
Aug 3 03:45:11.993: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.
Aug 3 03:45:11.993: %CPPDRV-6-INTR: F0: cpp_driver: CPP10(0) Interrupt : 21-Aug-03 05:45:11.965948 UTC+0200:INFP_INF_SWASSIST_LEAF_INT_INT_EVENT0
Aug 3 03:45:12.022: %CPPOSLIB-3-ERROR_NOTIFY: F0: fman_fp_image: fman_fp encountered an error -Traceback= 1#bf47857df3310dafed593c896c489be7 errmsg:B08A6000+E90 cpp_common_os:576000+B7FC cpp_common_os:576000+B28C cpp_common_os:576000+197E0 cpp_plutlu_common:6F4000+CDE8 cpp_plutlu_common:6F4000+102F4 cpp_plutlu_common:6F4000+10CC8 cpp_plutlu_common:6F4000+E6B4 cpp_cef_mpls_common:7CC000+31740 cpp_cef_mpls_common:7CC000+2E25C :9CF000+8BCC78 :9CF000+47DEA4 aobjman:B2EE2000+10B50 :9CF000+64596C evlib:A9A11000+A608 evlib:A9A11000+A88C :9CF000+39C4
Aug 3 03:45:12.610: %IOSXE_OIR-6-OFFLINECARD: Card (fp) offline in slot F0
Aug 3 03:45:12.612: %IOSXE_RP_ALARM-2-ESP: ASSERT CRITICAL module R0 No Working ESP
Aug 3 03:45:12.661: %CPPDRV-3-LOCKDOWN: F0: cpp_cp: QFP0.0 CPP Driver LOCKDOWN encountered due to previous fatal error (HW: QFP interrupt).
Aug 3 03:45:13.876: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (bad id) (id: 0x0)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E842B :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
Aug 3 03:45:13.876: DHCPD: AAA id already present , AAA UID = 62953
Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62954
Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62955
Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62956
Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62957
Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62958
Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62959
Aug 3 03:45:18.314: %CPPCDM-3-ERROR_NOTIFY: F0: cpp_cdm: QFP 0 thread 68 encountered an error -Traceback= 1#f1bdf7eeb0892846795259791b12836a 806CC94F 806CCAC5 806CCB8D 830325C4 80AD65F4 80AD65FE 80020064 80020055 80000000
Aug 3 03:45:18.365: %IOSXE-3-PLATFORM: F0: cpp_cdm: CPP crashed, generating core file.
Aug 3 03:45:47.566: %OSPF-5-ADJCHG: Process 6, Nbr 172.16.3.96 on TenGigabitEthernet0/1/0.24 from FULL to DOWN, Neighbor Down: Dead timer expired
Aug 3 03:45:49.574: %OSPF-5-ADJCHG: Process 6, Nbr 10.6.110.33 on TenGigabitEthernet0/1/0.24 from FULL to DOWN, Neighbor Down: Dead timer expired
Aug 3 03:46:17.022: %FMANRP-3-PEER_IPC_STUCK: R0/0: fman_rp: IPC to fman-log-bay0-peer0 is stuck for more than 30 seconds
Aug 3 03:46:18.063: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (bad id) (id: 0x0)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E842B :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
Aug 3 03:46:30.374: %BGP-3-NOTIFICATION: sent to neighbor 80.23.78.113 4/0 (hold time expired) 0 bytes
Aug 3 03:46:30.374: %BGP-5-NBR_RESET: Neighbor 80.23.78.113 reset (BGP Notification sent)
Aug 3 03:46:33.989: %BGP-5-ADJCHANGE: neighbor 80.23.78.113 Down BGP Notification sent
Aug 3 03:46:33.989: %BGP_SESSION-5-ADJCHANGE: neighbor 80.23.78.113 IPv4 Unicast topology base removed from session BGP Notification sent
Aug 3 03:46:38.892: %PIM-5-NBRCHG: neighbor 10.193.221.33 DOWN on interface BDI891 non DR
Aug 3 03:46:42.764: %BGP-3-NOTIFICATION: sent to neighbor 180.11.8.72 4/0 (hold time expired) 0 bytes
Aug 3 03:46:42.764: %BGP-5-NBR_RESET: Neighbor 180.11.8.72 reset (BGP Notification sent)
Aug 3 03:46:47.556: %BGP-5-ADJCHANGE: neighbor 180.11.8.72 Down BGP Notification sent
Aug 3 03:46:47.556: %BGP_SESSION-5-ADJCHANGE: neighbor 180.11.8.72 IPv4 Unicast topology base removed from session BGP Notification sent
Aug 3 03:46:59.431: %SSH-3-RSA_SIGN_FAIL: Signature connection failed, status 3
Aug 3 03:46:59.431: %SSH-3-RSA_SIGN_FAIL: Signature creation failed, status 21
Aug 3 03:46:59.431: %SYS-6-LOGOUT: User admin has exited tty session 2(10.250.0.2)
Aug 3 03:47:11.739: %SSH-3-RSA_SIGN_FAIL: Signature connection failed, status 3
Aug 3 03:47:11.739: %SSH-3-RSA_SIGN_FAIL: Signature creation failed, status 21
Aug 3 03:47:12.239: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:12.342: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:18.036: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:18.672: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:18.819: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (bad id) (id: 0x0)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E842B :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
Aug 3 03:47:19.405: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:23.456: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:27.168: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:27.414: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:28.142: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:29.205: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:29.452: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:29.200: %CPPHA-3-CDMDONE: F0: cpp_ha: CPP 0 microcode crashdump creation completed.
Aug 3 03:47:29.205: %IOSXE-6-PLATFORM: F0: cpp_cdm: Shutting down CPP MDM while client(s) still connected
Aug 3 03:47:29.207: %IOSXE-6-PLATFORM: F0: cpp_ha: Shutting down CPP MDM while client(s) still connected
Aug 3 03:47:29.213: %IOSXE-6-PLATFORM: F0: cpp_ha: Shutting down CPP CDM while client(s) still connected
Aug 3 03:47:29.749: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:29.533: %PMAN-3-PROCHOLDDOWN: F0: root: The process cpp_cdm_svr has been helddown (rc 69)
Aug 3 03:47:29.541: %PMAN-3-PROCHOLDDOWN: F0: root: The process cpp_ha_top_level_server has been helddown (rc 69)
Aug 3 03:47:31.827: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:32.650: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console
Aug 3 03:47:47.580: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872
Aug 3 03:47:50.302: %BGP-3-NOTIFICATION: sent to neighbor 89.46.144.11 4/0 (hold time expired) 0 bytes
Aug 3 03:47:50.303: %BGP-5-NBR_RESET: Neighbor 89.46.144.11 reset (BGP Notification sent)
Aug 3 03:47:51.131: %BGP-5-ADJCHANGE: neighbor 89.46.144.11 Down BGP Notification sent
Aug 3 03:47:51.131: %BGP_SESSION-5-ADJCHANGE: neighbor 89.46.144.11 IPv4 Unicast topology base removed from session BGP Notification sent
Aug 3 03:47:52.386: %BGP-3-NOTIFICATION: sent to neighbor 89.46.144.12 4/0 (hold time expired) 0 bytes
Aug 3 03:47:52.386: %BGP-5-NBR_RESET: Neighbor 89.46.144.12 reset (BGP Notification sent)
Aug 3 03:47:53.151: %BGP-5-ADJCHANGE: neighbor 89.46.144.12 Down BGP Notification sent
Aug 3 03:47:53.151: %BGP_SESSION-5-ADJCHANGE: neighbor 89.46.144.12 IPv4 Unicast topology base removed from session BGP Notification sent
Aug 3 03:48:09.819: %BGP-3-NOTIFICATION: sent to neighbor 194.204.233.1 4/0 (hold time expired) 0 bytes
Aug 3 03:48:09.819: %BGP-3-NOTIFICATION: sent to neighbor 194.204.232.1 4/0 (hold time expired) 0 bytes
Aug 3 03:48:09.819: %BGP-5-NBR_RESET: Neighbor 194.204.233.1 reset (BGP Notification sent)
Aug 3 03:48:09.819: %BGP-5-NBR_RESET: Neighbor 194.204.232.1 reset (BGP Notification sent)
Aug 3 03:48:11.309: %BGP-5-ADJCHANGE: neighbor 194.204.232.1 Down BGP Notification sent
Aug 3 03:48:11.309: %BGP_SESSION-5-ADJCHANGE: neighbor 194.204.232.1 IPv4 Unicast topology base removed from session BGP Notification sent
Aug 3 03:48:11.309: %BGP-5-ADJCHANGE: neighbor 194.204.233.1 Down BGP Notification sent
Aug 3 03:48:11.309: %BGP_SESSION-5-ADJCHANGE: neighbor 194.204.233.1 IPv4 Unicast topology base removed from session BGP Notification sent
Aug 3 03:48:11.973: %BGP-3-NOTIFICATION: sent to neighbor 94.246.185.198 4/0 (hold time expired) 0 bytes
Aug 3 03:48:11.974: %BGP-5-NBR_RESET: Neighbor 94.246.185.198 reset (BGP Notification sent)
Aug 3 03:48:15.218: %BGP-5-ADJCHANGE: neighbor 94.246.185.198 Down BGP Notification sent
Aug 3 03:48:15.218: %BGP_SESSION-5-ADJCHANGE: neighbor 94.246.185.198 IPv4 Unicast topology base removed from session BGP Notification sent
Aug 3 03:48:24.324: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0xDEADBEEF)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
Aug 3 03:49:02.166: %PMAN-3-PROCHOLDDOWN: F0: root: The process fman_fp_image has been helddown (rc 134)
Aug 3 03:49:02.454: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872
Aug 3 03:49:10.009: %IOSXE_SPA-6-UPDOWN: Interface TenGigabitEthernet0/0/0, link down due to local fault
Aug 3 03:49:12.010: %LINK-3-UPDOWN: Interface TenGigabitEthernet0/0/0, changed state to down
Aug 3 03:49:12.012: %IOSXE_RP_ALARM-6-INFO: ASSERT CRITICAL TenGigabitEthernet0/0/0 Physical Port Link Down
Aug 3 03:49:13.009: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet0/0/0, changed state to down
Aug 3 03:49:12.008: %LINK-3-UPDOWN: SIP0/0: Interface TenGigabitEthernet0/0/0, changed state to down
Aug 3 03:49:13.598: %IOSXE_SPA-6-UPDOWN: Interface TenGigabitEthernet0/1/0, link down due to local fault
Aug 3 03:49:15.597: %LINK-3-UPDOWN: Interface TenGigabitEthernet0/1/0, changed state to down
Aug 3 03:49:15.598: %IOSXE_RP_ALARM-6-INFO: ASSERT CRITICAL TenGigabitEthernet0/1/0 Physical Port Link Down
Aug 3 03:49:15.667: %LINK-3-UPDOWN: Interface BDI890, changed state to down
Aug 3 03:49:15.717: %LINK-3-UPDOWN: Interface BDI891, changed state to down
Aug 3 03:49:15.719: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel0, changed state to down
Aug 3 03:49:16.598: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet0/1/0, changed state to down
Aug 3 03:49:15.597: %LINK-3-UPDOWN: SIP0/1: Interface TenGigabitEthernet0/1/0, changed state to down
Aug 3 03:49:16.666: %LINEPROTO-5-UPDOWN: Line protocol on Interface BDI890, changed state to down
Aug 3 03:49:16.690: %IOSXE_SPA-6-UPDOWN: Interface TenGigabitEthernet0/2/0, link down due to local fault
Aug 3 03:49:16.716: %LINEPROTO-5-UPDOWN: Line protocol on Interface BDI891, changed state to down
Aug 3 03:49:17.025: %FMANRP-3-PEER_IPC_RESUME: R0/0: fman_rp: IPC to fman-log-bay0-peer0 has returned to normal after previous stuck
Aug 3 03:49:18.690: %LINK-3-UPDOWN: Interface TenGigabitEthernet0/2/0, changed state to down
Aug 3 03:49:18.691: %IOSXE_RP_ALARM-6-INFO: ASSERT CRITICAL TenGigabitEthernet0/2/0 Physical Port Link Down
Aug 3 03:49:18.793: %LINK-3-UPDOWN: Interface BDI20, changed state to down
Aug 3 03:49:18.688: %LINK-3-UPDOWN: SIP0/2: Interface TenGigabitEthernet0/2/0, changed state to down
Aug 3 03:49:19.690: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet0/2/0, changed state to down
Aug 3 03:49:19.792: %LINEPROTO-5-UPDOWN: Line protocol on Interface BDI20, changed state to down
Aug 3 03:49:24.472: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (bad id) (id: 0x0)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E842B :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
Aug 3 03:50:04.844: %SSH-3-RSA_SIGN_FAIL: Signature connection failed, status 3
Aug 3 03:50:04.844: %SSH-3-RSA_SIGN_FAIL: Signature creation failed, status 21
Aug 3 03:50:07.648: %SSH-3-RSA_SIGN_FAIL: Signature connection failed, status 3
Aug 3 03:50:07.648: %SSH-3-RSA_SIGN_FAIL: Signature creation failed, status 21
Aug 3 03:50:08.555: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872
Aug 3 03:50:24.562: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0x44484350)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
Aug 3 03:51:24.577: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0x44484350)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
Aug 3 03:51:44.980: %PMAN-3-PROCESS_NOTIFICATION: F0: pvp: System report core/Router-ASR_ESP_0-system-report_20210803-054906-CEST.tar.gz (size: 121849 KB) generated and System report info at core/Router-ASR_ESP_0-system-report_20210803-054906-CEST-info.txt
Aug 3 03:51:45.574: %IOSXE-6-PLATFORM: F0: cpp_sp: Shutting down CPP MDM while client(s) still connected
Aug 3 03:51:48.584: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872
Aug 3 03:52:12.134: %IPRT-3-RIB_LOOP: Resolution loop formed by routes in RIB
Aug 3 03:52:24.594: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0xDEADBEEF)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
Aug 3 03:53:24.622: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0x44484350)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
Aug 3 03:53:28.624: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872
Aug 3 03:53:41.661: %IOSXE_OIR-6-ONLINECARD: Card (fp) online in slot F0
Aug 3 03:53:41.665: %IOSXE_RP_ALARM-2-ESP: CLEAR CRITICAL module R0 No Working ESP
Aug 3 03:54:24.637: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0x6D656675)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
Aug 3 03:55:05.104: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: admin] [Source: 10.250.0.1] [localport: 22] at 05:55:05 CEST Tue Aug 3 2021
Aug 3 03:55:05.773: %SYS-6-LOGOUT: User admin has exited tty session 2(10.250.0.1)
Aug 3 03:55:08.650: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872
Aug 3 03:55:09.333: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: admin] [Source: 10.250.0.1] [localport: 22] at 05:55:09 CEST Tue Aug 3 2021
Aug 3 03:55:24.654: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0xBEEFBACC)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
.Aug 3 03:56:24.687: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0xFFFFFFFF)
-Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872
.Aug 3 03:56:26.272: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: admin] [Source: 10.250.0.2] [localport: 22] at 05:56:26 CEST Tue Aug 3 2021
11 REPLIES 11
Leo Laohoo
VIP Community Legend

Contact TAC and get them to confirm if this is CSCvf11949.

Hmm, do you think this problem may still exist in this version? asr1000rpx86-universalk9.16.06.08.SPA.bin. From what I read, this bug is affected with version 15.X. What do you think ?

marce1000
VIP Mentor

 

                        - FYIhttps://bst.cloudapps.cisco.com/bugsearch/bug/CSCtn02462

    I can't see any resolved releases being mentioned, check current software version , use latest and or advisory release , if problem persists, contact cisco TAC.

 M.

Yes, thank you for your reply, honestly, what you pointed out best fits my problem, but from what I can see the thread has been abandoned. What do you think I should do now?

Giuseppe Larosa
Hall of Fame Master

Hello @Krzysztof Jablonski ,

 

the first interesting line in the log in your initial post is:

 

Aug 3 03:45:11.980: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.

 

I don't know if the suggested bug IDs  apply to your device 

 

3 03:45:11.980: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.


Aug 3 03:45:12.009: %IOSXE-1-PLATFORM: F0: kernel: QFP0.0: Fatal Fault: HW reported: QFP interrupt


Aug 3 03:45:11.992: %IOSXE-3-PLATFORM: F0: cpp_cdm: CPP crashed, collecting state.


Aug 3 03:45:11.993: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump


Aug 3 03:45:11.993: %CPPDRV-6-INTR: F0: cpp_driver: CPP10(0) Interrupt : 21-Aug-03 05:45:11.965948

 

Verify which parts of your configuration can be causing this for example removing the BRAS features and seeing if without the system is stable.

 

Hope to help

Giuseppe

 

Yes thank you, that's how we do it now, but it's time consuming. The worst thing is that it is a production environment, and we cannot really afford several device reboots during the day. Do you have something else on your mind?

mgrabovs
Cisco Employee

Hello Krzysztof,

 

I was checking the logs you shared, decoding the tracebacks generated, but not pointing me to a conclusive cause of what is happening.

By you saying that it restarts after some time of correct operation of the router, it indicates me that we might run into some sort of exhaustion of the router (resource wise).

 

By these couple lines of logs that you shared:

Aug 3 03:45:11.980: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.
Aug 3 03:45:12.009: %IOSXE-1-PLATFORM: F0: kernel: QFP0.0: Fatal Fault: HW reported: QFP interrupt
Aug 3 03:45:11.992: %IOSXE-3-PLATFORM: F0: cpp_cdm: CPP crashed, collecting state.
Aug 3 03:45:11.993: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.
Aug 3 03:45:11.993: %CPPDRV-6-INTR: F0: cpp_driver: CPP10(0) Interrupt : 21-Aug-03 05:45:11.965948 UTC+0200:INFP_INF_SWASSIST_LEAF_INT_INT_EVENT0

I can see that there's a QFP (Quantum Flow Processor) interrupt at the hardware level. For reference, QFP is responsible for packet forwarding. And that we actually have a crash on CPP (Cisco Packet Processor), which for sure generated CPP core files (check your bootflash filesystem).

 

In such case, for further analysis, if you have a SMARTnet contract for this router, it is feasible to open a service request with us so we can look further into this and analyze the core files.

 

 

Thank you.

Marin Grabovschi

The presence of traceback messages in the logs reinforces my belief that this is a software problem and not a hardware problem. (traceback is always related to software issues and not hardware). The best thing for you is to open a case with Cisco TAC. The second best thing (if it is available to you) is to try a different version of code.

HTH

Rick

Sure, of course, the TAC would certainly be helpful here, but in our case it means the necessity to buy it, and as you know, circumstances are not always conducive to meeting such needs That's why I try to deal with it myself.

Thank you very much for this information. As I mentioned, I have already tried several different versions of the software and the problem persists. I have a question. Do ESP and SIP engines use separate software or is it provisioned for other modules ?

Hello Krzysztof,

 

If you're asking if there's any additional software needed to be installed besides IOS-XE, the answer is no.

What you have is firmware for the modules. An example of that (they can also be upgraded) is in this post, where I explain in details how to upgrade ROMMON, FPGA and CPLD:

Firmware upgrade fails on ASR 1004 

 

Thank you.

Marin Grabovschi