08-09-2021 12:29 AM
Dear group members, I am asking you for help with my Cisco ASR1004. Well, I use the device as a BGP and BRAS router on the network. The network is not very big - it is about 4,000. users.
The problem is that the device refuses to obey after a few hours of correct operation - in fact, the console remains available, but everything else stops working completely, BGP sessions are disconnected, there is no communication on the interfaces, and after a few moments the device restarts.
I am asking for a hint what the problem may be.
Below is the hardware configuration of my ASR.
NAME: "Chassis", DESCR: "Cisco ASR1004 Chassis" PID: ASR1004 , VID: V02 , SN: FOX1544G2H9 NAME: "module 0", DESCR: "Cisco ASR1000 SPA Interface Processor 40” PID: ASR1000-SIP40 , VID: V08 , SN: JAE1LCK08FO NAME: "module R0", DESCR: "Cisco ASR1000 Route Processor 2" PID: ASR1000-RP2 , VID: V04 , SN: JAE1781239A NAME: "module F0", DESCR: "Cisco ASR1000 Embedded Services Processor, 20Gbps" PID: ASR1000-ESP20 , VID: V04 , SN: JAE2017063A NAME: "Power Supply Module 0", DESCR: "Cisco ASR1004 AC Power Supply" PID: ASR1004-PWR-AC , VID: V03 , SN: ART1721B057 NAME: "Power Supply Module 1", DESCR: "Cisco ASR1004 AC Power Supply" PID: ASR1004-PWR-AC , VID: V03 , SN: ART1724A06C
Log file where you can see described problem.
Aug 3 03:45:11.870: DHCPD: AAA id already present , AAA UID = 62947 Aug 3 03:45:11.870: DHCPD: AAA id already present , AAA UID = 62948 Aug 3 03:45:11.870: DHCPD: AAA id already present , AAA UID = 62949 Aug 3 03:45:11.870: DHCPD: AAA id already present , AAA UID = 62950 Aug 3 03:45:11.975: %CPPDRV-3-LOCKDOWN: F0: fman_fp_image: QFP0.0 CPP Driver LOCKDOWN encountered due to previous fatal error (HW: QFP interrupt). Aug 3 03:45:11.979: %CPPHA-3-FAULT: F0: cpp_ha: CPP:0.0 desc:INFP_INF_SWASSIST_LEAF_INT_INT_EVENT0 det:DRVR(interrupt) class:OTHER sev:FATAL id:2121 cppstate:RUNNING res:UNKNOWN flags:0x7 cdmflags:0x8 Aug 3 03:45:11.980: %CPPOSLIB-3-ERROR_NOTIFY: F0: cpp_ha: cpp_ha encountered an error -Traceback= 1#e3468b513498723f47824eee70a65a93 errmsg:E705000+E90 cpp_common_os:FF58000+B7FC cpp_common_os:FF58000+B28C cpp_common_os:FF58000+197E0 cpp_drv_cmn:F83E000+39CA0 cpp_dmap:FCCC000+353F4 :10000000+26648 :10000000+26A80 :10000000+278A0 :10000000+E530 :10000000+12630 :10000000+C2C8 cpp_common_os:FF58000+109AC cpp_common_os:FF58000+10FA0 evlib:E054000+A214 evlib:E054000+A88C cpp_common_os:FF58000+137C4 :10000000+6378 c:B990000+21EB0 c:B990000+2 Aug 3 03:45:11.980: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump. Aug 3 03:45:12.009: %IOSXE-1-PLATFORM: F0: kernel: QFP0.0: Fatal Fault: HW reported: QFP interrupt Aug 3 03:45:11.992: %IOSXE-3-PLATFORM: F0: cpp_cdm: CPP crashed, collecting state. Aug 3 03:45:11.993: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump. Aug 3 03:45:11.993: %CPPDRV-6-INTR: F0: cpp_driver: CPP10(0) Interrupt : 21-Aug-03 05:45:11.965948 UTC+0200:INFP_INF_SWASSIST_LEAF_INT_INT_EVENT0 Aug 3 03:45:12.022: %CPPOSLIB-3-ERROR_NOTIFY: F0: fman_fp_image: fman_fp encountered an error -Traceback= 1#bf47857df3310dafed593c896c489be7 errmsg:B08A6000+E90 cpp_common_os:576000+B7FC cpp_common_os:576000+B28C cpp_common_os:576000+197E0 cpp_plutlu_common:6F4000+CDE8 cpp_plutlu_common:6F4000+102F4 cpp_plutlu_common:6F4000+10CC8 cpp_plutlu_common:6F4000+E6B4 cpp_cef_mpls_common:7CC000+31740 cpp_cef_mpls_common:7CC000+2E25C :9CF000+8BCC78 :9CF000+47DEA4 aobjman:B2EE2000+10B50 :9CF000+64596C evlib:A9A11000+A608 evlib:A9A11000+A88C :9CF000+39C4 Aug 3 03:45:12.610: %IOSXE_OIR-6-OFFLINECARD: Card (fp) offline in slot F0 Aug 3 03:45:12.612: %IOSXE_RP_ALARM-2-ESP: ASSERT CRITICAL module R0 No Working ESP Aug 3 03:45:12.661: %CPPDRV-3-LOCKDOWN: F0: cpp_cp: QFP0.0 CPP Driver LOCKDOWN encountered due to previous fatal error (HW: QFP interrupt). Aug 3 03:45:13.876: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (bad id) (id: 0x0) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E842B :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 Aug 3 03:45:13.876: DHCPD: AAA id already present , AAA UID = 62953 Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62954 Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62955 Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62956 Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62957 Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62958 Aug 3 03:45:13.877: DHCPD: AAA id already present , AAA UID = 62959 Aug 3 03:45:18.314: %CPPCDM-3-ERROR_NOTIFY: F0: cpp_cdm: QFP 0 thread 68 encountered an error -Traceback= 1#f1bdf7eeb0892846795259791b12836a 806CC94F 806CCAC5 806CCB8D 830325C4 80AD65F4 80AD65FE 80020064 80020055 80000000 Aug 3 03:45:18.365: %IOSXE-3-PLATFORM: F0: cpp_cdm: CPP crashed, generating core file. Aug 3 03:45:47.566: %OSPF-5-ADJCHG: Process 6, Nbr 172.16.3.96 on TenGigabitEthernet0/1/0.24 from FULL to DOWN, Neighbor Down: Dead timer expired Aug 3 03:45:49.574: %OSPF-5-ADJCHG: Process 6, Nbr 10.6.110.33 on TenGigabitEthernet0/1/0.24 from FULL to DOWN, Neighbor Down: Dead timer expired Aug 3 03:46:17.022: %FMANRP-3-PEER_IPC_STUCK: R0/0: fman_rp: IPC to fman-log-bay0-peer0 is stuck for more than 30 seconds Aug 3 03:46:18.063: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (bad id) (id: 0x0) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E842B :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 Aug 3 03:46:30.374: %BGP-3-NOTIFICATION: sent to neighbor 80.23.78.113 4/0 (hold time expired) 0 bytes Aug 3 03:46:30.374: %BGP-5-NBR_RESET: Neighbor 80.23.78.113 reset (BGP Notification sent) Aug 3 03:46:33.989: %BGP-5-ADJCHANGE: neighbor 80.23.78.113 Down BGP Notification sent Aug 3 03:46:33.989: %BGP_SESSION-5-ADJCHANGE: neighbor 80.23.78.113 IPv4 Unicast topology base removed from session BGP Notification sent Aug 3 03:46:38.892: %PIM-5-NBRCHG: neighbor 10.193.221.33 DOWN on interface BDI891 non DR Aug 3 03:46:42.764: %BGP-3-NOTIFICATION: sent to neighbor 180.11.8.72 4/0 (hold time expired) 0 bytes Aug 3 03:46:42.764: %BGP-5-NBR_RESET: Neighbor 180.11.8.72 reset (BGP Notification sent) Aug 3 03:46:47.556: %BGP-5-ADJCHANGE: neighbor 180.11.8.72 Down BGP Notification sent Aug 3 03:46:47.556: %BGP_SESSION-5-ADJCHANGE: neighbor 180.11.8.72 IPv4 Unicast topology base removed from session BGP Notification sent Aug 3 03:46:59.431: %SSH-3-RSA_SIGN_FAIL: Signature connection failed, status 3 Aug 3 03:46:59.431: %SSH-3-RSA_SIGN_FAIL: Signature creation failed, status 21 Aug 3 03:46:59.431: %SYS-6-LOGOUT: User admin has exited tty session 2(10.250.0.2) Aug 3 03:47:11.739: %SSH-3-RSA_SIGN_FAIL: Signature connection failed, status 3 Aug 3 03:47:11.739: %SSH-3-RSA_SIGN_FAIL: Signature creation failed, status 21 Aug 3 03:47:12.239: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:12.342: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:18.036: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:18.672: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:18.819: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (bad id) (id: 0x0) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E842B :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 Aug 3 03:47:19.405: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:23.456: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:27.168: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:27.414: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:28.142: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:29.205: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:29.452: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:29.200: %CPPHA-3-CDMDONE: F0: cpp_ha: CPP 0 microcode crashdump creation completed. Aug 3 03:47:29.205: %IOSXE-6-PLATFORM: F0: cpp_cdm: Shutting down CPP MDM while client(s) still connected Aug 3 03:47:29.207: %IOSXE-6-PLATFORM: F0: cpp_ha: Shutting down CPP MDM while client(s) still connected Aug 3 03:47:29.213: %IOSXE-6-PLATFORM: F0: cpp_ha: Shutting down CPP CDM while client(s) still connected Aug 3 03:47:29.749: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:29.533: %PMAN-3-PROCHOLDDOWN: F0: root: The process cpp_cdm_svr has been helddown (rc 69) Aug 3 03:47:29.541: %PMAN-3-PROCHOLDDOWN: F0: root: The process cpp_ha_top_level_server has been helddown (rc 69) Aug 3 03:47:31.827: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:32.650: %SYS-5-CONFIG_P: Configured programmatically by process VTEMPLATE Background Mgr from console as console Aug 3 03:47:47.580: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687 -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872 Aug 3 03:47:50.302: %BGP-3-NOTIFICATION: sent to neighbor 89.46.144.11 4/0 (hold time expired) 0 bytes Aug 3 03:47:50.303: %BGP-5-NBR_RESET: Neighbor 89.46.144.11 reset (BGP Notification sent) Aug 3 03:47:51.131: %BGP-5-ADJCHANGE: neighbor 89.46.144.11 Down BGP Notification sent Aug 3 03:47:51.131: %BGP_SESSION-5-ADJCHANGE: neighbor 89.46.144.11 IPv4 Unicast topology base removed from session BGP Notification sent Aug 3 03:47:52.386: %BGP-3-NOTIFICATION: sent to neighbor 89.46.144.12 4/0 (hold time expired) 0 bytes Aug 3 03:47:52.386: %BGP-5-NBR_RESET: Neighbor 89.46.144.12 reset (BGP Notification sent) Aug 3 03:47:53.151: %BGP-5-ADJCHANGE: neighbor 89.46.144.12 Down BGP Notification sent Aug 3 03:47:53.151: %BGP_SESSION-5-ADJCHANGE: neighbor 89.46.144.12 IPv4 Unicast topology base removed from session BGP Notification sent Aug 3 03:48:09.819: %BGP-3-NOTIFICATION: sent to neighbor 194.204.233.1 4/0 (hold time expired) 0 bytes Aug 3 03:48:09.819: %BGP-3-NOTIFICATION: sent to neighbor 194.204.232.1 4/0 (hold time expired) 0 bytes Aug 3 03:48:09.819: %BGP-5-NBR_RESET: Neighbor 194.204.233.1 reset (BGP Notification sent) Aug 3 03:48:09.819: %BGP-5-NBR_RESET: Neighbor 194.204.232.1 reset (BGP Notification sent) Aug 3 03:48:11.309: %BGP-5-ADJCHANGE: neighbor 194.204.232.1 Down BGP Notification sent Aug 3 03:48:11.309: %BGP_SESSION-5-ADJCHANGE: neighbor 194.204.232.1 IPv4 Unicast topology base removed from session BGP Notification sent Aug 3 03:48:11.309: %BGP-5-ADJCHANGE: neighbor 194.204.233.1 Down BGP Notification sent Aug 3 03:48:11.309: %BGP_SESSION-5-ADJCHANGE: neighbor 194.204.233.1 IPv4 Unicast topology base removed from session BGP Notification sent Aug 3 03:48:11.973: %BGP-3-NOTIFICATION: sent to neighbor 94.246.185.198 4/0 (hold time expired) 0 bytes Aug 3 03:48:11.974: %BGP-5-NBR_RESET: Neighbor 94.246.185.198 reset (BGP Notification sent) Aug 3 03:48:15.218: %BGP-5-ADJCHANGE: neighbor 94.246.185.198 Down BGP Notification sent Aug 3 03:48:15.218: %BGP_SESSION-5-ADJCHANGE: neighbor 94.246.185.198 IPv4 Unicast topology base removed from session BGP Notification sent Aug 3 03:48:24.324: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0xDEADBEEF) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 Aug 3 03:49:02.166: %PMAN-3-PROCHOLDDOWN: F0: root: The process fman_fp_image has been helddown (rc 134) Aug 3 03:49:02.454: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687 -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872 Aug 3 03:49:10.009: %IOSXE_SPA-6-UPDOWN: Interface TenGigabitEthernet0/0/0, link down due to local fault Aug 3 03:49:12.010: %LINK-3-UPDOWN: Interface TenGigabitEthernet0/0/0, changed state to down Aug 3 03:49:12.012: %IOSXE_RP_ALARM-6-INFO: ASSERT CRITICAL TenGigabitEthernet0/0/0 Physical Port Link Down Aug 3 03:49:13.009: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet0/0/0, changed state to down Aug 3 03:49:12.008: %LINK-3-UPDOWN: SIP0/0: Interface TenGigabitEthernet0/0/0, changed state to down Aug 3 03:49:13.598: %IOSXE_SPA-6-UPDOWN: Interface TenGigabitEthernet0/1/0, link down due to local fault Aug 3 03:49:15.597: %LINK-3-UPDOWN: Interface TenGigabitEthernet0/1/0, changed state to down Aug 3 03:49:15.598: %IOSXE_RP_ALARM-6-INFO: ASSERT CRITICAL TenGigabitEthernet0/1/0 Physical Port Link Down Aug 3 03:49:15.667: %LINK-3-UPDOWN: Interface BDI890, changed state to down Aug 3 03:49:15.717: %LINK-3-UPDOWN: Interface BDI891, changed state to down Aug 3 03:49:15.719: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel0, changed state to down Aug 3 03:49:16.598: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet0/1/0, changed state to down Aug 3 03:49:15.597: %LINK-3-UPDOWN: SIP0/1: Interface TenGigabitEthernet0/1/0, changed state to down Aug 3 03:49:16.666: %LINEPROTO-5-UPDOWN: Line protocol on Interface BDI890, changed state to down Aug 3 03:49:16.690: %IOSXE_SPA-6-UPDOWN: Interface TenGigabitEthernet0/2/0, link down due to local fault Aug 3 03:49:16.716: %LINEPROTO-5-UPDOWN: Line protocol on Interface BDI891, changed state to down Aug 3 03:49:17.025: %FMANRP-3-PEER_IPC_RESUME: R0/0: fman_rp: IPC to fman-log-bay0-peer0 has returned to normal after previous stuck Aug 3 03:49:18.690: %LINK-3-UPDOWN: Interface TenGigabitEthernet0/2/0, changed state to down Aug 3 03:49:18.691: %IOSXE_RP_ALARM-6-INFO: ASSERT CRITICAL TenGigabitEthernet0/2/0 Physical Port Link Down Aug 3 03:49:18.793: %LINK-3-UPDOWN: Interface BDI20, changed state to down Aug 3 03:49:18.688: %LINK-3-UPDOWN: SIP0/2: Interface TenGigabitEthernet0/2/0, changed state to down Aug 3 03:49:19.690: %LINEPROTO-5-UPDOWN: Line protocol on Interface TenGigabitEthernet0/2/0, changed state to down Aug 3 03:49:19.792: %LINEPROTO-5-UPDOWN: Line protocol on Interface BDI20, changed state to down Aug 3 03:49:24.472: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (bad id) (id: 0x0) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E842B :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 Aug 3 03:50:04.844: %SSH-3-RSA_SIGN_FAIL: Signature connection failed, status 3 Aug 3 03:50:04.844: %SSH-3-RSA_SIGN_FAIL: Signature creation failed, status 21 Aug 3 03:50:07.648: %SSH-3-RSA_SIGN_FAIL: Signature connection failed, status 3 Aug 3 03:50:07.648: %SSH-3-RSA_SIGN_FAIL: Signature creation failed, status 21 Aug 3 03:50:08.555: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687 -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872 Aug 3 03:50:24.562: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0x44484350) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 Aug 3 03:51:24.577: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0x44484350) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 Aug 3 03:51:44.980: %PMAN-3-PROCESS_NOTIFICATION: F0: pvp: System report core/Router-ASR_ESP_0-system-report_20210803-054906-CEST.tar.gz (size: 121849 KB) generated and System report info at core/Router-ASR_ESP_0-system-report_20210803-054906-CEST-info.txt Aug 3 03:51:45.574: %IOSXE-6-PLATFORM: F0: cpp_sp: Shutting down CPP MDM while client(s) still connected Aug 3 03:51:48.584: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687 -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872 Aug 3 03:52:12.134: %IPRT-3-RIB_LOOP: Resolution loop formed by routes in RIB Aug 3 03:52:24.594: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0xDEADBEEF) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 Aug 3 03:53:24.622: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0x44484350) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 Aug 3 03:53:28.624: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687 -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872 Aug 3 03:53:41.661: %IOSXE_OIR-6-ONLINECARD: Card (fp) online in slot F0 Aug 3 03:53:41.665: %IOSXE_RP_ALARM-2-ESP: CLEAR CRITICAL module R0 No Working ESP Aug 3 03:54:24.637: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0x6D656675) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 Aug 3 03:55:05.104: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: admin] [Source: 10.250.0.1] [localport: 22] at 05:55:05 CEST Tue Aug 3 2021 Aug 3 03:55:05.773: %SYS-6-LOGOUT: User admin has exited tty session 2(10.250.0.1) Aug 3 03:55:08.650: %SCHED-3-THRASHING: Process thrashing on watched message event. -Process= "DHCPD Receive", ipl= 2, pid= 687 -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+955A7E3 :55C8359DF000+955A1F0 :55C8359DF000+61EA735 :55C8359DF000+61EB872 Aug 3 03:55:09.333: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: admin] [Source: 10.250.0.1] [localport: 22] at 05:55:09 CEST Tue Aug 3 2021 Aug 3 03:55:24.654: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0xBEEFBACC) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 .Aug 3 03:56:24.687: %IDMGR-3-INVALID_ID: bad id in id_to_ptr (id: 0xFFFFFFFF) -Traceback= 1#67534267a135b52247fe4f0266b2715a :55C8359DF000+60E8460 :55C8359DF000+6243F41 :55C8359DF000+61EB435 :55C8359DF000+61EB872 .Aug 3 03:56:26.272: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: admin] [Source: 10.250.0.2] [localport: 22] at 05:56:26 CEST Tue Aug 3 2021
08-09-2021 01:05 AM
Contact TAC and get them to confirm if this is CSCvf11949.
08-09-2021 10:13 AM
Hmm, do you think this problem may still exist in this version? asr1000rpx86-universalk9.16.06.08.SPA.bin. From what I read, this bug is affected with version 15.X. What do you think ?
08-09-2021 01:33 AM
- FYI : https://bst.cloudapps.cisco.com/bugsearch/bug/CSCtn02462
I can't see any resolved releases being mentioned, check current software version , use latest and or advisory release , if problem persists, contact cisco TAC.
M.
08-09-2021 10:19 AM
Yes, thank you for your reply, honestly, what you pointed out best fits my problem, but from what I can see the thread has been abandoned. What do you think I should do now?
08-09-2021 07:52 AM - edited 08-09-2021 07:57 AM
Hello @Krzysztof Jablonski ,
the first interesting line in the log in your initial post is:
Aug 3 03:45:11.980: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.
I don't know if the suggested bug IDs apply to your device
3 03:45:11.980: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.
Aug 3 03:45:12.009: %IOSXE-1-PLATFORM: F0: kernel: QFP0.0: Fatal Fault: HW reported: QFP interrupt
Aug 3 03:45:11.992: %IOSXE-3-PLATFORM: F0: cpp_cdm: CPP crashed, collecting state.
Aug 3 03:45:11.993: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump
Aug 3 03:45:11.993: %CPPDRV-6-INTR: F0: cpp_driver: CPP10(0) Interrupt : 21-Aug-03 05:45:11.965948
Verify which parts of your configuration can be causing this for example removing the BRAS features and seeing if without the system is stable.
Hope to help
Giuseppe
08-09-2021 10:16 AM
Yes thank you, that's how we do it now, but it's time consuming. The worst thing is that it is a production environment, and we cannot really afford several device reboots during the day. Do you have something else on your mind?
08-09-2021 12:01 PM
Hello Krzysztof,
I was checking the logs you shared, decoding the tracebacks generated, but not pointing me to a conclusive cause of what is happening.
By you saying that it restarts after some time of correct operation of the router, it indicates me that we might run into some sort of exhaustion of the router (resource wise).
By these couple lines of logs that you shared:
Aug 3 03:45:11.980: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump. Aug 3 03:45:12.009: %IOSXE-1-PLATFORM: F0: kernel: QFP0.0: Fatal Fault: HW reported: QFP interrupt Aug 3 03:45:11.992: %IOSXE-3-PLATFORM: F0: cpp_cdm: CPP crashed, collecting state. Aug 3 03:45:11.993: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump. Aug 3 03:45:11.993: %CPPDRV-6-INTR: F0: cpp_driver: CPP10(0) Interrupt : 21-Aug-03 05:45:11.965948 UTC+0200:INFP_INF_SWASSIST_LEAF_INT_INT_EVENT0
I can see that there's a QFP (Quantum Flow Processor) interrupt at the hardware level. For reference, QFP is responsible for packet forwarding. And that we actually have a crash on CPP (Cisco Packet Processor), which for sure generated CPP core files (check your bootflash filesystem).
In such case, for further analysis, if you have a SMARTnet contract for this router, it is feasible to open a service request with us so we can look further into this and analyze the core files.
Thank you.
Marin Grabovschi
08-09-2021 02:01 PM
The presence of traceback messages in the logs reinforces my belief that this is a software problem and not a hardware problem. (traceback is always related to software issues and not hardware). The best thing for you is to open a case with Cisco TAC. The second best thing (if it is available to you) is to try a different version of code.
08-10-2021 04:24 AM
Sure, of course, the TAC would certainly be helpful here, but in our case it means the necessity to buy it, and as you know, circumstances are not always conducive to meeting such needs
08-10-2021 04:20 AM
Thank you very much for this information. As I mentioned, I have already tried several different versions of the software and the problem persists. I have a question. Do ESP and SIP engines use separate software or is it provisioned for other modules ?
08-10-2021 04:32 AM
Hello Krzysztof,
If you're asking if there's any additional software needed to be installed besides IOS-XE, the answer is no.
What you have is firmware for the modules. An example of that (they can also be upgraded) is in this post, where I explain in details how to upgrade ROMMON, FPGA and CPLD:
Firmware upgrade fails on ASR 1004
Thank you.
Marin Grabovschi
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide