10-15-2024 03:32 AM - edited 10-15-2024 09:51 AM
Hi,
We're trying to get this LC working in an NCS5508 chassis. We upgraded the fabric cards and fan trays to the second-generation version and installed the LC in the chassis alongside the older NC55-18H18F.
The fan trays and fabric cards are working OK, the 18H18F is working as well but it seems there's an issue with installing XR onto the LC:
0/1/ADMIN0:Oct 14 22:38:08.890 CEST: inst_agent[2105]: %INFRA-INSTAGENT-4-XR_PART_PREP_RESP : SDR/XR partition preparation completed successfully
0/1/ADMIN0:Oct 14 22:38:26.214 CEST: vm_manager[2162]: %INFRA-VM_MANAGER-4-INFO : Info: vm_manager started VM default-sdr--1
LC/0/1/CPU0:Oct 14 22:38:59.576 CEST: fia_driver[294]: BCM-DPA: Optics Driver connection not established yet,Allow couple of min to establish
LC/0/1/CPU0:Oct 14 22:39:21.843 CEST: processmgr[51]: %OS-SYSMGR-6-INFO : pm_audit_tier: ack not received for sysmgr shutdown event from :bfd_agent(256)
LC/0/1/CPU0:Oct 14 22:39:21.964 CEST: syslog_dev[111]: sdr_instagt[321] PID-2928: RL: waiting for completion of shutdown delay 30 secs, pending secs 3
LC/0/1/CPU0:Oct 14 22:39:21.964 CEST: sdr_instagt[321]: %INFRA-REBOOT_LIB-6-SHUTDOWN_DELAY : Waiting for completion of shutdown delay 30 secs, pending 3 secs
LC/0/1/CPU0:Oct 14 22:39:22.637 CEST: fia_driver[294]: %PLATFORM-OFA-6-INFO : NPU #0 Initialization Completed
LC/0/1/CPU0:Oct 14 22:39:24.964 CEST: syslog_dev[111]: sdr_instagt[321] PID-2928: RL: Reboot initiated with code 36, cause Reboot triggered by failed install operation reboot_timeout 30 shutdown delay 30
LC/0/1/CPU0:Oct 14 22:39:24.964 CEST: sdr_instagt[321]: %INFRA-REBOOT_LIB-5-REBOOT_INITIATED : Reboot initiated with code:36 cause:'Reboot triggered by failed install operation' reboot timeout:30 shutdown delay: 30
LC/0/1/CPU0:Oct 14 22:39:24.964 CEST: sdr_instagt[321]: %INFRA-REBOOT_LIB-5-REBOOT_STARTED : Started processing reboot request
LC/0/1/CPU0:Oct 14 22:39:24.964 CEST: syslog_dev[111]: sdr_instagt[321] PID-2928: RL: Shutdown initiated
LC/0/1/CPU0:Oct 14 22:39:24.965 CEST: sdr_instagt[321]: %INFRA-REBOOT_LIB-6-INVMGR_CONN_SUCCESS : Connected successfully to invmgr
LC/0/1/CPU0:Oct 14 22:39:24.965 CEST: syslog_dev[111]: sdr_instagt[321] PID-4185: Query the node to be reloaded
LC/0/1/CPU0:Oct 14 22:39:24.966 CEST: syslog_dev[111]: sdr_instagt[321] PID-4185: Invmgr : Obj not found nobjs = 0
LC/0/1/CPU0:Oct 14 22:39:24.968 CEST: syslog_dev[111]: sdr_instagt[321] PID-4185: sending stop hb
LC/0/1/CPU0:Oct 14 22:39:24.968 CEST: syslog_dev[111]: sdr_instagt[321] PID-4185: Cause: Reboot triggered by failed install operation
LC/0/1/CPU0:Oct 14 22:39:24.968 CEST: syslog_dev[111]: sdr_instagt[321] PID-4185: VM IP addr sent for reload 192.0.8.3
LC/0/1/CPU0:Oct 14 22:39:24.969 CEST: sdr_instagt[321]: %INFRA-REBOOT_LIB-6-SDRNM_INVOKED : SDR NM invoked to reload VM 192.0.8.3
0/1/ADMIN0:Oct 14 22:39:31.373 CEST: vm_manager[2162]: %INFRA-VM_MANAGER-4-INFO : Info: vm_manager brought down VM default-sdr--1
In the admin VM, the line card is displayed as operative:
admin show plat
Tue Oct 15 12:31:53.018 CEST
Location Card Type HW State SW State Config State
----------------------------------------------------------------------------
0/0 NC55-18H18F OPERATIONAL OPERATIONAL NSHUT
0/1 NC57-36H6D-S OPERATIONAL OPERATIONAL NSHUT
Looking at the logs however we see this output:
0/1/ADMIN0:Oct 15 03:43:38.401 : inst_agent[2105]: %INFRA-INSTAGENT-4-XR_PART_PREP_RESP : SDR/XR partition preparation completed successfully
0/1/ADMIN0:Oct 15 04:02:40.888 : sdr_mgr[2140]: %SM-SDR_MANAGER-3-MSG_VMM_ERROR : VMM returned error (SDR NM : sdr default-sdr vmid 1 start failed with error VM_MANAGER_ERR_UNABLE_TO_START_DOMAIN)
0/1/ADMIN0:Oct 15 04:02:40.888 : vm_manager[2162]: %INFRA-VM_MANAGER-3-MSG_VM_START_ERROR : Unable to start VM default-sdr--1 (virDomainCreate() call returned error -1)
0/RP0/ADMIN0:Oct 15 10:46:35.729 : shelf_mgr[2107]: %INFRA-SHELF_MGR-6-USER_ACTION : User admin(192.0.108.4) requested CLI action 'force card reload' for location 0/1
Sometimes (not always) the output also displays this:
LC/0/1/CPU0:Oct 15 13:51:27.438 CEST: arp[292]: %OS-EVM-3-EVM_CONTEXT : event_async_attach: failed to retrieve the evm_context. error(22)Invalid argument : arp : (PID=4237) : -Traceback= 7fe133212a8f 557f7c029395
The partition is created and it appears the LC is about to start except that it then gets brought down by a non specified "failed install operation"
We've tried to reimage the card with the "hw-module location 0/1/ bootmedia network reload" cli but the results are the same. We also tried restarting the chassis, leaving only the NC57-36H6D inserted and again observed the same behaviour. Right now, we're running 7.7.21 which should support this LC.
I confirmed that the device is working in compatibility mode through the "show hw-module profile npu-operating-mode" CLI.
Any suggestions in what we can try next to get this LC to boot or pinpoint exactly what is failing to install?
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide