I need some help from all the experts in this forum on a problem im facing
Im upgrading ASR9912 with RP2, SFC2, Typhoon and Tomahawk cards (no Trident). box is currently running 5.3.3, and the target release is 6.7.3 (32bit) with 6.5.3 as intermediate release as 5.3.3 -> 6.7.3 directly is not supported
To do this I have performed 'install add tar usb:/6.5.3.tar activate' to install 6.5.3
During the upgrade, notification of some FPD failures occurred while some were successful and then the box later rebooted as part of the process on completion of activation.
After reloading, both RP2 crashed, rebooted, and kept on crashing and rebooting in a loop with the following messages in the text box further down.
Eventually after about 5 reboots, one of the RP2 recovered but the other one did not recover at all.
I could see that the recovered RP was indeed running 6.5.3 but many FPD were not done and attempting a manual upgrade did not work. So since 6.5.3 was not committed I reloaded the 9912 to boot up again the committed 5.3.3 software, since this was the only option to bring back the box.
Clearly something went tragically wrong
So the questions I have are:
!!! WARNING !! - Rommon booted from backup flash !!! pcie_device_get_cnfg: Failed because of 'Subsystem(3290)' detected the 'warning' condition 'Code(4)' venid 0x8086 devid 0x10fc idx 0 pcie_device_get_cnfg: Failed because of 'Subsystem(3290)Failed to rename debug file, 18, src: /nvram:/sysmgr.log.timeout.Z, target: /nvram:/prev.sysmgr.log.timeout.Z Nov 17 00:36:12.871 : SYSMGR_LITE: Saving init logs in /nvram:/sysmgr.log.timeout.Z ... ' detected the 'Nov 17 00:36:12.988 : SYSMGR_LITE: INIT: respawn 'vkg_dmac_svr' disabled, exit_code 40704, INIT_MAX_SPAWN reached warning' conditireboot internal : cause code 671088647 cause INIT: respawn 'vkg_dmac_svr' disabled, exit_code 40704, INIT_MAX_SPAWN reached Failed to rename debug file, 18, src: /nvram:/sysmgr.log.timeout.Z, target: /nvram:/prev.sysmgr.log.timeout.Z reboot_internal timeout 30 is graceful no No vR e1b7o 0o0t: o3n6 :1A3SR9912 RP2 (0x100326) in slot 0 By init via REBOOT_CAUSE_SYSMGR (2c000007) Current time: 2021-11-17 00:36:13.463, Up time: 10s A kernel core file was explicitly requested by process init Reboot Reason: Cause code 0x2c000007 Cause: INIT: respawn 'vkg_dmac_svr' disabled, exit_code 40704, INIT_MAX_SPAWN reached Process: init Traceback: a892949 a892e62 a892c95 42073ce a7e0070 0 Active process(s): proc/boot/procnto-smp-instr pid 1 tid 1 on cpu 0, pri 0 proc: fdfe4010, utime = 84838 ms, stime = 2015 ms thread: fdfc4010, thread sutime = 10603 ms, pc = fe6ee84a proc/boot/procnto-smp-instr pid 1 tid 2 on cpu 1, pri 0 proc: fdfe4010, utime = 84838 ms, stime = 2015 ms thread: fdfc4348, thread sutime = 10942 ms, pc = fe6ee84a proc/boot/procnto-smp-instr pid 1 tid 3 on cpu 2, pri 0 proc: fdfe4010, utime = 84838 ms, stime = 2015 ms thread: fdfc4680, thread sutime = 10818 ms, pc = fe6ee84a proc/boot/procnto-smp-instr pid 1 tid 4 on cpu 3, pri 0 proc: fdfe4010, utime = 84838 ms, stime = 2015 ms thread: fdfc49b8, thread sutime = 10628 ms, pc = fe6ee84a x86/bin/init pid 8196 tid 6 on cpu 4, pri 10 proc: fdfe4760, utime = 59 ms, stime = 9 ms thread: fdfcb9b8, thread sutime = 1 ms, pc = a8930d9 eax = aaab7c8, ebx = 28000007, ecx = a8ddf64, edx = 28000007 edi = 1500ba1c, esi = 0, ebp = 40faf6c, exx = fdfcbcd0 cs = f3, efl = 1287, esp = 40faad4, ss = fb pkg/bin/pciesvr pid 28687 tid 1 on cpu 5, pri 10 proc: fdfe64f0, utime = 361 ms, stime = 101 ms thread: fdfd5680, thread sutime = 361 ms, pc = aae77b2 eax = e, ebx = 1500bc2c, ecx = 41ff63c, edx = aae77b2 edi = 15007eb8, esi = 423fbb0, ebp = 41ff6c8, exx = fdfd5998 cs = f3, efl = 80003206, esp = 41ff63c, ss = fb pkg/bin/devc-conaux pid 12299 tid 5 on cpu 6, pri 21 proc: fdfe59d0, utime = 40 ms, stime = 5 ms thread: fdfd3010, thread sutime = 16 ms, pc = ad1d354 eax = d, ebx = b4, ecx = 1, edx = 2f9 edi = 1, esi = 30, ebp = 411bf9c, exx = fdfd3328 cs = f3, efl = 3202, esp = 411bf64, ss = fb proc/boot/procnto-smp-instr pid 1 tid 11 on cpu 7, pri 10 proc: fdfe4010, utime = 84838 ms, stime = 2015 ms thread: fdfde680, thread sutime = 243 ms, pc = fe6bebfd Release mastership on RP2 Normal reboot Writing crashinfo Crash Reason: Cause code 0x2c000007 Cause: INIT: respawn 'vkg_dmac_svr' disabled, exit_code 40704, INIT_MAX_SPAWN reached Process: init Traceback: a892949 a892e62 a892c95 42073ce a7e0070 0 Exception at 0xa8930d9 signal 5 c=2 f=0 Active process(s): proc/boot/procnto-smp-instr Thread ID 0 on cpu 0 proc/boot/procnto-smp-instr Thread ID 1 on cpu 1 proc/boot/procnto-smp-instr Thread ID 2 on cpu 2 proc/boot/procnto-smp-instr Thread ID 3 on cpu 3 x86/bin/init Thread ID 5 on cpu 4 pkg/bin/pciesvr Thread ID 0 on cpu 5 pkg/bin/devc-conaux Thread ID 4 on cpu 6 proc/boot/procnto-smp-instr Thread ID 10 on cpu 7 Reboot reason: Cause: INIT: respawn 'vkg_dmac_svr' disabled, exit_code 40704, INIT_MAX_SPAWN reached Process: init Traceback: a892949 a892e62 a892c95 42073ce a7e0070 0 Dumping local syslog messages RP/0/RP0/CPU0:Nov 17 00:36:12.516 : pciesvr: %PLATFORM-PCIE-7-GEN_DEBUG : PCI_IoMsg - IOM_PCIE_GET_DEVICE_CONFIG not found venid 0x8086 devid 0x10fc inx 0 RP/0/RP0/CPU0:Nov 17 00:36:12.524 : vkg_dmac_svr: pcie_device_get_cnfg: reply_status fail! venid 0x8086 devid 0x10fc idx 0 RP/0/RP0/CPU0:Nov 17 00:36:12.527 : vkg_dmac_svr: %PLATFORM-DMAC-3-OPERATION_FAIL : dmac_hw_ini failure, error code Invalid argument RP/0/RP0/CPU0:Nov 17 00:36:12.536 : pciesvr: %PLATFORM-PCIE-7-GEN_DEBUG : PCI_Io
i have attached the upgrade mop.
in your case with the RP failing, i would recommend to boot in rommon and boot via usb using Turbo boot to see if it recovers
Thanks for the feedback.
Since my starting point is 5.3.3 not 5.3.4 what is the best approach?
Should I upgrade first to 5.3.4 then 6.5.3 then 6.7.3 which seems like a long stretch.
Or should I consider an earlier 6.x release which is supported directly from 5.3.3 ?
One other question is should I keep fpd auto upgrade enabled or not as perhaps that also caused some issues? I recall needing to do this in an earlier release but not sure if it should be on or off for upgrade from 5.3.3
Thanks again for your help
on the downloads page there is a docs.tar file that contains the upgrade mop.
unfortunately most of them will have the upgrade path from extended maintenance release to another release
in your case 5.3.4 is the extended release for 5.3.x code, if you were to upgrade to 6.2.x for example it will require another upgrade
so it seems that 2 upgrades are unavoidable, would you be able to turbo boot an RSP in a spare router to get the base code + configuration on? it might be faster as its 1 step
Unfortunately I dont have a spare 9912 but i can try turboboot to recover the failed RP2, or else RMA it
what is still not clear to me is whether it is mandatory to go from 5.3.3 to 5.3.4 before going to 6.x ?
I see that the MOP for 6.5.3 clearly states that upgrade from 5.3.4 is supported, but I believe 5.3.3 is also EMR and ive seen posts where it was confirmed that 5.3.3->6.x is supported. And indeed it worked fine in my lab on 9904 with RSP880
So perhaps 5.3.3 -> 6.x is only ok to do in some cases and not in others ?
I guess if its 5.3.3->5.3.4->6.5.3->6.7.3 the indeed turboboot may make sense.
So is turboboot usually considered last resort as the standard process allows to install packages and smus in one command and preserves the config. while turboboot means starting from scratch (boot min.vm, add packages, add smus, add config), or are there other reasons or risks for not doing turboboot, as we are essentially jumping direct from 5.3.3 to 6.7.3 anyway