10-26-2023 12:47 PM
I had one of three 9410's not complete a full ISSU upgrade. The code was copied to the device, verified the hash, and I used this command to kick off the upgrade:
install add file flash:cat9k_iosxe.17.09.04a.SPA.bin activate issu commit
When the standby sup reloaded, it went into a boot loop where this all repeats:
=================================================================================
Initializing Hardware......
System Bootstrap, Version 17.8.1r[FC1], RELEASE SOFTWARE (P)
Compiled Tue 02/01/2022 13:16:47.55 by rel
Current ROMMON image : Primary
Last reset cause : SoftwareResetTrig
C9400-SUP-1 platform with 16777216 Kbytes of main memory
Preparing to autoboot. [Press Ctrl-C to interrupt] 0
boot: attempting to boot from [bootflash:packages.conf]
boot: reading file packages.conf
#
###############################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################
Oct 26 00:11:38.624: %BOOT-3-SYSD_STARTFAIL: R1/0: Failed to launch boot task mount_packages.service ( exit-code )
Oct 26 00:11:39.289: %BOOT-0-BOOT_COMPLETE_FAIL: R1/0: Critical boot tasks failed: * *
=================================================================================
So understanding that the standby wouldn't boot, I dropped to rommon and tried manually booting via packages.conf, but that had the same result. Luckily I had a usb drive in with the software on it, so I booted the .bin file using "boot usbflash0: <filename>" and that brought the switch up. Once the standby (slot 6) was booted, the ISSU process kept going, and it upgraded slot 5's SUP, and that one successfully auto-booted. At that point I had slot6 as ACTIVE and slot5 as Standby Hot. With them both booted, I copied the packages.conf file from the good sup to the other, just in case it was that file being bad. Then did a force-switchover and it still failed to boot.
At this point, the switch works fine, but, I know I'm not all in a good place yet. The ISSU process hasn't fully completed, with "show install summary showing slot 6 as "Activated & Committed" but slot 5 (the good one) as "Activated & Uncommitted". My guess is that I could run the commit command on that one and it would show as finished.
Here's some info:
==================================================================================================
MY-9410#sh redund
Redundant System Information :
------------------------------
Available system uptime = 1 year, 35 weeks, 1 hour, 58 minutes
Switchovers system experienced = 10
Standby failures = 1
Last switchover reason = active unit removed
Hardware Mode = Duplex
Configured Redundancy Mode = sso
Operating Redundancy Mode = sso
Maintenance Mode = Disabled
Communications = Up
Current Processor Information :
-------------------------------
Active Location = slot 5
Current Software state = ACTIVE
Uptime in current state = 18 hours, 41 minutes
Image Version = Cisco IOS Software [Cupertino], Catalyst L3 Switch Software (CAT9K_IOSXE), Version 17.9.4a, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2023 by Cisco Systems, Inc.
Compiled Fri 20-Oct-23 10:44 by mcpre
BOOT = bootflash:packages.conf;
CONFIG_FILE =
Fast Switchover = Disabled
Initial Garp = Disabled
Peer Processor Information :
----------------------------
Standby Location = slot 6
Current Software state = STANDBY HOT
Uptime in current state = 18 hours, 8 minutes
Image Version = Cisco IOS Software [Cupertino], Catalyst L3 Switch Software (CAT9K_IOSXE), Version 17.9.4a, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2023 by Cisco Systems, Inc.
Compiled Fri 20-Oct-23 10:44 by mcpre
BOOT = bootflash:packages.conf;
CONFIG_FILE =
MY-9410#sh issu state det
Current ISSU Status: In Progress
Previous ISSU Operation: Successful
=======================================================
System Check Status
-------------------------------------------------------
Platform ISSU Support Yes
Standby Online Yes
Autoboot Enabled Yes
SSO Mode Yes
Install Boot Yes
Valid Boot Media Yes
Operational Mode HA-STANDALONE
=======================================================
Added Image:
Name Compatible
-------------------------------------------------------
17.09.04a.0.6 Yes
Operation type: One-shot ISSU
Install type : Image installation using ISSU
Current state : Activated state
Last operation: Switchover
Completed operations:
Operation Start time
-------------------------------------------------------
Activate location standby R1 2023-10-25:19:04:16
Activate location active R0 2023-10-25:19:39:31
Switchover 2023-10-25:19:40:43
State transition: Added -> Standby activated -> Active switched-over
Auto abort timer: inactive
Abort Reason: N/A
Running image: bootflash:packages.conf
Operating mode: sso, terminal state reached
MY-9410#show install summ
[ R0 ] Installed Package(s) Information:
State (St): I - Inactive, U - Activated & Uncommitted,
C - Activated & Committed, D - Deactivated & Uncommitted
--------------------------------------------------------------------------------
Type St Filename/Version
--------------------------------------------------------------------------------
IMG U 17.09.04a.0.6
[ R1 ] Installed Package(s) Information:
State (St): I - Inactive, U - Activated & Uncommitted,
C - Activated & Committed, D - Deactivated & Uncommitted
--------------------------------------------------------------------------------
Type St Filename/Version
--------------------------------------------------------------------------------
IMG C 17.09.04a.0.6
--------------------------------------------------------------------------------
Auto abort timer: inactive
--------------------------------------------------------------------------------
==================================================================================================
The question is, where do I go from here? I could commit the upgrade. But obviously the 2 errors "R1/0: Failed to launch boot task mount_packages.service (exit-code)" and "R1/0: Critical boot tasks failed: * *" mean that either some service couldn't mount something, or some file wouldn't boot. I'm running in Installed mode, not bundle (however the standby sup is booted using the .bin file which is like bundled mode).
Do I commit this upgrade, then do "install remove inactive" and make some drive space, and then copy the .bin file back to the bootflash: and try running the upgrade command again? Maybe that would re-run the whole upgrade process and fix what isn't happy? Do I wipe the bootflash and then expand the files back onto it? Do I copy all of the files one by one off the good SUP onto the other SUP and then try to boot? If I had a spare chassis I'd throw the sup in it, wipe it, install the same software on it, and then stick it back in the original chassis and let it pair back up, but I don't have that option right now.
Any good ideas on how to go on from here so that I can be confident that the next upgrade goes well? Or, do I just ride it out until the next inevitable upgrade and cross my fingers during that one?
10-26-2023 03:04 PM
install abort issu
NOTE:
Personally, I do not like nor recommend ISSU, FSU/eFSU/xFSU. My reason is because I have personally seen too many "code brown" moments (where I work). And in this forum, a few of us have been fixing other people's "code brown" too. I have always maintained a position that ISSU, FSU/eFSU/xFSU only works in "corner cases", a lab environment or a demo.
10-26-2023 03:21 PM
I understand. I do feel that way on the 4500 platform. Also, on the 6500 platform I'd done them for years and found that *if* you do some additional steps, the upgrade can go just fine. But you have to manage the upgrade, rather than let it do it itself. I haven't seen it go bad on the 9400's yet. This problem I'm not sure if it was caused by the ISSU process or just a fluke. On these, if you don't do ISSU, what method do you like to use? Do you just remove the "ISSU" from the command and then do a complete reload? If that is the case, then I don't know if I'd have had more success. My gut feeling is that something went wrong when it installed the package, which would seem to happen in installed mode as a possibility anyway. If booted in bundled mode probably not. I've never run a 9300/9400 series in bundled mode though, so I don't have experience with that.
10-26-2023 03:49 PM - edited 10-26-2023 03:51 PM
I developed my own method of upgrading the firmware of the 9500, ASR & ISR routers, 9800 controllers (without using PI or DNAC) because the Cisco "recommended" method does not give me the flexibility to reboot the 9500 on a later date. The method I have developed is called One-Hit-Wonder (NSFW version) (see attachment). NSFW because it is not a Cisco recommended procedure.
Since the development of this process, I have been testing and polishing the process for 4 years and I have not "lost" any appliances because my process is broken. It works. I unpack the packages any time during business hours and, for example, schedule a reboot at 7am the next morning.
10-26-2023 04:35 PM
Leo,
The process looks simple enough. I see the benefit because you can then do "reload at xx:xx" and schedule it. I must admit, after doing this for 20+ years, once Cisco switched to the package files, I never really understood what was going on during the boot process with all those files. So, when you rename the packages.conf file and then obviously you're booting the firmware.conf file, I don't know what ramifications that has. I don't know the difference between the 2. What would have happened if you just left it to boot Packages.conf? Shouldn't that also reboot the switch, or because you booted that file would it get upset and not continue? Also, are you in effect then booting it similar to booting the .bin file? I did read that booting the .bin file requires more memory because it has to load the whole file into memory. Not sure if that matters a lot though. Anyway, it's nice to see that the process works for you. If I had anything that wasn't in production, I'd try it some day.
10-26-2023 04:52 PM - edited 11-01-2023 05:58 PM
@RVTim wrote:
So, when you rename the packages.conf file and then obviously you're booting the firmware.conf file, I don't know what ramifications that has. I don't know the difference between the 2.
Same reason as people's option/choice to use Bundle Mode. Every time one has to upgrade the firmware, in Bundle Mode, one of the process is to replace the boot variable string to point to the new BIN file. If the boot variable string syntax is done incorrectly, either the platform boots the wrong firmware or, worse, boots into ROMMON (CSCvg37458).
But if the boot variable string always points to "packages.conf", I can just rename the "firmware.conf" into "packages.conf". It is simpler but, most importantly, minimizes the risk of having the wrong boot variable string syntax.
@RVTim wrote:
I did read that booting the .bin file requires more memory because it has to load the whole file into memory. Not sure if that matters a lot though.
IOS-XE memory leaks like no tomorrow. Anything to minimize memory utilization is good in my book.
It is not possible to apply SMU if the appliance has boot into Bundle Mode. Take, for example, the SMU to fix CSCwh87343 (Software Fix Availability for Cisco IOS XE Software Web UI Privilege Escalation Vulnerability - CVE-2023-20198). If the appliance boot in Bundle Mode, SMU cannot be applied.
And one last, very important thing, there is a bug feature that I want to share. Notice the "gotcha" sections? It does not matter what method is used (Cisco-recommended or One-Hit-Wonder), there will be occasions where the system will refuse or unable to rename the existing packages.conf file. The Cisco recommended method does not have this check and this causes to appliance to reboot into the current, not intended, version. The process I have developed incorporates a manual, albeit archaic, method to double-check.
10-28-2023 05:28 PM - edited 11-01-2023 06:07 PM
Although CSCwh76420 only applies to Catalyst 9800 WLC, it is still worth noting.
CSCwe62246 is a bug for 9400 and ISSU.
10-30-2023 06:52 PM
I just wanted to follow up on this that I did get the system back working fine. I did a few things and as I was doing them the sup was able to boot unassisted.
1) I did an "install remove inactive".
2) I copied the .bin file again to the active bootflash: (slot 5)
3) I ran the install command again with the word "force" appended. It didn't seem to do much.
The slot 6 Sup still wouldn't boot on its own.
4) I manually booted slot 6 SUP
5) I did a force-switchover to make slot 6 active
6) I again did the install command again with force. It didn't seem to do anything.
7) I began the CPLD updates by updating the standby (Slot 5)
9) I reloaded the entire chassis with "redundancy reload shelf"
Everything came up just fine and works good now. Just wanted to throw that all out there in case someone reads this.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide