cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5036
Views
29
Helpful
23
Replies

9800 ISSU behavior

eglinsky2012
Level 4
Level 4

I'm attempting an ISSU upgrade from 17.9.3 to 17.9.4 on 9800-80s. In the lab, it worked well, as expected. The process installed the image to both active and standby, then predownloaded the image to the APs. The predownload failed on the 2700, but life went on. Then the ISSU process stopped, and in the GUI, there was a button to continue the upgrade (I forget exactly what it said). Once I clicked that, the standby rebooted. It took a long time for SSO to a terminal status (there was an error in the logs about software mismatch), but after 15 minutes or so, the active finally rebooted. Then the APs did staggered reboots. Once ISSU was complete, the 2700 that had failed the predownload downloaded its new image and rebooted, which is good.

I was happy with how it went in the lab, so I tried on a production WLC which currently has no APs associated. Upon downloading and installing, it went through the whole process, not stopping after predownloading and rebooting the standby. It got stuck at the "upgrading standby" stage; the standby kept rebooting every few minutes (I assume due to SSO sync failure but didn't think to check), so I did a "reload" on the active WLC, thinking sync would be complete after the reboot when they would be on matching versions. After the reboot, sync completed, but the ISSU process was stuck on "upgrading active" even after an hour + of waiting. So, I did a "redundancy reload shelf," and both WLCs were on 17.9.4 and in sync, but ISSU was stuck at the "upgrading active" step still and the commit timer was still running. So, I did an ISSU terminate, back to 17.9.3.

I tried the process again, this time with a single AP associated, and the process did finish cleanly, however, it again continued through the rebooting and not stopping at predownloading. This is undesirable, since I want to do the install and predownload one day and the reboots the next day. The behavior in the lab would make that possible, but not the behavior on the other one.

My question is, what's the normal behavior? Do any of you who have tried the ISSU process had good results, or is it glitchy?

1 Accepted Solution

Accepted Solutions

Rich R
VIP
VIP

1. I second Leo's comment

2. Practically every time we've ever attempted ISSU in lab it has gone wrong in one way or another - so we have never dared risk it on production.  We just pre-download the APs then take the short hit for a few minutes with a reload after hours.

3. As mentioned on previous threads - having any SMU or APSP installed is a very good way to screw up ISSU and require multiple reloads and/or clear install state.

4. is it glitchy? YES!

View solution in original post

23 Replies 23

ammahend
VIP
VIP

Had a similar issue, has to abort ISSU and went well in second try, doesn't seem normal.

9800 (config)# service internal

9800#clear install state

9800 (config)#no service internal

-hope this helps-

Saved my day! Thanks!

It's worth highlighting that "clear install state" triggers a reload so just be prepared for that.  I just needed to use it on lab because SMU and APSP were not properly removed after upgrade and this was without HA or ISSU - just a regular upgrade!

9800#conf t
Enter configuration commands, one per line. End with CNTL/Z.
9800(config)#service internal
9800(config)#^Z
9800#clear install state
clear_install_state: START Fri Oct 27 09:50:03 BST 2023

This command will remove all the provisioned SMUs, and rollback points. Use this command with caution.
A reload is required for this process. Press y to continue [y/n]y
--- Starting clear_install_state ---
Performing clear_install_state on all members
[1] clear_install_state package(s) on chassis 1/R0
[1] Finished clear_install_state on chassis 1/R0
Checking status of clear_install_state on [1/R0]
clear_install_state: Passed on [1/R0]
Finished clear_install_state

Send model notification for before reload
Install will reload the system now!
Requesting RP pvp reload

Leo Laohoo
Hall of Fame
Hall of Fame

Never, ever, do ISSU upgrade without TAC on a WebEx call. 

balaji.bandi
Hall of Fame
Hall of Fame

i did couple of ISSU upgrade recently with Cat 9K AP associated with clients, from 17.X to 17.9.3 all works as expected.

May be in the network we do not have any 2700 to give you confirmation.

Another thing want to to check how is your Sync Link connected between 2 chassis ? back to back or going via another Switch ?

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

eglinsky2012
Level 4
Level 4

Ammahend, at what point do you do that "clear install state" command? Instead of doing an ISSU terminate, could I have used that after both WLCs were on 17.9.4 but ISSU was stuck on "upgrading active"?

Balaji, the chassis are directly connected by fiber (they're located in separate DCs). Each chassis has one uplink to a 6500 VSS in one DC and another uplink to the other VSS in the other DC. Shared L2 VLANs between them.

Rich R
VIP
VIP

1. I second Leo's comment

2. Practically every time we've ever attempted ISSU in lab it has gone wrong in one way or another - so we have never dared risk it on production.  We just pre-download the APs then take the short hit for a few minutes with a reload after hours.

3. As mentioned on previous threads - having any SMU or APSP installed is a very good way to screw up ISSU and require multiple reloads and/or clear install state.

4. is it glitchy? YES!

eglinsky2012
Level 4
Level 4

Thanks, Rich. I will heed your (and Leo's) advice and just do the regular install and predownload.

We haven't been using predownload since it caused WLC crash (8540) several years ago due to a bug, and last time we did it a couple years ago, some APs were stuck predownloading, which prevented the reboot. So, we've just rebooted them and let image download happen afterwards. Fingers crossed predownload goes smoothly this time. If not, is there a way to force the reboot if there are APs still downloading?

Rich R
VIP
VIP

The only problem we had the first time we did a 9800 upgrade was that flexconnect Efficient Image Upgrade was enabled by default which is great if all your APs are on the same site - where ours were spread across multiple sites on a single flex profile (yes I hear the screams of horror - follow the best practice guidelines lol).  So one of each AP model downloaded and the rest were trying (without success) to download from that one.  So either make sure your flexconnect profiles are site specific or turn off Efficient Image Upgrade.  Apart from that lesson learned - pre-download has worked fine.  You can expect the odd AP to fail or need another download and be aware of https://www.cisco.com/c/en/us/support/docs/wireless/catalyst-9800-series-wireless-controllers/220443-how-to-avoid-boot-loop-due-to-corrupted.html

Note https://twitter.com/DarchisNicolas/status/1290210991590871045 which TAC also recommended we set to 50 to improve download speed.
ap profile <profile name>
capwap window size 50

eglinsky2012
Level 4
Level 4

Thanks, Rich.

I just realized mobility was broken between the lab 9800s that (so I thought) upgraded to 17.9.4 successfully with ISSU and an 8540 pair on 8.10.185.3 (control and data path down). Rebooting the 9800 made the mobility tunnel come back up. Could be a coincidence, but I consider that another strike against ISSU.

eglinsky2012
Level 4
Level 4

Rich, that capwap window size, are you using that for all your APs or just some? Cisco says it should only be increased on teleworker/OfficeExtend APs. Would it benefit some APs and be a detriment to others depending if they're on WAN/LAN? The 17.9.4 GUI says the window size is limited to 20 BTW. (I'm not changing it at this time.)

We have a very small number of APs that are connected back to campus via IPSEC VPN over cable broadband. The rest of our remote sites are on gig fiber or faster. No DSL, satellite, or anything like that.

We actually don't have it configured at the moment - may have only been during the upgrade/pre-download (if at all) - I wasn't doing the upgrades myself - just noticed that TAC had recommended in the emails.  So maybe only use it when pre-download needed - but will require the AP to re-join (reset capwap).  If you use Efficient Image Upgrade then probably not much point using it though because most of the APs will download over TFTP from neighbours locally anyway.

17.9.4 CLI:
9800(config-ap-profile)#capwap window size ?
<1-50> AP CAPWAP control packet transmit queue size

9800(config-ap-profile)#capwap window size 50
This feature is supported only on Office Extended APs and may impact other APs
So it takes the config - just warning about using it on non-OE APs.
That GUI message is obviously just a mistake in the GUI help text (one of many) - it still takes up to 50.

Leo Laohoo
Hall of Fame
Hall of Fame

Anyone interested in performing software upgrade with ISSU, please note down the following Bug IDs:

* CSCwe62246
* CSCwh29442
* CSCwh36951
* CSCwh76420

NOTE: In the future, I will be using this thread to update any ISSU-related Bug IDs.

JPavonM
VIP
VIP

I never use ISSU due to the many problems it creates and I also perform manual code upgrade and pre-download always.

Review Cisco Networking for a $25 gift card