11-09-2015 06:15 AM - edited 03-08-2019 02:37 AM
Hi, We've got a couple of 6509 VSS systems running that are running deferred code and have bugs that we are seeing. To correct this, it's time for an IOS upgrade. We're currently running 151-2.SY2, and solely based on it's assurewave status, we don't plan to go all the way to the current 15.2, but go to 151-2.SY6. Running in a hospital environment.
When we first built the system, we tested the quad sup failover and it went real well, and I've read this whitepaper that describes the EFSU process a bit:
Is there any better documentation that Cisco (or you yourself) have that would fully describe all of the things that we need to do to accomplish this upgrade with no downtime? It seems like the actual process should be rather simple, but I'd like to lay out a command list plan so that when it comes time to execute it, we don't have any gotchas.
And again, we're running Quad sup 2T's.
The way it looks, we should lose maybe 1 ping while we do this, if everything goes as planned.
Thanks for any help!
11-09-2015 12:06 PM
I'm not sure if you've tested upgrading IOS with this kind of setup but I've always been a strong proponent of upgrading VSS using the "old" method: Copy the IOS into the two (or four) supervisor cards, change the boot statement and reboot.
I've never had a failure with the "old" method but we've sure ran into problems with FSU/eFSU.
11-19-2015 08:29 AM
I thought I would report back and also ask a follow-up question.
Our first attempt at this upgrade didn't go very well. We have quad Sup, were running 15.1.2(SY3) and there were bugs. The console was scrolling some netflow errors related to a memory leak bug we knew about. When the standby tried to take over, we had our links drop out of the port channels, and that was expected, but, a couple things happened. First, the standby sup didn't come online quick enough so the software upgrade aborted. But then, to really make things bad, these port channeled links tried to come back online but in unbundled state, causing spanning-tree events that whacked out the system. And if that wasn't bad enough, it affected another hospital that is connected to this one via a port channel, where our primary WAN links are, thus disturbing all of our routing for both hospitals. Eventually our only way to revover was to power off chassis 2 and let everything fall back into place. It was either that or yank all the optics. The only good news was that the reboot apparently cleared up the bug/error messages for the time being.
Our 2nd attempt today went flawlessly. We did do one additional precaution and that was to shut down all NON-VSL and Non-Dual-Active-Hello links on the #2 chassis before we started. That allowed it to upgrade the Sup code and reload all of the line cards without any effect on any links other than the VSL links, preventing them from causing unbundled spanning-tree issues. At that point we could do the "issu runversion" and since we had VSL links it could still manage chassis #1. We then waited for everything to stabilize and re-enabled all port channeled links, and did the "issu acceptversion". Then we disabled all of the links on Chassis #1, before running "issu commitversion", which did the same for the system as when we did it on the other chassis. And after it did it's rolling upgrade and went stable, we fired up all of those links and all was well again. So the quad sup process was good, and did work well that time.
----
My Follow-Up question: I need to get a system from SY2 to SY6, which is outside the 18 month window. Will the upgrade even allow itself to start, or is it hard-coded not to allow it? Feature wise we aren't changing anything, but I'd rather not have to upgrade to SY4a, and then re-upgrade the system one more time, risking another event, to SY6.
----
Now for help in case there are people looking for info:
This whitepaper is excellent:
http://www.cisco.com/c/en/us/products/collateral/switches/mgx-8800-series-switches/white_paper_c11-729039.html
And here are some commands in order that I used:
show issu state detail (look for "init")
show switch virtual redundancy (look for "sso")
show bootvar (look for it)
sh run | inc boot system (look for boot system flash command)
issu loadversion 1/5 bootdisk:s2t54-advipservicesk9-mz.SPA.151-2.SY6.bin 2/5 slavebootdisk:s2t54-advipservicesk9-mz.SPA.151-2.SY6.bin
show issu state detail (look for "please issue the issu runversion command...")
show switch virtual redundancy
ISSU loadversion is considered complete when you see the 'Bulk sync succeeded' message.
issu runversion (forces the switchover to 2nd chassis)
show issu state detail
show switch virtual redundancy (look for Run Version Completed)
issu acceptversion (stops the rollback timer)
issu commitversion (this starts the other chassis)
show issu state detail
show switch virtual redundancy (look for completed)
ISSU State is now 'No Upgrade Operation in Progress'. After a couple of minutes, this will go back to 'Init' state which means a new eFSU upgrade/downgrade can be initialized.
The process ends with two boot variables in the running config, one for the old image and another for the new image. You want to delete the old image boot.
sh run | i boot
delete the old boot line
Here are some syslog entries from our system showing the syntax of the messages...maybe helpful for grepping through syslog to see when the events happen.
Nov 19 05:11:57 sjcf-vss-core 7475030: Nov 19 05:11:56: %ISSU_PROCESS-SW1-6-LOADVERSION_INFO: Loadversion has completed. Please issue the 'issu runversion' command after all modules come online.
Nov 19 05:18:12 sjcf-vss-core 1026: Nov 19 05:18:11: %ISSU_PROCESS-SW2-6-RUNVERSION_INFO: Runversion has completed. Please issue the 'issu acceptversion' command
SJCF-VSS#issu acceptversion
% Rollback timer stopped. Please issue the 'issu commitversion' command.
Nov 19 05:42:12 sjcf-vss-core 1709: Nov 19 05:42:11: %ISSU_PROCESS-SW2-6-COMMITVERSION_INFO: Upgrade has completed, updating boot configuration
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide