Solved: Re: Has anyone upgraded FW from 2.2(6e) to 3.1 (3j) and had any issues?

nikthatte · ‎09-20-2018

I am upgrading FW in UCS for the first time this weekend and I am really nervous that blades are going to disappear or that we will have a major outage of some sort. I am just doing the infrastructure bundle this Friday night and was wondering if anyone has had any issues in the past going from 2.2 6e to 3.1 3j that I should be aware of? I have opened a proactive TAC case and they are looking at the show tech but would rather cover all my bases and check with you guys as well.

Kirk J · ‎09-20-2018

That's just it, you don't want to pin the restart/wait to a specific time,,, as customer with lots of chassis and blades, and service profiles with a lot a defined vnics and vhbas can take a lot longer for the vics to come back up. You will want to watch your alert count (which will have an alert counter incremented for every vnic and vhba that is down) to go back to zero, or what ever number it was before you started upgrade. I would watch for any alerts that say something about vif interface down. It would also be prudent to spot check your server from the OS perspective. For example log into esxi, drill down into a LUN/datastore, and check the multi-pathing, and make sure it shows all the paths back up before rebooting the 2nd (generally cluster lead) FI.

Thanks,

Kirk..

View solution in original post

Kirk J · ‎09-20-2018

Greetings.

One of the biggest issues we see in TAC is that customers don't completely wait for related vNICs/vHBAs to come back up on the FI that was just upgrade/rebooted, before acknowledging the next FI/reboot step This results in dual paths being down for some or all blades.

Other things to check include

Make sure you actually have active dual storage paths for FC or eth based iSCSI/NFS.
vswitch & esxi vnic numbering. Do a spot check to make sure your ESXi VMNIC numbering actually matches UCSM vnic numbering. If they don't, your whole side A/ Side B concepts can be off. This typically can become an issue if you have added additional vnics or vhbas to your service profile after ESXi was originally installed.

Thanks,

Kirk...

nikthatte · ‎09-20-2018

How long would you say I should wait before acknowledging the primary fi reboot. From what I understand, that the last step of the infrastructure upgrade.

Kirk J · ‎09-20-2018

That's just it, you don't want to pin the restart/wait to a specific time,,, as customer with lots of chassis and blades, and service profiles with a lot a defined vnics and vhbas can take a lot longer for the vics to come back up. You will want to watch your alert count (which will have an alert counter incremented for every vnic and vhba that is down) to go back to zero, or what ever number it was before you started upgrade. I would watch for any alerts that say something about vif interface down. It would also be prudent to spot check your server from the OS perspective. For example log into esxi, drill down into a LUN/datastore, and check the multi-pathing, and make sure it shows all the paths back up before rebooting the 2nd (generally cluster lead) FI.

Thanks,

Kirk..

nikthatte · ‎09-21-2018

Makes sense. Thanks Kirk. We have about three chassis so it shouldn't take too long but I'll watch the error count. I don't mind waiting, I'm just worried something so horrible like the FIs loose all configuration happens.

Does it usually just either work or fail? Any major catastrophes I should be aware of? I wasn't able to get an actual outage window. The management said previous admins did it without an outage so...

nikthatte · ‎09-21-2018

Makes sense. Thanks Kirk. We have about three chassis so it shouldn't take too long but I'll watch the error count. I don't mind waiting, I'm just worried someoneso horrible like the FIs loose all configuration happens.

Does it usually just either work or fail? Any major catastrophes I should be aware of? I wantwa able to get an actual outage window. The management said previous admins did it without an outage so...