cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 
cancel
4222
Views
0
Helpful
8
Replies

UCS not detecting ESXi Host reboot to apply Firmware Updates

eefranzen
Level 1
Level 1

Not sure if I'm missing something, but it seems that UCS  doesn't detect reboots outside of the CI environment.

I'm updating the firmware on the B200-M3 and M4 blades and I have the User Acknowledgement set and the blade is sitting waiting for me to acknowledge the reboot. I then go to vCenter Update Manager and apply the updated enic and fnic driver updates and reboot. When I go back to UCS, it's still requiring it's own reboot, so I have to reboot a second time.

So, now I'm having to manually put each host into maintenance mode, apply the updates, switch back to UCS to reboot and then switch back to vCenter and repeat 100 times.  

I miss the days with I can use vCenter Update Manger to patch 100s of Hosts without intervention. It seems as though I have gone back to the 5-1/4 inch floppies and sneaker net days.

Is this normal behavior or did I miss something that will allow firmware updates to apply on the next reboot?

Thanks,

Ernie

8 Replies 8

tcerling
Level 4
Level 4

I ran into the same thing in the Microsoft environment.  UCS goes through a special procedure when you acknowledge the reboot request within UCS to apply UCS specific things.  A host operating system reboot does not trigger the flag with UCS to cause the acknowledgement.

I was told they are looking into enabling the host OS reboot to trigger the update.  I'm guessing that it will take a little bit of a UI change because you would not necessarily want changes to apply if the system performed and unscheduled reboot (a.k.a. crash/restart).  You will most likely want to perform some sort of scripting to turn on acknowledgement on OS reboot before performing the OS reboot.

Thanks for confirming.

One would think that since Cisco and VMware are partners of this Flexpod thingy, that they would coordinate patching a little better.

I would also be interested to know if UCS performs clean OS shutdowns before it reboots the blade or are we expected to power down the server via the OS, and then reboot via UCS.

Ernie

As I said, it is being investigated for a future release.  There are situations where you would not always want an OS reboot to effect the pending change that UCS has made.  And, since the host reboot has absolutely no knowledge of the fact that it is running under UCS, there is nothing in the host reboot that would automatically trigger UCS to the fact that it is rebooting.  Particularly dangerous is if the host crashes and restarts.  Cisco engineers are aware of the various scenarios and are working on a solution.

Currently, when you acknowledge the pending reboot, I am pretty sure it asks if you want a clean shutdown or not, just like any other time UCS reboots the host.

I ceased trying to use that functionality, since it was always a hard shutdown of the ESXi server, and failed to vMotion the guests off. Resulting in a lot of dead VM's and angry owners.

I manually evacuate a host, shut it down and acknowledge it in UCSM. It's not the ideal 'Rolling Upgrade' they say it is.

eefranzen
Level 1
Level 1

As far as a clean shutdown, not so much. We receive Event Log entries on the Blades running Windows that show the previous shutdown was unexpected.

So, hopefully this will get resolved soon or Cisco will stop releasing new firmware updates monthly.

Ernie

thomsonac
Level 4
Level 4

Hi Ernest,

I wrote this script over the summer to iterate through all of our ~120 ESXi servers that were pending reboot for firmware updates.  The script connects to UCSC and then displays all connected UCSM domains for you to select one. You run the script with two required parameters and one optional, a vCenter instance and a UCSC instance are required while a SCOM server is optional.  The script will only update 1 host at a time.  I added a bit of error checking and you can set a mail server and email address to email yourself if the script bombs.  I'd say it runs pretty reliably; I would kick it off in the morning and keep an eye on it throughout the day.  I suggest running it with a -Verbose flag.  Lastly, the script will remediate the host through VUM to add the new UCS drivers for the new firmware after the firmware update. 

You will need to update the Host Firmware policy in UCSC first before running this script to set the service profiles to "Pending Reboot"

As always, use this script with your own discretion and there is no implied warranty or support.  Good luck!

Script steps

1) Import needed modules

2) Connect to vCenter

3) Connect to UCSC and display all connected UCSM domains.  Select which domain to update.

4) Connect to that UCSM instance

5) Return all ESXi global service profiles that need updates that are associated in that UCSM domain. See Note #1

6) Sets SCOM maintenance mode for that host

7) Puts the host into VMware maintenance mode.  See Note #2

8) Disables VMware alarms

9) Shuts down the host

10) Acknowledges the UCS pending reboot

11) Waits for the update to finish.  See Note #3

12) Boots the blade if it is finished and off

13) Remediates the ESXi host in VUM

14) Enables alarms

15) Tries to clear the flexflash errors that occur with UCS (very annoying bug).  This usually works, sometimes not, doesn't affect the running of the blade.  See Note #4

16) Takes the host out of VMware and SCOM maintenance mode.

Notes

1) On line 146 I search for all global service profiles that have *ESX* in the name (our naming convention).  You may need to change this to follow your naming convention.

2) The script will wait forever for the ESXi host to go into maintenance mode.  I did this on purpose in case there are VMs pinned to a particular host (backup appliances in my case).  I would notice that the script was waiting for a while and manually resolve and DRS issues.

3) Occassionally the script will progress despite the UCS blade status still being "Config".  During the firmware update the blade cycles through Config, Discovery, etc and I was too lazy to write in all the possibilities   It shouldn't affect the script

4) We run our ESXi servers with local disks and not flash cards.  This might break your host, test it if you run on flash cards!!!

5) Update line 252 with an email address and SMTP server if you want to get errors via email.

Thank you. I do not use SCOM, but since you have done a superb job documenting, I might be able to get it working without it.

Ernie

Thanks.  If I remember correctly I do a try/catch to put the server into SCOM mainentance mode so as long as you set the variable to anything I would think that it would still work.  PM me if you have any questions.

-Alex

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card