08-31-2016 09:28 PM - edited 03-19-2019 11:32 AM
Hi All,
Just performed an upgrade on the CUCM cluster from v8.6.2.22900-9 to v11.0.1.22900-14 on box.
The same VM's were used.
The upgrade was successful with no issues during the upgrade. When we reboot the VM's after the upgrade, they load up correctly, can ping and register phones, RTMT finds the servers, etc, however the VMWare console hangs on the last part of the boot at: SELinux update in progress..
The VM finally goes to the login prompt but can take a few hours to get there. During the time the server functions like it should, just does not present a prompt. During this time the CPU on the servers stays around the 60% mark.
I had opened a TAC case and the engineer at the end of the case didn't know what was causing the issue and suggested a rebuild on the server. For a test, i rebuilt and restored the server to the cluster (on version 11) and it fixed the issue.
We have 9 servers in the cluster, so i would prefer not to do it for all the servers.
When i reboot the affected server again, the same issue happens on the next reboot. Its like the SELinux update never fully applies after its been upgraded and tries to re apply it over and over again on the next number of reboots.
I have also put the servers into permissive mode and that has made no difference.
So has anyone came across this issue or have any suggestions on how to fix it without a server rebuild?
Thank you.
09-01-2016 06:34 AM
It could be the following bug that you are running into
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCux90747/?reffering_site=dumpcr
Manish
09-25-2016 05:15 PM
Thanks mgogna for the quick advice.
I have followed the instructions in the bug report but unfortunately it has not helped the servers. The server still boots up ok, then the CPU sits on around 50% or so and the VMWare console is still stuck at the SELinux update message.
The CPU stays pegged at 50-60% while the SELinux prompt is being displayed on the VMWare console. Once the server finally boots (took 4-5 hours last time) the cpu goes down to around the 25-30% mark.
09-25-2016 05:15 PM
Just an update on this case. We have been able to solve the issue with the help of the TAC.
Basically there were thousands of files left over from the 8.6 installation on the inactive partition. Every time the server would boot, the SELinux process would try and index the files. Putting the server into permissive mode did not help either. Once these were located, TAC logged in with root and removed the files. The removal of the files took around 3-4 hours on some of the servers. This was in line with how long the SELinux process froze on updating when the server was first booted.
To find the files in case you think you have this issue, issue the following commands via SSH:
If the list process takes a few hours, it means you have the issue. So log a TAC case and they will perform the following:
Once the file have been removed, then reboot the server and the SELinux process will now take 1-2 mins to complete.
There has been an internal bug opened i believe, so hopefully this will be fixed in future releases.
11-13-2016 10:40 AM
11-13-2016 01:31 PM
Yes thats very similar to what i saw VS. Did you try and reboot the subscriber again? If you do you will see this process just keeps repeating itself. The call manager application itself works and the cluster is fine, just the CPU goes to 60-70% upon reboot for 3 or so hours while the SELinux re-indexes the millions of files in those locations i posted above. Did you check to see how many files you had in the inactive and active partitions from the links i had posted? You may want to log a TAC case and have them removed or you may face a similar problem during the next reboot and upgrade.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide