Solved: Re: Not sure why we aren't seeing

Igor Gaydarov · ‎11-19-2016

I have a very strange problem trying to upgrade CUCM from 11 to 11.5.

The error is Unable to obtain an inactive partition label

The full log file from upgrading is attached.

I've tried different bootable and nonbootable iso files and even other versions - the result is the same.

This is the BE6K installation.

Did anyone have this problem?

Aseem Anand · ‎11-19-2016

Hi,

The error points to a platform issue where some files are corrupted. I would suggest you to take a DRS back, reinstall the server, restore the backup and then try the upgrade.

Aseem

(Please rate if useful)

View solution in original post

Aseem Anand · ‎11-19-2016

Hi,

The error points to a platform issue where some files are corrupted. I would suggest you to take a DRS back, reinstall the server, restore the backup and then try the upgrade.

Aseem

(Please rate if useful)

Igor Gaydarov · ‎11-20-2016

Aseem, thank you for reply.

Could you be more specific: is it a database error or OS core error?

Is there some way to fix it without reinstalling?

Best Regards,

Igor Gaydarov

Aseem Anand · ‎11-20-2016

Hi Igor,

Is is an OS issue. In my view there is no way to fix it. You are better of doing a reinstall and then try the upgrade again or alternatively you can check with TAC take a second opinion.

Aseem

Fred Nielsen (ePlus) · ‎03-07-2017

Not sure why we aren't seeing this more places, but am seeing this on an upgrade 11.5(1)SU1 to SU2 as well.. igorgaydarov did you ever find a fix or workaround for it?

I wonder if there might be a new bug lurking out there, the upgrade script's "find_inactive_partition_label" function parses the output of "mount -l" looking for partition labels (identified between square brackets at the end of each line, if a label is present) to first ID the proper active partition, and there is a clear difference between the output of this command from two freshly installed systems in my lab:

CUCM 10.5 (RHEL 6.2)

# mount -l
/dev/sda1 on / type ext4 (rw,noatime) [/] <-the script looks for the first root FS
proc on /proc type proc (rw,noexec)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda6 on /common type ext4 (rw) [/common]
/dev/sda3 on /grub type ext4 (rw) [/grub]
/dev/sda2 on /partB type ext4 (rw) [/partB]
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

CUCM 11.5(1)SU1 system (RHEL 6.6)

# mount -l
rootfs on / type rootfs (rw) <-this extra root FS entry clearly breaks the parser script
proc on /proc type proc (rw,relatime)
sysfs on /sys type sysfs (rw,relatime)
devtmpfs on /dev type devtmpfs (rw,relatime,size=4017168k,nr_inodes=1004292,mode=755)
devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /dev/shm type tmpfs (rw,relatime)
/dev/sda1 on / type ext4 (rw,noatime,barrier=1,data=ordered) [/]
/proc/bus/usb on /proc/bus/usb type usbfs (rw,relatime)
/dev/sda6 on /common type ext4 (rw,relatime,barrier=1,data=ordered) [/common]
/dev/sda3 on /grub type ext4 (rw,relatime,barrier=1,data=ordered) [/grub]
/dev/sda2 on /partB type ext4 (rw,relatime,barrier=1,data=ordered) [/partB]
...

Cisco will need to change their upgrade script to resolve this I'm betting.

The base reason for this behavioral change is that RHEL changed from using a regular file for /etc/mtab in previous versions to a symlink, which in turn causes things that look at mtab (notably coreutils disk-related commands like df, mount, etc) to output differently.

CUCM 10.5 (RHEL 6.2)

# ls -l mtab; echo; cat mtab 

-rw-r--r-- 1 root root 295 Mar 7 02:05 mtab

/dev/sda1 / ext4 rw,noatime 0 0
proc /proc proc rw,noexec 0 0
sysfs /sys sysfs rw 0 0
devpts /dev/pts devpts rw,gid=5,mode=620 0 0
/dev/sda6 /common ext4 rw 0 0
/dev/sda3 /grub ext4 rw 0 0
/dev/sda2 /partB ext4 rw 0 0
tmpfs /dev/shm tmpfs rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0

CUCM 11.5(1)SU1 system (RHEL 6.6)

# ls -l mtab; echo; cat mtab 

lrwxrwxrwx. 1 root root   17 Mar  1 18:06 mtab -> /proc/self/mounts

rootfs / rootfs rw 0 0
proc /proc proc rw,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
devtmpfs /dev devtmpfs rw,relatime,size=4017168k,nr_inodes=1004292,mode=755 0 0
devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /dev/shm tmpfs rw,relatime 0 0
/dev/sda1 / ext4 rw,noatime,barrier=1,data=ordered 0 0
/proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
/dev/sda6 /common ext4 rw,relatime,barrier=1,data=ordered 0 0
/dev/sda3 /grub ext4 rw,relatime,barrier=1,data=ordered 0 0
/dev/sda2 /partB ext4 rw,relatime,barrier=1,data=ordered 0 0
...

Igor Gaydarov · ‎03-09-2017

Frederick, the only one workaround for me was just to reinstall the CUCM completely.

As I understand, there is a bug in OS, there is no way to fix it.

jdiegmueller · ‎03-12-2019

Fred, I know it has been a few years, but I wanted to both thank you for your diligant work on this as well as document the workaround I employed in case I or anyone else need to refer to it again in the future.

I ran in to this same scenario on a CUCM 11.5(1)SU4 -> 11.5(1)SU5 upgrade. The work you put in to finding exactly what was failing (the parsing of /etc/mtab) saved me hours of troubleshooting, and I was able to implement a workaround within minutes of finding this thread & your analysis.

In my case, what I did was remove the symlink and generated a "normal" /etc/mtab where /dev/sda1 (my active partition, hence mounted / and labeled /) was listed before rootfs. This resulted in the upgrade script function working as expected, hence allowing the upgrade to proceed as desired.

In case anyone else runs in to this needs to do this, root in and run this (this is all on one line):

cp -a /etc/mtab /etc/mtab.ORIG && rm -f /etc/mtab && cat /proc/self/mounts | grep -v rootfs > /etc/mtab && cat /proc/self/mounts | grep rootfs >> /etc/mtab && chmod 644 /etc/mtab

It will:

* Copy the original contents of /etc/mtab to /etc/mtab.ORIG (not really necessary since the original contents are in /proc/self/mounts, but whatever)

* Remove the symbolic link

* Create an /etc/mtab with the same contents as before, except rootfs is now last

Once this was done, the SU5 upgrade ran without issue. The partition that SU5 installed to left a normal /etc/mtab not a symlink to /proc/self/mounts, as I/we originally had) so the issue should not re-present itself.

I confess I am really confused what the conditions that lead to this situation were. In our case, it was all 5 servers of a CUCM 11.5(1)SU4 cluster that were installed with 11.5(1)SU4 (ie, there was nothing in the inactive partition). I spun up a 2-node 11.5(1)SU4 cluster in a lab, and indeed both /etc/mtab's are symlinks. So perhaps it's just installation of certain versions? Like you, I'm confused how this thread is basically the only chatter about this behavior out there .. but then again this is only the first time I've run in to this, either, and I touch a LOT of CUCM clusters in a given year. So who knows.

Thanks again,

-jd

jdiegmueller · ‎03-12-2019

Fred, I know it has been a few years, but I wanted to both thank you for your diligant work on this as well as document the workaround I employed in case I or anyone else need to refer to it again in the future.

I ran in to this same scenario on a CUCM 11.5(1)SU4 -> 11.5(1)SU5 upgrade. The work you put in to finding exactly what was failing (the parsing of /etc/mtab) saved me hours of troubleshooting, and I was able to implement a workaround within minutes of finding this thread & your analysis.

In my case, what I did was remove the symlink and generated a "normal" /etc/mtab where /dev/sda1 (my active partition, hence mounted / and labeled /) was listed before rootfs. This resulted in the upgrade script function working as expected, hence allowing the upgrade to proceed as desired.

In case anyone else runs in to this needs to do this, root in and run this (this is all on one line):

cp -a /etc/mtab /etc/mtab.ORIG && rm -f /etc/mtab && cat /proc/self/mounts | grep -v rootfs > /etc/mtab && cat /proc/self/mounts | grep rootfs >> /etc/mtab && chmod 644 /etc/mtab

It will:

* Copy the original contents of /etc/mtab to /etc/mtab.ORIG (not really necessary since the original contents are in /proc/self/mounts, but whatever)

* Remove the symbolic link

* Create an /etc/mtab with the same contents as before, except rootfs is now last

Once this was done, the SU5 upgrade ran without issue. The partition that SU5 installed to left a normal /etc/mtab not a symlink to /proc/self/mounts, as I/we originally had) so the issue should not re-present itself.

I confess I am really confused what the conditions that lead to this situation were. In our case, it was all 5 servers of a CUCM 11.5(1)SU4 cluster that were installed with 11.5(1)SU4 (ie, there was nothing in the inactive partition). I spun up a 2-node 11.5(1)SU4 cluster in a lab, and indeed both /etc/mtab's are symlinks. So perhaps it's just installation of certain versions? Like you, I'm confused how this thread is basically the only chatter about this behavior out there .. but then again this is only the first time I've run in to this, either, and I touch a LOT of CUCM clusters in a given year -- so who knows.

Thanks again,

-jd

jdiegmueller · ‎06-22-2019

Ran in to this again on a CUCM IM&P 11.5(1)SU5 -> 11.5(1)SU6 upgrade for a completely different customer today. Wanted to again thank the gentleman who originally pointed me in the right direction on this, and also to thank myself for smartly leaving behind a simple "copy and paste" solution so I didn't waste time on this again. :)

-jd

dlcharville · ‎09-24-2020

jd - thanks for posting this work around and thanks to Fred for his work. I've run into this issue twice in the lab while practicing an upgrade on production. On the same cluster/environment, although seen it on different products. One was Publisher CUCM 10.5(1) SU2 to CUCM 10.5(1) SU10. Second was Publisher of UCCX 10.6(1)SU3 ES03 to 11.6(2).

Opened a TAC case and there is no bug on this, was told to rebuild the Publisher. Sure wish Cisco would take ownership of this issue and figure out how to fix it without rebuilding the Publisher.

Thanks,

Dan

jdiegmueller · ‎07-24-2024

Shout to to this old thread from 2020, still running in to this in 2024. Ran in to this again today on an 11.5(1)SU4 cluster, this time only the IMP portion. Same deal, used the /etc/mtab fix I provided above, immediately fixed.

R_Acuti · ‎04-24-2020

Just an FYI that this is STILL happening as late as SU7.

My employer is not allowing me to apply the fix outlined in this discussion because it is not a Cisco sanctioned workaround. He is concerned that we'll invalidate our Cisco support plan if we make off-support changes to the system.

I'll open a TAC case and see what Cisco has to say about it.

CUCM Upgrade Failed - Unable to obtain an inactive partition label