Solved: Re: CUCM 12.5 NTP Unsynchronized - Page 2

bwolf · ‎03-24-2021

Hello All,

CUCM System version: 12.5.1.13900-152 - 1 Publisher and 3 subscribers

I'm having an issue where NTP is unsynchronized. I can sit and use utils ntp restart on the publisher all day long and it never synchronizes.

remote refid st t when poll reach delay offset jitter
==============================================================================
+<NTP IP ADDRESS>                u 15 64 1 1.023 -1823.8 1653.58
+<NTP IP ADDRESS>                u 14 64 1 40.296 -1865.4 1568.27
+<NTP IP ADDRESS>                u 13 64 1 0.907 -1910.8 1698.47
*<NTP IP ADDRESS>                u 45 64 1 42.508 -519.46 287.555

unsynchronised
polling server every 64 s

You can see lots of jitter and offset. this status is right after I had just restarted NTP

We're using known good internal Linux NTPs (CUCM is the only system in our network having the NTP issue) that we have been using for years with no issues. port 123 is open, no issues there.

I have a case open with TAC, they are recommending to uncheck the "Synchronize Time with Host" option on the VM in VMWare (ESXi version 7.0). Which I haven't been able to do yet because our company is in a change freeze.

I just wanted to check here to see if anyone else has experienced this and if they had any suggestions to get NTP resynchronized. I also wanted to see if anyone who had this issue was able to resolve it by unchecking the "Synchronize Time with Host" option in VMWare. Most info I can find says to do traces (we've done that with TAC) and just keep trying to restart NTP, but this does not seem to help with getting NTP synchronized.

This issue is causing a DBRepliecationFailure to occur every hour. The issue seems to have started after I upgraded from 11.5 to 12.5 (SU1) and continues to happen after we upgraded to 12.5 (SU4), which is the most current version of CUCM available.

When it first started happening, it seemed like restarting the CUCM cluster would resolve it temporarily. About one week after the restart, the issue would occur again. But the issue came up again two days ago and now restarting the cluster isn't even helping as the issue persists pretty much immediately after the cluster restart.

Any info would be greatly appreciated

Thanks!

bwolf · ‎05-20-2021

Hey RITR,

Thanks for the info. I appreciate the input

Roger Kallberg · ‎05-20-2021

Never had any issues with NTP in CM. If you use a suitable NTP service it runs like clockwork.

bwolf · ‎06-01-2021

Dang, well lucky you, I wish some of that luck would rub off on my system XD. We're using suitable NTPs, everything else in our company uses them too, CUCM is the only system with issues. It's not like NTP is some complicated variable to work out, it just needs to keep time synched. We've used our internal NTPs for years with no issues until we upgraded the ESXi hosts to 7.0. Even good known "suitable" public NTPs (ones TAC recommended using) have the same issue. This is a bug Cisco needs to address/patch because they are falsely claiming CUCM 12.5 SU3 and ESXi 7.0 are compatible, they obviously are not compatible and I'm convinced of that even more so now that I've seen how many other people have been dealing with this issue.

Mortaza Rohani · ‎05-31-2021

We have the same issue on cucm 12.5 su2 and esxi 7u2 and we have temporarily solved the problem by rollbacking to exi6.7.

The question is whether Cisco will offer a patch for this issue or the only solution is to upgrade to version 14?

Thanks

bwolf · ‎06-01-2021

Hey Mortaza,

Yeah, I'd give my left foot if we could go back to ESXi 6.7, but there is a 0% chance of that happening at our company, I've begged. I'm temped to upgrade to v14, but Cisco said CUCM 12.5 SU3 and ESXi 7.0 are compatible, which is obviously not true. So, I'm having trust issues with Cisco right now lol. TAC has me running in circles doing the same things over and over, collecting logs over and over, changing NTP servers, restarting the cluster, sit and restart NTP over and over for hours, and it doesn't seem to be getting me anywhere. It's really frustrating. These actions have helped other people who have been in this situation, so I'm guessing TAC is just as frustrated as I am with the whole thing, I just don't understand why they're able to resolve this issue for some people, but not others. I hope they get it figured out and release a patch, I would not be looking forward to upgrading CUCM again and this time have it be the first release of a version.

bwolf · ‎07-13-2021

Hey Mortaza,

Not sure if you saw the solution I posted, but we ended up getting the NTP issue resolved by upgrading the VMWare VM hardware to version 19, which was not available at the time we started having this issue.

I hope this helps!

r.romirer1 · ‎06-08-2021

We also had NTP sync issues after upgrading to ESXi 7.0 (CUCM Version 12.5 SU4). Check the VM-Version in the vSphere Client of the virtual machine. We upgraded from version 8 to version 13 (you also have to restart the server!) and now NTP syncs again. Here is the KB article: https://kb.vmware.com/s/article/1003746

by_hamid · ‎06-08-2021

Same problem with cucm 12.5 su 3

Please inform here if the problem is resolved

bwolf · ‎07-13-2021

Hey by_hamid,

I posted a solution to this issue, not sure if you saw it, but we were able to resolve this issue by upgrading the VMWare VMs that CUCM are on to hardware version 19, which was not available at the time this issue began.

I hope this helps!

bwolf · ‎07-13-2021

To all waiting for an update on this:

The issue was resolved by upgrading the VM Hardware to version 19, which was not available at the time this issue began. We upgraded the VMWare VM hardware 5 days ago and NTP has been synchronized as expected ever since.

Thanks for the input you all have provided

Mortaza Rohani · ‎08-02-2021

Thank you very much for following up @bwolf
I suggest you specify this answer as the "correct answer" as well

bcon6928 · ‎10-07-2021

Thanks for posting this! I passed this article along to Cisco and it sounds like from what TAC told me this is a hot topic inside of Cisco. The latest ESXi 7.0 U2 build # 18538813 triggered this issue for me. Upgrading all CUCM and IM&P nodes to Hardware Version 19 fixed it.

bcon6928 · ‎10-20-2021

Update on this issue. It came back for us today. I reopened the ticket with Cisco. I plan to reboot tonight since that tends to fix it.

mn4 · ‎11-08-2021

Hello bcon6928,

I am having the same issue with the CUCM of a customer.

Did rebooting the cluster fixed the issue after it showed up again (worrysome!)?

Can you provide further details?

Thank you

MN

bcon6928 · ‎11-08-2021

I have a ticket open with VMware and they suggested moving to 7.0 U3. I'm a bit hesitant to upgrade since it is not supported by Cisco. I'm trying to get my Cisco rep to escalate this currently since TAC has not been very helpful so far. Rebooting did fix the issue but i'm not sure for how long. I have been rebooting every week to try to avoid outages. One take away was it appears to only be affecting the Pubs so far. Once the Pub is rebooted the Sub naturally fixes itself. I will update when I have a bit more information. If the customer can I would downgrade ESXi to a build below #18538813.