Solved: CUCM 12.5 NTP Unsynchronized

bwolf · ‎03-24-2021

Hello All,

CUCM System version: 12.5.1.13900-152 - 1 Publisher and 3 subscribers

I'm having an issue where NTP is unsynchronized. I can sit and use utils ntp restart on the publisher all day long and it never synchronizes.

remote refid st t when poll reach delay offset jitter
==============================================================================
+<NTP IP ADDRESS>                u 15 64 1 1.023 -1823.8 1653.58
+<NTP IP ADDRESS>                u 14 64 1 40.296 -1865.4 1568.27
+<NTP IP ADDRESS>                u 13 64 1 0.907 -1910.8 1698.47
*<NTP IP ADDRESS>                u 45 64 1 42.508 -519.46 287.555

unsynchronised
polling server every 64 s

You can see lots of jitter and offset. this status is right after I had just restarted NTP

We're using known good internal Linux NTPs (CUCM is the only system in our network having the NTP issue) that we have been using for years with no issues. port 123 is open, no issues there.

I have a case open with TAC, they are recommending to uncheck the "Synchronize Time with Host" option on the VM in VMWare (ESXi version 7.0). Which I haven't been able to do yet because our company is in a change freeze.

I just wanted to check here to see if anyone else has experienced this and if they had any suggestions to get NTP resynchronized. I also wanted to see if anyone who had this issue was able to resolve it by unchecking the "Synchronize Time with Host" option in VMWare. Most info I can find says to do traces (we've done that with TAC) and just keep trying to restart NTP, but this does not seem to help with getting NTP synchronized.

This issue is causing a DBRepliecationFailure to occur every hour. The issue seems to have started after I upgraded from 11.5 to 12.5 (SU1) and continues to happen after we upgraded to 12.5 (SU4), which is the most current version of CUCM available.

When it first started happening, it seemed like restarting the CUCM cluster would resolve it temporarily. About one week after the restart, the issue would occur again. But the issue came up again two days ago and now restarting the cluster isn't even helping as the issue persists pretty much immediately after the cluster restart.

Any info would be greatly appreciated

Thanks!

bwolf · ‎07-13-2021

To all waiting for an update on this:

The issue was resolved by upgrading the VM Hardware to version 19, which was not available at the time this issue began. We upgraded the VMWare VM hardware 5 days ago and NTP has been synchronized as expected ever since.

Thanks for the input you all have provided

View solution in original post

TomMar1 · ‎03-24-2021

Try adding an external NTP server (if you are able) just to see if it syncs. I had an issue with NTP a while ago and found some literature about recent call manager versions requiring NTP v4 as opposed to v3.

bwolf · ‎03-24-2021

Hey tmariutto,

Thank you for your reply. I did forget to mention that once the change freeze is over, I will be able to add an external NTP for testing. Do you happen to know how I can check to see which version of NTP I am using? It is possible we are using the wrong version.

Thank you!

TomMar1 · ‎03-24-2021

You would need to check what version of ntp your linux servers are running

Check this doc

https://www.cisco.com/c/en/us/support/docs/unified-communications/unified-communications-manager-callmanager/118718-technote-cucm-00.html

"Note: CUCM Version 9.x and later require that the NTPv4 server be configured as the preferred NTP server."

Try the command

utils network capture port 123

:

bwolf · ‎03-24-2021

Turns out it is NTPv4. Thank you for the information though!

Nithin Eluvathingal · ‎03-24-2021

CUCM doesn't work properly with some NTP servers. if i face any problem with NTP sync i use below two methods .

Use ntp server time.google.com if organization allow external NTP sync.
sync any of the cisco router/switch with internal NTP and i make this ntp synced router/switch as a NTP server for CUCM. Use Below command to make the router/switch a NTP server.
```
ntp master [stratum level]
```

bwolf · ‎03-25-2021

Hey Nithin,

Thank you for the information. My organization is compartmentalized, i don't really have access to mess with the routers and switches, I'm just the CUCM guy. I will see if we can work this out with the network guys who admin the switches and routers. But it does seem to make sense. Like making another internal NTP specifically just for CUCM to use?

Thanks

TomMar1 · ‎03-25-2021

Other than asking the networking team if an ACL is required for NTP access from the Call Manager there is not much else you need from them. You may not even need that, try to set up NTP using the default gateway IP for the voice network, that would normally live in a router running NTP. The router is getting NTP from somewhere (on net or externally) you would get it from the router. Hopefully the router is not getting NTP from stratum 3 or higher, IIRC CM wants the lowest stratum possible 3 or 4 or below, you would be adding one stratum.

Honestly as other have mentioned the best thing to do would be to get NTP from an external source (an NTP pool) providing CM is allowed access. I use Google time and 2 GPS sources.

bwolf · ‎05-20-2021

Thank you Nithin,

We attempted a public NTP a few days ago (128.138.140.44), it wasn't Google's NTP though. But if you're saying you've had success with Google's I'm going to give it a shot. But I just want to make sure we're in the same situation. Are you on CUCM 12.5 SU3 with ESXi host version 7.0 U1? We have pretty much isolated it to a compatibility issue with ESXi 7. Another person who responded was able to resolve the issue by going back to ESXi 6.7, but unfortunately I don't have that option. So, if you're using the same versions I am and have used google's NTP successfully, I'm interested in giving it a shot.

Thanks

dsitsgroup · ‎05-20-2021

We're working through a similar issue currently. We've unchecked the ESXi time synch settings that Cisco recommended, which didn't resolve the issue, and their next follow up was to disable the following .vmx settings on the virtual machine listed here: https://kb.vmware.com/s/article/1189.

We haven't made those vmx changes yet but will test them soon.

I believe that the issue is related to our recent ESXi host upgrades from 6.7 to 7.0. If we move the servers to a v6.7 host, NTP will resync within a couple minutes. For testing, we moved our pub from a 7.0 host to a 6.7 host but left the sub on a 7.0 host, the pub will sync but the sub won't. We then move the sub to 6.7 and it's synced within a few minutes. We have multiple clusters on our platform ranging from versions 11.5 to 12.5 and each cluster is experiencing the issue on 7.0 hosts.

We have tested with both internal ntp (linux server & L3 Cisco switch) and external NTP (Google).

Do you happen to have any ESXi hosts running a version older than 7.0 that you could test migrating the VMs to and see if the issue is resolved?

bwolf · ‎05-20-2021

Hey dsitsgroup,

Thanks for the reply. We have gone through this VMWare article before. I'm not 100% sure I did it right though because it says to deselect the "Synchronize Time with Host" checkbox and the "synchronize time periodically" but when we go to the VM in VMWare, there is no "Synchronize Time with Host" checkbox, only "Synchronize guest time with host", which I assume is the same thing, and it has the "synchronize time periodically" check box right under it. So it seems like that is the correct option.

After we made sure those options were unchecked on all of our nodes, I restarted the whole CUCM cluster from CUCM OS Admin. I'm not well seasoned with VMware, but I assume restarting via OS Admin should satisfy the restart the VM step. Should I have restarted the VM via VMWare instead? Is that somehow different from restarting via CUCM OS Admin?

Unfortunately we don't have any clusters left with esxi 6.7. Our unix team upgraded them all to meet requirements for another application. I did a rush upgrade to CUCM 12.5 in preparation for it because Cisco claims it is compatible with ESXi 7.0, but apparently it is not.

I've been working with TAC on this for a while now, they started off by having me try restarting NTP over and over until it finally syncs up. After probably100s of restarts, I gave up on that. Then they had us follow the VMWare article you linked and get those settings unchecked. Then they had me try a public NTP (128.138.140.44), bypassing our internal NTP. None of this has resolved the issue and the TAC engineer told me after our last communication that if the last bit of troubleshooting he gave me doesn't fix the problem to call in and get a new engineer, like he's giving up on it.

So, now I'm stuck here with this NTP issue, which is causing DB synchronization to fail and it's becoming a problem with newer phones we're putting into the system, some of them pull the wrong configuration, sometimes with configurations for people who aren't even with the company anymore and we have deleted their phones from CUCM, but somehow it still finds those configs and loads them up. It's really weird and annoying.

Thanks!

Roger Kallberg · ‎05-20-2021

AFAIK ESXi 7 is supported from CM 14. It is not supported for 12.5.

For more information about compatibility see this site https://www.cisco.com/c/dam/en/us/td/docs/voice_ip_comm/uc_system/virtualization/older_pages/virtualization-cisco-unified-communications-manager-older.html

bwolf · ‎05-20-2021

According to the documentation and TAC, CUCM 12.5 SU2 and above support ESXi 7.0 U1. But it seems like quite a few people have had this issue and maybe the documentation and TAC were wrong.

I didn't know 14 was out, it looks like it was just released in March. I just did a panic upgrade to 12.5 SU3 knowing the ESXi upgrade was coming, I wouldn't mind upgrading again if I knew for sure they were compatible. Have you had any experience with CUCM 14 using ESXi 7.0 U1?

https://www.cisco.com/c/dam/en/us/td/docs/voice_ip_comm/uc_system/virtualization/virtualization-cisco-unified-communications-manager.html

Thanks!

Roger Kallberg · ‎05-20-2021

No on both accounts, we don’t have ESXi 7 yet on any of the hosts where we run UC service and we have not yet installed/upgraded any system to 14. Quite likely we would get our Sandbox systems updated in the next coming weeks and there after our preproduction systems. We would wait for the first SU to be released before we update any of our production systems. It should be released in September from what I know.

RITR · ‎05-20-2021

@bwolf wrote:
Have you had any experience with CUCM 14 using ESXi 7.0 U1?

We're in production with v14 on ESXi 7.0 U2 (guests are set to 7.0 U1 compatibility). No issues so far (unique to v14).

I think Cisco needs to add some flexibility with their NTP implementation. We've had issues I think since v9 with NTP. To compound the issue, the Web complains about it but the CLI shows it's working fine. It was flagged by v14 upgrade prechecks and I just ignored it.