There's a client with Cisco Unity Connection 188.8.131.5200-206. After doing a 'utils ntp restart', the following message showed up:
Communication is not functioning correctly between the servers in the Cisco Unity Connection cluster. To review server status for the cluster, go to the Tools > Cluster Management page of Cisco Unity Connection Serviceability.
The client states that there was no service for 5 minutes and wants to know if performing this task shoud be disruptive or not and if there is any official document from Cisco stating this.
See SrvConnUnity_1.jpg sent by the client after performing the ntp restart.
Right now the service is normal (see SrvConnUnity_2.jpg attached). The client also sent a 'utils ntp
admin:utils ntp status ntpd (pid 10899) is running...
remote refid st t when poll reach delay offset jitter ============================================================================== *127.127.1.0 LOCAL(0) 10 l 16 64 377 0.000 0.000 0.002
synchronised to local net at stratum 11 time correct to within 12 ms polling server every 64 s
Current time in UTC is : Fri Apr 26 16:01:23 UTC 2013 Current time in America/Argentina/Buenos_Aires is : Fri Apr 26 13:01:23 ART 2013 admin:
Could anybody help me with this? What steps should I take? Many thanks in advance.
*127.127.1.0 LOCAL(0) 10 l 16 64 377 0.000 0.000 0.002
First, you are using the ip address 127.127.1.0 which is the referenced used for the local system clock, the asterisc means is the preferred option as there is no other IP available. This is not a good practice and not recommended.
Secondly the stratum is unreliable, meaning too high to reach or too low to be accepted by Unity Connection.
If you would happen to run 'utils diagnose test' you would have probably seen an output as the following example below:
skip - disk_files : This module must be run directly and off hours
test - service_manager : Passed
test - tomcat : Passed
test - tomcat_deadlocks : Passed
test - tomcat_keystore : Passed
test - tomcat_connectors : Passed
test - tomcat_threads : Passed
test - tomcat_memory : Passed
test - tomcat_sessions : Passed
test - validate_network : Reverse DNS lookup missmatch
test - raid : Passed
test - system_info : Passed (Collected system information in diagnostic log)
test - ntp_reachability : Passed
test - ntp_clock_drift : Passed
test - ntp_stratum : Failed
The reference NTP server is a stratum 11 clock.
NTP servers with stratum 5 or worse clocks are deemed unreliable.
Please consider using an NTP server with better stratum level.
Please use OS Admin GUI to add/delete NTP servers.
skip - sdl_fragmentation : This module must be run directly and off hours
skip - sdi_fragmentation : This module must be run directly and off hours
test - ipv6_networking : Passed
And on the RTMT (Real Time Monitoring Tool) you would have seen a Critical event:
The best external NTP server, , is stratum , which is unacceptably high. External NTP servers must be <= strata 8 and should be <= strata 5. NTP server strata can be verified using the CLI 'utils ntp status' command ('st' column). Try using different NTP servers.
All specified external NTP server(s) have unacceptably high stratum values. Network issues exist or the designated servers have unreliable stratum values.
Information is self explanatory and therefore reassures the need of having a NTP different from the server itself.
By the snippet you sent we can know that it is the publisher server, as the Subscriber polls this information from the Publisher.
Installing the Operating System and Cisco Unity Connection 8.x
"Cisco recommends that you use an external NTP server to ensure accurate system time on the publisher server. Ensure the external NTP server is stratum 9 or higher (meaning stratums 1-9). The subscriber server will get its time from the publisher server"
Documentation also reaffirms the need for that NTP to be accessible otherwise your system can be degraded. Some addtional information which would be interesting to know is:
- Why did they had to restart the NTP in the first place?
System Requirements for Cisco Unity Connection Release 8.x
"A network time protocol (NTP) server must be accessible to the Connection server"
On the Cisco Unity Connection Serviceability> Tools> CLuster Management screen shot you sent i see that the ports were "Not Available" and that the customer stated "there was no service for 5 minutes".
By no service did they mean that over the phone they heard a disconnected tone or a failsafe message?
Additionaly after the servers resolved from SBR the Subscriber never recovered entirely as it did not start the Conversation Manager service.
Bottom line if they are able to reproduce it then it would be worth a while checking with TAC