cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2056
Views
0
Helpful
4
Replies

CUCM Clusters stuck in syncing mode...

Forcefield
Level 1
Level 1

Hello

Has anyone come across a similar issue and if so what was the fix?

Currently I have a cluster (version 10.5.2) thats shows the runtimestate on all subcribers to be syncing... 

When I look at the CU reports i see all the subs as initializing. However normal functionality is fine and backups run without issues also. When I run a status command on dbreplication i can see everything connected.

I have gone through troubleshooting dbreplication with repair etc as well as restarting all servers also. I am unsure as to where to go next in troubleshooting other than upgrading the cluster. Seems very strange as it is having no impact to daily use.

The only anomoly i see is that NTP is at stratum 5 and that does give me a message to say that it is not recommended.

Thank you all

FF 

4 Replies 4

Jitender Bhandari
Cisco Employee
Cisco Employee

Hi FF,

you would have to fix NTP first try keeping the stratum below 3. Can you attach below.

Utils diagnose test

Show ntp status.

JB

Hi JB

We have had a network outage since we last spoke and interestingly I am since seeing a change for my local cluster:

                                      PING      DB/RPC/   REPL.    Replication    REPLICATION SETUP

SERVER-NAME         IP ADDRESS        (msec)    DbMon?    QUEUE    Group ID       (RTMT) & Details

-----------         ----------       ------    -------   -----    -----------    ------------------

PUB01       10.PPP.1P3.PPP    0.015     Y/Y/Y     0        (g_2)          (2) Setup Completed

1SUB02       10.PPP.1P3.PPP    0.182     Y/Y/Y     0        (g_3)          (2) Setup Completed

1TFTP03       10.PPP.1P3.PPP    0.140     Y/Y/Y     0        (g_4)          (2) Setup Completed

2SUB01       10.MMM.1M3.MMM    43.499    Y/Y/Y     --       (-)            (0) Syncing...

2SUB02       10.MMM.1M3.MMM    45.028    Y/Y/Y     0        (g_6)          (0) Syncing...

2TFTP03       10.MMM.1M3.MMM    43.610    Y/Y/Y     0        (g_7)          (0) Syncing...

-------

The last three members seem to hang on syncing only now. These three are remote to where I am.

When I run a status on the replication I can see one member missing:

SERVER                 ID STATE    STATUS     QUEUE  CONNECTION CHANGED

-----------------------------------------------------------------------

g_2_ccm10_5_2_13900_12   2 Active   Local           0

g_3_ccm10_5_2_13900_12   3 Active   Connected       0 Jun 28 15:31:21

g_4_ccm10_5_2_13900_12   4 Active   Connected       0 Jun 28 15:31:24

g_6_ccm10_5_2_13900_12   6 Active   Connected       0 Jul 12 07:37:45

g_7_ccm10_5_2_13900_12   7 Active   Connected       0 Jul 13 20:34:26

--------

Diagnostics test:

admin:utils diagnose test

 

Log file: platform/log/diag3.log

 

Starting diagnostic test(s)

===========================

test - disk_space         : Passed (available: 7099 MB, used: 12529 MB)

skip - disk_files         : This module must be run directly and off hours

test - service_manager     : Passed

test - tomcat             : Passed

test - tomcat_deadlocks   : Passed

test - tomcat_keystore     : Passed

test - tomcat_connectors   : Passed

test - tomcat_threads     : Passed

test - tomcat_memory       : Passed

test - tomcat_sessions     : Passed

skip - tomcat_heapdump     : This module must be run directly and off hours

test - validate_network   : Passed

test - raid               : Passed

test - system_info         : Passed (Collected system information in diagnostic log)

test - ntp_reachability   : Warning

The host 10.2M2.MMM.6 is not reachable, or it's NTP service is down.

The host 10.2P0.1P1.101 is not reachable, or it's NTP service is down.

 

Some of the configured external NTP servers are not reachable.

It is recommended that for better time synchronization all of

the NTP servers be reachable.

 

Please use the OS Admin GUI to add/remove NTP servers.

 

test - ntp_clock_drift     : Passed

test - ntp_stratum         : Failed

The reference NTP server is a stratum 5 clock.

NTP servers with stratum 5 or worse clocks are deemed unreliable.

Please consider using an NTP server with better stratum level.

 

Please use OS Admin GUI to add/delete NTP servers.

 

skip - sdl_fragmentation   : This module must be run directly and off hours

skip - sdi_fragmentation   : This module must be run directly and off hours

 

Diagnostics Completed

--------

NTP Details:

ntpd (pid 27758) is running...

 

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

+10.2X2.XXX.1   10.2A0.AAA.8     6 u  281 1024  377   42.281   -2.090   1.852

*10.1X0.XXX.4   130.88.200.6     4 u  283 1024  377  220.381    0.128   3.925

+10.2X0.XX.1     10.1A0.AAA.4     5 u 1015 1024  377    1.290    0.466   0.853

 10.2M2.MMM.6    .XFAC.          16 u    - 1024    0    0.000    0.000   0.000

 10.2P2.1P1.101  .XFAC.          16 u    - 1024    0    0.000    0.000   0.000

Thanks again

FF

Hi,

You can clearly see issue with your NTP

test - ntp_reachability   : Warning

The host 10.2M2.MMM.6 is not reachable, or it's NTP service is down.

The host 10.2P0.1P1.101 is not reachable, or it's NTP service is down.

 

Some of the configured external NTP servers are not reachable.

It is recommended that for better time synchronization all of

the NTP servers be reachable.

 

Please use the OS Admin GUI to add/remove NTP servers.

 

test - ntp_clock_drift     : Passed

test - ntp_stratum         : Failed

Cisco recommend stratum should stay below 4, fix the NTP first and then issue "utils dbreplication reset all" on publisher.

(Rate if it helps)

JB

I have fixed the issue at last.

Looks like that stratum level 5 does not cause an impact as they are still on the same level.

What I have done since is remove legay servers mentioned above, but I have also rebooted servers in the cluster. Whether it was one of these are a combination I'm not sure.

Thanks for the advice.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: