cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
746
Views
5
Helpful
3
Replies

CUCM Version 9.X generate alerts every day

Hi To all

 

I receive everyday alerts that two nodes from the cluster are down

 

ServerDown occurred.

Node 10.10.1.1 is unreachable.

 

 ServerDown occurred.

Node 10.10.1.2 is unreachable.

 

We have a cluster of four cucms

 

First ucs server located to the building A and holding the below cucms

10.10.0.1

10.10.0.2

 

Second ucs server located to the building B and holding the below cucms

10.10.1.1

10.10.1.2

 

Building A connected with Building B with dark fiber

 

Checked  the interface into the layer 3 devices on both sides  and are clear

No errors no drop packets

 

check the output from the replication status

The ping msec is acceptable?

 

admin:utils dbreplication runtimestate

DB and Replication Services: ALL RUNNING

DB CLI Status: No other dbreplication CLI is running...

Cluster Replication State: Replication status command started at: 2014-03-20-09-56
     Replication status command COMPLETED 603 tables checked out of 603
     Errors or Mismatches Were Found!!!

     Use 'file view activelog cm/trace/dbl/sdi/ReplicationStatus.2014_03_20_09_56_34.out' to see the details

DB Version: ccm9_1_1_10000_11
Repltimeout set to: 300s
PROCESS option set to: 1

Cluster Detailed View from CM1 (4 Servers):

                                PING            CDR Server      REPL.   DBver&  REPL.   REPLICATION SETUP
SERVER-NAME     IP ADDRESS      (msec)  RPC?    (ID) & STATUS   QUEUE   TABLES  LOOP?   (RTMT) & details
-----------     ------------    ------  ----    --------------  -----   ------- -----   -----------------

CM1     10.10.0.1      0.032   Yes     (2)  Connected   0      match   Yes     (2) PUB Setup Completed

CM2     10.10.0.2      0.099   Yes     (3)  Connected   0      match   Yes     (2) Setup Completed

CM3     10.10.1.1      1.378   Yes     (4)  Connected   140    match   Yes     (2) Setup Completed

CM4     10.10.1.2      1.232   Yes     (5)  Connected   140    match   Yes     (2) Setup Completed
 


Regrds

cc

 

Please rate all useful posts Regards Chrysostomos ""The Most Successful People Are Those Who Are Good At Plan B""
3 Replies 3

Manish Gogna
Cisco Employee
Cisco Employee

Hi,

Your cluster could be affected by the following bug

https://tools.cisco.com/bugsearch/bug/CSCtl75789

Symptom:
The RTMT receives a false alarm that the Server is down.

Conditions:
Any error in the network that closes the TCP session on the ports 1090 and 1099 will cause the false alarm.

Workaround:
none.

HTH

Manish

 

Hi Manish

 

I have put 5 points and didn't mark the question as answered since we are not sure yet if this is the issue

 

The cucm running version 9.1.1.10000-11

Is not fixed into this version

 

Do you have any other step to check?

 

Please rate all useful posts Regards Chrysostomos ""The Most Successful People Are Those Who Are Good At Plan B""

Hi,

I have seen a few TAC cases where this bug is suspected for the server down alerts even on cucm 9.1.X. As the bug is still in Open state, i would suggest opening a TAC case so that they can verify the same by taking packet captures from callmanager servers.

HTH

Manish

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: