05-22-2014 11:43 PM - edited 03-16-2019 10:52 PM
Hi To all
I receive everyday alerts that two nodes from the cluster are down
ServerDown occurred.
Node 10.10.1.1 is unreachable.
ServerDown occurred.
Node 10.10.1.2 is unreachable.
We have a cluster of four cucms
First ucs server located to the building A and holding the below cucms
10.10.0.1
10.10.0.2
Second ucs server located to the building B and holding the below cucms
10.10.1.1
10.10.1.2
Building A connected with Building B with dark fiber
Checked the interface into the layer 3 devices on both sides and are clear
No errors no drop packets
check the output from the replication status
The ping msec is acceptable?
admin:utils dbreplication runtimestate
DB and Replication Services: ALL RUNNING
DB CLI Status: No other dbreplication CLI is running...
Cluster Replication State: Replication status command started at: 2014-03-20-09-56
Replication status command COMPLETED 603 tables checked out of 603
Errors or Mismatches Were Found!!!
Use 'file view activelog cm/trace/dbl/sdi/ReplicationStatus.2014_03_20_09_56_34.out' to see the details
DB Version: ccm9_1_1_10000_11
Repltimeout set to: 300s
PROCESS option set to: 1
Cluster Detailed View from CM1 (4 Servers):
PING CDR Server REPL. DBver& REPL. REPLICATION SETUP
SERVER-NAME IP ADDRESS (msec) RPC? (ID) & STATUS QUEUE TABLES LOOP? (RTMT) & details
----------- ------------ ------ ---- -------------- ----- ------- ----- -----------------
CM1 10.10.0.1 0.032 Yes (2) Connected 0 match Yes (2) PUB Setup Completed
CM2 10.10.0.2 0.099 Yes (3) Connected 0 match Yes (2) Setup Completed
CM3 10.10.1.1 1.378 Yes (4) Connected 140 match Yes (2) Setup Completed
CM4 10.10.1.2 1.232 Yes (5) Connected 140 match Yes (2) Setup Completed
Regrds
cc
05-23-2014 03:56 AM
Hi,
Your cluster could be affected by the following bug
https://tools.cisco.com/bugsearch/bug/CSCtl75789
Symptom:
The RTMT receives a false alarm that the Server is down.
Conditions:
Any error in the network that closes the TCP session on the ports 1090 and 1099 will cause the false alarm.
Workaround:
none.
HTH
Manish
05-23-2014 04:29 AM
Hi Manish
I have put 5 points and didn't mark the question as answered since we are not sure yet if this is the issue
The cucm running version 9.1.1.10000-11
Is not fixed into this version
Do you have any other step to check?
05-23-2014 04:35 AM
Hi,
I have seen a few TAC cases where this bug is suspected for the server down alerts even on cucm 9.1.X. As the bug is still in Open state, i would suggest opening a TAC case so that they can verify the same by taking packet captures from callmanager servers.
HTH
Manish
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: