05-20-2011 04:46 PM - edited 03-16-2019 05:04 AM
I have a CUCM 8.0.3.22900-5 publisher that has been live for a few months. I recently added a subscriber but am unable to activate callmanager services on it because replication is failing.
When i run the DB replication report on the publisher i have the following errors. On the subscriber i get the opposite.
Server | Publisher DB Reachable | Local DB Reachable |
---|---|---|
10.244.44.10 | true | true |
10.244.44.11 | Source has failed due to source on 10.244.44.11 timing out | Source has failed due to source on 10.244.44.11 timing out |
I have connectivity between both and that same report also shows this:
RTMT Counter Information |
![]() | All servers have a replication count of 519. |
![]() | All servers have a good replication status. |
Lastly, i have tried using utils dbreplication
Ran utils dbreplication stop on both Sub and then Pub
Ran utils dbreplication dropadmindb on both Sub and then Pub
Ran utils dbreplicatin clusterreset
Ran utils dbreplication reset all on Pub
Rebooted Subscriber (Not Publisher).
The only other item is i don't have DNS set on the publiser or subscriber.
Help please!
Solved! Go to Solution.
05-22-2011 11:24 PM
I will suggest you to open a TAC case.
05-20-2011 08:23 PM
Hi ..
First of all, I would suggest not to run the replication commands on the cluster and better involve Cisco TAC for DBReplication issues.
But here are few things you can check ::
1) goto CUCM Cli and run "utils service list" and make sure you are running all key services.
example :: A Cisco DB / A Cisco DB Replicator / Cluster Manager / Tomcat / TFTP..etc..
2) Check the service on all the nodes in the cluster.
3) If all services are good, generate the "Unified Database Report" from Unified Reporting.
4) Check the syscdr status for all the node in the report. This must show the node entries for all the nodes in the cluster.
5) Make sure that you pub syscdr file, should have entry of pub and sub, and your sub have the enrty of sub and pub.
6) If you are missing those files or you see those files are empty then let me know and we can device an action plan further.
05-20-2011 08:53 PM
thanks so much for the response.
for 1. utils service list shows all services started
on those reports from the subscriber, i keep getting: Source has failed due to source on 10.244.44.10 (publisher) timing out
if i run it from the publisher, i get: Source has failed due to source on 10.244.44.1 (subscriber) timing out
I can ping between both servers. Seems like a networking issue?
05-20-2011 09:31 PM
there are two commands to check the connectivity issues ::
1) goto CLI of CUCM ::
2) do a test using "utils network connectivity" on both the nodes.
3) also do "utils diagnose test" to check for any dns issues or ntp issues.
seems to me that the tomcat service on pub is not working properly.
- Restart the tomcat service and AMC service on the pub and then check in the report, if you still get the same error.
PS .. Please rate useful posts ..!!
05-20-2011 09:56 PM
utils network connectivity passed on pub and sub
utils diagnose test passed on pub and sub
restarted tomcat and amc on pub and sub
still getting "Source has failed due to source on 10.244.44.11 timing out"
05-21-2011 08:44 PM
Goto CUCM Cli of the Pub::
type "utils dbreplication runtimestate".
Paste the output over here.
Also please attach the .xml report from the Unified Database Report so that I can check the errors.
Thanks
Mudit
05-21-2011 08:59 PM
admin:utils dbreplication runtimestate
DB and Replication Services: ALL RUNNING
Cluster Replication State: Replication repair command started at: 2011-05-20-01-13
Replication repair command COMPLETED 519 tables processed out of 519
No Errors or Mismatches found.
Use 'file view activelog cm/trace/dbl/sdi/ReplicationRepair.2011_05_20_01_13_22.out' to see the details
DB Version: ccm8_0_3_22900_5
Number of replicated tables: 519
Cluster Detailed View from PUB (2 Servers):
PING REPLICATION REPL. DBver& REPL. REPLICATION SETUP
SERVER-NAME IP ADDRESS (msec) RPC? STATUS QUEUE TABLES LOOP? (RTMT) & details
----------- ------------ ------ ---- ----------- ----- ------- ----- -----------------
Akins-CUCM-Publish 10.244.44.10 0.042 Yes Connected 0 match N/A (2) PUB Setup Completed
AkinsCUCM02 10.244.44.11 0.235 Yes Connected 402 match N/A (2) Not Setup
05-21-2011 11:15 PM
I have checked the report.
The Replication is good and is happening, but it looks like the some of the ports in your cluster has been bolcked. Can you check if you have any firewall doing it.
Also do "utils firewall list" and check if you can see any ports blocked.
The Replication is GOOD in your cluster so fdon't woory about it.
Thanks
Mudit
05-21-2011 11:53 PM
utils firewall list doesn't show anything blocking.
when i try to activate the callmanager service on the subscriber, i get: Cisco CallManager Service cannot be Activated or Deactivated due to Database Update Failure.
05-22-2011 07:13 PM
If replication is working, why can't i active the callmanager service? i want the IP phones to register to the subscriber.
05-22-2011 09:00 PM
I will recommend you to open a ticket with Cisco TAC and have some Engineer look into your network.
05-22-2011 09:22 PM
Also .. it looks more like a connectivity issue but we need to dig in further to find the cause of failure.
In the meanwhile, you can goto Subscriber's cli and start the Call Manager service using the cli.
goto -> cli -> utils service start Cisco CallManager
and then you can point your phones to the Subscriber.
05-22-2011 11:03 PM
it does seem to be a networking issue. The publisher is a physical MCS server. the subscriber is on a UCS vmware server.
i installed another subscriber on the UCS vmware server and the two vms can communicate properly in the
database status report but the physical MCS doesn't. If i run the report from the physical box, it communicates but the other 2 VMs don
't. all 3 can ping each other
05-22-2011 11:24 PM
I will suggest you to open a TAC case.
05-24-2011 08:42 PM
this probem is now resolved thanks to TAC
for anyone interested, the problem ended up being because we reset the security password. We did this PRIOR to adding the subscribers but it still caused dbreplication issues. Cisco TAC had to get root access and run a script to correct the password inconsistencies.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide