09-28-2015 01:46 PM - edited 03-17-2019 04:24 AM
We are running a cluster with 2 CUC server with version 8.6 on them, after a recent reboot of both server we were having replication issues and I was able to solve the problem following some steps from a different forum post about dbreplication resets.
we are currently still experience an issue when logging into the CUC publisher via the web interface we get the following error message "Communication is not functioning correctly between the servers in the Cisco Unity Connection cluster." and when logging into the subscriber we receive the same message along with "The Cisco Unity Connection cluster subscriber server has changed to Primary status(failover has occurred)."
Both the Publisher and Subscriber are currently running in a Primary state and when I got to use the Cluster management tool all the buttons are greyed out so I cannot change anything.
here is an out put from the Publisher:
admin:show cuc cluster status
Server Name Member ID Server State Internal State Reason
------------ --------- ------------ ----------------------- -------
001 0 Primary Pri Active Disconnected Normal
002 1 Disconnected Unknown Unknown
Database replication is not active
Output from the Subcriber:
admin:show cuc cluster status
ACE_File_Lock::ACE_File_Lock: Permission denied /dev/shm/CCM_GENstatusLock_0
Server Name Member ID Server State Internal State Reason
------------ --------- ------------ ---------------------------- -------
001 0 Disconnected Unknown Unknown
002 1 Primary Sec Act Primary Disconnected Normal
SERVER ID STATE STATUS QUEUE CONNECTION CHANGED
-----------------------------------------------------------------------
g_ciscounity_pub 100 Active Dropped 3913882 Sep 28 16:09:11
g_ciscounity_sub1 101 Active Local 0
SERVERS
Server Peer ID State Status Queue Connection Changed
---------------------------------------------------------------------------
g_ciscounity_sub1 g_ciscounity_pub 100 Active Dropped 3913882 Sep 28 16:09:11
g_ciscounity_sub1 101 Active Local 0
STATE
Source ER Capture Network Apply
State State State State
---------------------------------------------------------------------------
g_ciscounity_pub Shut Down Uninitialized Down Uninitialized
g_ciscounity_sub1 Active Running Running Uninitialized
Runtimestate looks good to me:
DB and Replication Services: ALL RUNNING
Cluster Replication State: Replication status command started at: 2015-09-28-07-06
Replication status command COMPLETED 541 tables checked out of 541
No Errors or Mismatches found.
Use 'file view activelog cm/trace/dbl/sdi/ReplicationStatus.2015_09_28_07_06_55.out' to see the details
DB Version: ccm8_6_1_20007_2
Number of replicated tables: 541
Cluster Detailed View from PUB (2 Servers):
PING REPLICATION REPL. DBver& REPL. REPLICATION SETUP
SERVER-NAME IP ADDRESS (msec) RPC? STATUS QUEUE TABLES LOOP? (RTMT) & details
----------- ------------ ------ ---- ----------- ----- ------- ----- -----------------
001 xxx.xxx.xxx.22 0.043 Yes Connected 0 match Yes (2) PUB Setup Completed
002 xxx.xxx.xxx.50 0.293 Yes Connected 140 match Yes (2) Setup Completed
so from what I can see replication is working and the devices appear to be talking to each other so why am I receiving these error messages and how can I fix this issue?
09-28-2015 01:54 PM
Hi
In my experience, I have found Unity Connection database extremely sensitive and does not tolerate any power shutdown or any network issues compared to their CCM cousins.
Hence in my humble suggestion, it would be far less painful and effortless to rebuild the Subscriber.
I have had my share of restless and sleepless nights (with TAC help) to rebuild a broken database replication or recover from a database corruption.
HTH
09-28-2015 04:35 PM
Tend to agree with that Wilson. +5
09-29-2015 04:45 AM
Thanks for the reply.
in my situation a rebuild would be the last option and from what I can see the dbreplication is working fine I can make changes on the PUB and it will show up on the SUB and I can delete things from the SUB and they will be removed from the PUB. So I don't really know where to go from here and how to get rid of these error messages as far as the cluster talking to each other properly.
10-11-2015 05:52 PM
Did you find a solution to your problem?
I'm having the same issue after doing a backup restore on the PUB.
10-13-2015 05:05 AM
No, I still have not been able to solve this issue I am at the point of taking the advice in this post and just rebuilding.
08-02-2018 12:19 AM
Hi,
We have a same scenario, where the CUC Version: 10.5.
NOTE: checked, the DBreplication on PUB its happening successfully (but still the below alert was seen on SUB)
10:41:24.768 |17539,,,SRM,3,<CM> Command: /opt/cisco/connection/bin/db-replication-control status icb0003 execution completed abnormally. Error number: 255
10:41:25.441 |17539,,,SRM,3,<CM> Command: /opt/cisco/connection/bin/db-replication-control status icb0006 execution completed abnormally. Error number: 255
Any update/ suggestions will be highly recommended
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide