The Spint Brain Recovery condition in Unity Connections is an odd state to find your Unity Connections cluster in, for sure. One thing is certain though, 'something' happened, and this is the result of whatever that 'something' was or is.
When this condition happens, what you'll notice (either from logs or by watching the cluster node status) is that the Primary and HA servers will take turns being the cluster's primary node every few minutes. This is because both cluster nodes have somehow ended up with slightly different database versions and the Server Redundancy Manager service cannot determine exactly which server should be primary. The cluster will try to resolve this issue on it's own, but often times cannot.
The good news (maybe not so good) is that in several cases, the cluster will continue to function and answer calls during a split brain. Where I have seen the split brain condition cause service outage to users is with SCCP integrations that do not have sufficient SCCP ports on both cluster nodes.
What causes this can be a number of things; often it is the result of the two nodes losing network communication between themselves and/or service failure. The resolution to this condition is fairly simple, although, you'd be best served to figure out why it happened in the first place, lest you repeat it again.
Since Unity Connections is running the same operating system as Unified Communications Manager, you can run the same health checks in Unity Connections that you would run you Unified Communications Manager. You'll want to resolve any issues discovered in those health checks before resolving the split brain issue.
Assuming you have a healthy Unity Connections cluster and/or discovered and resolved the issue that caused the split brain you'll want to move on to resolution.
Power off / shut down the true HA node (the node that is not supposed to be the true primary).
In the Cisco Unity Connection Administration section click on Cluster under System Settings in the left-side vertical navigation menu and verify that both nodes are entered in correctly (IP address/hostname/FQDN).
In the Cisco Unity Connection Serviceability section click on Service Management under the Tools menu and verify that the Connection Message Transfer Agent and the Connection Notifer services are started.
Restart the Unity Connections Cluster primary server. From the CLI, issue the command "utils system restart" and press "Y" to the proceeding prompt.
Once the rebooted primary server is back up, wait till you can access the GUI web page of the server before proceeding (in other words, wait till the Cisco Tomcat Service is operational).
From the Unified Communications Server (or other type of call control server), place a call into voicemail (all you really need is something that will signal the IVR to pickup) and then hang-up and wait 5 minutes before proceeding to the last step.
Power on the HA node (that should still be shutdown) and wait till you can access the GUI web page of the server before proceeding (in other words, wait till the Cisco Tomcat Service is operational).
Verify that the split brain status has been removed for the node status on the Cluster Management page under Cisco Unity Connection Serviceability (or issue show cuc cluster status from the CLI of the cluster's primary server).
Run health checks on the cluster one last time, to verify everything is healthy and operational.
The cluster should be back to normal status now and the split brain condition no longer listed on the cluster nodes.
In some instances (usually due to how long the cluster was in this state), even more action is necessary and you may need to reset the cluster replication (done via the CLI of the primary server with, utils dbreplication reset *) if the cluster's database replication is damaged. The need to do this will be discovered in your final set of health checks.
HiI want to register the 8831 3PCC phone on broadsoft platform. I am able to register the phone for the first time but as long as we move the phone to another subnet, phone stick to old IP.After search we find out this is known bug...
Current Environment VMware Installation: 2 vCPU Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz, disk 1: 80Gbytes, 6144Mbytes RAM, Partitions aligned With CurrentCUCM 220.127.116.110-2CUC 18.104.22.16800-26CER 22.214.171.12400-6UCCX 126.96.36.19900.51Looks like I can upg...
Guys i have CUCM cluster on 2 sites. publisher and 2 sub on site X and 2 sub on site z. CUCM connected to CUBE @ site X through SIP trunk. issue is when i use ip-phone @ site Z to make a call that goes through CUBE call fail. i used to re-...
Hi we have 11.5 cucm ,am using CIPC and configured extension mobility feature , when i pressed message button am getting HOST NOT FOUND .1.I changed the dns name in the url with IP address and changed in the Enterprise parameter also2.IN...
We have a building that would like to utilize a system call handler only when their front desk secretary is not in the office. Is there an easy way to do this? I know the basics of using system call handlers in CUCM/Unity. Right now, building that use a c...