11-18-2010 05:51 AM - edited 03-16-2019 01:59 AM
Hi all,
I am encountering a problem with replication between these two servers.
All servers do not have a replication count of 348.
All servers have a good replication status. |
Server | Number of Replicates Created | Replicate_State |
---|---|---|
10.182.9.227 | 0 | 2 - good |
10.182.9.228 | 348 | 2 - good |
Publisher:
admin:show perf query class "Number of Replicates Created and State of Replication"
==>query class :
- Perf class (Number of Replicates Created and State of Replication) has instances and values:
ReplicateCount -> Number of Replicates Created = 0
ReplicateCount -> Replicate_State = 2
Subscriber:
admin:show perf query class "Number of Replicates Created and State of Replication"
==>query class :
- Perf class (Number of Replicates Created and State of Replication) has instances and values:
ReplicateCount -> Number of Replicates Created = 348
ReplicateCount -> Replicate_State = 2
Can anyone help me out? I am having lots of problems with this cluster and unable to add phones, backup, upgrade. I am suspecting this replication is the main cause of everything.
I tried the utils dpreplication repair all command but no changes.
I just typed the 'utils dbreplication stop' command and waiting for it to finish (requires between 5 and 60 mins) then I'll do a 'utils dbreplication reset' to see if anything changes.
Thanks.
Solved! Go to Solution.
11-19-2010 12:20 PM
Don't do anything just yet.
Wait until they both go to 4. If that happens, do a dbreplication stop on both servers, drop the dbadmin again on the subscriber, and then to the cluster reset again.
But what you are seeing often will happen, but then everything will snap to a 2. Wait till they are all 4's or all 2's before you do anything.
11-18-2010 06:06 AM
Attached is the report generated from Unified CM Database Status.
11-18-2010 11:09 AM
Nothing worked.
I tried also the utils dbreplication dropadmindb, then stop then repair.. but nothing happened as well.
11-18-2010 12:38 PM
You may need to do the dropadmindb followed by a cluster reset.
When you scroll down farther on the dbreplication report, do you have any other errors in the section below the one you pasted in on your first post? If those have issues, they need to be fixed first.
11-19-2010 01:52 AM
Yes there are other problems. I have the report attached in the previous post in XML format. I am not on site now to provide a screenshot.
If you open the XML file and search for 'Dropped', you'll see the following output of 'cdr list serv'
cdr list serv
10.182.9.227 (the publisher)
SERVER ID STATE STATUS QUEUE CONNECTION CHANGED
-----------------------------------------------------------------------
10.182.9.228 (the subscriber)
SERVER ID STATE STATUS QUEUE CONNECTION CHANGED
-----------------------------------------------------------------------
g_cmngpucm1_ccm 2 Active Dropped 0 Nov 18 14:59:49
g_cmngpucm2_ccm 3 Active Local 0
11-19-2010 04:04 AM
Hello,
Please try the following in the exact sequence mentioned below:
>> once it is stopped on the subscriber then utils dbreplication stop
on the publisher
>> wait for a few minutes for it to finish
>> utils dbreplication dropadmindb on the publisher
>> wait for it to finish
>> utils dbreplication dropadmindb on the subscriber
>> wait for a few minutes for it to finish
>>utils dbreplication reset all on the publisher
Give it some time and then run the following:
show perf query class "Number of Replicates Created and State of Replication"
on both the servers and post the results.
ALso post screen shots from the unified reporting from the call manager, once we have the screen shots, as well as the outputs, we can guide further.
HTH
Kunal
11-19-2010 12:12 PM
Hi,
Thanks for the hints guys.
I ran the 'utils dbreplication clusterreset' after stopping the dbreplication on both servers.
And I also ran the dropadmindb and then reset all as mentionned above and I got the following below.
Right now, I have the the Replicate Count = 348 on both servers but the Replicate state = 4 on the Publisher and 3 on the Subsriber.
I did the reset all command 1 hour, and (supposedly) it's still working in the background now.
Attached are screenshots and the output of the 'utils dbreplication status'. And notice that there are 0 processed rows in every part of the output!
Now the problem moved to be the Replicate State = 4 on the Publisher.
What do you advise the next step to be?
Regards.
11-19-2010 12:20 PM
Don't do anything just yet.
Wait until they both go to 4. If that happens, do a dbreplication stop on both servers, drop the dbadmin again on the subscriber, and then to the cluster reset again.
But what you are seeing often will happen, but then everything will snap to a 2. Wait till they are all 4's or all 2's before you do anything.
11-19-2010 02:10 PM
Hi Clifford,
Actually they became both 4 4 (with 384 Replicate created on both, they were 0-384 before). Then I restarted the publisher so the state stayed 4 on the publisher and went to 3 on the subscriber. So I tried the dropadmin again then reset all and nothing changed.
A question, why would I need to do a dropadmin and a clusterreset again? Wasn't it supposed to work from the first time?
Thanks a lot for your help.
11-19-2010 05:32 PM
It should have, but sometimes....things just don't behave correctly.
Usually this happens because the CDR is hosed. Dropping it and resetting it usually straightens it out. If it doesn't work after the second time, call TAC.
NOTICE OF CONFIDENTIALITY:
The information contained in this email transmission is confidential information which may contain information that is legally privileged and prohibited from disclosure under applicable law or by contractual agreement. The information is intended solely for the use of the individual or entity named above.
If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or taking of any action in reliance on the contents of this email transmission is strictly prohibited.
If you have received this email transmission in error, please notify us immediately by telephone to arrange for the return of the original transmission to us.
11-20-2010 01:35 AM
Hello,
Needed to look at the unified reporting to check if the etc files and the sql files have the correct enteries, does like it could be an issue.
There are a lot things we can look at to be honest.
But would need access or a webex session to fix it, try to reboot the pub and the sub once off production to see if that fixes the issue.
If that does not fix it i would strongly suggest you to oprn a TAC case if this affecting production, so that we can take access and fix the issue.
Regards
Kunal
11-22-2010 06:53 AM
Thanks a lot guys, after rebooting the pub waiting till its up and running and then rebooting the sub, the status is now 2 on both servers
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide