cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements

BLOG (No Title)

15053
Views
60
Helpful
5
Comments
Cisco Employee

Let me start by telling you what Community Tech-Talk series is all about. This series is designed to discuss and share insights on specific topics, selectively chosen based on most-common conversation themes happening in community from our technology area.

I decided to pick this topic "Understanding CUCM dbreplication runtimestate" to address some your most talked about concerns primarily reflected in the community discussions.

dbreplication tech_talk.png

DBreplication process can be broken down into 5 steps:

  1. Define Pub - Set it up to start replicating
  2. Define template on pub and realize it (Tells pub which tables to replicate)
  3. Define each Sub
  4. Realize Template on Each Sub (Tells sub which tables they will get/send data for)
  5. Sync data using "cdr check" or "cdr sync" (older systems use sync)

The above process is done automatically by the replication scripts when the system is installed or when you run "utils dbreplication reset all" command.

Here are some of the key points to keep in mind about the various dbreplication commands that I discussed in the Tech-Talk Video:

  • utils dbreplication setrepltimeout -- The default value is set to 300 seconds. You can validate this by running "show tech repltimeout". This is the timer used to put multiple servers into one run of the data sync. In other words, it is the "batching" timer. This affects when the broadcast realize template and data sync will fire (n seconds from the end of the first defined server). Clustering over WAN (CoW) long delays can cause the data sync process to be exponentially longer. Try to sync the local servers first.

  • utils dbreplication repair -- in CUCM 5.x, this command meant a reset of the replication, whereas, in CUCM 6.x and higher versions, this means a repair of the data. It runs a repair process on all tables in the replication for all servers that are included in the command. Run this command when RTMT = 2, not when RTMT = 0 or 3.

  • utils dbreplication repairtable / repairreplicate -- This command essentially does the same thing as the repair command, but runs on only one table / replicate, hence making the process much faster. It fixes the out of sync data for that table / replicate. You can verify by running "utils dbreplication status" to see if there are any mismatches or errors found. It is particularly useful on large CUCM clusters. Run this command when RTMT = 2, not when RTMT = 0 or 3.

  • utils dbreplication stop -- You should only be running this if you want to stop replication setup. The only way to recover from a stop is with a reset. This command removes the set-up indicator file i.e. the dbmonpreflightcheck file and kills the currently running replication commands. It pauses for the duration of repltimeout timer, so if you run replication commands soon after running a stop, it could kill the commands again. Run this command when RTMT = 0, not when RTMT = 3

  • utils dbreplication reset -- This command causes replication to be torn down and then set-up. You should run this command when RTMT = 4 or when you have issued stop. Successful completion of this process results in RTMT = 2.

  • utils dbreplication clusterreset -- Avoid running this command. It is for debugging replication set-up problems. It bypasses the RTMT settings, cluster requirements and normal CUCM set-up. It causes services to go out of sync with the database because it syncs data without change notification. The services need to be restarted when this command is run, no exceptions!

  • utils dbreplication dropadmindb -- Run this command when there is a looping attempt to define a server in replication. It's usually not the server that's failing, it's the pub which is corrupted as a result of an attempt or the sub, prior to the current one attempting set-up.

  • utils dbreplication forcedatasyncsub -- This command takes a backup of the publisher and restores it to the subscriber(s) and resets up replication. It requires a serivces restart on the subscriber so they get the new values.

New Commands and Database Improvements in CUCM 9.x:

  • Re-engineered CLI forcedatasyncsuball (Lightening fast) -- This command can now restore a larger cluster in a shorter period of time!

  • New CLI rebuild is a stop, drop and reset all in one (and faster) -- The architecture of Rebuild is multi-threaded, the total operation time is much shorter than executing three different CLI commands (stop / drop / reset). Rebuild, is a master command that will stop, delete and trigger the replication setup signal across the cluster automatically and in parallel:
      1. Stop DB Replication – stop the current replication setup process if exists
      2. Remove server from database – Remove replication from the network by either “cdr delete”, dropping the syscdr database or renaming the syscdr database remotely
      3. Trigger Dbmon on the subscriber to submit a replication setup request on to publisher.

  • New CLI utils replication status table/replicate -- The "utils dbreplication status" command is lengthy when it runs. If only one table is suspect, then you have to wait for all the tables to check. Being able to check one table speeds up checking of replication.

  • Better Log Collection -- "utils create report database" collects all the database logs in one go. Also, ercollect.sh script is embedded into the server for IBM root cause cases. The script is on the server now, no need to transfer and change permissions. It is accessible via root access only.

  • Faster and more accurate Runtimestate CLI -- This command is now multithreaded, making it much faster. The output will also be logged for historical RCA. If there are any unreachable servers in the cluster, this command will no longer hang. Some additional information will be included in it such as repltimeout and IDS server number.

For detailed dbreplication troubleshooting procedures, there is already a comprehensive document available in the community:Troubleshooting CUCM Database Replication in Linux Appliance Model

Additional Reference:

Watch the Tech-Talk and checkout the presentation slides

I hope this has been an informative session and proves useful for dbreplication issues. Please do share your feedback and opinion via the comments session below.

Thank you for watching!

5 Comments
Beginner

when we issue utils dbreplication reset all "services to go out of sync with the database because it syncs data without change notification". Is there any logs available to check / prove end client, that services are not in sync with the DataBase.

Cisco Employee

Hi Chandra,

First of all, apologies for the huge delay in response. I did not get an update that you had posted a query on this page. Please let me know if you are still looking for an answer and we can discuss further.

 

Regards,

Harmit.

Beginner

Hi Harmit

 

It is a great post.  I am running into a DBReplication issue where I reset the Security Password and I added the subscriber.  After the successful Installation of Subscriber, dbreplication is failed and none of the command is working for replication.  You have been asked the same question in the Video and you mentioned you are going to share more details in the blog.  (ref. Video 29:28).  Can you please share the script to get it resolve?

 

Call Manager Version: System version: 8.5.1.10000-26

Error: Source has failed due to source on 10.168.1.10 timing out

Regards 

Cisco Employee

Hi faiqmahdi,

Thank you! I am glad you found it useful. Based on what you've explained, I believe you may have to open a TAC case. This may require some log analysis or root access. It could be related to the following defect: https://tools.cisco.com/bugsearch/bug/CSCtn79868/?reffering_site=dumpcr

Hope this helps.

Regards,

Harmit Singh.

Beginner

Hi Harmit

Thanks for your quick response.  It really helped.

Regards

CreatePlease to create content
Content for Community-Ad
August's Community Spotlight Awards