01-11-2016 11:46 AM - edited 03-18-2019 11:45 AM
One of the servers in our CUCM10.5.1.11900-13 cluster died so we had to re-add it. Everything seems to have gone successfully, except that DRS wants to back up the TFTP service, CEF, and the Directory Number Alias Lookup service even though neither of these services are activated. Because of this the backup fails. How can I get DRS to stop trying to back up services that aren't running?
Solved! Go to Solution.
01-11-2016 01:23 PM
I would first start with restarting the DRS services on the nodes and then test.
If you still get the errors in the backup job, I would restart the tomcat service on the nodes and the test.
If you STILL get errors, I would raise a TAC case (you may need to do a reboot but I would have TAC look at it first).
Thanks,
Ryan
01-11-2016 12:20 PM
Can you log into the cluster publisher's CLI and provide the output of;
After the previous node died, did you also remove it's processNode reference in the cluster (System -> Server) of the Cisco Unified CM Administration web interface) before reinstalling the node back into the cluster?
Thanks,
Ryan
(: ... Please rate helpful posts ... :)
01-11-2016 12:49 PM
I cannot post actual names or IPs of the devices, so I had to change the output a bit. Suffice to say the host names and the IPs are all correct in the real output. The server marked CUCM08 is the one we just re-added. CUCM00 is the publisher.
admin:utils dbreplication runtimestate
Server Time: Mon Jan 11 20:29:18 UTC 2016
Cluster Replication State: BROADCAST SYNC Completed on 14 servers at: 2015-12-24-04-27
Last Sync Result: SYNC COMPLETED on 680 tables out of 680
Sync Status: NO ERRORS
Use CLI to see detail: 'file view activelog cm/trace/dbl/20151224_014550_dbl_repl_output_Broadcast.log'
DB Version: ccm10_5_1_11900_13
Repltimeout set to: 1800s
PROCESS option set to: 1
Cluster Detailed View from XXXXXXCUCM00 (15 Servers):
PING DB/RPC/ REPL. Replication REPLICATION SETUP
SERVER-NAME IP ADDRESS (msec) DbMon? QUEUE Group ID (RTMT) & Details
----------- ---------- ------ ------- ----- ----------- ------------------
XXXXXXCUCM12 XX.XX.XX.XX 33.087 Y/Y/Y 0 (g_15) (2) Setup Completed
XXXXXXCUCM04 XX.XX.XX.XX 0.152 Y/Y/Y 0 (g_6) (2) Setup Completed
XXXXXXCUCM06 XX.XX.XX.XX 0.240 Y/Y/Y 0 (g_8) (2) Setup Completed
XXXXXXCUCM05 XX.XX.XX.XX 0.249 Y/Y/Y 0 (g_7) (2) Setup Completed
XXXXXXCUCM03 XX.XX.XX.XX 0.313 Y/Y/Y 0 (g_5) (2) Setup Completed
XXXXXXCUCM09 XX.XX.XX.XX 32.581 Y/Y/Y 0 (g_12) (2) Setup Completed
XXXXXXCUCM01 XX.XX.XX.XX 0.195 Y/Y/Y 0 (g_3) (2) Setup Completed
XXXXXXCUCM02 XX.XX.XX.XX 0.203 Y/Y/Y 0 (g_4) (2) Setup Completed
XXXXXXCUCM10 XX.XX.XX.XX 32.521 Y/Y/Y 0 (g_13) (2) Setup Completed
XXXXXXTFTP01 XX.XX.XX.XX 0.151 Y/Y/Y 0 (g_11) (2) Setup Completed
XXXXXXTFTP02 XX.XX.XX.XX 32.534 Y/Y/Y 0 (g_16) (2) Setup Completed
XXXXXXCUCM00 XX.XX.XX.XX 0.021 Y/Y/Y 0 (g_2) (2) Setup Completed
XXXXXXCUCM07 XX.XX.XX.XX 0.279 Y/Y/Y 0 (g_9) (2) Setup Completed
XXXXXXCUCM11 XX.XX.XX.43 32.588 Y/Y/Y 0 (g_14) (2) Setup Completed
XXXXXXCUCM08 XX.XX.XX.38 0.286 Y/Y/Y 0 (g_26) (2) Setup Completed
admin:file view activelog platform/log/diag3.log
01-11-2016 20:33:18 Diagnostics Version: 1.0.0
01-11-2016 20:33:18 getting hardware model [/usr/local/bin/base_scripts/sd_hwdetect HWModel]
01-11-2016 20:33:18 Hardware Model: VMware
01-11-2016 20:33:18 getting verson number [rpm -q --nodigest --nosignature master | sed -e "s/master-//"]
01-11-2016 20:33:18 Version: 10.5.1
01-11-2016 20:33:18 disk_space: Is valid module: True
01-11-2016 20:33:18 disk_files: Is valid module: True
01-11-2016 20:33:18 service_manager: Is valid module: True
01-11-2016 20:33:18 tomcat: Is valid module: True
01-11-2016 20:33:18 tomcat_deadlocks: Is valid module: True
01-11-2016 20:33:18 tomcat_keystore: Is valid module: True
01-11-2016 20:33:18 tomcat_connectors: Is valid module: True
01-11-2016 20:33:18 tomcat_threads: Is valid module: True
01-11-2016 20:33:18 tomcat_memory: Is valid module: True
01-11-2016 20:33:18 tomcat_sessions: Is valid module: True
01-11-2016 20:33:18 tomcat_heapdump: Is valid module: True
01-11-2016 20:33:18 validate_network: Product specific XML file: /usr/local/platform/conf/cli/cliProduct.xml
01-11-2016 20:33:18 validate_network: val: true
01-11-2016 20:33:18 validate_network: Is valid module: True
01-11-2016 20:33:18 validate_network_adv: Is valid module: False
01-11-2016 20:33:18 raid: getting cpu speed [/usr/local/bin/base_scripts/sd_hwdetect CPUSpeed]
01-11-2016 20:33:18 raid: CPU Speed: 2400
01-11-2016 20:33:18 raid: model = VMware
01-11-2016 20:33:18 raid: Is valid module: True
01-11-2016 20:33:18 system_info: Is valid module: True
01-11-2016 20:33:18 ntp_reachability: Is valid module: True
01-11-2016 20:33:18 ntp_clock_drift: Is valid module: True
01-11-2016 20:33:18 ntp_stratum: Is valid module: True
01-11-2016 20:33:18 sdl_fragmentation: Is valid module: True
01-11-2016 20:33:18 sdi_fragmentation: Is valid module: True
01-11-2016 20:33:18 ipv6_networking: IPV6INIT=no
01-11-2016 20:33:18 ipv6_networking: IPv6 initialized: no
01-11-2016 20:33:18 ipv6_networking: False
01-11-2016 20:33:18 ipv6_networking: Is valid module: False
01-11-2016 20:33:18
01-11-2016 20:33:18 --> executing test [validate_network], fix: fixauto, stop on error: False
01-11-2016 20:33:18
01-11-2016 20:33:18 validate_network: ------------------
01-11-2016 20:33:18 validate_network: Testing networking, but skipping duplicate IP test.
01-11-2016 20:33:18 validate_network: checking network [/usr/local/bin/base_scripts/validateNetworking.sh -n]
01-11-2016 20:33:19 validate_network: retrieving pub name from [/usr/local/platform/conf/platformConfig.xml]
01-11-2016 20:33:19 validate_network: Hostname: [XXXXXXCUCM00]
01-11-2016 20:33:19 validate_network: found pub name [XXXXXXCUCM00]
01-11-2016 20:33:19 validate_network: checking /etc/hosts [grep -q `hostname` /etc/hosts]
01-11-2016 20:33:19 validate_network: Finding cluster nodes [/usr/local/bin/base_scripts/list_cluster.sh]
01-11-2016 20:33:19 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:20 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:21 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:22 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:23 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:24 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:24 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:25 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:26 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:26 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:27 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:27 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:28 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:29 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:30 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:31 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:31 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:31 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:32 validate_network: running [./diag_validate_network_sftp.exp sftpuser@XX.XX.XX.XX>/dev/null]
01-11-2016 20:33:32 validate_network: does test script exist [/usr/local/bin/base_scripts/networkDiagnostic.sh]
01-11-2016 20:33:32 validate_network: test script exists
01-11-2016 20:33:32 validate_network: run network script via expect [./diag_validate_network.exp > /dev/null]
01-11-2016 20:33:33 validate_network: result: 0, message: Passed
end of the file reached
01-11-2016 01:00 PM
After the previous node died, did you also remove it's processNode reference in the cluster (System -> Server) of the Cisco Unified CM Administration web interface) before reinstalling the node back into the cluster?
01-11-2016 01:12 PM
Sorry for not answering that before.
I'm not the one who did the reinstallation, I was just asked to come in and clean up some issues. I asked the guy who did the reinstall and he said he did remove it. Best I can do for you.
01-11-2016 01:23 PM
I would first start with restarting the DRS services on the nodes and then test.
If you still get the errors in the backup job, I would restart the tomcat service on the nodes and the test.
If you STILL get errors, I would raise a TAC case (you may need to do a reboot but I would have TAC look at it first).
Thanks,
Ryan
01-11-2016 02:30 PM
Restarting those services did it Ryan. Thanks.
01-11-2016 01:27 PM
if you are not runing tftp server on that sever then it wont take back up for that server.
01-11-2016 02:06 PM
That's just the thing Samuel. The TFTP service is not running on the server, yet it's trying to back up the files for the service. This server was never a TFTP server, and the service is currently disabled. When the DRS tries to back it up, there's nothing to back up and the routine fails.
01-11-2016 02:11 PM
i think you may need to reboot those server... I would say open a TAC case and let them check it.
01-11-2016 02:31 PM
As it turned out I just had to restart the services. Check out Ryan's response. Thanks for working with me, though, Samuel.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: