03-20-2014 09:45 PM - edited 03-16-2019 10:12 PM
Hi there
Our publisher reboot itself for some reason around 8:56am and it came back successful around 9am.
We would like to know what happen. Can someone help to check the logs and give some idea.
I have collect Event View logs for reference.
Cheers
Fei
03-20-2014 11:17 PM
Hi Fei,
From the syslogs I don't see what happened before the reboot, except one message, that the certificate has expired on one of the nodes in the cluster.
Mar 20 21:00:00 ndc1cm001 local7 0 : 311: Mar 20 10:00:00.60 UTC : %CCM_UNKNOWN-CERT-0-CertExpiryEmergency: Certificate Expiry EMERGENCY_ALARM Message:Certificate expiration Notification. Certificate name:CAPF-962c702a Unit:CallManager-trust Type:trust-cert Expiration:Mon Oct 29 App ID:Cisco Certificate Monitor Cluster ID: Node ID:ndc1cm001
After this the only message is that the call manager has booted, where the service manager is starting up. this is the first service to start.
Mar 21 08:58:37 ndc1cm001 local7 6 : 0: Mar 20 21:58:37.317 UTC : %CCM_SERVICEMANAGER-GENERIC-6-ServiceStarted: Service started. Service Name:Service Manager Process ID:6959 App ID:Cisco Service Manager Cluster ID: Node ID:ndc1cm001
I would suggest checking the server for any core dumps that have been created. That would indicate a service crash, causing a reboot.
Another thing that you can look at would be the tomcat catalina.out log to see if that may have been a cause.
Was any change made on this server yesterday?
Thanks
03-23-2014 08:17 PM
Hi Sreekanth
According to post from Slashdots, we did see unclean shutdown in system-history.log
03/21/2014 08:57:59 | root: Boot 7.1.5.10000-12 Start
Can you confirm that we are hitting this bug or not?
Our CCM is 7.1.5.10000-12.
03-23-2014 08:41 PM
Hi Fei,
Yes, CUCM ver 7.1.5 is affected and as per the bug workaround "Rebuild the server if there is a non-graceful shutdown regardless if the recovery disk was required. As any non-graceful shutdown can cause file system corruption even if no symptoms are seen immediately."
HTH
Manish
03-23-2014 08:46 PM
Hi Manish
What is the cause for unexpected reboot in our case?
03-23-2014 09:00 PM
Hi Fei,
I don't see much information in the traces for the timestamp prior to 8:56 AM, have you checked the output of 'utils core active list' from the server to see if any core dumps were generated.
HTH
Manish
03-23-2014 11:32 PM
Hi Fei,
Did you see coredumps generated on the server?
If you don't see any logs for this period of time, it could mean 2 things:
1. Server rebooted and therefore no logs were written.
2. The server went into read-only mode and therefore couldn't write any logs to the database.
Please make sure that the firmware on the servers are at the latest posted at the cisco.com website. This will keep them from hitting any issues such as read only.
Downloads Home > Products > Unified Communications > Voice Servers > Cisco 7800 Series Media Convergence Servers
03-21-2014 01:24 AM
Hello Fei
You can also try the steps described here:
http://www.cisco.com/c/en/us/support/docs/unified-communications/unified-communications-manager-callmanager/116717-trouble-cucm-shutdown-00.html
This document helps you to examine the Reasons for shutdown, reboot etc.
In my opinion this also could be due to a power outage. (Since there are no logs of Services which got shutdown)
cheers
slashdots
unrelated:
You might want to check your licenses since there are a lot of License File Errors in your logs (file CiscoSyslog)
03-23-2014 07:53 PM
Hi Slashdots
The URL is extremely helpful.
In our case, it's not power outage. Pub just reboot itself for no reason.
I will approach to Cisco to confirm whether we are hitting a bug or not.
Thx.
Fei
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide