cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
915
Views
5
Helpful
8
Replies

CUCM 7.1.5 Publisher reboot unexpected.

fei he
Level 2
Level 2

Hi there 

Our publisher reboot itself for some reason around 8:56am and it came back successful around 9am. 

We would like to know what happen. Can someone help to check the logs and give some idea.

I have collect Event View logs for reference. 

Cheers

Fei

8 Replies 8

Sreekanth Narayanan
Cisco Employee
Cisco Employee

Hi Fei,

From the syslogs I don't see what happened before the reboot, except one message, that the certificate has expired on one of the nodes in the cluster.

Mar 20 21:00:00 ndc1cm001 local7 0 : 311: Mar 20 10:00:00.60 UTC :  %CCM_UNKNOWN-CERT-0-CertExpiryEmergency: Certificate Expiry EMERGENCY_ALARM  Message:Certificate expiration Notification. Certificate name:CAPF-962c702a Unit:CallManager-trust Type:trust-cert Expiration:Mon Oct 29  App ID:Cisco Certificate Monitor Cluster ID: Node ID:ndc1cm001

 

After this the only message is that the call manager has booted, where the service manager is starting up. this is the first service to start.

Mar 21 08:58:37 ndc1cm001 local7 6 : 0: Mar 20 21:58:37.317 UTC :  %CCM_SERVICEMANAGER-GENERIC-6-ServiceStarted: Service started. Service Name:Service Manager Process ID:6959 App ID:Cisco Service Manager Cluster ID: Node ID:ndc1cm001

 

I would suggest checking the server for any core dumps that have been created. That would indicate a service crash, causing a reboot.

Another thing that you can look at would be the tomcat catalina.out log to see if that may have been a cause.

 

Was any change made on this server yesterday?

 

Thanks

Hi Sreekanth

 

According to post from Slashdots, we did see unclean shutdown in system-history.log

03/21/2014 08:57:59 | root: Boot 7.1.5.10000-12 Start

 

Can you confirm that we are hitting this bug or not?

Our CCM is 7.1.5.10000-12.

 

Hi Fei,

Yes, CUCM ver 7.1.5 is affected and as per the bug workaround "Rebuild the server if there is a non-graceful shutdown regardless if the recovery disk was required. As any non-graceful shutdown can cause file system corruption even if no symptoms are seen immediately."

HTH

Manish

Hi Manish

What is the cause for unexpected reboot in our case?

 

Hi Fei,

I don't see much information in the traces for the timestamp prior to 8:56 AM, have you checked the output of 'utils core active list' from the server to see if any core dumps were generated.

HTH

Manish

Hi Fei,

Did you see coredumps generated on the server?

If you don't see any logs for this period of time, it could mean 2 things:

1. Server rebooted and therefore no logs were written.

2. The server went into read-only mode and therefore couldn't write any logs to the database.

 

Please make sure that the firmware on the servers are at the latest posted at the cisco.com website. This will keep them from hitting any issues such as read only.

Downloads Home > Products > Unified Communications > Voice Servers > Cisco 7800 Series Media Convergence Servers

slashdots
Level 1
Level 1

Hello Fei

You can also try the steps described here:
http://www.cisco.com/c/en/us/support/docs/unified-communications/unified-communications-manager-callmanager/116717-trouble-cucm-shutdown-00.html

This document helps you to examine the Reasons for shutdown, reboot etc.

In my opinion this also could be due to a power outage. (Since there are no logs of Services which got shutdown)

cheers
slashdots

unrelated:
You might want to check your licenses since there are a lot of License File Errors in your logs (file CiscoSyslog)

Hi Slashdots

The URL is extremely helpful. 

In our case, it's not power outage. Pub just reboot itself for no reason. 

I will approach to Cisco to confirm whether we are hitting a bug or not.

Thx.

Fei