cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1339
Views
5
Helpful
5
Replies

Core dump file - CTIManager

juan.hinojosa89
Level 1
Level 1

Hi

I had an issue with the CTIManager service and the CallManager service. They both restarted twice. The first time was the last monday and the next time was the last friday as you can see:

On Subscriber
Cisco CallManager Started Activated Fri Nov 11 17:46:21 2016 0 days 00:51:00

Cisco CTIManager Started Activated Fri Nov 11 17:44:45 2016 0 days 00:52:36

On Publisher
Cisco CTIManager Started Activated Fri Nov 11 17:44:56 2016 0 days 00:53:44

Cisco CallManager Started Activated Fri Nov 11 17:46:16 2016 0 days 00:52:24

We have two servers in the cluster, and the replication is good. The CUCM generated some core dump files:

108384 KB 2016-11-07 12:27:02 core.18216.6.CTIManager.1478542974
335248 KB 2016-11-07 12:28:05 core.18220.6.ccm.1478543003
280852 KB 2016-11-11 17:45:12 core.20714.6.ccm.1478907836
104224 KB 2016-11-11 17:43:57 core.20681.6.CTIManager.1478907836

The CTIManager file was created at 2016-11-11 17:43:57, (followed by the restart of the CTIManager service which is started at Fri Nov 11 17:44:56 2016 for the publisher and Fri Nov 11 17:44:45 2016 for the subscriber).

The ccm file was created at 2016-11-11 17:45:12, (followed by the restart of th CallManager service which started at Fri Nov 11 17:46:16 2016 for the publisher and Fri Nov 11 17:46:21 2016 for the subscriber).

The backtraces are the following:

core.20681.6.CTIManager.1478907836

backtrace - CUCM
===================================
#0 0x00e3e430 in __kernel_vsyscall ()
#1 0x00abb871 in raise () from /lib/libc.so.6
#2 0x00abd14a in abort () from /lib/libc.so.6
#3 0x083eae0c in IntentionalAbort (reason=0x85fae48 "SDL Router Services declared dead. This may be due to high CPU usage or blocked function. Attempting to restart CTIManager.") at ProcessCTIProcMon.cpp:65
#4 0x083eafcb in CMProcMon::verifySdlTimerServices () at ProcessCTIProcMon.cpp:573
#5 0x083ebbbc in CMProcMon::callManagerMonitorThread (cmProcMon=0xf7228d48) at ProcessCTIProcMon.cpp:330
#6 0x00952398 in ACE_OS_Thread_Adapter::invoke() () from /usr/local/platform/lib/libACE.so.6.1.1
#7 0x00912491 in ace_thread_adapter () from /usr/local/platform/lib/libACE.so.6.1.1
#8 0x004adb39 in start_thread () from /lib/libpthread.so.0
#9 0x00b73c2e in clone () from /lib/libc.so.6
====================================

core.20714.6.ccm.1478907836

backtrace - CUCM
===================================
#0 0xf7744430 in __kernel_vsyscall ()
#1 0xf6931871 in raise () from /lib/libc.so.6
#2 0xf693314a in abort () from /lib/libc.so.6
#3 0x0838898e in IntentionalAbort () at ProcessCMProcMon.cpp:88
#4 CMProcMon::verifySdlRouterServices () at ProcessCMProcMon.cpp:748
#5 0x08388bba in CMProcMon::callManagerMonitorThread (cmProcMon=0xe704be0) at ProcessCMProcMon.cpp:429
#6 0xf6c63398 in ACE_OS_Thread_Adapter::invoke (this=0x112e32e0) at OS_Thread_Adapter.cpp:103
#7 0xf6c23491 in ace_thread_adapter (args=0x112e32e0) at Base_Thread_Adapter.cpp:126
#8 0xf68e9b39 in start_thread () from /lib/libpthread.so.0
#9 0xf69e9c2e in clone () from /lib/libc.so.6
====================================

core.18216.6.CTIManager.1478542974

backtrace - CUCM
===================================
#0 0x00751430 in __kernel_vsyscall ()
#1 0x01bca871 in raise () from /lib/libc.so.6
#2 0x01bcc14a in abort () from /lib/libc.so.6
#3 0x083eae0c in IntentionalAbort (reason=0x85fae48 "SDL Router Services declared dead. This may be due to high CPU usage or blocked function. Attempting to restart CTIManager.") at ProcessCTIProcMon.cpp:65
#4 0x083eafcb in CMProcMon::verifySdlTimerServices () at ProcessCTIProcMon.cpp:573
#5 0x083ebbbc in CMProcMon::callManagerMonitorThread (cmProcMon=0xf7332d88) at ProcessCTIProcMon.cpp:330
#6 0x008c8398 in ACE_OS_Thread_Adapter::invoke() () from /usr/local/platform/lib/libACE.so.6.1.1
#7 0x00888491 in ace_thread_adapter () from /usr/local/platform/lib/libACE.so.6.1.1
#8 0x00260b39 in start_thread () from /lib/libpthread.so.0
#9 0x01c82c2e in clone () from /lib/libc.so.6
====================================

core.18220.6.ccm.1478543003

backtrace - CUCM
===================================
#0 0xf77c4430 in __kernel_vsyscall ()
#1 0xf69b1871 in raise () from /lib/libc.so.6
#2 0xf69b314a in abort () from /lib/libc.so.6
#3 0x0838857a in IntentionalAbort () at ProcessCMProcMon.cpp:88
#4 CMProcMon::verifySdlTimerServices () at ProcessCMProcMon.cpp:802
#5 0x08388bcb in CMProcMon::callManagerMonitorThread (cmProcMon=0xeea5e990) at ProcessCMProcMon.cpp:431
#6 0xf6ce3398 in ACE_OS_Thread_Adapter::invoke (this=0xe551b490) at OS_Thread_Adapter.cpp:103
#7 0xf6ca3491 in ace_thread_adapter (args=0xe551b490) at Base_Thread_Adapter.cpp:126
#8 0xf6969b39 in start_thread () from /lib/libpthread.so.0
#9 0xf6a69c2e in clone () from /lib/libc.so.6
====================================

The CPU levels are fine, they are even below of 60% on peaks on both servers. We have one UCCX and one IM&Presence and these servers do not generate any core dump files.

I will appreciate any kind of help.

Thank you

5 Replies 5

Adarsh Chauhan
Level 3
Level 3

Hi Juan,

The back trace you have pasted are generic errors.

The do not reflect a defect in the code and just convey that there was a resource starvation situation.

You might want to do a quick Storage check for high Input Output percentage.

You also might want to refer to:

Troubleshooting Intentional Abort

Please rate if helpful.

Regards,

Adarsh Chauhan


Please rate and mark correct if helpful
Regards,
Adarsh Chauhan

Hi Adarsh

Thank you for your support.

I am loking for a loop, because I find some interesting stuff on the cdr reports.

By the way, I tried to see the RisDC perfmonlog information on the RTMT and I can´t. It shows me an error message.

Regards

Juan

Hi Juan,

If you can attach the perfmon file then I can try to have a look at it and comment further.

Regards,

Adarsh Chauhan


Please rate and mark correct if helpful
Regards,
Adarsh Chauhan

Hi

We had the issue again. This time was the call manager service only (the CTI manager did not reboot this time).

The core dump file generated was the following:

285324 KB   2016-11-19 00:00:43   core.28161.6.ccm.1479535192

Since the CTI manager did not reboot, it not generated a core dump for the CTI manager.

I found some logs, I attach the very important in this file.

You can see the moment of the error, and then you can see the keepalives timeouts, and then some connections are stopping. 

I cant determine with these logs why the server fail, obviously was because resource problems, but I cant see the reason.

Hope you can help me

Thank you

Look at the timestamps, how they agree, for the core dump and for the error on the log.

And check the reason 4 for the error on the image that I attach from the System error messages.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: