cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4431
Views
0
Helpful
14
Replies

cmoninit High CPU v11.0.1

cameronallison1
Level 1
Level 1

Hi Everyone,

Every night I'm getting an RTMT alert from my PUB CUCM 11.0.1 + 4xsubs

 Processor load over configured threshold for configured duration of time . Configured high threshold is 90 % cmoninit (58 percent) uses most of the CPU. 

 

Processor_Info: 

 

 For processor instance _Total: %CPU= 99, %User= 77, %System= 17, %Nice= 0, %Idle= 0, %IOWait= 5, %softirq= 1, %irq= 0. 

 

 For processor instance 0: %CPU= 99, %User= 77, %System= 17, %Nice= 0, %Idle= 0, %IOWait= 5, %softirq= 1, %irq= 0. 

 

The alert is generated on Thu Nov 10 00:02:25 AEDT 2016 on node 10.99.63.51. 

  

 Memory_Info: %Mem Used= 65, %VM Used= 49. 

 

 Partition_Info: 

Swap: %Disk Used=25. 

Active: %Disk Used=92. 

Common: %Disk Used=48. 

 

 Process_Info: processes with D-State:  cmoninit#2

Thanks for your help in advance

Cam

14 Replies 14

Aseem Anand
Cisco Employee
Cisco Employee

Hi,

The system is complaining for high CPU.

Please check the following:

1. Make sure you are not using snapshots on CUCM as they tend to affect system performance.

2. If it happens every night then do you have any DRS backup, Directory Sync or any other activity on the server?

3. Can you check the NTP status on the CUCM. The recommended stratum value for NTP is less than 5. You can check by getting the output of utils ntp status.

4. Collect the event viewer system and application logs and look for any errors or alerts corresponding to the time of the high cpu alert. 

5. Perform a system diagnostics during the normal hours and during the time when the system complains for high cpu to compare the two. You can use the command utils diagnose test.

6. Check the ESXi and see if there are any errors/alerts.

Aseem

Hi Aseem,

Thanks for your response.

- I've confirmed both ESXi Snapshots and backups, they only occur at 3:30am

- The DRS occurs each night at 1:00am. I even turned off the DRS one night and the issue still occured

- The NTP Stratum is 3

- I'll check the application logs and look for errors

I'll let you know

Thanks

Cam

Hi,

cmoninit is a database process, there can be a variety of reason this could happen.

but you can not use snapshot on CUCM, you would have to remove them to get to a supported configuration.

Also how many users do you have on CUCM, can you past the output of "Show status" from CLI for the server.

JB

Hi JB,

Thanks for your response

I agree with on the snapshot but the VMWare team take them anyway to then backup the system

See below the show status output













Executed command unsuccessfully
No valid command entered
admin:show status

Host Name          : vicpccm01
Date               : Thu Nov 10, 2016 14:57:15
Time Zone          : Australian Eastern Daylight Time (Australia/Melbourne)
Locale             : en_US.UTF-8
Product Ver        : 11.0.1.20000-2
Unified OS Version : 6.0.0.0-2

Uptime:
 14:57:16 up 106 days, 19:01,  1 user,  load average: 0.16, 0.27, 0.34

CPU Idle:   93.75%  System:   03.12%    User:   03.12%
  IOWAIT:   00.00%     IRQ:   00.00%    Soft:   00.00%

Memory Total:        5994288K
        Free:         222780K
        Used:        5771508K
      Cached:        1761028K
      Shared:         299552K
     Buffers:         150132K

                        Total            Free            Used
Disk/active         14154228K        1125108K       12883992K (92%)

Hi,

Can you also paste "show hardware"

I understand the concern of Vmware team, but if you go to TAC for troubleshooting the first thing they will tell you is to remove snapshot.

You can setup DRS to backup your configuration which is a supported way to go.

JB

Hi JB,

See show hardware below


admin:show hard
admin:show hardware

HW Platform       : VMware Virtual Machine
Processors        : 1
Type              : Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
CPU Speed         : 2900
Memory            : 6144 MBytes
Object ID         : 1.3.6.1.4.1.9.1.1348
OS Version        : UCOS 6.0.0.0-2.i386
Serial Number     : VMware

RAID Version      :
No RAID controller information is available

BIOS Information  :
PhoenixTechnologiesLTD 6.00 10/22/2013

RAID Details      :
No RAID information is available
-----------------------------------------------------------------------
Physical device information
-----------------------------------------------------------------------
Number of Disks   : 1
Hard Disk #1
Size (in GB)      : 80

Partition Details :

Disk /dev/sda: 10443 cylinders, 255 heads, 63 sectors/track
Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sda1   *      2048  29028351   29026304  83  Linux

Thanks

Cam

Hi Cam,

Can you tell me below.

  1. How many nodes in CUCM cluster.
  2. How many users \ phone in total in your CUCM.

JB

Hi JB,

At present there are only 166 phones  on this system. We are migrating from another CUCM. Also attached are Unity, IM&P, Attendent Console and UCCX

Thanks

Cameron

Hi Cam,

You are well within the OVA specification for 2500 users

Cisco Unified Communications Manager (CUCM) configuration that supports up to 2500 users per node.
Details:
Red Hat Enterprise Linux 6 (64-bit)
CPU: 1 vCPU with 800 MHz reservation
Memory: 6 GB with 6 GB reservation
Disk: 1 - 80 GB disk
      </Description>
    </Configuration>
    <Configuration ovf:default="true" ovf:id="CUCM_7500">
      <Label>CUCM 7500 user node</Label>
      <Description>

I would at this point first get rid of snapshot an monitor to see if that's what was causing the issue before completely migrating over from old CUCM

JB

Hi JB,

There are no snapshots on the vmware box at present

The backup works in such a way that the backup software takes a snapshot and after backup deletes the snapshot

Thanks

Cam

Hi Cam,

Are the alerts generated at the same time that process takes place?

JB

Hi JB,

The alerts take place at 00:00 and the backup occurs at 03:30

Thanks

Cam

Hi Cam,

You can try increasing CPU to 2 to see if that makes a difference, otherwise i would suggest to try collecting the below logs for the time period to find root cause.

  1. Cisco Informix Database Service
  2. Event viewer application logs
  3. Event viewer system logs
  4. RisDC perfmon logs

JB

Hi JB,

Thanks for all your help, I'll get those logs and let you know

I'll also see if I can increase the CPU and see how I go

Thanks

Cam

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: