cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2402
Views
15
Helpful
17
Replies

Weird CPU usage ?

TrickTrick
Level 3
Level 3

Hello,

I have some issues using an ISR 4321 as a voice gateway, when I hang off the IP phone, I can hear the caller but after like 7 to 10 seconds, during which nothing is heard at all for both ends, I thought it might be a high CPU usage which makes the router slow to process the call, so I issued "show processes cpu sorted" and "Show processes cpu history", in the first the cpu seems like to be not used almost (0%, 1%), I'm not sure of the second one though because there's no stars (*) and no (#) but looks like the CPU is totally fine, can you please confirm if my read is correct?

CPU utilization for five seconds: 1%/0%; one minute: 0%; five minutes: 0%

 

sh processes cpu history

 


22222
100
90
80
70
60
50
40
30
20
10
0....5....1....1....2....2....3....3....4....4....5....5....6
0 5 0 5 0 5 0 5 0 5 0
CPU% per second (last 60 seconds)

 

 

121111111111111112111121111111111111111111111112111111111111
100
90
80
70
60
50
40
30
20
10
0....5....1....1....2....2....3....3....4....4....5....5....6
0 5 0 5 0 5 0 5 0 5 0
CPU% per minute (last 60 minutes)
* = maximum CPU% # = average CPU%

 

 

222222222222222222222222222222222222222222222222222222222322222222222222
100
90
80
70
60
50
40
30
20
10
0....5....1....1....2....2....3....3....4....4....5....5....6....6....7..
0 5 0 5 0 5 0 5 0 5 0 5 0
CPU% per hour (last 72 hours)
* = maximum CPU% # = average CPU%

17 Replies 17

marce1000
VIP
VIP

 

 - FYI : https://www.cisco.com/c/en/us/support/docs/routers/4000-series-integrated-services-routers/210760-Monitor-CPU-Usage-On-ISR4300-Series.html

                   Check current software version installed too and consider using an advisory release , if applicable.

 M.



-- ' 'Good body every evening' ' this sentence was once spotted on a logo at the entrance of a Weight Watchers Club !

Leo Laohoo
Hall of Fame
Hall of Fame

@TrickTrick wrote:

sh processes cpu history


Please exercise extreme caution when using this command on routers and switches running on IOS-XE.  This command only tells the operator less-than-half of the story -- And mostly the wrong half.  

The "sh processes cpu history" command is useful on single CPU (not single core) routers and switches.  

Cisco IOS-XE runs on appliances with multi-core/multi-CPU.  When used on a multi-core appliance the command tells the operator the average CPU utilization.  This command will not provide a breakdown of individual CPU utilization.  

The ideal command is: 

sh platform software status control-processor brief

This command is a real-time "snapshot" and will provide an overall output of each CPU as well as memory utilization.   
Here is an example of the command when used on a stack of five switches: 

Load Average
 Slot  Status  1-Min  5-Min 15-Min
1-RP0 Healthy   0.43   0.50   0.53
2-RP0 Healthy   0.26   0.30   0.27
3-RP0 Healthy   0.07   0.08   0.09
4-RP0 Healthy   0.33   0.22   0.14
5-RP0 Healthy   0.12   0.11   0.14

Memory (kB)
 Slot  Status    Total     Used (Pct)     Free (Pct) Committed (Pct)
1-RP0 Healthy  3934796  1964356 (50%)  1970440 (50%)   2910124 (74%)
2-RP0 Healthy  3934796  1858256 (47%)  2076540 (53%)   2833236 (72%)
3-RP0 Healthy  3934796  1144996 (29%)  2789800 (71%)   1548428 (39%)
4-RP0 Healthy  3934796  1138060 (29%)  2796736 (71%)   1533936 (39%)
5-RP0 Healthy  3934796  1138992 (29%)  2795804 (71%)   1539844 (39%)

CPU Utilization
 Slot  CPU   User System   Nice   Idle    IRQ   SIRQ IOwait
1-RP0    0   7.90   1.70   0.00  90.19   0.00   0.20   0.00
         1  13.40   2.20   0.00  84.30   0.00   0.10   0.00
         2  10.01   1.60   0.00  88.38   0.00   0.00   0.00
         3   6.60   1.10   0.00  92.29   0.00   0.00   0.00
2-RP0    0   4.29   1.29   0.00  94.40   0.00   0.00   0.00
         1   3.30   1.70   0.00  94.89   0.00   0.10   0.00
         2   8.19   1.99   0.00  89.81   0.00   0.00   0.00
         3   5.60   3.60   0.00  90.80   0.00   0.00   0.00
3-RP0    0   1.60   2.30   0.00  96.10   0.00   0.00   0.00
         1   4.90   1.10   0.00  93.99   0.00   0.00   0.00
         2   1.20   0.70   0.00  97.99   0.00   0.10   0.00
         3   2.80   1.40   0.00  95.80   0.00   0.00   0.00
4-RP0    0   4.90   0.60   0.00  94.50   0.00   0.00   0.00
         1   2.50   0.50   0.00  97.00   0.00   0.00   0.00
         2   0.70   0.30   0.00  98.79   0.00   0.20   0.00
         3   1.40   1.00   0.00  97.60   0.00   0.00   0.00
5-RP0    0   3.20   2.60   0.00  94.20   0.00   0.00   0.00
         1   4.90   1.60   0.00  93.50   0.00   0.00   0.00
         2   3.30   1.40   0.00  95.30   0.00   0.00   0.00
         3   0.30   0.40   0.00  99.30   0.00   0.00   0.00

Items in RED are Memory Utilization.  

Things to look out for: 

  1. Experience have taught me that under "Used" column, the value must be <70%.  If the memory hits >70% (and does not decrease over time), that switch will crash shortly.  
  2. Add up the Used and Free and they should total to 100%. 
  3. Committed is 140%

 

Items in BLUE are CPU Utilization. 

Things to look out for: 

  • Like the Memory Utilization, look under "Used".  Because it is CPU, the value should move up and down faster than Memory.  

Now let us say something is chewing the Memory.  If you want to know WHAT process(es) are hogging the memory, then use the following command: 

 

 

sh processes memory platform sorted location switch <SWITCH NUMBER> r0

 

 

Under this output, pay close attention to the RSS column and compare the value with the "total system memory" (first line after the command).  

Ok, now here comes the fun part:  How to know which CPU is "hot spinning".  The command is very similar to the command for memory:  

 

 

sh processes cpu platform sorted location switch <SWITCH NUMBER> r0

 

 

Now, here is a sample of an output:

 

 

Switch#sh processes cpu platform sorted location switch 5 r0
CPU utilization for five seconds:  3%, one minute:  3%, five minutes:  3%
Core 0: CPU utilization for five seconds:  1%, one minute:  2%, five minutes:  4%
Core 1: CPU utilization for five seconds:  1%, one minute:  2%, five minutes:  2%
Core 2: CPU utilization for five seconds:  2%, one minute:  2%, five minutes:  2%
Core 3: CPU utilization for five seconds:  5%, one minute:  4%, five minutes:  3%
   Pid    PPid    5Sec    1Min    5Min  Status        Size  Name                  
--------------------------------------------------------------------------------
 10659   10015      5%      5%      5%  R           214452  fed main event        
 19730   18997      3%      3%      3%  S            63436  fman_fp_image         
 10956   10342      1%      1%      1%  R            14196  hman                  
  8487    8154      1%      1%      1%  S            80128  sif_mgr               
 26432       2      0%      0%      0%  S                0  kworker/0:0           
 26328   26323      0%      0%      0%  S            25904  python2.7             
 26323   26209      0%      0%      0%  S             1632  rdope.sh              
 26254       1      0%      0%      0%  S             1904  rotee                 
 26209       1      0%      0%      0%  S             3112  pman.sh               
 24634   24627      0%      0%      0%  S              928  sntp                  
 24627       1      0%      0%      0%  S             1784  stack_sntp.sh         
 21824   20904      0%      0%      0%  S             2720  btelnet               
 20904   20858      0%      0%      0%  S             4504  btelnet.sh            
 20858   20020      0%      0%      0%  S             1828  bexec.sh              
 20702   20642      0%      0%      0%  S             2560  journalctl            
 20642   20037      0%      0%      0%  S             8316  plogd                 
 20320       1      0%      0%      0%  S             1904  rotee                 
 20199   19705      0%      0%      0%  S            29484  repm            

 

 

The output above gives me more granular output of the behaviour of my IOS-XE router &/or switch.

Hope this helps.

 

Anyone want to see a command that behaves very similar to Linux/Unix "top" command? 

sh proc cpu monitor interval <5 to 3600> 

WARNING:  Use this option with extreme caution.  

 

Thank you Leo for the complete answer, I have a question though, what do you mean by "Multi-core" "Multi-CPU" appliances ? a stack of multiple switches for example ? is a single ISR router can  have multi-CPUs ? because in my case I have only that one router

But overall, your answer is generally helpful thank you


@TrickTrick wrote:

what do you mean by "Multi-core" "Multi-CPU" appliances


Generally, 1 "core" means 4 CPUs.  Multi-Core means a lot more CPUs.  

IOS-XE was developed for appliances (i.  e.  router, switches, firewalls, etc) with multi-CPU (2 or more CPU)/multi-core.  Classic IOS, apparently, does not "handle well" in multi-core environment.  Routers such as ASR, ISR 4k and Catalyst 8k are multi-core and they run IOS-XE.  

See what output gets generated with the command: 

sh platform software status control-processor brief

The output is self-explanatory.

I issued the command above and found something interesting, Memory used (99%), but status is "Healthy" somehow, Attached a picture of the output of both memory and CPU

1.jpg

If my read is correct, how can I check which process is eating that much of memory?


@TrickTrick wrote:

how can I check which process is eating that much of memory? 


sh processes memory platform sorted location r0

What firmware is this router running on?

Tha actual firmware is "isr4300-universalk9.16.09.06"

Here's the output of the recent command to see the details of the processes, the system itself is the highest process running, and the strange thing is that by verifying 4 other routers, they have the same Memory usage (all 99%) but no issues noticed though in those, calls are perfectly handled

2.jpg

2nd screenshot shows 1429068 free memory. 

(1429068 / 3950556) x 100 = 36.17% free memory

Not sure what is going on with the first screenshot. 
You may need to take multiple output of #1 and #2 to determine if there are really discrepancies.

I will do that several times, just by curiosity to see if i'll get the same result as you, I tried to do the same calculation you did above, but i've got (1429068 / 39050556) x 100 = 3.66, I'm doing something wrong ? if my calculation is correct then it matches almost the 1st screenshot 


@TrickTrick wrote:

39050556


I got my values wrong:  It should be 3950556. 

Is it possible there is some delay in the signalling path that is causing the delay? Once two way audio is established, does the conversation act properly? If so, are there any firewalls in play? Perhaps an issue with the interface bindings for either signalling or media. What protocol is the gateway using? H.323, SIP, MGCP, SCCP?

Yes sir, once two way audio is established the conversation acts properly

Well their's a firewall between the Voice gateway and CUCM since it's a centralized deployment with several remote sites, but how can I explain the reboot of the ISR fixes the problem temporarily (4 days or so) before acting that way again after? 

"Perhaps an issue with the interface bindings for either signalling or media" : Hmm how can I check that please ?

The gateway is using H.323

thank you

Review Cisco Networking products for a $25 gift card