07-22-2010 08:03 AM
I work for a network performance team and we were asked to monitor the CPU Utilization for a Cisco 7613. It was thought (by Management) that turing on SNMP would cause the CPU to be over utilized. We usually poll the device via SNMP to gather this information but since they were afraid of CPU failures, We are logging directly into the router and monitoring the CPU levels using the "sh proc cpu" and 'sh proc cpu hist" Command. Since the graphs and charts provided are very vague it is hard to come up with solid percentage numbers. We were also seeing numbers that didnt match up between the (72) hourly graph and the minute graph. The minute graph was showing a percentage of somewhere between 0-10% and within that same hour, on the hourly graph, showed a spike of 90-100%. My questions are, What are the capture points between these graphs? Is there a way to get this data in a raw format? This will help us come up with more accurate numbers to show that turning on SNMP should not be an issue with the CPU utilization.
Thanks,
Jim
08-10-2010 12:49 PM
Jim,
Your management is correct that SNMP can cause high CPU utilization in certain scenarios, depending on the configuration that you are running. For example performing logging BGP on a router with a full table. However, high CPU doesn't mean that packets will always be dropped by a busy router. In a very simplified response, routing and switching have higher CPU priorities than logging. Unless there is some special packet inspection, packets in general will continue to be forwarded even if the processor is at high utilization. * (If interested there is a longer explanation at the end of this post.)
More specifically addressing your question, my understanding of the show processes cpu command is that the utilization percentages are measurements of the average utilization over that time period. So the 5min cpu utlization is the average of the last 5 minutes.
The show processes cpu history is somewhat different in that it shows both the peak utilization of the time period as well as the average utilization. I believe this explains the disrepency between the CPU utilizations that you saw. For any number of reasons, including certain show commands, the CPU can spike for a moment even during light usage. (Again this doesn't mean there would necessarily be a network outage though.) The show processes cpu history shows you these spikes while the average over time is more indicative of the demands that the router is seeing over time. In the history command the # denote the average usage, the * indicate the peak usage. (Example highlighted in blue.)
router#show processes cpu history
!--- One minute output omitted
6665776865756676676666667667677676766666766767767666566667
6378016198993513709771991443732358689932740858269643922613
100
90
80 * * * * * * * *
70 * * ***** * ** ***** *** **** ****** * ******* * *
60 #***##*##*#***#####*#*###*****#*###*#*#*##*#*##*#*##*****#
50 ##########################################################
40 ##########################################################
30 ##########################################################
20 ##########################################################
10 ##########################################################
0....5....1....1....2....2....3....3....4....4....5....5....
0 5 0 5 0 5 0 5 0 5
CPU% per minute (last 60 minutes)
* = maximum CPU% # = average CPU%
As far as this data in a raw format, nothing occurs to me at the moment. If I think of something I will append this posting.
Hope this helps, and you may have already looked at this but for reference:
http://www.cisco.com/en/US/products/sw/iosswrel/ps1828/products_tech_note09186a00800a65d0.shtml
-Brian
* An engineer can see the split between this high priority utilization and lower priority utilization by using show processes cpu and looking at the CPU utilization percentages. (Highlighted in blue below.) The first percentage is the process utilization, the second percentage is interupt utilization. If the percentages are nearly equal and high (ie 95%/93%) then generally the router is processing more traffic than it is designed for, there is some kind of special packet processing going on, there is a bug or issue that needs to be escalated to TAC, or some combination of those.)
router#show processes cpu
CPU utilization for five seconds: 8%/4%; one minute: 6%; five minutes: 5%
PID Runtime(uS) Invoked uSecs 5Sec 1Min 5Min TTY Process
1 384 32789 11 0.00% 0.00% 0.00% 0 Load Meter
2 2752 1179 2334 0.73% 1.06% 0.29% 0 Exec
3 318592 5273 60419 0.00% 0.15% 0.17% 0 Check heaps !--- output omitted
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide