04-14-2010 09:23 AM
I need to what the PID using CPU utilization at 50%.
All i see on the syslog server is " %HA_EM-2-LOG: highcpu: HIGH CPU"
EEM doesnt tell me what process is consuming the CPU at 50%
Any ideas?
event manager applet highcpu
event snmp oid 1.3.6.1.4.1.9.9.109.1.1.1.1.10.1 get-type exact entry-op ge entry-val 50 poll-interval 5
action 1.0 cli command "enable"
action 2.0 cli command "show proc cpu sorted"
action 3.0 syslog priority critical msg "HIGH CPU"
Francisco.
Solved! Go to Solution.
04-15-2010 03:28 PM
Joe,
On one of your C6500 WAN switches with EBGP peering with our ISP provider appears to have BGP hold timer expiring frequently during the day (really annoying) . During those events we are seeing very high CPU. We are not sure at this stage what's causing high CPU since we are logging lots a ACL's filling logging buffer quickly hence that's why we am asking for your assitant to capture what's causing high CPU.
The tcl script you have provided us we will definately use in prod but any chance you could modify the script to capture overall CPU utilization only at 50%?
Francisco.
04-15-2010 03:38 PM
Right now it does. That is, the script will not execute unless the OVERALL CPU usage is at or above 50%. It will then email all processes which have a five second CPU utilization above 0%. This way, you can get an idea of ALL CPU consumers that contributed to the 50% overall CPU utilization. I thought this is what you would want. However, if you want to only see processes at or above 50% on their own, I can do that, but I do not think the script will provide useful data at that point.
--
Please support CSC Helps Haiti
https://supportforums.cisco.com/docs/DOC-8895
https://supportforums.cisco.com
04-15-2010 04:00 PM
Joe.
Thank you very much for your assistant.
I will test it on the weekend
Francisco
04-15-2010 04:24 PM
I made one modification which may help. In this version, the processes will be listed in descending order relative to their CPU consumption. Previously, the processes were listed by hash value.
--
Please support CSC Helps Haiti
https://supportforums.cisco.com/docs/DOC-8895
https://supportforums.cisco.com
04-15-2010 06:18 PM
Sorry, I had a typo in that last version. Try this one instead.
--
Please support CSC Helps Haiti
https://supportforums.cisco.com/docs/DOC-8895
https://supportforums.cisco.com
04-16-2010 04:45 AM
Hey Joe,
I dont see much difference based on the output i am seeing on the syslog server!
The high cpu threshold "50" definately means 0.50% and higher not 50.00% and higher. The problem is we are receiving far too many syslogs mesaages for procceses using low CPU and we could easily missed out what we need to see!. Hence why i need the tcl script to trigger when CPU is at 50.00% or above. Not interested on any process using below 50.00%.
Any chance you could modify the script to do that? if not then no worries.
Thanks Joe.
Francisco.
04-16-2010 09:40 AM
The threshold is 50%. That is, the script will not run at all unless the five second CPU utilization of the device as a whole is greater than or equal to 50%. When that occurs, the script will parse the output of "show proc cpu sorted". For every process which has a non-zero five second CPU utilization value, that process name will be sent out via a syslog with its five second CPU utilization value. Again, the reason for this is that multuple processes could be contributing to the overall CPU utilization of 50%. There may not be one single process taking 50% or more CPU. In which case, if you only printed processes that had a 50% CPU utilization value, your syslog would have no processes.
Now, if you are only interested in processes which have a 50% utilization value, you need to change the design of your script from one that looks at the overall system CPU usage to one that runs periodically, parses the output of "show proc cpu sorted", and only sends a syslog when one or more processes are taking up at least 50% of the CPU. IS this what you want?
04-16-2010 09:55 AM
Yes Joe. Only interested in processes which have a 50% utilization value
Francisco.
04-16-2010 10:35 AM
04-19-2010 01:42 PM
Joe,
I have uploaded the script in to flash and trying to register it, i get error below.
R1(config)#event manager policy tm_alert_high_cpu.tcl
Compile check and registration failed:Wrong # args, usage is "::cisco::eem::event_register_timer watchdog|countdown|absolute|cron name ? cron_entry ? time ? queue_priority normal|low|high maxrun ? nice ?"
while executing
"::cisco::eem::event_register_timer watchdog time $high_cpu_poll_freq
"
Tcl policy execute failed: Wrong # args, usage is "::cisco::eem::event_register_timer watchdog|countdown|absolute|cron name ? cron_entry ? time ? queue_priority normal|low|high maxrun ? nice ?"
Embedded Event Manager configuration: failed to retrieve intermediate registration result for policy tm_alert_high_cpu.tcl: Unknown error 0
R1(config)#event manager policy tm_alert_high_cpu.tcl
Compile check and registration failed:Wrong # args, usage is "::cisco::eem::event_register_timer watchdog|countdown|absolute|cron name ? cron_entry ? time ? queue_priority normal|low|high maxrun ? nice ?"
while executing
"::cisco::eem::event_register_timer watchdog time $high_cpu_poll_freq
"
Tcl policy execute failed: Wrong # args, usage is "::cisco::eem::event_register_timer watchdog|countdown|absolute|cron name ? cron_entry ? time ? queue_priority normal|low|high maxrun ? nice ?"
Embedded Event Manager configuration: failed to retrieve intermediate registration result for policy tm_alert_high_cpu.tcl: Unknown error 0
R1(config)#event manager policy tm_alert_high_cpu.tcl
04-19-2010 08:53 PM
04-20-2010 06:42 AM
PERFECTO. Excellent stuff Joe.
Another rating goes to you.
Thanks
Francisco.
10-11-2010 12:37 PM
Joe,
I'm in a similar position where it appears the "BGP Router" process may be running so frequently due our EBGP peer, whom we receieve the full routing table from. They constantly run some algorithm that changes their routes for optimal routing which seems to be causing them to send us updates ranging from 15-30 on average with spikes as high as 100+. On our other bgp router, it peers with two ISPs which it gets full routing tables from both. It has twice as many routes as the one in question but rarely has the same problem. Its bgp updates are only around 1-5 on average with maybe a spike up to 20-25. Nothing like the one in question. I was hoping to use EEM to get a better picture of what process is taking up the majority of the CPU when the cpu spikes to over 75% over the 5sec duration. I want to confirm that its the BGP Router process that is causing our problem. Whats making me look at bgp is that we use nagios as our NMS. We defined a service so that every 5 mins it runs a perl script to get information on all our devices interfaces. Periodically it will alert us that this router is not responding but 5 minutes later its ok. Now I know that I can adjust one setting in nagios that might stop it from alerting what I feel is a false alarm but I want to confirm if its the CPU spikes by the BGP Router process that stops the router from responding to nagios.
s72033-advipservicesk9_wan-mz.122-33.SXI1.bin <---- IOS currently running on our 7200 router.
Any help would be appreciated.
Lee
10-11-2010 03:46 PM
Please start a new thread for your issue.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide