This is an opportunity to learn and ask questions about high CPU condition that you might be facing your environment and troubleshooting the same with the tools and techniques available within the platform with Cisco expert Vinit Jain.
Ask questions from Monday, May 11th, 2015 to Friday, May 22, 2015
High CPU condition is a very common problem seen in production environments which can cause a huge impact on the services if not taken care on time. High CPU can be classified in primarily in two categories – 1) High CPU due to process and 2) High CPU due to interrupt (traffic). Cisco expert Vinit Jain will cover and answer all of your questions about troubleshooting High CPU on Cisco IOS.
Vinit Jain, 3X CCIE #22854 is a Technical Lead in HTTS (High Touch Technical Support) team supporting customers in areas of routing, MPLS, TE, IPv6, multicast and a wide variety of platform issues like High CPU, Memory leak, etc IOS, IOS XE, IOS XR and NxOS code base. Has been delivering trainings within Cisco on various technology as well as platform troubleshooting topics. He has also written workbook on IOS XR fundamentals on Cisco Support Community. Vinit has CCIE in R&S, SP and Sec and holds multiple certifications on programming and databases.
**Ratings Encourage Participation! **
Please be sure to rate the Answers to Questions
Solved! Go to Solution.
yes, there are scenario's in which an IGP flap can cause CPU to spike up. Now the question is how do we approach this problem. If we try to troubleshoot high CPU, then this will lead us to look at IGP flaps.
Suppose, the BFD is flapping which is causing OSPF to flap, then we can use the below script to troubleshoot this problem:
event manager applet OSPF_Monitor event syslog pattern "Neighbor Down: BFD node down" action 1.01 syslog priority critical msg "**** BFD Failure Detected - Statistics Logged ****" action 1.02 cli command "enable" action 1.03 cli command "show clock | append bootdisk:cpu_stats" action 1.04 cli command "show proc cpu sort | append bootdisk:cpu_stats" action 1.05 cli command "debug netdr cap rx" action 1.06 cli command "show netdr cap | append bootdisk:cpu_stats" action 1.07 cli command "undebug all" action 1.08 cli command "end"
The above capture is for performing a netdr capture on the event of BFD flap to see what packets are hitting the CPU which can then be decoded to further understand what is happening on the router. We can capture commands related to BFD or OSPF in the above EEM.
If we dont know which process or protocol is causing high CPU and when its causing it, we can have another EEM script configured on the router which can be triggered when the CPU spikes up:
event manager applet HIGHCPU event snmp oid "18.104.22.168.22.214.171.124.126.96.36.199.1.3.1" get-type exact entry-op gt entry-val "90" exit-op lt exit-val "70" poll-interval 5 maxrun 200 action 1.0 syslog msg "START of TAC-EEM: High CPU" action 1.1 cli command "enable" action 1.3 cli command "debug netdr clear-capture" action 1.4 cli command "debug netdr capture rx“ action 2.0 cli command "sh clock | append disk0:proc_CPU" action 2.1 cli command "show process cpu sorted | append disk0:proc_CPU“ action 2.2 cli command "show proc cpu history | append disk0:proc_CPU" action 2.3 cli command "show netdr capture | append disk0:proc_CPU" action 3.1 cli command "show log | append disk0:proc_CPU" action 4.0 syslog msg "END of TAC-EEM: High CPU"
In the above EEM script, we are triggering the EEM when the high CPU is noticed. We can also set the min and max range of CPU on which the trigger can occur.
The more imp question is why the IGP is flapping. It could be due to some drops, of MTU issues or some rate-limiter dropping some legitimate packets etc..
Hope this helps.
PS: Please do rate the reply if you find them useful
I know it is not 5/11 yet, but I will ask the question and wait for the response until 5/11.
I have purchased several Cisco 2960-X switches. Out of the box, without ANY configuration the CPU runs at about 40% to 42% at all the time. One process in particular (Hulc LED Process) is using 22% of the CPU. We have also opened a ticket with TAC and they told us that this behavior is normal. Can you explain what is cause the high CPU and if Cisco has anything on the road map to remedy this?
Could you please let me know what is the IOS version that you are running? i have noticed this as a common problem on this platform. Based on my research,
The "Hulc LED" process does following tasks: - Check Link status on every port - If the switch supports POE, it checks to see if there is a Power Device (PD) detected - Check the status of the transceiver - Update Fan status - Set Main LED and ports LEDs - Update both Power Supplies and RPS - Check on system temperature status
So even if there is no production traffic, you might be seeing this process consuming CPU on the switch. This is documented as an expected behavior under the software defect# CSCtg86211 (This problem is not a software defect but an expected behavior).
Hope this information helps.
I dont think this behavior is changed and neither do I see any clear indication on when this is going to get changed/fixed.
If your query has been answered, please do rate the post and the mark the answer as complete.
If you have any further questions, please feel free to ask them as well.
On a quick note, are there any cables connected on the switch. If you do show ip int brief, what is the status that you see. Are the ports in up state?
If the device is not in production, can you try to unplug the ports or may be shut them down using shutdown command and see if that brings any change in the CPU utilization as based on what this process does, it should bring it down even if its few CPU cycles.
we have a 7200 series router which shows high CPU. The problem started happening after we enabled netflow on the router. There has been no change in traffic though. Is this a known issue. How can we fix it?
RA01#show proc cpu sorted | ex 0.00 CPU utilization for five seconds: 97%/57%; one minute: 97%; five minutes: 96% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 95 122265420 537588673 227 35.11% 34.27% 34.32% 0 IP Input 179 3734124 23873626 156 2.23% 2.44% 2.17% 0 LFDp Input Proc 9 103157352 312479671 330 1.03% 1.14% 1.08% 0 ARP Input 312 732648 2234277784 0 0.63% 0.57% 0.56% 0 IP SLAs Responde 170 23292176 13980278 1666 0.23% 0.22% 0.23% 0 CEF: IPv4 proces 303 9053864 329693755 27 0.15% 0.18% 0.17% 0 PPP Events 168 166280 2234199103 0 0.15% 0.12% 0.13% 0 HQF Output Shape 329 39411164 287153755 137 0.15% 0.32% 0.32% 0 BGP Router
i would like to know what IOS version are you running? If you remove the netflow configuration, does it normalize the CPU?
We are running 15.1(1)S release.
For the second question, yes when we remove the configuration, the CPU normalizes. We have similar configuration on 7600 router but we dont see high CPU there. Is this something common on 7200?
I believe this is known issue. CSCtr92077. The problem here seems to be that the netflow might be causing the packets to be software switched instead of fast switching.
Could you confirm if you are using Advanced IP Service image or Advanced enterprise image. If you are using the IP Services image, i would request you to try using adventerprise image. That should probably fix the issue.
Hope this helps.
Yes, we are using Adv IP services image. Will check internally to see if we can make a change.
Will keep you posted
Thanks for quick response.