05-05-2015 07:59 AM - edited 03-07-2019 11:52 PM
This is an opportunity to learn and ask questions about high CPU condition that you might be facing your environment and troubleshooting the same with the tools and techniques available within the platform with Cisco expert Vinit Jain.
Ask questions from Monday, May 11th, 2015 to Friday, May 22, 2015
High CPU condition is a very common problem seen in production environments which can cause a huge impact on the services if not taken care on time. High CPU can be classified in primarily in two categories – 1) High CPU due to process and 2) High CPU due to interrupt (traffic). Cisco expert Vinit Jain will cover and answer all of your questions about troubleshooting High CPU on Cisco IOS.
Vinit Jain, 3X CCIE #22854 is a Technical Lead in HTTS (High Touch Technical Support) team supporting customers in areas of routing, MPLS, TE, IPv6, multicast and a wide variety of platform issues like High CPU, Memory leak, etc IOS, IOS XE, IOS XR and NxOS code base. Has been delivering trainings within Cisco on various technology as well as platform troubleshooting topics. He has also written workbook on IOS XR fundamentals on Cisco Support Community. Vinit has CCIE in R&S, SP and Sec and holds multiple certifications on programming and databases.
Vinit Jain will also be speaking at Cisco Live in June 2015 on Troubleshooting BGP (BRKRST-3320).
Click here for More Information
Find other https://supportforums.cisco.com/expert-corner/events.
**Ratings Encourage Participation! **
Please be sure to rate the Answers to Questions
Solved! Go to Solution.
05-16-2015 12:51 PM
Hello,
I actually provided a different command. we actually need "show region" output for performing CPU profiling.
Below is the link to perform CPU profiling:
http://www.cisco.com/c/en/us/support/docs/routers/7500-series-routers/41120-highcpu-interrupts.html#cd
Please perform the CPU profiling and share the logs. Though I am not a voice expert but can take a look to see if anything can be done to mitigate it.
You might also want to check out with the design / implementation team if you need to upgrade a different router for future purposes. If there is further increase of the voice traffic or any other traffic that will hit the CPU and spikes it up to 90% , the router might take a hit.
Regards,
Vinit
PS: Please do rate the posts if you find them useful.
05-16-2015 07:26 PM
05-16-2015 07:37 PM
Could you please share the show version output as well. I will be needing that to analyze the profiling.
Thanks,
05-16-2015 07:41 PM
05-16-2015 08:06 PM
The top process consuming CPU cycles is
========== TOP 20 FUNCTIONS ==========
(2631/46180) 5.70% mv64340_ge_rx_buffer_interrupt
(2380/46180) 5.15% action_regexp_ascii
(1950/46180) 4.22% memset
(1621/46180) 3.51% classify_packet
(1324/46180) 2.87% ipv4fib_les_switch_wrapper
(1204/46180) 2.61% voip_rtp_send_pak
Though the primary cause is interrupt traffic but i think the IP NBAR feature is also consuming some CPU cycles constantly.
Though its production, I am not really sure if you can try removing nbar configuration from the interfaces and see if that reduces the CPU utilization.
Multiple actions being performed on the router at the same time is causing the CPU to be around 40-50%.
Please let me know if you can try the above and if that helps reduce the CPU a bit.
Hope this helps.
Vinit
05-16-2015 10:01 PM
Vinit,
Thank you sir. We have removed NBAR protocol-discovery from the G0/0 sub-interfaces and from G0/1. We will monitor and update you shortly.
05-16-2015 10:33 PM
Quick note, the CPU is high not just because of NBAR but traffic as well. the first process as we can see is interrupt related.
Did removing nbar bring down the CPU?
05-18-2015 05:27 AM
Vinit,
It looks like our average CPU usage has gone down since disabling NBAR. We have still experienced a few spikes proportional to our call volume. We should have relatively high volume today in about 14 hours sir. We are planning to monitor and collect another CPU profile during that time.
05-18-2015 07:38 AM
Thanks for the update. Yes, it will be good to take another CPU profiling during peak hours. If the CPU is high due to interrupt then i would request you to open a tac case for getting suggestions from voice expert.
05-20-2015 08:30 AM
Hello,
Did you get a chance to perform another set of CPU profiling during peak hours. Please let me know if you have any further questions.
Thanks,
Vinit
05-20-2015 06:20 PM
Vinit,
Just wanted to give you a quick follow up sir. We have been monitoring for a few days now and the average CPU usage is much lower when processing calls. Prior to disabling NBAR protocol-discovery on the interfaces, the CPU would sustain at 90% or more. Since disabling, we have had a few spikes but have not seen the processor sustain over 50%. Thanks so much for your help!
05-20-2015 06:41 PM
Really glad that i was able to help. Please feel free to reach out in case you have any further queries.
05-16-2015 04:41 AM
Hi Vinit
Suppose you can exclude high CPU due to interrupt (traffic) by interpreting the logs of the facilities you mentioned or by isolating the node(s), what steps do you recommend to further troubleshoot high CPU due to process? How would you exclude a bug or some conflicting config command if you sometimes don't have the same or similar platform to test on as a reference point, especially on a live network? Any way to trace the process and CPU utilization in more detail?
05-16-2015 06:09 AM
Hello,
CPU utilization due to different processes has to be troubleshooted differently. Each process does a particular job and based on the job it does, we need to understand why its taking too much of CPU cycles. Like for example, BGP scanner process runs every 60 secs. if there are not too many routes, the CPU should be normal but if for some reason / changes done in network, the routes have increased or some neighbors have increased, the CPU will start spiking up every 60 seconds.
So to begin with, we can start by understanding what a particular process does. If we are not able to narrow down the problem through various commands, we can use CPU profiling on IOS platforms. it gives details of top processes (function calls in programming language) which are consuming most CPU cycles. This tool can be used when the CPU is high due to interrupt and sometimes due to process.
One more thing to note here is, sometimes process also indicates traffic issues for example, if the CPU is high due to IP Input process it still means its high due to traffic.
The below link has an example of CPU profiling:
http://www.cisco.com/c/en/us/support/docs/routers/7500-series-routers/41120-highcpu-interrupts.html#cd
Hope this helps.
Please let me know if you have any further questions.
PS: Please do rate the post if you think are useful.
05-17-2015 11:48 PM
Hi Vinit,
I have a problem when i configure GRE tunnel over WAN (MPLS). I configure GRE tunnel between Cisco Catalist 4500 and SM-ES3G-16-P on ISR 3945. At first, there was not any problem at all. But, when traffic is increasing (Voice traffic), link becomes intermittent and CPU on SM-ES3G-16-P is high (68%). the highest cpu proccess is IP Input (>60%). Is there any suggestion for this problem? Thanks
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide