04-19-2006
07:14 AM
- last edited on
03-25-2019
03:03 PM
by
ciscomoderator
I have started to see the problem for couple days where some hosts connected to 4507R stopped responding to ping and I saw huge latency upto 400msec. (In normal cases it is 1 msec becuase they are directly connected hosts. I couldn't find anythiing in the logs or interface errors but show proc cpu history and found that CPU went up to 100% in last 60 sec. Anyone know why it is doing this? Do I need a reboot?
Thanks.
04-19-2006 07:19 AM
Hi Nawas,
Where are you pinging from? If you're pinging from the switch, you'll definitely see inconsistent response times, especially if the CPU gets pegged.
If you're pinging from another host connected through the switch, then we shouldn't see any impact on response times unless, for some reason, the ICMP packets are being process switched.
Do you know what processes were spiking?
regards,
Bobby
04-19-2006 07:27 AM
Hi Bobby
I'm pinging from a different host which is directly connected to this swtich and our monitoring system (HpOV also on the same vlan)sends pages/email when it looses pings. The highest process I see on show proc cpu is
26 3609955722129699394 169 9.03% 9.79% 9.92% 0 Cat4k Mgmt HiPri
27 30077935764176720830 720 8.47% 7.57% 7.61% 0 Cat4k Mgmt LoPri
but when I do show proc cpu histroy I see about 99% spike but that doesn't show which process, here is the capture
xec-4507-25#sh proc cpu history
1111122222222221111111111111111111122222333331111122222111
6666600000000007777799999777777777788888111115555511111666
100
90
80
70
60
50
40
30 **********
20 **********************************************************
10 **********************************************************
0....5....1....1....2....2....3....3....4....4....5....5....
0 5 0 5 0 5 0 5 0 5
CPU% per second (last 60 seconds)
4444554444444455444443435343434375534343434352424342429942
6644889955996688224442328852534394512442231368995366779978
100 **
90 **
80 * **
70 * **
60 ** ** * * * * **
50 ** ************ * * * *** * * * * * ***
40 ********************* * *** * * *** * * * * * * * * * *#*
30 ********************************##********************##**
20 ##########################################################
10 ##########################################################
0....5....1....1....2....2....3....3....4....4....5....5....
0 5 0 5 0 5 0 5 0 5
CPU% per minute (last 60 minutes)
* = maximum CPU% # = average CPU%
9977787677655587777777766675556677877787666555777776777766755586677777
9571002931794143410410487627389561511711952155461009150261467110632133
100 **
90 ** *
80 *** * * * * ** * * *
70 *********** ************* ************ *********** * * ******
60 ************ ************** ************** ***************** ********
50 **********************************************************************
40 **********************************************************************
30
04-19-2006 07:35 AM
The HiPri and LoPri processes are normal.Your average CPU is also in the normal rage, while the max CPU is clearly hitting 100% within the last 60 minutes. So we can certainly infer that there are occasional spikes.
The questions is whether this truly correlates with the host response time. The first thing we need to determine is whether the spikes are caused by a process or by CPU switched traffic.
You'll need to capture the output of "show proc cpu" and "show platform health" and catch it while the CPU is spiked so that we can determine what's causing it. If it is very intermittent, you may have a hard time catching it, so if you have a way of scripting it with a cron job, that may be easier.
Also, here's a good doc on t-shooting CPU issues on this platform.
http://www.cisco.com/en/US/products/hw/switches/ps663/products_tech_note09186a00804cef15.shtml
HTH,
Bobby
*Please rate helpful posts.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide