high cpu 1841 interrupts

Mukund M Deshpande · ‎09-29-2011

Hi ,

I need help on high cpu due to interrupt on cisco1841 router .

# sh proc cpu | ex 0.00

CPU utilization for five seconds: 32%/29%; one minute: 35%; five minutes: 34%

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

2 86888 9845513 8 0.08% 0.04% 0.02% 0 Load Meter

3 716 238 3008 0.32% 0.34% 0.11% 194 SSH Process

35 7147916 152455674 46 0.08% 0.11% 0.10% 0 IP SLAs Event Pr

38 265576 52029697 5 0.08% 0.10% 0.10% 0 Net Background

42 21111756 49227561 428 0.40% 0.31% 0.28% 0 Per-Second Jobs

74 3453961536510634 0 0.16% 0.15% 0.13% 0 ACCT Periodic Pr

78 3517201536510635 0 0.08% 0.09% 0.12% 0 IP ARP Retry Age

79 116284736 362961119 320 0.49% 0.33% 0.28% 0 IP Input

134 115880 492071349 0 0.08% 0.07% 0.08% 0 RBSCP Background

190 10153108 4794767 2117 0.08% 0.02% 0.03% 0 SNMP ENGINE

197 31480308 71788845 438 0.08% 0.07% 0.08% 0 BGP I/O

Baseline average CPU is 20% and from last 4 days its reaching to 40% . one 20MB link utilization peak utilization is around 17MB sometimes in a day ,having Total output drops: 785729 in 22 hrs .

I do not find any reason here and worried abt these increaseing o/p drops in such huge amount , Is this may be the cause of high interrupt CPU ? Please share your views on this .

router uptime and image -

--------------------------------------

uptime is 1 year, 29 weeks, 3 days, 14 hours, 42 minutes

System returned to ROM by power-on

System restarted at 03:34:55 IST Sun Mar 7 2010

System image file is "flash:c1841-spservicesk9-mz.124-15.T11.bin"

FastEthernet0/1 is up, line protocol is up

Hardware is Gt96k FE, address is 001c.5867.36e5 (bia 001c.5867.36e5)

Description: " Link to PDM_CORP_RTR2 on fa0/0/0 "

Internet address is 10.90.4.173/30

MTU 1500 bytes, BW 40960 Kbit/sec, DLY 100 usec,

reliability 255/255, txload 79/255, rxload 27/255

Encapsulation ARPA, loopback not set

Keepalive set (10 sec)

Full-duplex, 100Mb/s, 100BaseTX/FX

ARP type: ARPA, ARP Timeout 04:00:00

Last input 00:00:00, output 00:00:00, output hang never

Last clearing of "show interface" counters 22:25:20

Input queue: 2/75/0/0 (size/max/drops/flushes); Total output drops: 785729

Queueing strategy: Class-based queueing

Output queue: 0/1000/64/0 (size/max total/threshold/drops)

Conversations 0/2/256 (active/max active/max total)

Reserved Conversations 1/1 (allocated/max allocated)

Available Bandwidth 18402 kilobits/sec

5 minute input rate 4350000 bits/sec, 1187 packets/sec

5 minute output rate 12690000 bits/sec, 3068 packets/sec

69358747 packets input, 3048299466 bytes

Received 18783 broadcasts, 0 runts, 0 giants, 0 throttles

0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

0 watchdog

0 input packets with dribble condition detected

213627271 packets output, 693052780 bytes, 0 underruns

0 output errors, 0 collisions, 0 interface resets

0 unknown protocol drops

0 babbles, 0 late collision, 0 deferred

0 lost carrier, 0 no carrier

0 output buffer failures, 0 output buffers swapped out

--------------

04:58:12 PM Thursday Sep 29 2011 IST

333334444433333333333333333333333333333333333333333333333333

444448888888888888889999966666999998888877777999994444466666

100

90

80

70

60

50 *****

40 ********************************************* *****

30 ************************************************************

20 ************************************************************

10 ************************************************************

0....5....1....1....2....2....3....3....4....4....5....5....6

0 5 0 5 0 5 0 5 0 5 0

CPU% per second (last 60 seconds)

344443333333334333343443334444336443333334343444434454333333

908117684599090365708015986301954256698880806465696126764677

100

90

80

70

60 *

50 * * * * *** * **

40 ###***** *** ** *********#*#********************#*####** ***

30 ############*##########*####################################

20 ############################################################

10 ############################################################

0....5....1....1....2....2....3....3....4....4....5....5....6

0 5 0 5 0 5 0 5 0 5 0

CPU% per minute (last 60 minutes)

* = maximum CPU% # = average CPU%

776677579586776875789867776678658657757889858899776776667669658888667786

287544439608600470923061667605633590857398222381427463562138144965765923

100 * * * *

90 * * * ** ** * ***

80 * * * * ** **** ** * * * ***** **** * * **** ***

70 ****** ** **** ** ************* ** ** ***** ********* *** * *********

60 ****** ********** ************* *********** ***************** **********

50 ************************************************************************

40 ****************#***######************#******#####**************###*#***

30 #######****####################*****####################################

20 ########################################################################

10 ########################################################################

0....5....1....1....2....2....3....3....4....4....5....5....6....6....7..

0 5 0 5 0 5 0 5 0 5 0 5 0

CPU% per hour (last 72 hours)

* = maximum CPU% # = average CPU%

Thanks 4 reply

-------------------------

Joseph W. Doherty · ‎09-29-2011

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Generally, interrupt CPU is caused by the routing forwarding packets, which might be confirmed by your mention of your bandwidth peaks. (NB: many of Cisco's software based routers performance capabilities are not suited to sustained high usage of the Ethernet interfaces.)

Regarding your drops, looks like you're using CBWFQ - might help if you would post its interfaces stats.

Mukund M Deshpande · ‎09-29-2011

Hi Joseph ,

The CPU utilization noticed from last one week , it was 30% around before , no changes been done here .I m not sure how the same ethernet utilization was working well with CPU and now interrupt CPU is high .

I have pasted before show interface and it is using Queueing strategy: Class-based queueing . output drops are in huge numbers .do not understand here if CPU is due to this I mean processing lot of packet once again at interface level ?

Thanks 4 reply

---------------------

Joseph W. Doherty · ‎09-29-2011

Sorry, should have made myself clearer. It might help if you posted the policy-map's interface stats.

Mukund M Deshpande · ‎09-29-2011

Hi Joseph ,

Below is the policy map applied on fa0/1 having 17mb traffic at peak time .

!

interface FastEthernet0/1

description " Link to PDM_CORP_RTR2 on fa0/0/0 "

bandwidth 40960

ip address 10.90.4.173 255.255.255.252

ip ospf mtu-ignore

speed 100

full-duplex

service-policy output SANJIV_VDI_Policy_OUT

!

policy-map SANJIV_VDI_Policy_OUT

class SANJIV_VDI_Class_OUT

bandwidth 30

class OPEN_HOUSE_class

priority 12288

class PDM4D_FILESRVR_CLASS

police cir 4608000

conform-action transmit

exceed-action drop

!

class-map match-all SANJIV_VDI_Class_OUT

description << SANJIV_VDI>>

match access-group name SANJIV_VDI_ACL_OUTbound

class-map match-all OPEN_HOUSE_class

description << OPEN House Traffic >>

match access-group name OPEN_HOUSE_MATCH

class-map match-all MLD_INT_CLASS_MAP

description << OPEN House from 4D to Interface >>

match access-group name MLD_INT_MAP_ACL

class-map match-all PDM4D_FILESRVR_CLASS

description << FILE SERVER ACCESS BETWEEN 4D and PDM >>

match access-group name fileserver_4dpdm

!

ip access-list extended OPEN_HOUSE_MATCH

permit ip host 172.16.103.8 host 192.168.80.250

permit ip host 172.16.103.8 host 192.168.80.251

permit ip host 172.16.103.8 host 192.168.80.252

permit ip host 172.16.103.15 host 192.168.80.250

permit ip host 172.16.103.15 host 192.168.80.251

permit ip host 172.16.103.15 host 192.168.80.252

permit ip host 172.16.103.16 host 192.168.80.250

permit ip host 172.16.103.16 host 192.168.80.251

permit ip host 172.16.103.16 host 192.168.80.252

permit ip host 172.16.103.8 host 184.73.26.138

permit ip host 172.16.103.8 host 184.73.101.4

permit ip host 172.16.103.8 host 203.201.252.199

permit ip host 172.16.103.8 host 203.201.252.202

permit ip host 172.16.103.8 host 204.236.225.157

permit ip host 172.16.103.15 host 184.73.26.138

permit ip host 172.16.103.15 host 184.73.101.4

permit ip host 172.16.103.15 host 203.201.252.199

permit ip host 172.16.103.15 host 203.201.252.202

permit ip host 172.16.103.15 host 204.236.225.157

permit ip host 172.16.103.16 host 184.73.26.138

permit ip host 172.16.103.16 host 184.73.101.4

permit ip host 172.16.103.16 host 203.201.252.199

permit ip host 172.16.103.16 host 203.201.252.202

permit ip host 172.16.103.16 host 204.236.225.157

permit ip 192.168.24.0 0.0.0.31 host 192.168.80.250

ip access-list extended SANJIV_VDI_ACL_OUTbound

permit ip host 10.60.208.40 any

ip access-list extended fileserver_4dpdm

permit ip any host 192.168.82.129

permit ip host 192.168.82.129 any

sorry not getting u on wht it means here "policy-map's interface stats " .

Thanks 4 reply

-----------------------------

Joseph W. Doherty · ‎09-30-2011

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

What I was asking for was the stats from the command "show policy-map interface FastEthernet0/1 out" but your providing the configuration statements was helpful. My guess is your high drop rate is from the policer.

Again, interrupt CPU on the software based platforms like an 1841 is generally driven by traffic volume. An 1841 is rated at 75 Kpps, which for min size Ethernet packets, is only about 38 Mbps. With typical packet sizes, and 1841 might be about 3x that. Your posted interface stats showed a 5 minute average of about 17 Mbps. So I would expect CPU load between about 45% to about 15%, and your 5 minute CPU average was 34%.

Your interface stats also caught a couple of packets in the input queue which is indicative of a router being unable to keep up with the offered traffic load.

I tried to compute your average packet sizes, but it looks to me like the counters might have wrapped. Clear the counters and after a day divide both input and output total bytes by their corresponding total packets. If average packets sizes are small, this would be another confirmation of higher than expected CPU loading.

PS:

If this Ethernet link is fractional (about 40 Mbps), I would recommend a shaper rather than a policer and such should be in a parent policy with a child policy controlling prioritization when there's congestion.

Ivan Krimmel · ‎10-01-2011

Hi Sagar,

what has been changed since 4 days while you're observing this issue? it is obvious that no config changes were carried out, but perhaps traffic rate has increased? also, are you aware of the traffic pattern flowing over this box, i.e. what sort of traffic is going through? The problem with software platforms is that we can't trace which packets are being punted to CPU.

If possible, I'd suggest to take a sniffer trace and have a look into the actual packets.

HTH,

Ivan.

Mukund M Deshpande · ‎10-03-2011

Hi Ivan ,

There is no change in traffic load also as per last 2 months logs fro the 20 mB interface .

How can you sniff traffic on 1841 , please let me know ..

--------------

thanks

Ivan Krimmel · ‎10-03-2011

Sagar,

we can't sniff to traffic on the router itself, I was refering to an external sniffer plugged in a switch, provided the router is connected to one. It is often a case for the CPU to go mad because of 'bad' packets.

HTH,

Ivan.

Joseph W. Doherty · ‎10-03-2011

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

NB: Later IOSs do support a router sniffing its own traffic, see:

http://www.cisco.com/en/US/products/ps9913/products_ios_protocol_group_home.html

Believe many 'bad' packets, when they spike CPU, often do so outside of the fast path, for example, i.e. interrupt CPU would be much lower than total CPU.