cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
10050
Views
4
Helpful
21
Replies

100% CPU Usage on 887VA when network traffic is heavy

Ian Stephens
Level 1
Level 1

We have a problem with 100% CPU usage and a small packet loss when the router can't keep up at full speed (100Mb/s) NAT.

We are not using any inspect commands, so there are no overheads there.

Why is the router slowing down and grinding to a halt?

We are running a basic NAT and our ISP has provided us 100Mb/s VDSL connection.  It's when we hit these high speeds that the router CPU usage hits 100% and we experience packet loss when pinging for example (intermittent no replies... etc).

Below is our running config and process information. 

Your thoughts, fixes, comments and suggestions are greatly appreciated.

show proc cpu sort

r1.xxx.xxxx.com#show proc cpu sort

CPU utilization for five seconds: 96%/96%; one minute: 96%; five minutes: 96%

PID Runtime(ms)     Invoked      uSecs   5Sec   1Min   5Min TTY Process

  98      340532    13421485         25  1.89%  1.28%  1.21%   0 Ethernet Msec Ti

   2       35928       21401       1678  1.34%  1.04%  1.03%   0 Load Meter   

  92     2372784      528420       4490  0.63%  0.83%  0.96%   0 COLLECT STAT COU

146        1192       16749         71  0.47%  0.06%  0.01%   0 TCP Timer     

289       93284     3293892         28  0.39%  0.25%  0.24%   0 PPP Events   

281       18392      828131         22  0.23%  0.07%  0.06%   0 PPPoE Background

115      156872      146421       1071  0.23%  0.19%  0.15%   0 IP Input     

288      134128     3293930         40  0.23%  0.41%  0.42%   0 PPP manager   

  97       23132      775596         29  0.15%  0.06%  0.07%   0 Ethernet Timer C

111       74836     3279452         22  0.15%  0.24%  0.23%   0 IPAM Manager 

  63       69968      555090        126  0.15%  0.23%  0.23%   0 LED Timers   

283       17560      209076         83  0.07%  0.03%  0.05%   0 IP NAT Ager   

274        7452       21461        347  0.07%  0.03%  0.00%   0 Compute load avg

188        7740      207739         37  0.07%  0.02%  0.00%   0 Inspect process

  68        4680      106699         43  0.07%  0.01%  0.00%   0 Console redirect

  32        7896      111262         70  0.07%  0.03%  0.00%   0 ARP Background

  17        4212      104127         40  0.07%  0.02%  0.00%   0 IPC Periodic Tim

  25        1380       21372         64  0.07%  0.00%  0.00%   0 IPC Loadometer

  56        6512       54449        119  0.07%  0.02%  0.00%   0 Fast Throttle Ti

244        2860         189      15132  0.07%  0.14%  0.03%   8 Virtual Exec 

AND MORE... but I omitted it because I was getting the message "This message can not be displayed due to its content. Please use the Contact Us link with any questions"...

Our Running Config

version 15.1

no service pad

service timestamps debug datetime msec

service timestamps log datetime msec

no service password-encryption

!

hostname r1.essex.xxxx.xxx

!

boot-start-marker

boot system flash c880data-universalk9-mz.151-4.M3.bin

boot-end-marker

!

!

no logging buffered

enable secret 5 xxxxxx

enable password xxxxxx

!

no aaa new-model

memory-size iomem 10

no ip source-route

!

!

!

ip dhcp excluded-address 192.168.0.1

ip dhcp excluded-address 192.168.0.50 192.168.0.255

!

ip dhcp pool NET-POOL

network 192.168.0.0 255.255.255.0

default-router 192.168.0.1

dns-server 8.8.8.8 8.8.4.4

!

!

ip cef

ip name-server 8.8.8.8

ip name-server 8.8.4.4

no ipv6 cef

!

!

!

!

!

!

!

!

controller VDSL 0

!

no ip ftp passive

!

!

!

!

!

!

!

interface Ethernet0

no ip address

!

interface Ethernet0.101

encapsulation dot1Q 101

pppoe-client dial-pool-number 1

!

interface ATM0

no ip address

shutdown

no atm ilmi-keepalive

!

interface FastEthernet0

no ip address

!

interface FastEthernet1

no ip address

shutdown

!

interface FastEthernet2

no ip address

shutdown

!

interface FastEthernet3

no ip address

shutdown

!

interface Vlan1

ip address 192.168.0.1 255.255.255.0

ip nat inside

ip virtual-reassembly in

ip tcp adjust-mss 1452

!

interface Dialer0

ip address 81.138.131.190 255.255.255.248

no ip redirects

no ip unreachables

no ip proxy-arp

ip mtu 1492

ip nat outside

ip virtual-reassembly in

encapsulation ppp

dialer pool 1

ppp authentication chap callin

ppp chap hostname xxxxxxxx

ppp chap password 0 xxxxxxxxx

ppp ipcp route default

no cdp enable

!

ip forward-protocol nd

no ip http server

ip http secure-server

!

ip nat inside source list 101 interface Dialer0 overload

ip nat inside source static 192.168.0.250 xxx.xxx.xxx.xxx

!

access-list 101 permit ip any any

!

!

!

!

!


2 Accepted Solutions

Accepted Solutions

Joseph W. Doherty
Hall of Fame
Hall of Fame

Disclaimer


The   Author of this posting offers the information contained within this   posting without consideration and with the reader's understanding that   there's no implied or expressed suitability or fitness for any purpose.   Information provided is for informational purposes only and should not   be construed as rendering professional advice of any kind. Usage of  this  posting's information is solely at reader's own risk.

Liability Disclaimer

In   no event shall Author be liable for any damages whatsoever (including,   without limitation, damages for loss of use, data or profit) arising  out  of the use or inability to use the posting's information even if  Author  has been advised of the possibility of such damage.

Posting

"Software" based routers use their main CPU for everything including forwarding packets.  When you push enough traffic toward them, their CPU will max out, which seems to be your case from both your CPU stats and the volume of traffic you describe.  An 880 series router is rated at 50 Kpps or about 25 Mbps (unidirectional).  That's for minimal sized packet so higher throughput is possible with larger packets (often the norm).  Cisco notes max transfer rate (for 1500 byte packets) at about 200 Mbps.  They also recommend the 880 for WAN links up to 8 Mbps (duplex).

Your configuration looks pretty "clean", so your only real solution would be a "faster" device.

View solution in original post

The CPU hits the 100% mark when we are pushing around 90Mb/s inbound from the Dialer0 to the Vlan1 (downloading a file for example).  There are not many NAT clients behind the network yet, so it's purely throughput not bloating of the NAT translation table.  I think even when we hit 100% CPU, the translation table only has 100 entries.

That is normal, and consistent or exceeding the performances tested by Cisco. See attachment, NAT testing.

With such a fast circuit, you will need a faster router.

View solution in original post

21 Replies 21

Joseph W. Doherty
Hall of Fame
Hall of Fame

Disclaimer


The   Author of this posting offers the information contained within this   posting without consideration and with the reader's understanding that   there's no implied or expressed suitability or fitness for any purpose.   Information provided is for informational purposes only and should not   be construed as rendering professional advice of any kind. Usage of  this  posting's information is solely at reader's own risk.

Liability Disclaimer

In   no event shall Author be liable for any damages whatsoever (including,   without limitation, damages for loss of use, data or profit) arising  out  of the use or inability to use the posting's information even if  Author  has been advised of the possibility of such damage.

Posting

"Software" based routers use their main CPU for everything including forwarding packets.  When you push enough traffic toward them, their CPU will max out, which seems to be your case from both your CPU stats and the volume of traffic you describe.  An 880 series router is rated at 50 Kpps or about 25 Mbps (unidirectional).  That's for minimal sized packet so higher throughput is possible with larger packets (often the norm).  Cisco notes max transfer rate (for 1500 byte packets) at about 200 Mbps.  They also recommend the 880 for WAN links up to 8 Mbps (duplex).

Your configuration looks pretty "clean", so your only real solution would be a "faster" device.

Thank you for your reply Josepth.

That's what I thought too - I thought we had reached the limit of what the router could handle. 

However, dropping packets is a bit of an issue, I didn't expect that.  Are there any tweaks we could put in place to stop packet loss?

Thanks again.

Disclaimer


The    Author of this posting offers the information contained within this    posting without consideration and with the reader's understanding that    there's no implied or expressed suitability or fitness for any  purpose.   Information provided is for informational purposes only and  should not   be construed as rendering professional advice of any kind.  Usage of  this  posting's information is solely at reader's own risk.

Liability Disclaimer

In    no event shall Author be liable for any damages whatsoever  (including,   without limitation, damages for loss of use, data or  profit) arising  out  of the use or inability to use the posting's  information even if  Author  has been advised of the possibility of such  damage.

Posting

However, dropping packets is a bit of an issue, I didn't expect that.  Are there any tweaks we could put in place to stop packet loss?

Yes and no.  As your configuration is "clean", you've not "wasting" CPU.

What you could consider, ingress shapers to selectively drop some traffic.  Right now your drops are "random", but if you're going to have drops, one could argue it's better to be selective about them.  For example, you might drop packets from TCP flows, which will both recover from the drops and slow their transmission rate.  This would help preclude drops from traffic that doesn't deal well with packet loss.

Of course, this does add additional load to your processing, so your overall throughput is likely to be even less, yet it might seem better to your users.

Also, I forgot to mention Joseph - when we disable CEF, traffic slows right down to just 20Mb/s - that's all we can push through it.  However, with CEF enabled we can push almost 100Mb/s - is this normal?

Disclaimer


The    Author of this posting offers the information contained within this    posting without consideration and with the reader's understanding that    there's no implied or expressed suitability or fitness for any  purpose.   Information provided is for informational purposes only and  should not   be construed as rendering professional advice of any kind.  Usage of  this  posting's information is solely at reader's own risk.

Liability Disclaimer

In    no event shall Author be liable for any damages whatsoever  (including,   without limitation, damages for loss of use, data or  profit) arising  out  of the use or inability to use the posting's  information even if  Author  has been advised of the possibility of such  damage.

Posting

Also, I forgot to mention Joseph - when we disable CEF, traffic slows right down to just 20Mb/s - that's all we can push through it.  However, with CEF enabled we can push almost 100Mb/s - is this normal?

Can't say whether that much of a delta is normal, but CEF is Cisco's premiere technology, designed to increase L3 forwarding performance.  So a drop in forwarding performance should be expected, again though, just the delta is the open issue (NB: your mileage may vary).

Thanks for the information Joseph.  I have just looked at our Dialer interface and noticed that "Stateful Inspection" is present:

Dialer0 is up (if_number 12)

  Corresponding hwidb fast_if_number 12

  Corresponding hwidb firstsw->if_number 12

  Internet address is xxx.xxx.xxx.xxx/29

  ICMP redirects are never sent

  Per packet load-sharing is disabled

  IP unicast RPF check is disabled

  Input features: Stateful Inspection, Dialer i/f override, Virtual Fragment Reassembly, Virtual Fragment Reassembly After IPSec Decryption, NAT Outside

  Output features: Post-routing NAT Outside, Stateful Inspection, Dialer idle reset, Dialer idle reset

  IP policy routing is disabled

  BGP based policy accounting on input is disabled

  BGP based policy accounting on output is disabled

  Interface is marked as point to point interface

  Hardware idb is Dialer0

  Fast switching type 15, interface type 98

  IP CEF switching enabled

  IP CEF switching turbo vector

  IP Null turbo vector

  IP prefix lookup IPv4 mtrie 8-8-8-8 optimized

  Input fast flags 0x400040, Output fast flags 0x10100

  ifindex 12(12)

  Slot  Slot unit -1 VC -1

  IP MTU 1492

Is this normal?  Is it needed for NAT?  Can it be disabled?

Disclaimer


The     Author of this posting offers the information contained within this     posting without consideration and with the reader's understanding  that    there's no implied or expressed suitability or fitness for any   purpose.   Information provided is for informational purposes only and   should not   be construed as rendering professional advice of any kind.   Usage of  this  posting's information is solely at reader's own risk.

Liability Disclaimer

In     no event shall Author be liable for any damages whatsoever   (including,   without limitation, damages for loss of use, data or   profit) arising  out  of the use or inability to use the posting's   information even if  Author  has been advised of the possibility of  such  damage.

Posting

Ryan Barclay wrote:

Thanks for the information Joseph.  I have just looked at our Dialer interface and noticed that "Stateful Inspection" is present:

Dialer0 is up (if_number 12)

  Corresponding hwidb fast_if_number 12

  Corresponding hwidb firstsw->if_number 12

  Internet address is xxx.xxx.xxx.xxx/29

  ICMP redirects are never sent

  Per packet load-sharing is disabled

  IP unicast RPF check is disabled

  Input features: Stateful Inspection, Dialer i/f override, Virtual Fragment Reassembly, Virtual Fragment Reassembly After IPSec Decryption, NAT Outside

  Output features: Post-routing NAT Outside, Stateful Inspection, Dialer idle reset, Dialer idle reset

  IP policy routing is disabled

  BGP based policy accounting on input is disabled

  BGP based policy accounting on output is disabled

  Interface is marked as point to point interface

  Hardware idb is Dialer0

  Fast switching type 15, interface type 98

  IP CEF switching enabled

  IP CEF switching turbo vector

  IP Null turbo vector

  IP prefix lookup IPv4 mtrie 8-8-8-8 optimized

  Input fast flags 0x400040, Output fast flags 0x10100

  ifindex 12(12)

  Slot  Slot unit -1 VC -1

  IP MTU 1492

Is this normal?  Is it needed for NAT?  Can it be disabled?

My guess, might be related to NAT.

Ryan Barclay wrote:

Also, I forgot to mention Joseph - when we disable CEF, traffic slows right down to just 20Mb/s - that's all we can push through it.  However, with CEF enabled we can push almost 100Mb/s - is this normal?

You need to run with CEF exclusively. The router is not designed to work without.

Thank you for the information Paolo.

Ian Stephens
Level 1
Level 1

Do you think that if I configure a single direct Ethernet interface on the inside instead of using the Vlan this would help with the CPU pegging? Just a thought.

Sent from Cisco Technical Support iPad App

Disclaimer


The     Author of this posting offers the information contained within this     posting without consideration and with the reader's understanding  that    there's no implied or expressed suitability or fitness for any   purpose.   Information provided is for informational purposes only and   should not   be construed as rendering professional advice of any kind.   Usage of  this  posting's information is solely at reader's own risk.

Liability Disclaimer

In     no event shall Author be liable for any damages whatsoever   (including,   without limitation, damages for loss of use, data or   profit) arising  out  of the use or inability to use the posting's   information even if  Author  has been advised of the possibility of  such  damage.

Posting

Ryan Barclay wrote:

Do you think that if I configure a single direct Ethernet interface on the inside instead of using the Vlan this would help with the CPU pegging? Just a thought.

Sent from Cisco Technical Support iPad App

I doubt it.

One other possible CPU consumer is the router dealing with fragmentation because of your PPoE, but don't see a way to avoid it because your are PPoE.  With your configuration, since you also have the mss-adjust for TCP, unlikely there's much of this happening.

Any log with "fragment table has reached its maximum" ?

Any change on CPU with "no ip virtual-reassembly" ?

http://www.cisco.com/en/US/docs/ios-xml/ios/security/d1/sec-cr-i3.html#GUID-70035BE4-0286-4E4C-8B59-263F64069CA4

I have applied the following to both the Dialer0 and the Vlan1 interfaces:

no ip virtual-reassembly in

However, CPU is still 99% when traffic is heavy.

How can I check the fragment table?  Sorry, I'm quite new to all of this. 

Thank you very much for your help.

>How can I check the fragment table?

sh ip virtual-reassembly

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card