09-04-2012 03:10 PM - edited 03-04-2019 05:28 PM
We have a problem with 100% CPU usage and a small packet loss when the router can't keep up at full speed (100Mb/s) NAT.
We are not using any inspect commands, so there are no overheads there.
Why is the router slowing down and grinding to a halt?
We are running a basic NAT and our ISP has provided us 100Mb/s VDSL connection. It's when we hit these high speeds that the router CPU usage hits 100% and we experience packet loss when pinging for example (intermittent no replies... etc).
Below is our running config and process information.
Your thoughts, fixes, comments and suggestions are greatly appreciated.
show proc cpu sort
r1.xxx.xxxx.com#show proc cpu sort
CPU utilization for five seconds: 96%/96%; one minute: 96%; five minutes: 96%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
98 340532 13421485 25 1.89% 1.28% 1.21% 0 Ethernet Msec Ti
2 35928 21401 1678 1.34% 1.04% 1.03% 0 Load Meter
92 2372784 528420 4490 0.63% 0.83% 0.96% 0 COLLECT STAT COU
146 1192 16749 71 0.47% 0.06% 0.01% 0 TCP Timer
289 93284 3293892 28 0.39% 0.25% 0.24% 0 PPP Events
281 18392 828131 22 0.23% 0.07% 0.06% 0 PPPoE Background
115 156872 146421 1071 0.23% 0.19% 0.15% 0 IP Input
288 134128 3293930 40 0.23% 0.41% 0.42% 0 PPP manager
97 23132 775596 29 0.15% 0.06% 0.07% 0 Ethernet Timer C
111 74836 3279452 22 0.15% 0.24% 0.23% 0 IPAM Manager
63 69968 555090 126 0.15% 0.23% 0.23% 0 LED Timers
283 17560 209076 83 0.07% 0.03% 0.05% 0 IP NAT Ager
274 7452 21461 347 0.07% 0.03% 0.00% 0 Compute load avg
188 7740 207739 37 0.07% 0.02% 0.00% 0 Inspect process
68 4680 106699 43 0.07% 0.01% 0.00% 0 Console redirect
32 7896 111262 70 0.07% 0.03% 0.00% 0 ARP Background
17 4212 104127 40 0.07% 0.02% 0.00% 0 IPC Periodic Tim
25 1380 21372 64 0.07% 0.00% 0.00% 0 IPC Loadometer
56 6512 54449 119 0.07% 0.02% 0.00% 0 Fast Throttle Ti
244 2860 189 15132 0.07% 0.14% 0.03% 8 Virtual Exec
AND MORE... but I omitted it because I was getting the message "This message can not be displayed due to its content. Please use the Contact Us link with any questions"...
Our Running Config
version 15.1
no service pad
service timestamps debug datetime msec
service timestamps log datetime msec
no service password-encryption
!
hostname r1.essex.xxxx.xxx
!
boot-start-marker
boot system flash c880data-universalk9-mz.151-4.M3.bin
boot-end-marker
!
!
no logging buffered
enable secret 5 xxxxxx
enable password xxxxxx
!
no aaa new-model
memory-size iomem 10
no ip source-route
!
!
!
ip dhcp excluded-address 192.168.0.1
ip dhcp excluded-address 192.168.0.50 192.168.0.255
!
ip dhcp pool NET-POOL
network 192.168.0.0 255.255.255.0
default-router 192.168.0.1
dns-server 8.8.8.8 8.8.4.4
!
!
ip cef
ip name-server 8.8.8.8
ip name-server 8.8.4.4
no ipv6 cef
!
!
!
!
!
!
!
!
controller VDSL 0
!
no ip ftp passive
!
!
!
!
!
!
!
interface Ethernet0
no ip address
!
interface Ethernet0.101
encapsulation dot1Q 101
pppoe-client dial-pool-number 1
!
interface ATM0
no ip address
shutdown
no atm ilmi-keepalive
!
interface FastEthernet0
no ip address
!
interface FastEthernet1
no ip address
shutdown
!
interface FastEthernet2
no ip address
shutdown
!
interface FastEthernet3
no ip address
shutdown
!
interface Vlan1
ip address 192.168.0.1 255.255.255.0
ip nat inside
ip virtual-reassembly in
ip tcp adjust-mss 1452
!
interface Dialer0
ip address 81.138.131.190 255.255.255.248
no ip redirects
no ip unreachables
no ip proxy-arp
ip mtu 1492
ip nat outside
ip virtual-reassembly in
encapsulation ppp
dialer pool 1
ppp authentication chap callin
ppp chap hostname xxxxxxxx
ppp chap password 0 xxxxxxxxx
ppp ipcp route default
no cdp enable
!
ip forward-protocol nd
no ip http server
ip http secure-server
!
ip nat inside source list 101 interface Dialer0 overload
ip nat inside source static 192.168.0.250 xxx.xxx.xxx.xxx
!
access-list 101 permit ip any any
!
!
!
!
!
Solved! Go to Solution.
09-04-2012 05:27 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
"Software" based routers use their main CPU for everything including forwarding packets. When you push enough traffic toward them, their CPU will max out, which seems to be your case from both your CPU stats and the volume of traffic you describe. An 880 series router is rated at 50 Kpps or about 25 Mbps (unidirectional). That's for minimal sized packet so higher throughput is possible with larger packets (often the norm). Cisco notes max transfer rate (for 1500 byte packets) at about 200 Mbps. They also recommend the 880 for WAN links up to 8 Mbps (duplex).
Your configuration looks pretty "clean", so your only real solution would be a "faster" device.
09-09-2012 11:13 AM
The CPU hits the 100% mark when we are pushing around 90Mb/s inbound from the Dialer0 to the Vlan1 (downloading a file for example). There are not many NAT clients behind the network yet, so it's purely throughput not bloating of the NAT translation table. I think even when we hit 100% CPU, the translation table only has 100 entries.
That is normal, and consistent or exceeding the performances tested by Cisco. See attachment, NAT testing.
With such a fast circuit, you will need a faster router.
09-04-2012 05:27 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
"Software" based routers use their main CPU for everything including forwarding packets. When you push enough traffic toward them, their CPU will max out, which seems to be your case from both your CPU stats and the volume of traffic you describe. An 880 series router is rated at 50 Kpps or about 25 Mbps (unidirectional). That's for minimal sized packet so higher throughput is possible with larger packets (often the norm). Cisco notes max transfer rate (for 1500 byte packets) at about 200 Mbps. They also recommend the 880 for WAN links up to 8 Mbps (duplex).
Your configuration looks pretty "clean", so your only real solution would be a "faster" device.
09-05-2012 02:15 AM
Thank you for your reply Josepth.
That's what I thought too - I thought we had reached the limit of what the router could handle.
However, dropping packets is a bit of an issue, I didn't expect that. Are there any tweaks we could put in place to stop packet loss?
Thanks again.
09-05-2012 02:58 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
However, dropping packets is a bit of an issue, I didn't expect that. Are there any tweaks we could put in place to stop packet loss?
Yes and no. As your configuration is "clean", you've not "wasting" CPU.
What you could consider, ingress shapers to selectively drop some traffic. Right now your drops are "random", but if you're going to have drops, one could argue it's better to be selective about them. For example, you might drop packets from TCP flows, which will both recover from the drops and slow their transmission rate. This would help preclude drops from traffic that doesn't deal well with packet loss.
Of course, this does add additional load to your processing, so your overall throughput is likely to be even less, yet it might seem better to your users.
09-05-2012 02:23 AM
Also, I forgot to mention Joseph - when we disable CEF, traffic slows right down to just 20Mb/s - that's all we can push through it. However, with CEF enabled we can push almost 100Mb/s - is this normal?
09-05-2012 03:02 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Also, I forgot to mention Joseph - when we disable CEF, traffic slows right down to just 20Mb/s - that's all we can push through it. However, with CEF enabled we can push almost 100Mb/s - is this normal?
Can't say whether that much of a delta is normal, but CEF is Cisco's premiere technology, designed to increase L3 forwarding performance. So a drop in forwarding performance should be expected, again though, just the delta is the open issue (NB: your mileage may vary).
09-05-2012 03:25 AM
Thanks for the information Joseph. I have just looked at our Dialer interface and noticed that "Stateful Inspection" is present:
Dialer0 is up (if_number 12)
Corresponding hwidb fast_if_number 12
Corresponding hwidb firstsw->if_number 12
Internet address is xxx.xxx.xxx.xxx/29
ICMP redirects are never sent
Per packet load-sharing is disabled
IP unicast RPF check is disabled
Input features: Stateful Inspection, Dialer i/f override, Virtual Fragment Reassembly, Virtual Fragment Reassembly After IPSec Decryption, NAT Outside
Output features: Post-routing NAT Outside, Stateful Inspection, Dialer idle reset, Dialer idle reset
IP policy routing is disabled
BGP based policy accounting on input is disabled
BGP based policy accounting on output is disabled
Interface is marked as point to point interface
Hardware idb is Dialer0
Fast switching type 15, interface type 98
IP CEF switching enabled
IP CEF switching turbo vector
IP Null turbo vector
IP prefix lookup IPv4 mtrie 8-8-8-8 optimized
Input fast flags 0x400040, Output fast flags 0x10100
ifindex 12(12)
Slot Slot unit -1 VC -1
IP MTU 1492
Is this normal? Is it needed for NAT? Can it be disabled?
09-05-2012 04:42 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Ryan Barclay wrote:
Thanks for the information Joseph. I have just looked at our Dialer interface and noticed that "Stateful Inspection" is present:
Dialer0 is up (if_number 12)
Corresponding hwidb fast_if_number 12
Corresponding hwidb firstsw->if_number 12
Internet address is xxx.xxx.xxx.xxx/29
ICMP redirects are never sent
Per packet load-sharing is disabled
IP unicast RPF check is disabled
Input features: Stateful Inspection, Dialer i/f override, Virtual Fragment Reassembly, Virtual Fragment Reassembly After IPSec Decryption, NAT Outside
Output features: Post-routing NAT Outside, Stateful Inspection, Dialer idle reset, Dialer idle reset
IP policy routing is disabled
BGP based policy accounting on input is disabled
BGP based policy accounting on output is disabled
Interface is marked as point to point interface
Hardware idb is Dialer0
Fast switching type 15, interface type 98
IP CEF switching enabled
IP CEF switching turbo vector
IP Null turbo vector
IP prefix lookup IPv4 mtrie 8-8-8-8 optimized
Input fast flags 0x400040, Output fast flags 0x10100
ifindex 12(12)
Slot Slot unit -1 VC -1
IP MTU 1492
Is this normal? Is it needed for NAT? Can it be disabled?
My guess, might be related to NAT.
09-05-2012 03:07 AM
Ryan Barclay wrote:
Also, I forgot to mention Joseph - when we disable CEF, traffic slows right down to just 20Mb/s - that's all we can push through it. However, with CEF enabled we can push almost 100Mb/s - is this normal?
You need to run with CEF exclusively. The router is not designed to work without.
09-05-2012 03:25 AM
Thank you for the information Paolo.
09-05-2012 02:16 PM
Do you think that if I configure a single direct Ethernet interface on the inside instead of using the Vlan this would help with the CPU pegging? Just a thought.
Sent from Cisco Technical Support iPad App
09-05-2012 04:45 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Ryan Barclay wrote:
Do you think that if I configure a single direct Ethernet interface on the inside instead of using the Vlan this would help with the CPU pegging? Just a thought.
Sent from Cisco Technical Support iPad App
I doubt it.
One other possible CPU consumer is the router dealing with fragmentation because of your PPoE, but don't see a way to avoid it because your are PPoE. With your configuration, since you also have the mss-adjust for TCP, unlikely there's much of this happening.
09-06-2012 03:12 AM
Any log with "fragment table has reached its maximum" ?
Any change on CPU with "no ip virtual-reassembly" ?
09-06-2012 05:08 AM
I have applied the following to both the Dialer0 and the Vlan1 interfaces:
no ip virtual-reassembly in
However, CPU is still 99% when traffic is heavy.
How can I check the fragment table? Sorry, I'm quite new to all of this.
Thank you very much for your help.
09-06-2012 05:46 AM
>How can I check the fragment table?
sh ip virtual-reassembly
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: