cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3142
Views
5
Helpful
8
Replies

2801 high CPU load / low traffic / high interrupts

modulis.ca
Level 1
Level 1

Hello,

First of all, thanks for your help in advance! I know that this topic has come up quite a bit, but it would be nice to get your opinion/help with this issue.

We installed a solution with 2 Cisco 2801, BGP multihomed failover.

1) The router which is currently getting all the traffic gets to 55% to 60% of CPU usage when handling 40 SIP/RTP streams . This equals 10Mbit up/10Mbit down and it showed around 5800 packets TX and around 5800 packets RX, with a majority of them CEF switched. As those figures are way less than the performance figures published by Cisco, we wonder if we made any mistake in setting up our router, or if we can do something to improve the router setup.

2) Does it have an impact on router performance if we increase/decrease RTP packet size, thus increasing or decreasing the pps relative to the consumed bandwidth?

3) If it is not possible to improve router configuration, we also wonder about possible replacement units for those routers. Would a 2901 do a good job? By how much would it rise the capacity? What other models would you recommend if we plan to rise the number of concurrent calls by a factor of 4 or even 8 times of what we have now (so up to 48000 pps and 80Mbit).

Here is what we tried:

- ip route-cache same-interface does not seem to improve anything

- ip flow ingress on or off makes no difference

- disabling the inbound ACL on fa0/0 seems to reduce load by 10%, although I don't understand why - a very high percentage is CPU interrupts, and ACLs are process switched, or not?

- we tried following the Cisco guide for high CPU due to high interrupts, with no success

Here are some usage statistics:

The graphs that we plot via SNMP show a propotional growth/increase of CPU and bandwidth (and thus pps)

At the highest loads, we had a bit more than 55% CPU utilization with more than 50% interrupt CPU.

CPU utilization for five seconds: 36%/30%; one minute: 30%; five minutes: 30%

PID Runtime(ms)     Invoked      uSecs   5Sec   1Min   5Min TTY Process

127       13140         954      13773  2.00%  0.29%  0.07% 194 SSH Process     

   5    12224616     1351520       9045  1.83%  0.24%  0.14%   0 Check heaps     

  96       39576  1126121145          0  0.71%  0.79%  0.77%   0 Ethernet Msec Ti

123    12640516    76929600        164  0.71%  0.37%  0.31%   0 IP Input        

289       55344   553370656          0  0.55%  0.58%  0.58%   0 HSRP Common     

119       31088   276873591          0  0.15%  0.15%  0.15%   0 IPAM Manager    

   2       14348     1773731          8  0.07%  0.04%  0.02%   0 Load Meter      

  64        5436     8869897          0  0.07%  0.05%  0.07%   0 Per-Second Jobs 

302       78156      649346        120  0.07%  0.02%  0.00%   0 OSPF-1 Hello    

173       15152     9286599          1  0.07%  0.07%  0.07%   0 CEF: IPv4 proces

290     4956252    42408070        116  0.07%  0.07%  0.07%   0 HSRP IPv4       

  29     4650380    13370758        347  0.07%  0.07%  0.07%   0 ARP Input       

show cpu processes history

     222222222333333333333333333333333333333333333333333333222222

      888888888111113333322222555552222200000111110000000000999999

  100                                                            

   90                                                            

   80                                                            

   70                                                            

   60                                                            

   50                                                            

   40                         *****                              

   30 ************************************************************

   20 ************************************************************

   10 ************************************************************

     0....5....1....1....2....2....3....3....4....4....5....5....6

               0    5    0    5    0    5    0    5    0    5    0

               CPU% per second (last 60 seconds)

      333333233333344453323334444444333343344545334444444444499333

      542120935157502208492594271243829746863160630021028354299973

  100                                                        **  

   90                                                        **  

   80                                                        **  

   70                                                        **  

   60                                                        **  

   50                 *        *           * #*#        * *  ##  

   40 *       * ****###*   *######### *****#####* *############**

   30 ############################################################

   20 ############################################################

   10 ############################################################

     0....5....1....1....2....2....3....3....4....4....5....5....6

               0    5    0    5    0    5    0    5    0    5    0

               CPU% per minute (last 60 minutes)

              * = maximum CPU%   # = average CPU%

      94443311111125957994229121111    1  1  11  11 112 1 1 1  11111111211  11

      965102634235109779931399125108979088099218901912192809099500312483309922

  100 *             *  **   *                                                

   90 *             *  **   *                                                

   80 *             * ***   *                                                

   70 *             * ***   *                                                

   60 *             *****   *                                                

   50 ***          ***##*   *                                                

   40 #***         **####*  *                                                

   30 ####**       #######  *                                                

   20 #####**    **#######***** *                     *        *      **     

   10 ########***################***********************************##********

     0....5....1....1....2....2....3....3....4....4....5....5....6....6....7..

               0    5    0    5    0    5    0    5    0    5    0    5    0 

                   CPU% per hour (last 72 hours)

                  * = maximum CPU%   # = average CPU%

show int fa0/0

FastEthernet0/0 is up, line protocol is up

  Hardware is Gt96k FE, address is 0017.95c1.0e0e (bia 0017.95c1.0e0e)

  Description: link to public switch

  Internet address is xxx/30

  MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec,

     reliability 255/255, txload 10/255, rxload 10/255

  Encapsulation 802.1Q Virtual LAN, Vlan ID  1., loopback not set

  Keepalive set (10 sec)

  Full-duplex, 100Mb/s, 100BaseTX/FX

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:00, output 00:00:00, output hang never

  Last clearing of "show interface" counters 16:49:43

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 4065000 bits/sec, 2261 packets/sec

  5 minute output rate 4055000 bits/sec, 2259 packets/sec

     91018375 packets input, 2710300101 bytes

     Received 318549 broadcasts (0 IP multicasts)

     0 runts, 0 giants, 36 throttles

     55 input errors, 0 CRC, 0 frame, 0 overrun, 55 ignored

     0 watchdog

     0 input packets with dribble condition detected

     90962419 packets output, 2668263330 bytes, 0 underruns

     0 output errors, 0 collisions, 0 interface resets

     2984 unknown protocol drops

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier

     0 output buffer failures, 0 output buffers swapped out

show int fa0/0 switching

FastEthernet0/0 link to public switch

          Throttle count        631

                   Drops         RP          0         SP          0

             SPD Flushes       Fast          0        SSE          0

             SPD Aggress       Fast          0

            SPD Priority     Inputs   48039822      Drops          0

    Protocol  IP                 

          Switching path    Pkts In   Chars In   Pkts Out  Chars Out

                 Process   40750060 3474058032   33082509 2096128623

            Cache misses          0          -          -          -

                    Fast  648511240 1245356047  639188320  143520056

               Auton/SSE          0          0          0          0

    Protocol  DEC MOP            

          Switching path    Pkts In   Chars In   Pkts Out  Chars Out

                 Process          0          0       6544     503888

            Cache misses          0          -          -          -

                    Fast          0          0          0          0

               Auton/SSE          0          0          0          0

    Protocol  ARP                

          Switching path    Pkts In   Chars In   Pkts Out  Chars Out

                 Process   10518633  629538628   11718898  703280564

            Cache misses          0          -          -          -

                    Fast          0          0          0          0

               Auton/SSE          0          0          0          0

    Protocol  Other              

          Switching path    Pkts In   Chars In   Pkts Out  Chars Out

                 Process     445965   62435242     886372   53182320

            Cache misses          0          -          -          -

                    Fast          0          0          0          0

               Auton/SSE          0          0          0          0

    NOTE: all counts are cumulative and reset only after a reload.

show cef int fa0/0

FastEthernet0/0 is up (if_number 2)

  Corresponding hwidb fast_if_number 2

  Corresponding hwidb firstsw->if_number 2

  Internet address is xxx/30

  Secondary address xxx/24

  Secondary address xxx.132.2/24

  Secondary address xxx.135.2/24

  ICMP redirects are never sent

  Per packet load-sharing is disabled

  IP unicast RPF check is disabled

  Input features: Stateful Inspection, Ingress-NetFlow, Virtual Fragment Reassembly, Access List, Virtual Fragment Reassembly After IPSec Decryption, NAT Outside

  Output features: Post-routing NAT Outside, Stateful Inspection, Post-Ingress-NetFlow

  IP policy routing is disabled

  BGP based policy accounting on input is disabled

  BGP based policy accounting on output is disabled

  Hardware idb is FastEthernet0/0

  Fast switching type 1, interface type 18

  IP CEF switching enabled

  IP CEF switching turbo vector

  IP prefix lookup IPv4 mtrie 8-8-8-8 optimized

  Input fast flags 0x400061, Output fast flags 0x10100

  ifindex 2(2)

  Slot  Slot unit 0 VC -1

  IP MTU 1500

show ip cef switching statistics

       Reason                          Drop       Punt  Punt2Host

RP LES Packet destined for us             0      45594         52

RP LES No adjacency                    8007          0      52251

RP LES TTL expired                        0          0     397099

RP LES Features                        1028          0       3948

RP LES Neighbor resolution req         5681          0          0

RP LES Total                          14716      45594     453350

All    Total                          14716      45594     453350

Here is our router configuration:

version 15.1

no service pad

service tcp-keepalives-in

service tcp-keepalives-out

service timestamps debug datetime msec localtime show-timezone

service timestamps log datetime msec localtime show-timezone

service password-encryption

service sequence-numbers

!

hostname xxx

!

boot-start-marker

boot-end-marker

!

!

security authentication failure rate 10 log

security passwords min-length 6

no logging buffered

no logging console

no logging monitor

enable secret xxx

!

aaa new-model

!

!

aaa authentication login default local

aaa authorization exec default local

!

!

!

!

!

aaa session-id common

!

memory-size iomem 25

clock timezone EST -4 0

dot11 syslog

no ip source-route

!

!

!

!

!

no ip cef optimize neighbor resolution

ip cef

no ip bootp server

ip domain name xxx

ip name-server 8.8.8.8

ip name-server 8.8.4.4

login block-for 10 attempts 3 within 60

!

multilink bundle-name authenticated

!

!

!

!

!

license xxx

username xxx

username xxx

!

redundancy

!

!

ip ssh time-out 60

ip ssh authentication-retries 2

ip ssh version 2

ip ssh pubkey-chain

  username root

   key-hash ssh-rsa xxx

  quit

!

track 1 interface FastEthernet0/0 line-protocol

!

track 2 interface Tunnel0 line-protocol

!

!

crypto isakmp policy 10

authentication pre-share

crypto isakmp key xxx address xxx

!

!

crypto ipsec transform-set ipsecset esp-3des esp-md5-hmac

!

crypto ipsec profile P1

set transform-set ipsecset

!

!

!

!

!

!

interface Loopback10

description Provider1 supplied loopback address for the B peer

ip address xxx 255.255.255.255

!

interface Tunnel0

ip address 10.254.0.2 255.255.255.252

keepalive 10 3

tunnel source xxx

tunnel mode ipsec ipv4

tunnel destination xxx

tunnel protection ipsec profile P1

!

interface FastEthernet0/0

description link to public switch

ip address xxx 255.255.255.0 secondary

ip address xxx.132.2 255.255.255.0 secondary

ip address xxx.135.2 255.255.255.0 secondary

ip address xxx 255.255.255.252

ip access-group block-invalid-from-outside in

no ip redirects

no ip unreachables

no ip proxy-arp

ip flow ingress

ip nat outside

ip virtual-reassembly in

standby 5 ip xxx.132.1

standby 5 timers 1 3

standby 5 priority 110

standby 5 preempt

standby 5 track 1 decrement 20

standby 10 ip xxx.135.1

standby 10 timers 1 3

standby 10 priority 110

standby 10 preempt

standby 10 track 1 decrement 20

standby 15 ip xxx

standby 15 timers 1 3

standby 15 priority 110

standby 15 preempt

standby 15 track 1 decrement 20

ip route-cache same-interface

duplex auto

speed auto

no cdp enable

no mop enabled

!

interface FastEthernet0/0.1

ip access-group block-invalid-from-outside in

ip flow ingress

no cdp enable

!

interface FastEthernet0/0.101

encapsulation dot1Q 101

ip address xxx 255.255.255.248

ip access-group block-invalid-from-outside in

no ip redirects

no ip unreachables

no ip proxy-arp

ip flow ingress

no cdp enable

!

interface FastEthernet0/1

description link to private switch

no ip address

no ip redirects

no ip unreachables

no ip proxy-arp

duplex auto

speed auto

no mop enabled

!

interface FastEthernet0/1.4

description private network

encapsulation dot1Q 4

ip address 10.0.0.13 255.255.255.0

no ip redirects

no ip unreachables

no ip proxy-arp

ip nat inside

ip virtual-reassembly in

standby 25 ip 10.0.0.1

standby 25 timers 1 3

standby 25 priority 110

standby 25 preempt

standby 25 track 1 decrement 20

standby 25 track 2 decrement 20

ip ospf priority 255

!

interface FastEthernet0/1.15

encapsulation dot1Q 15

ip address 10.0.15.2 255.255.255.0

no ip redirects

no ip unreachables

no ip proxy-arp

standby 20 ip 10.0.15.1

standby 20 timers 1 3

standby 20 priority 110

standby 20 preempt

standby 20 track 1 decrement 20

no cdp enable

!

interface Serial0/1/0

no ip address

shutdown

no fair-queue

!

router ospf 1

router-id 10.0.0.13

network 10.0.0.0 0.0.0.255 area 0

network 10.1.0.0 0.0.0.255 area 0

network 10.254.0.0 0.0.0.255 area 0

!

router bgp xxx

bgp default local-preference 110

bgp log-neighbor-changes

network xxx mask 255.255.255.255

network xxx mask 255.255.255.0

network xxx.132.0

network xxx.135.0

neighbor xxx remote-as xxx

neighbor xxx description Provider's B Peer to Core Router *Loopback*

neighbor xxx ebgp-multihop 5

neighbor xxx update-source Loopback10

neighbor xxx prefix-list only-default-route in

neighbor xxx remote-as xxx

neighbor xxx description Provider's A Peer to BA router

neighbor xxx.132.3 remote-as xxx

neighbor xxx.132.3 next-hop-self

maximum-paths 6

no auto-summary

!

ip forward-protocol nd

!

ip as-path access-list 85 permit ^xxx_[0-9]*$

!

no ip http server

no ip http secure-server

ip nat inside source list privatenat interface FastEthernet0/0 overload

ip route 0.0.0.0 0.0.0.0 xxx 250

ip route 10.1.0.0 255.255.255.0 Tunnel0 250

ip route 10.10.0.0 255.255.255.0 10.0.0.72 250

ip route 10.36.0.0 255.255.0.0 10.0.0.10 250

ip route 10.37.0.0 255.255.0.0 10.0.0.23 250

ip route xxx 255.255.255.0 Null0 250

ip route xxx 255.255.254.0 xxx

ip route 172.31.1.0 255.255.255.0 10.0.0.72 250

ip route xxx 255.255.255.255 xx.132.3

ip route xxx 255.255.255.0 xxx

ip route xxx.132.0 255.255.255.0 Null0 250

ip route xxx.135.0 255.255.255.0 Null0 250

!

ip access-list standard privatenat

permit 10.0.0.0 0.255.255.255

permit 172.16.0.0 0.15.255.255

permit 192.168.0.0 0.0.255.255

ip access-list standard snmp-cacti-only

permit 10.0.0.72

remark restrict to cacti server

permit 10.1.0.69

deny   any log

!

ip access-list extended block-invalid-from-outside

deny   ip 10.0.0.0 0.255.255.255 any

deny   ip 172.16.0.0 0.15.255.255 any

deny   ip 192.168.0.0 0.0.255.255 any

deny   ip 169.254.0.0 0.0.255.255 any

deny   ip 192.0.2.0 0.0.0.255 any

deny   ip any 10.0.0.0 0.255.255.255

deny   ip any 172.16.0.0 0.15.255.255

deny   ip any 192.0.2.0 0.0.0.255

permit ip any any

deny   ip any 192.168.0.0 0.0.255.255

!

!

ip prefix-list 86 seq 5 permit 0.0.0.0/0

!

ip prefix-list only-default-route seq 5 permit 0.0.0.0/0

!

ip prefix-list only-modulis-bgp-range seq 5 permit xxx.132.0/22

ip prefix-list only-modulis-bgp-range seq 10 permit xxx.132.0/24

ip prefix-list only-modulis-bgp-range seq 15 permit xxx.133.0/24

ip prefix-list only-modulis-bgp-range seq 20 permit xxx.132.0/23

ip prefix-list only-modulis-bgp-range seq 25 permit xxx.134.0/24

ip prefix-list only-modulis-bgp-range seq 30 permit xxx.135.0/24

logging esm config

no logging trap

logging facility local2

access-list 99 permit 10.0.0.0 0.0.0.255

access-list 99 permit 10.1.0.0 0.0.0.255

access-list 99 permit 172.31.1.0 0.0.0.255

access-list 101 permit ip 10.0.0.0 0.0.0.255 10.1.0.0 0.0.0.255

no cdp run

!

!

!

snmp-server group modulisgroup v3 auth access snmp-cacti-only

snmp-server group modulisgroup v3 priv access snmp-cacti-only

!

!

!

control-plane

!

!

scheduler allocate 20000 1000

end

2 Accepted Solutions

Accepted Solutions

paolo bevilacqua
Hall of Fame
Hall of Fame

CPU usage seems normal.

You can check attached document to decide on a router upgrade.

View solution in original post

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Your mileage might vary.

If you read the performance document's header notes, it explains the performance numbers are for theoretical best case.

In your first question, you mention most of your traffic is CEF switched, but keep in mind process switching, on a 2801, according to that document, is 30x slower.  So even a few process switched packets will add to your load.

For decreasing performance, realize pretty much everything a software router does consumes CPU, so where possible decrease the work to accomplish the same results.  For instance, can ACEs be resequenced in hit occurrence (most frequent first)?  Can ACEs be combined?

e.g.:

access-list 99 permit 10.0.0.0 0.0.0.255

access-list 99 permit 10.1.0.0 0.0.0.255

vs.

access-list 99 permit 10.0.0.0 0.1.0.255

Perhaps Unicast RPF can be used in lieu of your block-invalid-from-outside.

Since you're using BGP, perhaps you want to enable ip tcp path-mtu-discovery to enable max MTU between BGP peers.

On your tunnel, you might want to enable mss-adjust to minimize the need to fragment TCP packets.

Etc.

For #2, yes, on software based routers, packet size matters.  Generally increasing packet size will increase effective throughput.

For #3, you can roughly scale your existing performance based on reference sheet Paolo provided.

View solution in original post

8 Replies 8

paolo bevilacqua
Hall of Fame
Hall of Fame

CPU usage seems normal.

You can check attached document to decide on a router upgrade.

Thanks.

How come that the figures in the document and the real capacity of the router are so far off (90000 vs 20000 pps)?

Thanks again!

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Your mileage might vary.

If you read the performance document's header notes, it explains the performance numbers are for theoretical best case.

In your first question, you mention most of your traffic is CEF switched, but keep in mind process switching, on a 2801, according to that document, is 30x slower.  So even a few process switched packets will add to your load.

For decreasing performance, realize pretty much everything a software router does consumes CPU, so where possible decrease the work to accomplish the same results.  For instance, can ACEs be resequenced in hit occurrence (most frequent first)?  Can ACEs be combined?

e.g.:

access-list 99 permit 10.0.0.0 0.0.0.255

access-list 99 permit 10.1.0.0 0.0.0.255

vs.

access-list 99 permit 10.0.0.0 0.1.0.255

Perhaps Unicast RPF can be used in lieu of your block-invalid-from-outside.

Since you're using BGP, perhaps you want to enable ip tcp path-mtu-discovery to enable max MTU between BGP peers.

On your tunnel, you might want to enable mss-adjust to minimize the need to fragment TCP packets.

Etc.

For #2, yes, on software based routers, packet size matters.  Generally increasing packet size will increase effective throughput.

For #3, you can roughly scale your existing performance based on reference sheet Paolo provided.

Nothing indicates that your router has exhausted capacity.

To the contrary, 65 or even 95% CPU usage indicates it can take more traffic.

Thanks to both of you for the info!

Thank you for the nice rating and good luck!

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Paolo Bevilacqua wrote:

Nothing indicates that your router has exhausted capacity.

To the contrary, 65 or even 95% CPU usage indicates it can take more traffic.

I agree with Paolo, your posted stats don't show your router has exhausted total capacity, but perhaps a more important question might be does it have adequate capacity for your traffic now and any more of the same.

Your posted CPU's history stats generally shows a consistent load with few and narrow 99% spikes. For traffic like your SIP/RTP, you would want sufficient "reserve" CPU capacity that the CPU can keep up with your traffic, close to real-time. Another indication that you have sufficient CPU might be lack of any ingress queue drops, which there are none for the Ethernet interface's stats you've posted.

Based on the ordering guide, here are the UBE max capacities for ISR Gen 1:

Cisco 2800 Series Integrated Services Routers

2801

2811

2821

2851

55

110

200

225

Cisco 3800 Series Integrated Services Routers

3825

3845

400

500


So regardless of BGP tuning, you have hit SIP capacity for Cisco gateways. I have found when doing a lot on the UBE normalization side, you want to stay under 75% of those figures for high volume gateways. Remember, these figures were based on UBE gateways only doing voice. Add in BGP and other functions....

Cheers,

Pete

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card