11-27-2012 09:54 AM - edited 03-04-2019 06:15 PM
Hello,
First of all, thanks for your help in advance! I know that this topic has come up quite a bit, but it would be nice to get your opinion/help with this issue.
We installed a solution with 2 Cisco 2801, BGP multihomed failover.
1) The router which is currently getting all the traffic gets to 55% to 60% of CPU usage when handling 40 SIP/RTP streams . This equals 10Mbit up/10Mbit down and it showed around 5800 packets TX and around 5800 packets RX, with a majority of them CEF switched. As those figures are way less than the performance figures published by Cisco, we wonder if we made any mistake in setting up our router, or if we can do something to improve the router setup.
2) Does it have an impact on router performance if we increase/decrease RTP packet size, thus increasing or decreasing the pps relative to the consumed bandwidth?
3) If it is not possible to improve router configuration, we also wonder about possible replacement units for those routers. Would a 2901 do a good job? By how much would it rise the capacity? What other models would you recommend if we plan to rise the number of concurrent calls by a factor of 4 or even 8 times of what we have now (so up to 48000 pps and 80Mbit).
Here is what we tried:
- ip route-cache same-interface does not seem to improve anything
- ip flow ingress on or off makes no difference
- disabling the inbound ACL on fa0/0 seems to reduce load by 10%, although I don't understand why - a very high percentage is CPU interrupts, and ACLs are process switched, or not?
- we tried following the Cisco guide for high CPU due to high interrupts, with no success
Here are some usage statistics:
The graphs that we plot via SNMP show a propotional growth/increase of CPU and bandwidth (and thus pps)
At the highest loads, we had a bit more than 55% CPU utilization with more than 50% interrupt CPU.
CPU utilization for five seconds: 36%/30%; one minute: 30%; five minutes: 30%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
127 13140 954 13773 2.00% 0.29% 0.07% 194 SSH Process
5 12224616 1351520 9045 1.83% 0.24% 0.14% 0 Check heaps
96 39576 1126121145 0 0.71% 0.79% 0.77% 0 Ethernet Msec Ti
123 12640516 76929600 164 0.71% 0.37% 0.31% 0 IP Input
289 55344 553370656 0 0.55% 0.58% 0.58% 0 HSRP Common
119 31088 276873591 0 0.15% 0.15% 0.15% 0 IPAM Manager
2 14348 1773731 8 0.07% 0.04% 0.02% 0 Load Meter
64 5436 8869897 0 0.07% 0.05% 0.07% 0 Per-Second Jobs
302 78156 649346 120 0.07% 0.02% 0.00% 0 OSPF-1 Hello
173 15152 9286599 1 0.07% 0.07% 0.07% 0 CEF: IPv4 proces
290 4956252 42408070 116 0.07% 0.07% 0.07% 0 HSRP IPv4
29 4650380 13370758 347 0.07% 0.07% 0.07% 0 ARP Input
show cpu processes history
222222222333333333333333333333333333333333333333333333222222
888888888111113333322222555552222200000111110000000000999999
100
90
80
70
60
50
40 *****
30 ************************************************************
20 ************************************************************
10 ************************************************************
0....5....1....1....2....2....3....3....4....4....5....5....6
0 5 0 5 0 5 0 5 0 5 0
CPU% per second (last 60 seconds)
333333233333344453323334444444333343344545334444444444499333
542120935157502208492594271243829746863160630021028354299973
100 **
90 **
80 **
70 **
60 **
50 * * * #*# * * ##
40 * * ****###* *######### *****#####* *############**
30 ############################################################
20 ############################################################
10 ############################################################
0....5....1....1....2....2....3....3....4....4....5....5....6
0 5 0 5 0 5 0 5 0 5 0
CPU% per minute (last 60 minutes)
* = maximum CPU% # = average CPU%
94443311111125957994229121111 1 1 11 11 112 1 1 1 11111111211 11
965102634235109779931399125108979088099218901912192809099500312483309922
100 * * ** *
90 * * ** *
80 * * *** *
70 * * *** *
60 * ***** *
50 *** ***##* *
40 #*** **####* *
30 ####** ####### *
20 #####** **#######***** * * * **
10 ########***################***********************************##********
0....5....1....1....2....2....3....3....4....4....5....5....6....6....7..
0 5 0 5 0 5 0 5 0 5 0 5 0
CPU% per hour (last 72 hours)
* = maximum CPU% # = average CPU%
show int fa0/0
FastEthernet0/0 is up, line protocol is up
Hardware is Gt96k FE, address is 0017.95c1.0e0e (bia 0017.95c1.0e0e)
Description: link to public switch
Internet address is xxx/30
MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec,
reliability 255/255, txload 10/255, rxload 10/255
Encapsulation 802.1Q Virtual LAN, Vlan ID 1., loopback not set
Keepalive set (10 sec)
Full-duplex, 100Mb/s, 100BaseTX/FX
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:00, output hang never
Last clearing of "show interface" counters 16:49:43
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 4065000 bits/sec, 2261 packets/sec
5 minute output rate 4055000 bits/sec, 2259 packets/sec
91018375 packets input, 2710300101 bytes
Received 318549 broadcasts (0 IP multicasts)
0 runts, 0 giants, 36 throttles
55 input errors, 0 CRC, 0 frame, 0 overrun, 55 ignored
0 watchdog
0 input packets with dribble condition detected
90962419 packets output, 2668263330 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
2984 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier
0 output buffer failures, 0 output buffers swapped out
show int fa0/0 switching
FastEthernet0/0 link to public switch
Throttle count 631
Drops RP 0 SP 0
SPD Flushes Fast 0 SSE 0
SPD Aggress Fast 0
SPD Priority Inputs 48039822 Drops 0
Protocol IP
Switching path Pkts In Chars In Pkts Out Chars Out
Process 40750060 3474058032 33082509 2096128623
Cache misses 0 - - -
Fast 648511240 1245356047 639188320 143520056
Auton/SSE 0 0 0 0
Protocol DEC MOP
Switching path Pkts In Chars In Pkts Out Chars Out
Process 0 0 6544 503888
Cache misses 0 - - -
Fast 0 0 0 0
Auton/SSE 0 0 0 0
Protocol ARP
Switching path Pkts In Chars In Pkts Out Chars Out
Process 10518633 629538628 11718898 703280564
Cache misses 0 - - -
Fast 0 0 0 0
Auton/SSE 0 0 0 0
Protocol Other
Switching path Pkts In Chars In Pkts Out Chars Out
Process 445965 62435242 886372 53182320
Cache misses 0 - - -
Fast 0 0 0 0
Auton/SSE 0 0 0 0
NOTE: all counts are cumulative and reset only after a reload.
show cef int fa0/0
FastEthernet0/0 is up (if_number 2)
Corresponding hwidb fast_if_number 2
Corresponding hwidb firstsw->if_number 2
Internet address is xxx/30
Secondary address xxx/24
Secondary address xxx.132.2/24
Secondary address xxx.135.2/24
ICMP redirects are never sent
Per packet load-sharing is disabled
IP unicast RPF check is disabled
Input features: Stateful Inspection, Ingress-NetFlow, Virtual Fragment Reassembly, Access List, Virtual Fragment Reassembly After IPSec Decryption, NAT Outside
Output features: Post-routing NAT Outside, Stateful Inspection, Post-Ingress-NetFlow
IP policy routing is disabled
BGP based policy accounting on input is disabled
BGP based policy accounting on output is disabled
Hardware idb is FastEthernet0/0
Fast switching type 1, interface type 18
IP CEF switching enabled
IP CEF switching turbo vector
IP prefix lookup IPv4 mtrie 8-8-8-8 optimized
Input fast flags 0x400061, Output fast flags 0x10100
ifindex 2(2)
Slot Slot unit 0 VC -1
IP MTU 1500
show ip cef switching statistics
Reason Drop Punt Punt2Host
RP LES Packet destined for us 0 45594 52
RP LES No adjacency 8007 0 52251
RP LES TTL expired 0 0 397099
RP LES Features 1028 0 3948
RP LES Neighbor resolution req 5681 0 0
RP LES Total 14716 45594 453350
All Total 14716 45594 453350
Here is our router configuration:
version 15.1
no service pad
service tcp-keepalives-in
service tcp-keepalives-out
service timestamps debug datetime msec localtime show-timezone
service timestamps log datetime msec localtime show-timezone
service password-encryption
service sequence-numbers
!
hostname xxx
!
boot-start-marker
boot-end-marker
!
!
security authentication failure rate 10 log
security passwords min-length 6
no logging buffered
no logging console
no logging monitor
enable secret xxx
!
aaa new-model
!
!
aaa authentication login default local
aaa authorization exec default local
!
!
!
!
!
aaa session-id common
!
memory-size iomem 25
clock timezone EST -4 0
dot11 syslog
no ip source-route
!
!
!
!
!
no ip cef optimize neighbor resolution
ip cef
no ip bootp server
ip domain name xxx
ip name-server 8.8.8.8
ip name-server 8.8.4.4
login block-for 10 attempts 3 within 60
!
multilink bundle-name authenticated
!
!
!
!
!
license xxx
username xxx
username xxx
!
redundancy
!
!
ip ssh time-out 60
ip ssh authentication-retries 2
ip ssh version 2
ip ssh pubkey-chain
username root
key-hash ssh-rsa xxx
quit
!
track 1 interface FastEthernet0/0 line-protocol
!
track 2 interface Tunnel0 line-protocol
!
!
crypto isakmp policy 10
authentication pre-share
crypto isakmp key xxx address xxx
!
!
crypto ipsec transform-set ipsecset esp-3des esp-md5-hmac
!
crypto ipsec profile P1
set transform-set ipsecset
!
!
!
!
!
!
interface Loopback10
description Provider1 supplied loopback address for the B peer
ip address xxx 255.255.255.255
!
interface Tunnel0
ip address 10.254.0.2 255.255.255.252
keepalive 10 3
tunnel source xxx
tunnel mode ipsec ipv4
tunnel destination xxx
tunnel protection ipsec profile P1
!
interface FastEthernet0/0
description link to public switch
ip address xxx 255.255.255.0 secondary
ip address xxx.132.2 255.255.255.0 secondary
ip address xxx.135.2 255.255.255.0 secondary
ip address xxx 255.255.255.252
ip access-group block-invalid-from-outside in
no ip redirects
no ip unreachables
no ip proxy-arp
ip flow ingress
ip nat outside
ip virtual-reassembly in
standby 5 ip xxx.132.1
standby 5 timers 1 3
standby 5 priority 110
standby 5 preempt
standby 5 track 1 decrement 20
standby 10 ip xxx.135.1
standby 10 timers 1 3
standby 10 priority 110
standby 10 preempt
standby 10 track 1 decrement 20
standby 15 ip xxx
standby 15 timers 1 3
standby 15 priority 110
standby 15 preempt
standby 15 track 1 decrement 20
ip route-cache same-interface
duplex auto
speed auto
no cdp enable
no mop enabled
!
interface FastEthernet0/0.1
ip access-group block-invalid-from-outside in
ip flow ingress
no cdp enable
!
interface FastEthernet0/0.101
encapsulation dot1Q 101
ip address xxx 255.255.255.248
ip access-group block-invalid-from-outside in
no ip redirects
no ip unreachables
no ip proxy-arp
ip flow ingress
no cdp enable
!
interface FastEthernet0/1
description link to private switch
no ip address
no ip redirects
no ip unreachables
no ip proxy-arp
duplex auto
speed auto
no mop enabled
!
interface FastEthernet0/1.4
description private network
encapsulation dot1Q 4
ip address 10.0.0.13 255.255.255.0
no ip redirects
no ip unreachables
no ip proxy-arp
ip nat inside
ip virtual-reassembly in
standby 25 ip 10.0.0.1
standby 25 timers 1 3
standby 25 priority 110
standby 25 preempt
standby 25 track 1 decrement 20
standby 25 track 2 decrement 20
ip ospf priority 255
!
interface FastEthernet0/1.15
encapsulation dot1Q 15
ip address 10.0.15.2 255.255.255.0
no ip redirects
no ip unreachables
no ip proxy-arp
standby 20 ip 10.0.15.1
standby 20 timers 1 3
standby 20 priority 110
standby 20 preempt
standby 20 track 1 decrement 20
no cdp enable
!
interface Serial0/1/0
no ip address
shutdown
no fair-queue
!
router ospf 1
router-id 10.0.0.13
network 10.0.0.0 0.0.0.255 area 0
network 10.1.0.0 0.0.0.255 area 0
network 10.254.0.0 0.0.0.255 area 0
!
router bgp xxx
bgp default local-preference 110
bgp log-neighbor-changes
network xxx mask 255.255.255.255
network xxx mask 255.255.255.0
network xxx.132.0
network xxx.135.0
neighbor xxx remote-as xxx
neighbor xxx description Provider's B Peer to Core Router *Loopback*
neighbor xxx ebgp-multihop 5
neighbor xxx update-source Loopback10
neighbor xxx prefix-list only-default-route in
neighbor xxx remote-as xxx
neighbor xxx description Provider's A Peer to BA router
neighbor xxx.132.3 remote-as xxx
neighbor xxx.132.3 next-hop-self
maximum-paths 6
no auto-summary
!
ip forward-protocol nd
!
ip as-path access-list 85 permit ^xxx_[0-9]*$
!
no ip http server
no ip http secure-server
ip nat inside source list privatenat interface FastEthernet0/0 overload
ip route 0.0.0.0 0.0.0.0 xxx 250
ip route 10.1.0.0 255.255.255.0 Tunnel0 250
ip route 10.10.0.0 255.255.255.0 10.0.0.72 250
ip route 10.36.0.0 255.255.0.0 10.0.0.10 250
ip route 10.37.0.0 255.255.0.0 10.0.0.23 250
ip route xxx 255.255.255.0 Null0 250
ip route xxx 255.255.254.0 xxx
ip route 172.31.1.0 255.255.255.0 10.0.0.72 250
ip route xxx 255.255.255.255 xx.132.3
ip route xxx 255.255.255.0 xxx
ip route xxx.132.0 255.255.255.0 Null0 250
ip route xxx.135.0 255.255.255.0 Null0 250
!
ip access-list standard privatenat
permit 10.0.0.0 0.255.255.255
permit 172.16.0.0 0.15.255.255
permit 192.168.0.0 0.0.255.255
ip access-list standard snmp-cacti-only
permit 10.0.0.72
remark restrict to cacti server
permit 10.1.0.69
deny any log
!
ip access-list extended block-invalid-from-outside
deny ip 10.0.0.0 0.255.255.255 any
deny ip 172.16.0.0 0.15.255.255 any
deny ip 192.168.0.0 0.0.255.255 any
deny ip 169.254.0.0 0.0.255.255 any
deny ip 192.0.2.0 0.0.0.255 any
deny ip any 10.0.0.0 0.255.255.255
deny ip any 172.16.0.0 0.15.255.255
deny ip any 192.0.2.0 0.0.0.255
permit ip any any
deny ip any 192.168.0.0 0.0.255.255
!
!
ip prefix-list 86 seq 5 permit 0.0.0.0/0
!
ip prefix-list only-default-route seq 5 permit 0.0.0.0/0
!
ip prefix-list only-modulis-bgp-range seq 5 permit xxx.132.0/22
ip prefix-list only-modulis-bgp-range seq 10 permit xxx.132.0/24
ip prefix-list only-modulis-bgp-range seq 15 permit xxx.133.0/24
ip prefix-list only-modulis-bgp-range seq 20 permit xxx.132.0/23
ip prefix-list only-modulis-bgp-range seq 25 permit xxx.134.0/24
ip prefix-list only-modulis-bgp-range seq 30 permit xxx.135.0/24
logging esm config
no logging trap
logging facility local2
access-list 99 permit 10.0.0.0 0.0.0.255
access-list 99 permit 10.1.0.0 0.0.0.255
access-list 99 permit 172.31.1.0 0.0.0.255
access-list 101 permit ip 10.0.0.0 0.0.0.255 10.1.0.0 0.0.0.255
no cdp run
!
!
!
snmp-server group modulisgroup v3 auth access snmp-cacti-only
snmp-server group modulisgroup v3 priv access snmp-cacti-only
!
!
!
control-plane
!
!
scheduler allocate 20000 1000
end
Solved! Go to Solution.
11-28-2012 04:11 AM
CPU usage seems normal.
You can check attached document to decide on a router upgrade.
11-28-2012 10:27 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Your mileage might vary.
If you read the performance document's header notes, it explains the performance numbers are for theoretical best case.
In your first question, you mention most of your traffic is CEF switched, but keep in mind process switching, on a 2801, according to that document, is 30x slower. So even a few process switched packets will add to your load.
For decreasing performance, realize pretty much everything a software router does consumes CPU, so where possible decrease the work to accomplish the same results. For instance, can ACEs be resequenced in hit occurrence (most frequent first)? Can ACEs be combined?
e.g.:
access-list 99 permit 10.0.0.0 0.0.0.255
access-list 99 permit 10.1.0.0 0.0.0.255
vs.
access-list 99 permit 10.0.0.0 0.1.0.255
Perhaps Unicast RPF can be used in lieu of your block-invalid-from-outside.
Since you're using BGP, perhaps you want to enable ip tcp path-mtu-discovery to enable max MTU between BGP peers.
On your tunnel, you might want to enable mss-adjust to minimize the need to fragment TCP packets.
Etc.
For #2, yes, on software based routers, packet size matters. Generally increasing packet size will increase effective throughput.
For #3, you can roughly scale your existing performance based on reference sheet Paolo provided.
11-28-2012 04:11 AM
11-28-2012 07:31 AM
Thanks.
How come that the figures in the document and the real capacity of the router are so far off (90000 vs 20000 pps)?
Thanks again!
11-28-2012 10:27 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Your mileage might vary.
If you read the performance document's header notes, it explains the performance numbers are for theoretical best case.
In your first question, you mention most of your traffic is CEF switched, but keep in mind process switching, on a 2801, according to that document, is 30x slower. So even a few process switched packets will add to your load.
For decreasing performance, realize pretty much everything a software router does consumes CPU, so where possible decrease the work to accomplish the same results. For instance, can ACEs be resequenced in hit occurrence (most frequent first)? Can ACEs be combined?
e.g.:
access-list 99 permit 10.0.0.0 0.0.0.255
access-list 99 permit 10.1.0.0 0.0.0.255
vs.
access-list 99 permit 10.0.0.0 0.1.0.255
Perhaps Unicast RPF can be used in lieu of your block-invalid-from-outside.
Since you're using BGP, perhaps you want to enable ip tcp path-mtu-discovery to enable max MTU between BGP peers.
On your tunnel, you might want to enable mss-adjust to minimize the need to fragment TCP packets.
Etc.
For #2, yes, on software based routers, packet size matters. Generally increasing packet size will increase effective throughput.
For #3, you can roughly scale your existing performance based on reference sheet Paolo provided.
11-28-2012 02:29 PM
Nothing indicates that your router has exhausted capacity.
To the contrary, 65 or even 95% CPU usage indicates it can take more traffic.
11-28-2012 04:54 PM
Thanks to both of you for the info!
11-29-2012 03:42 AM
Thank you for the nice rating and good luck!
11-29-2012 10:12 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Paolo Bevilacqua wrote:
Nothing indicates that your router has exhausted capacity.
To the contrary, 65 or even 95% CPU usage indicates it can take more traffic.
I agree with Paolo, your posted stats don't show your router has exhausted total capacity, but perhaps a more important question might be does it have adequate capacity for your traffic now and any more of the same.
Your posted CPU's history stats generally shows a consistent load with few and narrow 99% spikes. For traffic like your SIP/RTP, you would want sufficient "reserve" CPU capacity that the CPU can keep up with your traffic, close to real-time. Another indication that you have sufficient CPU might be lack of any ingress queue drops, which there are none for the Ethernet interface's stats you've posted.
12-04-2012 04:32 PM
Based on the ordering guide, here are the UBE max capacities for ISR Gen 1:
Cisco 2800 Series Integrated Services Routers | 2801 2811 2821 2851 | 55 110 200 225 |
Cisco 3800 Series Integrated Services Routers | 3825 3845 | 400 500 |
So regardless of BGP tuning, you have hit SIP capacity for Cisco gateways. I have found when doing a lot on the UBE normalization side, you want to stay under 75% of those figures for high volume gateways. Remember, these figures were based on UBE gateways only doing voice. Add in BGP and other functions....
Cheers,
Pete
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide