10-25-2015 06:23 AM - edited 03-05-2019 02:35 AM
Hello,
recently I installed an 2811 15.1(4)M10. The primary function is to provide static IPv4-addresses to dynamic endpoints. I'm using a stock DMVPN config without crypto for that purpose. With approx 5MBit/s outgoing over the DMVPN-Tunnel-Interface, CPU is spending 98% on doing things and IP Input is the top most CPU hog. As you can see, only 34% CPU are spent on Interrupt-Switching packets.
show proc cpu sorted 5min CPU utilization for five seconds: 98%/34%; one minute: 98%; five minutes: 97% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 138 1562991564 404043710 3868 55.71% 55.40% 55.63% 0 IP Input
I tried to debug the cause but I'm running out of ideas.
show interface stats FastEthernet0/0 Switching path Pkts In Chars In Pkts Out Chars Out Processor 82854343 2835637802 41595604 3625150655 Route cache 108673117 4258033954 115701696 306401071 Total 191527460 2798704460 157297300 3931551726 Tunnel6 Switching path Pkts In Chars In Pkts Out Chars Out Processor 35423466 1297452269 164814 21452440 Route cache 26199025 3188431356 58412283 2399557371 Total 61622491 190916329 58577097 2421009811
Fa0/0 is the main interface where all traffic runs in and out. As you can see, there are many packets input which get process switched.
Tunnel6 is the DMVPN-Hub-Interface. Please note the really high percentage of process switched Pkts In compared to Pkts Out.
show cef int FastEthernet0/0 is up (if_number 3) Corresponding hwidb fast_if_number 3 Corresponding hwidb firstsw->if_number 3 Internet address is <cut> Secondary address <cut> ICMP redirects are always sent Per packet load-sharing is disabled IP unicast RPF check is disabled Input features: Stateful Inspection, Virtual Fragment Reassembly, Virtual Fragment Reassembly After IPSec Decryption, NAT Outside Output features: Post-routing NAT Outside, Stateful Inspection IP policy routing is disabled BGP based policy accounting on input is disabled BGP based policy accounting on output is disabled IPv6 CEF switching enabled Hardware idb is FastEthernet0/0 Fast switching type 1, interface type 18 IP CEF switching enabled IP CEF switching turbo vector IP prefix lookup IPv4 mtrie 8-8-8-8 optimized Input fast flags 0x400040, Output fast flags 0x10100 ifindex 3(3) Slot Slot unit 0 VC -1 IP MTU 1500 Tunnel6 is up (if_number 8) Corresponding hwidb fast_if_number 8 Corresponding hwidb firstsw->if_number 8 Internet address is <cut> ICMP redirects are never sent Per packet load-sharing is disabled IP unicast RPF check is disabled IP policy routing is disabled BGP based policy accounting on input is disabled BGP based policy accounting on output is disabled Interface is marked as tunnel interface Hardware idb is Tunnel6 Fast switching type 14, interface type 0 IP CEF switching enabled IP CEF switching turbo vector IP Null turbo vector IP prefix lookup IPv4 mtrie 8-8-8-8 optimized Input fast flags 0x0, Output fast flags 0x0 ifindex 8(8) Slot Slot unit -1 VC -1 IP MTU 1476
The only difference between these two is the fast flags, but I don't know if they are relevant.
Since the following stats are reset only for a reload: uptime is 22 weeks, 2 days, 11 hours, 19 minutes.
show int switching FastEthernet0/0 Throttle count 5692 Drops RP 26171 SP 0 SPD Flushes Fast 0 SSE 0 SPD Aggress Fast 0 SPD Priority Inputs 78081963 Drops 0 Protocol IP Switching path Pkts In Chars In Pkts Out Chars Out Process 1679494616 4243398782 818616164 4214848478 Cache misses 0 - - - Fast 3414455563 435028419 3669552799 3141577463 Auton/SSE 0 0 0 0 Protocol DEC MOP Switching path Pkts In Chars In Pkts Out Chars Out Process 0 0 22461 1729497 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 Protocol ARP Switching path Pkts In Chars In Pkts Out Chars Out Process 62261029 3735661740 119167 7150020 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 Protocol CDP Switching path Pkts In Chars In Pkts Out Chars Out Process 224812 106560888 250320 110391120 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 Protocol IPv6 Switching path Pkts In Chars In Pkts Out Chars Out Process 7814309 1563957300 65623186 1184106846 Cache misses 0 - - - Fast 71908574 3189895435 30888012 2718155351 Auton/SSE 0 0 0 0 Protocol Other Switching path Pkts In Chars In Pkts Out Chars Out Process 10899 653940 1350462 81027720 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 Tunnel6 Tunnel-Wolke 2 Throttle count 0 Drops RP 6251973 SP 0 SPD Flushes Fast 0 SSE 0 SPD Aggress Fast 0 SPD Priority Inputs 0 Drops 0 Protocol IP Switching path Pkts In Chars In Pkts Out Chars Out Process 755542282 996645000 2921849 374627008 Cache misses 0 - - - Fast 582966433 3881995597 1430454894 3257835338 Auton/SSE 0 0 0 0 Protocol Other Switching path Pkts In Chars In Pkts Out Chars Out Process 0 0 7019105 920205716 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0
Again, there is a high rate of Pkts In which doesn't get Fast-Switched. Debugging on packet layer as recommended in Cisco-Documents for debugging high cpu load isn't an option since the CPU load is already very high and I had a very hard time to stop debugging after 10 minutes waiting for EXEC to get access to the CPU.
So, basic questions: Why are so many packets input getting process switched as opposed to fast switched, on a GRE-Tunnel Hub interface? (At least that's what I can dig out of the run time guts.) Checking on an 2901/15.5(2)T, I can see similar counts on the hub tunnel. On an 2951/Version 15.0(1)M9, fast switched packet counters are one or two orders of magnitudes higher on the tunnel interfaces than process switched.
Thanks for any hint regarding this topic!
10-25-2015 04:30 PM
Hello,
I would personally suggest trying to find out whether it is the tunneled traffic that is causing the IP Input process to go high on CPU. I am not sure if you can actually shutdown the Tunnel interface for a couple of minutes (it will obviously cause a network outage) but if you could, observing the CPU load would be extremely helpful.
If the IP Input is related to the traffic passing through the tunnel then the most probable issue would be MTU issues and the need to fragment packets. According to the outputs you have posted, the Tunnel interface is already using the MTU of 1476 which is just right for plain GRE tunnels (20 extra bytes for a new IP header, 4 bytes for the GRE header, 20+4+1476=1500). However, if the packets passing through this tunnel are over 1476 bytes long then the router will need to engage into fragmenting them, hence the high IP Input load.
You have not posted the configuration of the tunnel but please make sure that you are using the ip tcp adjust-mss 1436 (1476-40) command on that interface - this will affect all TCP sessions that are carried by that tunnel to not originate segments larger than 1436 bytes in their bodies, which is just right for the tunnel's MTU (TCP header is 20 bytes, IP header is 20 bytes, 20+20+1436=1476).
This command will unfortunately be unable to affect UDP streams. If there is a possibility of lots of intensive UDP traffic being carried by the tunnel, it is possible that the UDP segments are oversized (say, video streams or NFS mounts). Unfortunately, a router cannot influence UDP streams. If it is determined that your high CPU is indeed caused by excessive need for IP fragmentation, and the oversized IP packets carry UDP segments, you will need to tweak the applications that are sending them (it that's possible at all).
Best regards,
Peter
10-25-2015 11:39 PM
Hello Peter,
thanks for your suggestions. I didn't try to shut down the tunnel itself but the other side - the tunnel of the peer who's experiencing/generating traffic. The load immediately drops to nearly zero, as I expected.
I also thought about MTU-stuff. Thanks for your maths about mtu and mss sizes. I always carefully calculate and set these values. I don't have proof that high CPU comes from fragmentation but we're running two 2821 as PPPoE endpoints with a sustained rate of up to 80MBit/s from upstream into the PPPoE-sessions. These boxes don't suffer from high CPU but they need to frag packets regularly. For that reason I don't believe in frag-processing as root cause.
I know about ip tcp adjust-mss and I try to omit this configuration in favour of PMTUD. Reason: I read somewhere that adding this parameter makes packets always going the process switching way. Unfortunately, I didn't save the source of this claim. :-( Anyway, adding this parameter and looking at the CPU usage history over the last few hours didn't change anything.
For your recommendations regarding UDP traffic: I configured ip flow top-talkers which shows that mostly tcp-traffic is flowing over the tunnel.
Please also see my comment about punt-adjancencies as reply to my initial post.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide