08-01-2023 04:54 AM - edited 08-02-2023 01:54 AM
Hi everyone
I have a 1921 router with a WAN interface where I see a lot of packet drops. I can see that the drops are throttles and overruns but I does not know how to find the culprit for why it happens.
I don't have any CPU issues or anything of the like on this router.
R1#show int gi 0/0
GigabitEthernet0/0 is up, line protocol is up
Hardware is CN Gigabit Ethernet, address is c067.af50.8a40 (bia c067.af50.8a40)
Description: Yderside interface
Internet address is 195.xx.xx.xx/25
MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full Duplex, 100Mbps, media type is RJ45
output flow-control is unsupported, input flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:00, output hang never
Last clearing of "show interface" counters never
Input queue: 0/75/2498545/1023776 (size/max/drops/flushes); Total output drops: 74
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 252000 bits/sec, 147 packets/sec
5 minute output rate 87000 bits/sec, 21 packets/sec
2230357044 packets input, 3275392715 bytes, 1093734 no buffer
Received 530570600 broadcasts (0 IP multicasts)
0 runts, 0 giants, 223188 throttles
633215588 input errors, 0 CRC, 0 frame, 633215588 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
3125809638 packets output, 1809711886 bytes, 0 underruns
0 output errors, 0 collisions, 3 interface resets
13700099 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
8 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
08-01-2023 05:06 AM
MTU mismatch
08-01-2023 05:53 AM
Hi Leo, Is there anyway I can see that? That the MTU sizes are not as expected?
08-01-2023 05:00 PM
Look at the interface-level or global config of the downstream device connected to Gi0/0.
Make sure the MTU matches.
08-01-2023 07:40 AM
8 lost carrier,
Check cable or duplex mismatch
08-01-2023 06:17 PM - edited 08-01-2023 06:23 PM
Input queue: 0/75/2498545/1023776 (size/max/drops/flushes); Total output drops: 74
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 252000 bits/sec, 147 packets/sec
5 minute output rate 87000 bits/sec, 21 packets/sec
2230357044 packets input, 3275392715 bytes, 1093734 no buffer
Received 530570600 broadcasts (0 IP multicasts)
0 runts, 0 giants, 223188 throttles
633215588 input errors, 0 CRC, 0 frame, 633215588 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
3125809638 packets output, 1809711886 bytes, 0 underruns
0 output errors, 0 collisions, 3 interface resets
13700099 unknown protocol drops
Most of the above bolded stats show your router cannot keep up with the ingress rate. The unexpected, to me, stat, is the high number of unknown protocol drops. What's on the other end of that link (a switch)?
Although Cisco recommends a 1921 for WAN circuits of only 15 Mbps, that allows for pretty much the worst possible case. Depending on what your traffic is comprised of, and your router config, up to about 100 Mbps (aggregate) throughput isn't impossible. So, ideally, would like to better understand your traffic mix and your router's configuration. Without knowing, difficult to propose things to do.
Possibly, increasing the input queue depth might help.
A possible immediate work-around "fix", would be to run that interface at 10 Mbps or have other side "shape" its egress to less than 100 Mbps, both only possible if other side supports.
Reviewing your buffer stats, might show buffer tuning might be helpful too (or, if not already using, and if your IOS supports, enable auto buffer tuning).
Again, though, analysis of your traffic and your config, would be the ideal next step.
BTW, what's your CPU history stats look like? (Edit: I do see you mention not having CPU issues, but I wonder about 100% usage spikes.)
08-01-2023 06:21 PM
Oops, meant to list Cisco's definitions of ingress overruns and throttles:
overruns:
Gives the number of times that the serial receiver hardware was incapable of handing received data to a hardware buffer because the input rate exceeded the receiver's capability to handle the data.
throttles:
Throttles are a good indication of an overloaded router. They show the number of times the receiver on the port has been disabled, possibly due to buffer or processor overload. Together with high CPU utilization on an interrupt level, throttles indicate that the router is overloaded with traffic.
08-02-2023 12:11 AM
Thank you for the input. I have been looking into the overruns and throtteling as well and but cannot see any CPU or internal bandwidth related issues on the device. We have other locations where 1921 is running 100mbit without issues with our configuration.
The connection is a wan connection with a public IP and the Errors are not there at the same time as the usage.
CPU:
08-02-2023 06:39 AM
I suspect that graph is showing CPU average utilization. I'm as much interested in, if not more so, spike utilization. What does the router's CPU history stats show?
08-02-2023 06:53 AM
They show some spikes at times
R1#show proc cpu history
R1 03:52:09 PM Wednesday Aug 2 2023 CEST
111111111111111112222222222222222222222222111111111111111
667777777777333332222222222000000000011111777777777744444777
100
90
80
70
60
50
40
30
20 ************ ***********************************
10 ************************************************************
0....5....1....1....2....2....3....3....4....4....5....5....6
0 5 0 5 0 5 0 5 0 5 0
CPU% per second (last 60 seconds)
2211121111111211 111 1211 111 1121122325442336445143
280011424785329689889967751383342956084363471554747370640629
100
90
80
70
60 *
50 ** ** *
40 * *** *#*** **
30 * * * **###*###### #*
20 #* * *** *** * * ** * **#####*######*##
10 ####*#*#########*#**#****###*########*######################
0....5....1....1....2....2....3....3....4....4....5....5....6
0 5 0 5 0 5 0 5 0 5 0
CPU% per minute (last 60 minutes)
* = maximum CPU% # = average CPU%
9997695573 11111162269995866676741111 1 31 5935535456762 113311 5981
186235874289023418823299979180167610009929629599224473905682387109093099
100 ** * *** * *
90 *** * *** * * *
80 *** * *** * * * **
70 **** * * * *** * * *** * *** **
60 ********* * ************ ** *** **
50 ********* * ************* ** ** ****** ***
40 ********* * ************* * ***** ****** ** ***
30 *##******* * ************* * ************* ** ***
20 #######*** ***************** * *********#*** ## ***
10 ########***************#*#######**************##########*****##****#****
0....5....1....1....2....2....3....3....4....4....5....5....6....6....7..
0 5 0 5 0 5 0 5 0 5 0 5 0
CPU% per hour (last 72 hours)
* = maximum CPU% # = average CPU%
08-02-2023 07:16 AM
Those stats show average CPU utilization, much like your prior CPU usage graph, which looks pretty good, but likely the overflows are happening during the 99% spikes (especially interesting are those showing an average of less than 10% at the same time). So, the question is what's causing those?
08-02-2023 01:54 AM
Could this just be random unsupported internet traffic?
R1#show int switching
Interface Embedded-Service-Engine0/0 is disabled
GigabitEthernet0/0 WAN interface
Throttle count 223756
Drops RP 2513506 SP 0
SPD Flushes Fast 1023884 SSE 0
SPD Aggress Fast 0
SPD Priority Inputs 1735138172 Drops 2422928
Protocol IP
Switching path Pkts In Chars In Pkts Out Chars Out
Process 3134352928 1966075900 227533911 2725135319
Cache misses 0 - - -
Fast 1680377463 260863647 2902278276 175208507
Auton/SSE 0 0 0 0
Protocol DEC MOP
Switching path Pkts In Chars In Pkts Out Chars Out
Process 0 0 56778 4371906
Cache misses 0 - - -
Fast 0 0 0 0
Auton/SSE 0 0 0 0
Protocol ARP
Switching path Pkts In Chars In Pkts Out Chars Out
Process 1713533747 4030667218 1211833 72709980
Cache misses 0 - - -
Fast 0 0 0 0
Auton/SSE 0 0 0 0
Protocol CDP
Switching path Pkts In Chars In Pkts Out Chars Out
Process 0 0 632093 245251121
Cache misses 0 - - -
Fast 0 0 0 0
Auton/SSE 0 0 0 0
Protocol Other
Switching path Pkts In Chars In Pkts Out Chars Out
Process 13724473 850771164 3407996 204479760
Cache misses 0 - - -
Fast 0 0 0 0
Auton/SSE 0 0 0 0
NOTE: all counts are cumulative and reset only after a reload.
08-02-2023 07:01 AM
Hmm, although I hadn't mentioned it, I was suspecting process switching (as the latter is often about 1/10 the capacity of fast switching), and those stats appear to show a lot of process switching.
The overflowing ingress queue happens when the router just cannot process the ingress stream as quickly as it arrives. Over the decades, about the only time I've seen an input queue overflow was (I recall ???) on an Internet connected router, when it was taking in the full Internet route table. In this case, the router couldn't process the BGP packets as quickly as they were arriving. I also recall (???) the "fix" was to enable PMTUD on the router, which got the BGP peer to send packets sized at 1500 rather than 576.
What do your short term CPU stats look like? (Using sorted option, 1st five or show processes should be sufficient.)
"Could this just be random unsupported internet traffic?"
Random, I wonder (some form of DoS)? Your other similar routers, that you don't have an issue with, also LAN<>Internet?
Lots of ARPs.(?) LAN side a host gateway router?
What's the IOS version and features running on this router? (I'm also wondering whether it has the embedded packet capture feature).
Not a huge count, but surprised to see any DEC MOP packets. Probably a good idea to disable it.
Unsure whether all the "protocol other" packets, being processed switched, is a potential issue.
As this is an Internet connected router, what are you doing for "security" dealing with the Internet side?
08-02-2023 04:38 PM
Hello
Is the rtr connected via any L2 handoff switchport, if so
13700099 unknown protocol drops = Disable DTP if enabled, as the rtr dosnt know about dtp packets
sh int x/x switchport
int x.x
Description wan rtr
switchport mode access
switchport nonnegoicate
08-02-2023 11:31 PM
hi @paul driver
The port with the issues/drops are not a switchport but a routed port with IPSEC termination.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide