07-25-2018 12:16 PM - edited 03-08-2019 03:44 PM
I have a switch that shows a lot of drops on just about all interfaces that have a device connected to them (VMhosts with lots of servers.) The main interface in question is an uplink to an ASA, essentially the gateway for those devices on the switch, which shows many many many more drops (which could be expected.) That interface, full duplex 1 Gbps. If I sit there and run show int every few seconds, the number essentially continually increments, but the input rate (set to 30 seconds, so I'm kind of just guessing) doesn't add up. It's never anywhere near a gig. I never see anything in the output queue either. I use PRTG to monitor traffic and that interface never shows more than around 150 Mbps. I understand the polling interval may prevent me from seeing bursts, or maybe not. Are bursts really creating all of these drops? But it's nearly constant... I would expect to see something somewhere explaining these drops. It's also perplexing that the other interfaces show many drops as well. Can this be a hardware issue with the switch? There are no errors, nothing... I find the burst theory hard to believe because all of these servers would not be generating that much traffic at any given time. People would be pulling files through these links, so a large amount of data can make sense, but why would TCP allow it if it was literally one connection (one client to one server?) Any insight is appreciated. Thanks!
07-25-2018 01:07 PM
Hello Brettp :-)
There are two options:
1) counters are wrong. It may be bug. Use "cisco bug searc tool" https://bst.cloudapps.cisco.com/bugsearch/?referring_site=mm , enter something like "3650 output drops" (or what kind of switch you have..)
2) counters are true. As you stated, that you normally don't have too much traffic, there are two possibilities: 2a) switch has troubles with buffers (what type is it?) 2b) As you stated that output drops are seen on many interfaces it may be some intermittent traffic storm. This can be dignosed by capturing sample of traffic (notebook with wireshark) on these ports and looking if it is all legitimate trafic. There may be some parasitic broadcast, multicast or unknown unicast.
Hope that helps :-)
Stepan
07-26-2018 09:03 AM
Thank you for the response. Queuing is just FIFO. I checked for possible bugs regarding inaccurate output drops, but there doesn't seem to be any. I ran wireshark and observed the multicast/broadcast traffic and I saw nothing out of the ordinary. I suppose the only option left is bursts, but I really find that hard to believe... hmm...
07-26-2018 09:18 AM
Can you copy/paste please the output of the following command:
show interface "interface" summary
show queueing interface "interface"
Thanks,
Calin
07-26-2018 09:26 AM - edited 07-26-2018 09:28 AM
Thank you for the reply. I have included the show int output as well as the output you requested. I just cleared the counters before running the commands to illustrate there were 18 drops already soon after I cleared them... And based on the output, it just looks like the link is saturated, being the drops are from the output queue... so it would have to be a burst, since the link isn't generally saturated?
SWITCH#show queueing int g1/0/1 Interface GigabitEthernet1/0/1 queueing strategy: none SWITCH#show int g1/0/1 GigabitEthernet1/0/1 is up, line protocol is up (connected) Hardware is Gigabit Ethernet, address is b4a4.e3ab.4181 (bia b4a4.e3ab.4181) Description: Uplink to Primary XOASA MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, reliability 255/255, txload 8/255, rxload 6/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX input flow-control is off, output flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input never, output 00:00:01, output hang never Last clearing of "show interface" counters 00:01:43 Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 18 Queueing strategy: fifo Output queue: 0/40 (size/max) 30 second input rate 26380000 bits/sec, 3564 packets/sec 30 second output rate 34062000 bits/sec, 5059 packets/sec 389531 packets input, 360361208 bytes, 0 no buffer Received 65 broadcasts (0 multicasts) 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 0 multicast, 0 pause input 0 input packets with dribble condition detected 535531 packets output, 442964566 bytes, 0 underruns 0 output errors, 0 collisions, 0 interface resets 0 unknown protocol drops 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 pause output 0 output buffer failures, 0 output buffers swapped out SWITCH#show int g1/0/1 summary *: interface is up IHQ: pkts in input hold queue IQD: pkts dropped from input queue OHQ: pkts in output hold queue OQD: pkts dropped from output queue RXBS: rx rate (bits/sec) RXPS: rx rate (pkts/sec) TXBS: tx rate (bits/sec) TXPS: tx rate (pkts/sec) TRTL: throttle count Interface IHQ IQD OHQ OQD RXBS RXPS TXBS TXPS TRTL ----------------------------------------------------------------------------------------------------------------- * GigabitEthernet1/0/1 0 0 0 18 26210000 3567 35061000 5133 0
07-26-2018 09:33 AM
I replied but it seems like it disappeared? Here is the output you requested (as well as show int to illustrated 18 drops shortly after I cleared the counters.) Based on what I'm seeing, the packets are just dropping from the output queue but based on the numbers, the link is nowhere near saturated... thus indicating a burst... which in my opinion is not possible based on the traffic traversing the link. I have also included a screenshot showing traffic graph...
SWITCH#show queueing int g1/0/1 Interface GigabitEthernet1/0/1 queueing strategy: none SWITCH#show int g1/0/1 GigabitEthernet1/0/1 is up, line protocol is up (connected) Hardware is Gigabit Ethernet, address is b4a4.e3ab.4181 (bia b4a4.e3ab.4181) Description: Uplink to Primary XOASA MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, reliability 255/255, txload 8/255, rxload 6/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX input flow-control is off, output flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input never, output 00:00:01, output hang never Last clearing of "show interface" counters 00:01:43 Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 18 Queueing strategy: fifo Output queue: 0/40 (size/max) 30 second input rate 26380000 bits/sec, 3564 packets/sec 30 second output rate 34062000 bits/sec, 5059 packets/sec 389531 packets input, 360361208 bytes, 0 no buffer Received 65 broadcasts (0 multicasts) 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 0 multicast, 0 pause input 0 input packets with dribble condition detected 535531 packets output, 442964566 bytes, 0 underruns 0 output errors, 0 collisions, 0 interface resets 0 unknown protocol drops 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 pause output 0 output buffer failures, 0 output buffers swapped out SWITCH#show int g1/0/1 summary *: interface is up IHQ: pkts in input hold queue IQD: pkts dropped from input queue OHQ: pkts in output hold queue OQD: pkts dropped from output queue RXBS: rx rate (bits/sec) RXPS: rx rate (pkts/sec) TXBS: tx rate (bits/sec) TXPS: tx rate (pkts/sec) TRTL: throttle count Interface IHQ IQD OHQ OQD RXBS RXPS TXBS TXPS TRTL ----------------------------------------------------------------------------------------------------------------- * GigabitEthernet1/0/1 0 0 0 18 26210000 3567 35061000 5133 0
07-26-2018 10:31 AM
All looks OK...
Some idea mentions microburst, but how much microburst can you have on an interface that is using this amount of capacity?
How's the load in general per switch?
Calin
07-26-2018 11:08 AM
The switch "appears" fine except for the drops. CPU, memory, load... it all checks out. I'm thinking the only answer is 1. bursty traffic (which I will have to investigate. Unfortunately, I can't get stats down to the millisecond... however I see there's a way to possibly detect it with Wireshark. I'll have to read about that) or the switch is defective in some way -- which is possible, but the behavior doesn't typically point to a hardware issue from what I can tell. It's not the cabling or the actual physical switch port. As hard as it is for me to believe, I am leaning towards bursts... just very strange in my opinion.
07-26-2018 11:30 AM
If you have a possibility to open a TAC, do it. I'm curious about the outcome.
07-26-2018 05:07 PM - edited 07-26-2018 05:08 PM
". . . but why would TCP allow it if it was literally one connection (one client to one server?) Any insight is appreciated."
Are you familiar with TCP "slow start"?
Of course, understand, if you only have one flow, and its input is the same bandwidth as the output, although the source host will transmit in bursts, those bursts will not queue on egress. However, if you have multiple ingress to one egress or an ingress with greater bandwidth capacity than the egress, the above can "micro burst". Interestingly, the micro burst drops can keep the link from being saturated, as TCP senders will slow their transmission rate when the drops are detected.
07-30-2018 10:46 AM
So I read up on the Cisco site how to detect microbursts with Wireshark, and sure enough, that seems to be the culprit. It's amazing what you can dig up -- I was almost positive that wasn't the case. I'd like to mitigate these drops if possible... I'm currently using a C2960S... Beyond the buffers, the switch is performing fine. I know one suggestion could be the fine-tune the buffers... but I'm wondering if maybe getting another similar catalyst switch might be a better option. Increasing bandwidth is not an option. According to the site, the 2960S has 2MB egress buffers, the 2960X has 4MB. If my calculations are correct... During these bursts, I'm seeing about 100,000 bits per .001 second above the limit. I'd imagine doubling the buffers would be adequate to correct the issue... maybe... Do you have an input? I'm new to this...
07-31-2018 05:46 AM
07-31-2018 06:56 AM
Thanks for the input. I think I'll grab a 2960-X switch and hope for the best. Obviously, it's a much better switch than the 2960-S, so I can't imagine it would make the issue any worse and I'll have a new switch!
08-01-2018 04:39 AM
08-01-2018 06:11 AM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide