02-14-2018 07:10 AM - edited 03-05-2019 09:55 AM
On one of the ports on our recently upgraded switch we are getting a high number of discards. Switch we use is WS-C3650-24PD-E. We use Solar Winds for switch/port monitoring and these discards are happening at random intervals. Last few days there were none and then yesterday and today there are over million discards on the given port.
Port Configuration:
interface GigabitEthernet1/0/19
switchport access vlan 2
switchport mode access
speed 1000
duplex full
end
Show Interface:
GigabitEthernet1/0/19 is up, line protocol is up (connected)
MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input never, output 00:00:00, output hang never
Last clearing of "show interface" counters 2w4d
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 2279088
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 700000 bits/sec, 286 packets/sec
5 minute output rate 3388000 bits/sec, 480 packets/sec
136094155 packets input, 51417445035 bytes, 0 no buffer
Received 190438 broadcasts (4216 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 4216 multicast, 0 pause input
0 input packets with dribble condition detected
218477298 packets output, 186197856830 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
As you can see total output drops is very high. Before I set up port monitor and WireShark, anyone has any idea what might be causing this?
02-14-2018 08:18 AM
02-14-2018 08:55 AM
To add to J Doherty's comment; particularly bursty data could cause this, especially in cases with higher rate incoming uplinks (10Gig or more). For the bursty data the Gig port becomes the bottleneck during the initial phases of the data transmission.
One thing to do is to change the load interval on the port to 30 seconds for a more accurate utilization reading and see if the drops increment slowly or come in bursts.
02-14-2018 01:04 PM
02-15-2018 07:24 AM
02-15-2018 09:11 AM
Understood, but to elaborate on J Doherty's earlier comment. These could be sub-second micro bursts. I doubt you would see them in Solarwinds nor with the 30 second load interval setting. For example, prior to a 28Mbps burst, there have been a much higher bursty rate that caused the drops.
The key is to identify if the drops are happening at a constant rate (which may be an issue) or if they happen occasionally that coincide with the increase in traffic.
One other thing, I noticed you have the port hard coded. With Gig you really shouldn't have to do that. Unless it is necessary, you may want to set it to auto (port will bounce when you do). I doubt it will have an effect on the drops, but it certainly couldn't hurt.
Regards
02-16-2018 10:41 AM
Hi,
This is likely a microburst thing (as others have mentioned), you say you don't see over 28 mbps but this measure is done over what timeframe? 60 seconds? Microbursts are short lived and quite difficult to detect unless you measure transference stats over small periods (ie. 1 second).
What shows if you issue this:
show interface IF_NAME counters errors
If you have a lot of xmit errors this could prove it's a microburst problem or the ports being overrun by speeds conversions
Cisco's documentation https://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/12027-53.html
"""
errors. This is an indication that the internal send (Tx) buffer is full.
Common Causes: A common cause of Xmit-Err can be traffic from a high bandwidth link that is switched to a lower bandwidth link, or traffic from multiple inbound links that are switched to a single outbound link. For example, if a large amount of bursty traffic comes in on a gigabit interface and is switched out to a 100Mbps interface, this can cause Xmit-Err to increment on the 100Mbps interface. This is because the output buffer of the interface is overwhelmed by the excess traffic due to the speed mismatch between the inbound and outbound bandwidths.
"""
It's for the 6500 but I believe this is the same over all the catalyst line.
Also I'd start checking closely the switch for output drops by first reseting the counters:
clear counter IF_NAME
And then checking every second if this increases, if you see something like:
1.- No drops at all
2.- Sudden drops (a lot)
3.- No drops at all
4.- No drops at all
5.- Sudden drops (a lot)
Would also point to a microburst thing. If this is, you either increase the capacity of that link or increase the buffer size.
I had to troubleshoot an issue like this over a 3850 where the peak rate over a single 1GE interface never went over 300 Mbps, but a lot of drops were seen, turns out this interface was used as the uplink for 4 other devices all using 1 GE port upon further inspection we saw it was due to microbursting.
We ended using this command to increase the buffer allocation and the drops ended:
qos queue-softmax-multiplier 1200
here's the documentation for this: https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst3850/software/release/3e/qos/command_reference/b_qos_3e_3850_cr/b_qos_3e_3850_cr_chapter_010.html#wp4043987550
Although I'm not sure whether this applies to the 3650 as well
Now if the issue isn't microburst related then this could probably be a hardware fault in which case you should open a TAC case.
HTH
Please remember to rate useful posts
02-15-2018 07:53 AM
Hello,
on a side note, what is connected to this access port ?
02-16-2018 07:51 AM
Its our SolarWinds server.
02-16-2018 12:19 PM
Running on what ? Windows Server (X) ?
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide