cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
682
Views
0
Helpful
7
Replies

Help with Output Drops

Red Taco
Level 1
Level 1

Hi, looking for some help with output drops.  2 stacked 2960x switches were installed Feb 2016.  In that 2 month period some ports are showing tens of thousands of output drops.  The 2-link port-channel trunk is gigabit and has 188k output drops.  Access ports run through Cisco 7960/7942 phones and are 100Mbps.  

Drops are sporadic, they do not increment steadily.  I first tried raising the hold-queue (256 then 512) on a single port which seemed to slow the occurrence of drops but did not stop them.  

I spanned one of the ports and captured some bursty traffic from one of the file servers.  Going by packet-count and time-stamp in Wireshark, there were 6146 packets in 0.891 sec in one flow and 2641 packets in 0.953 seconds in another.  For this user I put in a 5-port gigabit switch to bypass the phone and since negotiating 1000Mb/s the interface has had zero drops in 21 hours.  It is not feasible at this time to upgrade phones or install additional switches for all users to increase all ports to gigabit.  

Going by this document I pulled some stats on the interface but I'm not entirely sure what to do with the information:

PV-2960X-48-OFFICE_STACK#show mls qos int gi 2/0/3 stat
GigabitEthernet2/0/3 (All statistics are in packets)

<output omitted>

output queues dropped:
queue: threshold1 threshold2 threshold3
-----------------------------------------------
queue 0:   0          0          0
queue 1:   0          0          0
queue 2:   0          0          9238
queue 3:   2097     0         0

Does this imply that the drops for int 2/0/3 are occurring in Queueset1 queues 3 and 4?  

PV-2960X-48-OFFICE_STACK#show mls qos queue-set
Queueset: 1
Queue :         1    2     3   4
----------------------------------------------
buffers :       15  25   40   20
threshold1: 100 125 100   60
threshold2: 100 125 100 150
reserved :    50 100 100   50
maximum : 200 400 400 200 

The buffers were hard-coded by the 3rd party who setup the original switches that were replaced in Feb.  The configs were, I believe, just copied over to the new switches.

output queues dropped:
queue: threshold1 threshold2 threshold3
-----------------------------------------------
queue 0:   0        0        0
queue 1:   0        0        0
queue 2:   0        0        79342
queue 3:   9777   0        0

This is from another interface with 89k drops in 2 months.  All interfaces I've checked this stat for so far all appear to use the same queues.

output queues dropped:
queue: threshold1 threshold2 threshold3
-----------------------------------------------
queue 0:   0           0           0
queue 1:   0           0           0
queue 2:   0           0           1
queue 3:   163247  0           0

output queues dropped:
queue: threshold1 threshold2 threshold3
-----------------------------------------------
queue 0:   0           0           0
queue 1:   0           0           0
queue 2:   0           0           26
queue 3:   25672    0           0

These are the 2 interfaces in the port-channel trunk.  Would moving these interfaces to Queueset2 alleviate the buffers for the rest of the ports?

PV-2960X-48-OFFICE_STACK#show run | b mls qos
mls qos map policed-dscp 0 10 18 24 46 to 8
mls qos map cos-dscp 0 8 16 24 32 46 48 56
mls qos srr-queue output cos-map queue 1 threshold 3 4 5
mls qos srr-queue output cos-map queue 2 threshold 1 2
mls qos srr-queue output cos-map queue 2 threshold 2 3
mls qos srr-queue output cos-map queue 2 threshold 3 6 7
mls qos srr-queue output cos-map queue 3 threshold 3 0
mls qos srr-queue output cos-map queue 4 threshold 3 1
mls qos srr-queue output dscp-map queue 1 threshold 3 32 33 40 41 42 43 44 45
mls qos srr-queue output dscp-map queue 1 threshold 3 46 47
mls qos srr-queue output dscp-map queue 2 threshold 1 16 17 18 19 20 21 22 23
mls qos srr-queue output dscp-map queue 2 threshold 1 26 27 28 29 30 31 34 35
mls qos srr-queue output dscp-map queue 2 threshold 1 36 37 38 39
mls qos srr-queue output dscp-map queue 2 threshold 2 24
mls qos srr-queue output dscp-map queue 2 threshold 3 48 49 50 51 52 53 54 55
mls qos srr-queue output dscp-map queue 2 threshold 3 56 57 58 59 60 61 62 63
mls qos srr-queue output dscp-map queue 3 threshold 3 0 1 2 3 4 5 6 7
mls qos srr-queue output dscp-map queue 4 threshold 1 8 9 11 13 15
mls qos srr-queue output dscp-map queue 4 threshold 2 10 12 14
mls qos queue-set output 1 threshold 1 100 100 50 200
mls qos queue-set output 1 threshold 2 125 125 100 400
mls qos queue-set output 1 threshold 3 100 100 100 400
mls qos queue-set output 1 threshold 4 60 150 50 200
mls qos queue-set output 1 buffers 15 25 40 20
mls qos

Queue configs

Thanks in advance for any assistance.  

zp. 

7 Replies 7

Philip D'Ath
VIP Alumni
VIP Alumni

What percentage of packets are being dropped?  If it is less than 0.01% you probably don't have an issue.

Thanks, Philip.  

I've seen one port as high as 0.17% dropped, the others have been far lower.  My only reason for concern is that we've had some users getting disconnected from remote apps.

Those rates are pretty low.  I don't think I would worry about them.

Thanks, Philip.  I would agree 100% except that we do have users with intermittent disconnection from the respective databases of two different applications.  I would like to try something before handing it off to a 3rd party.  

Even a short burst of packet drops wont kill a connection.  TCP will just retransmit the lost packets.

Are you able to get a packet capture of the issue happening?

I've gotten a few captures of bursty traffic but I haven't been able to capture a disconnection from the database yet. This is occurring at another location that I travel to every couple weeks. 

Philip D'Ath
VIP Alumni
VIP Alumni

Basically most output drops happen when the offered load exceeds what the port can handle.

So if you have Gigabit connected servers, and 100Mb/s connected servers and they pull a big file you are going to get output drops.

No amount of configuration will resolve that issue.