cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3003
Views
10
Helpful
14
Replies

3650 Output Discards

Mark Bowyer
Level 1
Level 1

We have issues on all of our 3650's, we are getting a lot of output discards. I believe there is bug that can produce discards that arent actually there.

Port              Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
Gi1/0/1                0          0            0           0             0       3669768454
Gi1/0/2                0          0            0           0             0               0
Gi1/0/3                0          0            0           6             0           16161480
Gi1/0/4                0          0            0           4             0            36054474
Gi1/0/5                0          0            0           0             0           3613814062
Gi1/0/6                0          0            0           0             0            2163258834
Gi1/0/7                0          0            0           1             0            3540927907
Gi1/0/8                0          0            0           0             0             15315352

I have used the following command, which has made some difference, but doesn't seem to of eliminated the issues completely:

qos queue-softmax-multiplier 1200

The switches are running IOS 16.09.05

Does anyone have any ideas?

1 Accepted Solution

Accepted Solutions

"Anybody have any ideas?"

Yup, what you're seeing might be totally "legitimate", i.e. not a bug, but cannot say for sure as I don't "know" your traffic mix and/or how suitable using AutoQoS is, for your QoS goals (if any).

I do see the interface you've posted is only running at 100 Mbps, which assuming you are also using gig or 10g interfaces, it's not all difficult to overrun such a "slower" interface.

Your interface's enqueue stats show queuing only for queues zero, one and seven.  The interface shows all the drops, though, just in queue seven, which appears to completely overflow the buffer.  "Tweaking" buffer allocations may mitigate the issue.

The reason I write "may mitigate" is because, again, insufficient information to know for sure plus often "tweaking" QoS is a bit of trial-and-error.

View solution in original post

14 Replies 14

Joseph W. Doherty
Hall of Fame
Hall of Fame

Okay, you believe your discards might be due to a possible bug.  Have you checked for any known bug?  Are these switches under a maintenance contract, and if so, have you contacted TAC?

Interesting when you mention setting queue-softmax-multiplier 1200 appears to have made some difference.  I'm wondering whether, in fact, the discards are accurate.  What's the egress QoS policy and individual interface queue stats look like?

Mark Bowyer
Level 1
Level 1

We don't have any support on these switches unfortunately, so I don't have access to any tools to check for bugs, unless there is a free tool? Below are the stats:

 

policy-map AutoQos-4.0-Output-Policy
class AutoQos-4.0-Output-Priority-Queue
priority level 1 percent 30
class AutoQos-4.0-Output-Control-Mgmt-Queue
bandwidth remaining percent 10
queue-limit dscp cs2 percent 80
queue-limit dscp cs3 percent 90
queue-limit dscp cs6 percent 100
queue-limit dscp cs7 percent 100
queue-buffers ratio 10
class AutoQos-4.0-Output-Multimedia-Conf-Queue
bandwidth remaining percent 10
queue-buffers ratio 10
class AutoQos-4.0-Output-Trans-Data-Queue
bandwidth remaining percent 10
queue-buffers ratio 10
class AutoQos-4.0-Output-Bulk-Data-Queue
bandwidth remaining percent 4
queue-buffers ratio 10
class AutoQos-4.0-Output-Scavenger-Queue
bandwidth remaining percent 1
queue-buffers ratio 10
class AutoQos-4.0-Output-Multimedia-Strm-Queue
bandwidth remaining percent 10
queue-buffers ratio 10
class class-default
bandwidth remaining percent 25
queue-buffers ratio 25

GigabitEthernet1/0/6 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is 005d.7371.4306 (bia 005d.7371.4306)
MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 100Mb/s, media type is 10/100/1000BaseTX
input flow-control is on, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:42, output 00:00:01, output hang never
Last clearing of "show interface" counters 1d23h
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 2641591
Queueing strategy: Class-based queueing
Output queue: 0/40 (size/max)
5 minute input rate 28000 bits/sec, 4 packets/sec
5 minute output rate 82000 bits/sec, 19 packets/sec
1118459 packets input, 345202637 bytes, 0 no buffer
Received 7660 broadcasts (5241 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 5241 multicast, 0 pause input
0 input packets with dribble condition detected
3196957 packets output, 1440747056 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out

 

31MPH_Floor0_3650#$rm hardware fed switch 1 qos queue stats interface Gi1/0/6
DATA Port:8 Enqueue Counters
---------------------------------------------------------------------------------------------
Q Buffers Enqueue-TH0 Enqueue-TH1 Enqueue-TH2 Qpolicer
(Count) (Bytes) (Bytes) (Bytes) (Bytes)
- ------- -------------------- -------------------- -------------------- --------------------
0 0 0 14973190 48934334 0
1 0 0 6111449 12272 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
5 0 0 0 0 0
6 0 0 0 0 0
7 0 0 0 4115415085 0
DATA Port:8 Drop Counters
-------------------------------------------------------------------------------------------------------------------------------
Q Drop-TH0 Drop-TH1 Drop-TH2 SBufDrop QebDrop QpolicerDrop
(Bytes) (Bytes) (Bytes) (Bytes) (Bytes) (Bytes)
- -------------------- -------------------- -------------------- -------------------- -------------------- --------------------
0 0 0 0 0 0 0
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
6 0 0 0 0 0 0
7 0 0 7294100 0 0 0

Mark Bowyer
Level 1
Level 1

Anybody have any ideas?

"Anybody have any ideas?"

Yup, what you're seeing might be totally "legitimate", i.e. not a bug, but cannot say for sure as I don't "know" your traffic mix and/or how suitable using AutoQoS is, for your QoS goals (if any).

I do see the interface you've posted is only running at 100 Mbps, which assuming you are also using gig or 10g interfaces, it's not all difficult to overrun such a "slower" interface.

Your interface's enqueue stats show queuing only for queues zero, one and seven.  The interface shows all the drops, though, just in queue seven, which appears to completely overflow the buffer.  "Tweaking" buffer allocations may mitigate the issue.

The reason I write "may mitigate" is because, again, insufficient information to know for sure plus often "tweaking" QoS is a bit of trial-and-error.

Yes, I think you are onto something there, mostly all of the workstations on our network are connected via Cisco phones which can only run at 100Mbps because we bought the cheaper phones and we do have 1g and 10gb uplinks. I have checked another 3650 switch stack that only has servers connected to it, all connected at 1 gig and there are absolutely no discards. So other than tweaking the QoS to allow for more buffers and maybe reducing the speed of the uplinks, there is not much we can do?

Ah, yes, with servers connected with gig, and gig and 10g uplinks, again, very easy for a burst of data from a server to overrun the port's buffer space (NB: lack of buffers has been a common problem with smaller Catalyst switches, like their 2Ks and 3Ks).

Having VoIP phone on those ports helps explain the high volume of queue 7 traffic too, I guess.

Reducing the speed of the uplinks would only move the congestion there, and being just a single port, probably increase the overall drop rate.

If you have excess ports, splitting your 100 Mbps VoIP phones and allowing your workstations to run at gig would likely be the "best" solution.  Second "best", if short of ports, is to use a "cheap" 4 port switch that supports gig.  Connect it to Cisco port at gig, and VoIP and workstation to it (@ 100Mbps and gig, respectively).

Least expensive fix might be trying QoS buffer tweaking.  If you don't have the experience in that, either you need to learn as you go and/or retain a network consultant with such experience.

BTW, @MHM Cisco World mentions checking if you have a half duplex issue.  I don't think that's an issue in this case, but worthwhile double checking.  If it is an issue, rather than hard coding duplex setting, do insure each side is configured for auto (as, generally, all network hardware vendors recommend using auto).

see below comment

see below comment 

finally I found bug 

Nice find, but don't think it applies here for a couple of reasons.

Symptom: Output drops and Output errors increment simultaneously in show interfaces when only output drops are expected.

From the posted interface stats:

Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 2641591
.
.
0 output errors, 0 collisions, 0 interface resets

Output errors not incrementing.

Conditions: To confirm the output drops are because of egress buffer drops use "sh pl qos queue stats gigabitEthernet x/y/z" and look for "Drop-TH" counters. This counter should increment the same amount as the output drops counter in show interface.

Total output drops: 2641591
7 0 0 7294100 0 0 0

Although unequal, interface also notes "Last clearing of "show interface" counters 1d23h". Cannot say for a 3650, but I recall on the 3560 and 3750 clearing the interface stats did not also clear the ASIC stats.  As the queue drops are higher, if like behavior, this would be expected after a clear interface stats.

yes he must monitor the drop
total output and Drop-TH, 
if it increase in same or near same count then this bug is what he face. 

Mark Bowyer
Level 1
Level 1

show platform hardware fed switch 1 qos queue stats interface Gi1/0/14
DATA Port:12 Enqueue Counters
---------------------------------------------------------------------------------------------
Q Buffers Enqueue-TH0 Enqueue-TH1 Enqueue-TH2 Qpolicer
(Count) (Bytes) (Bytes) (Bytes) (Bytes)
- ------- -------------------- -------------------- -------------------- --------------------
0 0 0 6098720 12534896 0
1 0 0 1556847 68 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
5 0 0 0 0 0
6 0 0 0 0 0
7 0 0 0 1499845081 0
DATA Port:12 Drop Counters
-------------------------------------------------------------------------------------------------------------------------------
Q Drop-TH0 Drop-TH1 Drop-TH2 SBufDrop QebDrop QpolicerDrop
(Bytes) (Bytes) (Bytes) (Bytes) (Bytes) (Bytes)
- -------------------- -------------------- -------------------- -------------------- -------------------- --------------------
0 0 0 0 0 0 0
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
6 0 0 0 0 0 0
7 0 0 2366767 0 0 0

sh int Gi1/0/14
GigabitEthernet1/0/14 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is 005d.73c5.ba8e (bia 005d.73c5.ba8e)
MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 100Mb/s, media type is 10/100/1000BaseTX
input flow-control is on, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:47, output 00:00:00, output hang never
Last clearing of "show interface" counters 1d17h
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 2366767
Queueing strategy: Class-based queueing
Output queue: 0/40 (size/max)
5 minute input rate 37000 bits/sec, 7 packets/sec
5 minute output rate 108000 bits/sec, 15 packets/sec
1228152 packets input, 293381417 bytes, 0 no buffer
Received 5037 broadcasts (4924 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 4924 multicast, 0 pause input
0 input packets with dribble condition detected
2811398 packets output, 1512496847 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out

The numbers seem to be exactly the same...

so it is bug I share before, check it workaround.

"so it is bug I share before, check it workaround."

If I understand:

Conditions: To confirm the output drops are because of egress buffer drops use "sh pl qos queue stats gigabitEthernet x/y/z" and look for "Drop-TH" counters. This counter should increment the same amount as the output drops counter in show interface.

correctly, and with:

7 0 0 2366767 0 0 0
Total output drops: 2366767

in Mark's last posting, if counts agree, that's expected, i.e. no bug indication.

Review Cisco Networking for a $25 gift card