04-02-2017 02:09 AM - edited 03-08-2019 10:01 AM
Hello, everyone.
Does anyone can help with troubleshooting problem with output packet drops at the 3850 switch? The problem is that I see a lot of packet drops at the some interfaces at the low traffic rate (maximum is 100 Mbits at the 1G ports).
GigabitEthernet1/0/26 is up, line protocol is up (connected)
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 165898
Output queue: 0/40 (size/max)
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
165898 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
GigabitEthernet2/0/26 is up, line protocol is up (connected)
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 18264
Output queue: 0/40 (size/max)
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
18264 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
interface Port-channel36
switchport
switchport access vlan 1
switchport trunk native vlan 1
switchport private-vlan trunk encapsulation dot1q
switchport private-vlan trunk native vlan tag
switchport mode trunk
no switchport nonegotiate
no switchport protected
no switchport block multicast
no switchport block unicast
no ip arp inspection trust
ip arp inspection limit rate 15 burst interval 1
ip arp inspection limit rate 15
load-interval 300
carrier-delay 2
no shutdown
ipv6 mld snooping tcn flood
ipv6 mfib forwarding input
ipv6 mfib forwarding output
ipv6 mfib cef input
ipv6 mfib cef output
snmp trap mac-notification change added
snmp trap mac-notification change removed
snmp trap link-status
arp arpa
arp timeout 14400
spanning-tree guard root
spanning-tree port-priority 128
spanning-tree cost 0
hold-queue 2000 in
hold-queue 40 out
ip igmp snooping tcn flood
no ip dhcp snooping information option allow-untrusted
no bgp-policy accounting input
no bgp-policy accounting output
no bgp-policy accounting input source
no bgp-policy accounting output source
no bgp-policy source ip-prec-map
no bgp-policy source ip-qos-map
no bgp-policy destination ip-prec-map
no bgp-policy destination ip-qos-map
interface GigabitEthernet1/0/26
switchport
switchport access vlan 1
switchport private-vlan trunk encapsulation dot1q
switchport private-vlan trunk native vlan tag
switchport mode trunk
no switchport nonegotiate
no switchport protected
no switchport block multicast
no switchport block unicast
no ip arp inspection trust
ip arp inspection limit rate 15 burst interval 1
ip arp inspection limit rate 15
load-interval 300
carrier-delay 2
no shutdown
ipv6 mld snooping tcn flood
ipv6 mfib forwarding input
ipv6 mfib forwarding output
ipv6 mfib cef input
ipv6 mfib cef output
snmp trap mac-notification change added
snmp trap mac-notification change removed
snmp trap link-status
cts role-based enforcement
cdp tlv location
cdp tlv server-location
cdp tlv app
arp arpa
arp timeout 14400
channel-group 36 mode active
spanning-tree guard root
spanning-tree port-priority 128
spanning-tree cost 0
hold-queue 2000 in
hold-queue 40 out
ip igmp snooping tcn flood
no ip dhcp snooping information option allow-untrusted
no bgp-policy accounting input
no bgp-policy accounting output
no bgp-policy accounting input source
no bgp-policy accounting output source
no bgp-policy source ip-prec-map
no bgp-policy source ip-qos-map
no bgp-policy destination ip-prec-map
no bgp-policy destination ip-qos-map
interface GigabitEthernet2/0/26
switchport
switchport access vlan 1
switchport private-vlan trunk encapsulation dot1q
switchport private-vlan trunk native vlan tag
switchport mode trunk
no switchport nonegotiate
no switchport protected
no switchport block multicast
no switchport block unicast
no ip arp inspection trust
ip arp inspection limit rate 15 burst interval 1
ip arp inspection limit rate 15
load-interval 300
carrier-delay 2
no shutdown
ipv6 mld snooping tcn flood
ipv6 mfib forwarding input
ipv6 mfib forwarding output
ipv6 mfib cef input
ipv6 mfib cef output
snmp trap mac-notification change added
snmp trap mac-notification change removed
snmp trap link-status
cts role-based enforcement
cdp tlv location
cdp tlv server-location
cdp tlv app
arp arpa
arp timeout 14400
channel-group 36 mode active
spanning-tree guard root
spanning-tree port-priority 128
spanning-tree cost 0
hold-queue 2000 in
hold-queue 40 out
ip igmp snooping tcn flood
no ip dhcp snooping information option allow-untrusted
no bgp-policy accounting input
no bgp-policy accounting output
no bgp-policy accounting input source
no bgp-policy accounting output source
no bgp-policy source ip-prec-map
no bgp-policy source ip-qos-map
no bgp-policy destination ip-prec-map
no bgp-policy destination ip-qos-map
The queue size at the other side is 75. So there shouldn't be any problems. Also I didn't have this problem before IOS update (previous version was 03.02.03 and current one is 03.06.06)
Best Regards.
04-02-2017 03:46 AM
Drop rate 200k packets at the 10 Mbit. It's a lot and I don't know why it is so high.
Could it be connected with vlans? This ports works at the trunk mode. But at the swith on the other side there is no some of the vlans that presents at the current device.
04-03-2017 12:52 AM
Looks like problem not linked with vlans. I see the same problem at ports in access mode too.
So, any ideas what it could be?
04-03-2017 02:22 AM
Ok. My resaults so far:
Looks like problem connected with QoS. Seems like Cisco changed queues settings at the newest versions of IOS. As I see the second queue (data traffic) at the port is dropping traffic:
#show platform qos queue stats gi 1/0/26
DATA Port:20 Drop Counters
-------------------------------
Queue Drop-TH0 Drop-TH1 Drop-TH2 SBufDrop QebDrop
----- ----------- ----------- ----------- ----------- -----------
0 0 0 0 0 0
1 0 0 31764522 0 0
As solution I found this lines:
qos queue-softmax-multiplier 1200
ip access-list extended allTraffic
remark --- ACL for matching all traffic ---
permit ip any any
exit
class-map match-any cDefQos
match access-group name allTraffic
exit
policy-map pDesQos
class class-default
bandwidth percent 100
exit
exit
int [name]
service-policy output pDesQos
exit
But so far first line
qos queue-softmax-multiplier 1200
was anought for me. After this line I don't see errors at the interfaces anymore.
I'm still monitoring ports. But there is no problems so far.
Best Regards.
04-04-2017 12:01 AM
I solved problem with all interfaces except interfaces Gi1/0/26 and gi2/0/26. Counters rising only at this 2 interfaces. I placed service-policing to this interfaces but it din't changed anything.
Does anyone have any ideas? I'm bad at QoS questions...
Best Regards.
04-04-2017 02:36 PM
Hi,
seeing output-drops is a known problem of Catalyst 3850 Switches. Cisco published a PDF where the issue is descriped.
http://www.cisco.com/c/en/us/support/docs/switches/catalyst-3850-series-switches/200594-Catalyst-3850-Troubleshooting-Output-dr.html
The drops occur regarding micro-bursts which the switchport cannot buffer (the soft-buffer for queue 1 is too small).
QoS can help to reduce the drops but will not fix the problem at all. Maybe you still see drops, but in another queue.
To increase the soft-buffer of the ports, you can use the "qos queue-soft-multiplier".
Since all ports of the Catalyst 3850 use a shared-buffer on as needed-basis, I thought it could be a problem to change the soft-buffer value to the maximum of 1200.
Few days ago I got the answer by the TAC, that this woud have no negativ impact. A value if 1200 is OK.
If using releases prior to 3.6.6 or 3.7.5 you have to attach a service-policy to increase the soft-buffer. Since 3.6.6 or 3.7.5 increasing the buffer takes affect without a service-policy.
regards
~chris
04-05-2017 12:41 AM
Hi, Chris.
Thanks for you reply.
I did saw this document. But it says this:
Output drops are generally a result of interface oversubscription caused by many to one or a 10gig to 1gig transfer.
But in my case I can see 150k packets drop rate at the 50 Mbit speed of traffic (at the 1G ports in full duplex). At such speed buffer shouldn't be used at all.
And as you told I can see drops at the another queue. Before placing service-policy to the interface I saw drops at the queue 1. But after placing policy (with only default class to 100% bandwith) now I'm is seeing drops at the queue 0.
So should I add one more class with "priority level 0 percent ..."?
Best Regards.
04-12-2017 12:33 AM
Hi,
depending on your service-policy, you will disable queue 1 and all traffic flows into queue 0. That's why you see dropped traffic in that queue now.
Did you also change the soft-buffer to a higher value?
What version of IOS-XE you are running on the box? 3.6.6 / 3.7.5 ?
regards
~chris
04-12-2017 06:04 AM
Hi, Cris.
The IOS version is 03.06.06.
Current changes of QoS is:
qos queue-softmax-multiplier 1200
ip access-list extended allTraffic
remark --- ACL for matching all traffic ---
permit ip any any
class-map match-any cDefQos
match access-group name allTraffic
policy-map pDesQos
class class-default
bandwidth percent 100
int [name]
service-policy output pDesQos
Best Regards.
04-12-2017 07:44 AM
Hi,
yes, looks good.
But if using 3.6.6 I would like to prefer using no service-policy but increasing the soft-buffer to the maximum value of 1200.
Why? So, as I wrote above, when using this service-policy, than queue 1 is disabled and all traffic (control-traffic and normal traffic) is combined in queue 0. There will be no default "classification".
Whatever you use, do you see still drops?
regards
~chris
04-19-2017 05:37 AM
Hi, c.edel.
Sorry for a long responce. I see drops in both cases: when I'm using service-policy and when I'm not. But when I'm using it I see drops at the queue 0 and when I'm not using it I see drops at queue 1.
Best Regard.
01-19-2018 06:05 AM
3. If you define only 1 class-default, in order to tweak the buffer, all the traffic falls under the single queue (including control packets). Be advised that when all traffic is put in one queue, there is no classification between control and data traffic and during time of congestion, control traffic could get dropped. So, it is recommended to create at least 1 other class for control traffic. CPU generated control-packets will always go to the first priority queue even if not matched in the class-map. If there is no priority queue configured, it would go to the first queue of the interface, which is queue-0.
10-02-2018 02:34 AM
did you add this to both ends of the etherchannel or just on the 3850 end?
11-03-2022 07:26 AM
I'm curious if anyone has run into this issue with more modern IOS such as 16.x? We were seeing heavy drops on a 4 port 1G etherchannel, less than 30 megabit out per member. We applied qos queue-softmax-multiplier 1200 to global config, and the drops decreased dramatically. The Po int went from 24 million drops every half hour down to a million every day. The lack of any 'show platform qos ....' commands makes it difficult in this IOS. I've yet to find the replacement syntax if there is one.
Thanks,
Chuck
11-03-2022 08:20 AM
Chuck,
Did you find this information by reading this document:
?
Thanks,
Dallas
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide