cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2294
Views
0
Helpful
8
Replies

Output drops on a 1Gb interface

pwilliams86
Level 1
Level 1

Hi,

We have a c3850 in our data centre with a fibre 1 Gbps layer-2 link to a remote site. We are seeing output drops on this interface at random times in the day, in the past 24hrs there have been nearly 2,000,000 drops and users are reporting slow network. Their actual utilization is very low, around 5Mbps with occasional spikes at 20Mbps, so not even touching the sides. 

Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 2412072
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 680000 bits/sec, 330 packets/sec
5 minute output rate 2598000 bits/sec, 438 packets/sec
34581067 packets input, 10213669927 bytes, 0 no buffer
Received 569985 broadcasts (272695 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 272695 multicast, 0 pause input
0 input packets with dribble condition detected
48843768 packets output, 38999739203 bytes, 0 underruns
2412072 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out

Should I see about increasing the output buffer from the default of 40? If so should I double it or make it much larger eg 1000? And will I be able to do it without bringing the link down? Are there any debug commands I could run on that interface to capture when these drops are happening?

 

Many thanks for any help,

P

8 Replies 8

Deepak Kumar
VIP Alumni
VIP Alumni

Hello,

I can see the error

2412072 output errors, 0 collisions, 0 interface resets

 

This is due to High Link Uses, High CPU uses or Software bug. Please provide the software details.

 

Regards,

Deepak Kumar

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

Base Ethernet MAC Address : 54:7c:69:df:26:00
Motherboard Assembly Number : 73-14442-10
Motherboard Serial Number : FOC184917BU
Model Revision Number : R0
Motherboard Revision Number : A0
Model Number : WS-C3850-48P
System Serial Number : FCW1849C0ZN


Switch Ports Model SW Version SW Image Mode
------ ----- ----- ---------- ---------- ----
* 1 56 WS-C3850-48P 03.06.05E cat3k_caa-universalk9 INSTALL

This link was moved from another switch a couple of months ago and that old interface shows the same output drops. That switch is also running the same software. I'm not convinced it is software, we have hundreds of these uplinks running on identical switches but this is the only site that has these errors and we have moved this link to another switch and seen the errors carry on.

 

Thanks,

P

Hi,
Please visit Cisco Bug ID: CSCvb65304
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvb65304/?referring_site=bugquickviewredir

And Also check the Interface reliability.
Regards,
Deepak Kumar
Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

Thanks for the link. Will that bug be affecting the end users? I have seen these drops before on our ethernet access ports, they dont seem to affect the end user, but this is the first time I have seen it on fibre uplink. 

 

Reliability is reliability 255/255, txload 2/255, rxload 1/255

 

 

Reliability 255/255= 100% up and reliable. Your Fiber is seemed ok.

Please update the IOS and check.

Regarding the Users, May user not a bandwidth hungry so he is not feeling.

 

Regards,

Deepak Kumar 

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

Joseph W. Doherty
Hall of Fame
Hall of Fame

"Their actual utilization is very low, around 5Mbps with occasional spikes at 20Mbps . . ."

Measured over what time interval? A 40 packet queue can fill (or drain) very quickly at gig.

If the egress queue is really only 40, that's probably rather shallow for gig. Often I've found you want a network device's egress queue to be about half of the BDP (bandwidth delay product).

I know little about the 3850, so don't know whether the egress queue is really 40 or whether increasing its size will matter, as actual egress queue(s) might be supported by an ASIC.

barking error.PNGbarking bps.PNG

I think contrary to my original post there obviously is some correlation between the two graphs. All of those errors are discards. Unfortunately we only added the interface to solarwinds two days ago so not a huge amount of data.

Our ten gig uplinks that we use elsewhere are also all 40 and we have never had an issue with them. From wikipedia it says 

High-speed terrestrial network: 1 Gbit/s, 1 ms RTT

B×D = 109 b/s × 10−3 s = 106 b, or 1 Mb, or 125 kB.
which fits our link very well, so what would you set the hold-queue to? And would changing that in the interface settings cause the interface to go down temporarily?
 
Many thanks,
P

Yea, it does like there may be a correlation between the two graphs. Could be micro bursts. BTW, a high drop rate will throttle throughput, which in turn makes the link look not too busy.

Your BDP is find for a single gig LAN flow, if not using jumbo frames, but can one egress interface be fed by multiple gig interfaces and/or 10g? Multiple concurrent gig or 10g may burst over the default egress queue depth.

Changing the egress queue size may very briefly interrupt traffic, but likely not worse than the what your current drops might be doing.

Again, I'm not familiar with a 3850, so unsure what impact changing the hold queue size will have. On the earlier 3750, you had to adjust queues with the SRR commands.
Review Cisco Networking for a $25 gift card