Re: Output drop - Page 2

salemmahara · ‎01-20-2019

Hello everybody

We have a couple of 3850 (Stacked) and we use it in our aggregation layer.

ports on this switches are bundled in port-channels to downstream switches

More than half of port are free(not connected for future).

As you know, 3850 series are really powerful and meanwhile when you stack them, their capabilities will increase.

but here are an output of one of port-channel interfaces which looks really strange:

https://pitstop.manageengine.com/portal/community/topic/report-and-print-problems

Hardware is EtherChannel, address is ---------- (bia -------------)
MTU 1500 bytes, BW 2000000 Kbit/sec, DLY 10 usec,
     reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 1000Mb/s, link type is auto, media type is
input flow-control is off, output flow-control is unsupported
Members in this channel: Gi1/0/24 Gi2/0/24
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:13:28, output never, output hang never
Last clearing of "show interface" counters never
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 562303015
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 153000 bits/sec, 134 packets/sec
5 minute output rate 2330000 bits/sec, 325 packets/sec
     127201005 packets input, 44246365237 bytes, 0 no buffer
     Received 1266947 broadcasts (903818 multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 903818 multicast, 0 pause input
     0 input packets with dribble condition detected
     380610888 packets output, 410883103982 bytes, 0 underruns
     0 output errors, 0 collisions, 1 interface resets
     0 unknown protocol drops
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 pause output
     0 output buffer failures, 0 output buffers swapped out

Its IOS EX is almost updated (3.6). no buffer failures, no error or ,,,, but Output drop. this interfaces are 1G but their utilization is at most 30% ant lower. some of them utilize even under 10% . Switches are stacked and most of their interfaces are not connected so there isn't any limitation in buffer or ...

MUHAMMAD TAYYAB MUNIR · ‎01-23-2019

Hi Salem,

Did you implement all the advised solutions if not please go ahead and implement those and then update us with the output?

However, if you need additional help with regards to your case I suggest you open a TAC Case and share "Show Tec" output with them.

BR,

Tayyab - www.tayyabmunir.com

***Please rate if response was helpful***

*** Please rate all helpful responses and mark solutions***

salemmahara · ‎01-26-2019

Hello @MUHAMMAD TAYYAB MUNIR

May I ask you to answer my previous post please ?

I'm going to make sure before implementing the solution .

MUHAMMAD TAYYAB MUNIR · ‎01-26-2019

Hi @salemmahara

Yes you can?

*** Please rate all helpful responses and mark solutions***

salemmahara · ‎01-26-2019

hank you @MUHAMMAD TAYYAB MUNIR

I mean my last reply in page 1 ( I replayed your post ) .

Could you please check and confirm it if it's okay . I asked you some questions there

MUHAMMAD TAYYAB MUNIR · ‎01-26-2019

@salemmahara

1) Output drop rate is : (562303015 / 410883103982) * 100 = 0.13% which means reasonable (Less than 1 percent) or (562303015 /410883103982) * 1000 = 1.3 packet within 1000 .

So, the actual drop rate is around is less than 1 percent which is quite low, that these drops happened on a non-priority queue.

2) By applying the command Qos queue-softmax-multiplier 600we will increase the buffer size multiplied by 6 which means 300*6*256= 450KB for each interface. Right ? Does it effect on TenGig interfaces too? I have some TenGig interfaces which are working fine with their buffer size without any drop.

Since all ports of the Catalyst 3850 use a shared-buffer on as needed-basis, I thought it could be a problem to change the soft-buffer value to the maximum of 1200 and 1200 maximum buffer size often add latency to the packets. So initially increase to 600.

conf t

qos queue-softmax-multiplier 600

Does this command need reload?

Reload required for release 3.6.6E and 3.7.5E in your case you’re running code 3.6.8 after increasing the soft-buffer take effect immediately.

3) When we analyze the graphs, no interfaces touch the maximum BW . So why a 1G interface fails to transfer 30-40Mb data with output drop? Only because of that we are using them as uplink (One to many)? But 3850 series are supposed to be used in aggregation layer! Even at core in environments with lower data rate.

Its a bug and fixed release already in place.

BTW , I want to share another thing with you . Our 3850s are connected to downstream switches which their uplink ports are SFP+ but we use GLC-T(1G) for converting Fibber to copper . But I think it doesn't effect the output drop. I have similar experience in other projects without any output drop. What is your idea ?

Bug triggered on this device, other devises still not hit the bug!!!

*** Please rate if response helpful***

*** Please rate all helpful responses and mark solutions***

salemmahara · ‎01-27-2019

Hello @MUHAMMAD TAYYAB MUNIR

I changed the softmax-buffer and cleared all counters.

It needs 1-2 days to monitor the switch. I'll send the result as soon as possible

Thank you so much

MUHAMMAD TAYYAB MUNIR · ‎01-27-2019

@salemmahara

Happy to hear that you have applied the solution.

Please don't forget

*** Please rate all helpful responses and mark solutions***

salemmahara · ‎01-27-2019

Hello @MUHAMMAD TAYYAB MUNIR

Unfortunately there is still OUTPUT DROP after increasing Softmax-buffer

SHAME ON CISCO !!!!!!!

Don't buy 3850 and 2960X ! Buggy platforms !!!!!!!!!!!!!!!

MUHAMMAD TAYYAB MUNIR · ‎01-28-2019

Hi @salemmahara

There is a solution to every problem, don't depress yourself.

Let's Make an Action Plan:

1) increase the softmax-Buffer to maximum 1200 and clear the counter, Check after 24 hours.

2) If you still observing the drops only solution to upgrade the IOS.

No worries I will support you until you get rid of this issue.

@Leo Laohoo @Georg Pauwen @Georg Pauwen@ I'm new to this community, I have no idea How we can engage Cisco official or TAC team to help this gentleman.

BR

Tayyab

*** Please rate all helpful responses and mark solutions***

Leo Laohoo · ‎01-28-2019

@MUHAMMAD TAYYAB MUNIR wrote:

I have no idea How we can engage Cisco official or TAC team to help this gentleman.

The OP needs to contact Cisco TAC (and not you).

MUHAMMAD TAYYAB MUNIR · ‎01-28-2019

Look like he doesn't have valid support contract otherwise, he will not wait for our feedback.

*** Please rate all helpful responses and mark solutions***

MUHAMMAD TAYYAB MUNIR · ‎01-29-2019

Hi @salemmahara

Have you got chanced to work on this case? please share the updates.

BR

Tayyab

*** Please rate all helpful responses and mark solutions***

salemmahara · ‎02-01-2019

Hi @MUHAMMAD TAYYAB MUNIR

How is it going?

I'm so sorry for a week of absence.

Actually, I'm a little confused and have no idea to solve the problem. Now, I'm going to manually set the speed of uplinks of my 2960s to 1G and check the result. My assumption is that they can be a part of this dark puzzle! Although there're working in 1G but they're intrinsically SFP+ !

There is another problem with upgrading to new version of IOS XE ! Cisco is releasing plenty of versions and it's a nightmare to go for one of those. Denali , Everest , 16.X , 3.X .... ! Which one should I go for !? Of course Denali looks good but Cisco is terrible in documentation. People have to search the internet and share their knowledge which sometimes is not trust-able.

Leo Laohoo · ‎01-23-2019

@salemmahara wrote:

We're using 3.6.8 so it should be solved in this release

Raise a TAC Case. Details in Bug IDs are seldom accurate and rarely updated.

romanroma · ‎01-26-2019

I had very similar issues recently. If you are using fiber you might want to check out the Tx/Rx values and heat levels on your transceivers with a:

sh transceiver detail command on the interface.

In my case, when the transceiver got hot, I would start dropping packets, and I had some erratic light issues as well.