Disclaimer - Page 2

paul amaral · ‎04-22-2016

Hi all, I have an interface that constantly has output drops, looking at the drops I can confirm they are all output discard. The problem im having is I can’t figure out what is causing the output packet drops.

The interface is part of a routed vlan and it set for 100Mb full duplex connected to an L2 switch set at 100/full as well. I already replaced the L2 switch and that didn’t make a difference. Also looking at the L2 switches’ interface there are not input drops or errors at all.

I have read a lot of documents on discards and have done as much troubleshooting as I can and have not been able to stop this from happening or even determine what packets are being dropped.

I have looked for microbursts using wireshark and there are none even at 2Mb you will see discard sometimes. I have increase the output queue to match the input queue and that didn’t help. The interface looks clean with no CRC/Runts etc and I’m barely touching the 100Mb throughput. Can someone recommend what else I can do and look at to determined what the issue might be.

Mod Ports Card Type                              Model              Serial No.
--- ----- -------------------------------------- ------------------ -----------
1   48 SFM-capable 48-port 10/100 Mbps RJ45   WS-X6548-RJ-45     SAL0710A54G
2   48 SFM-capable 48-port 10/100 Mbps RJ45   WS-X6548-RJ-45     SAL09444KVM

7    2 Supervisor Engine 720 (Active)         WS-SUP720-3BXL     SAD084202LK
8    2 Supervisor Engine 720 (Hot)            WS-SUP720-3BXL     SAL1015JPRZ

TIA, Paul

Vlan615 is up, line protocol is up

Hardware is EtherSVI, address is 0015.c7c7.0880 (bia 0015.c7c7.0880)

Internet address is xxxx

MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,

reliability 255/255, txload 1/255, rxload 1/255

Encapsulation ARPA, loopback not set

Keepalive not supported

ARP type: ARPA, ARP Timeout 04:00:00

Last input 00:00:22, output 00:00:22, output hang never

Last clearing of "show interface" counters 01:49:53

Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

Queueing strategy: fifo

Output queue: 0/40 (size/max)

5 minute input rate 1337000 bits/sec, 349 packets/sec

5 minute output rate 1938000 bits/sec, 344 packets/sec

L2 Switched: ucast: 214 pkt, 14696 bytes - mcast: 12 pkt, 768 bytes

L3 in Switched: ucast: 1582724 pkt, 567827172 bytes - mcast: 0 pkt, 0 bytes mcast

L3 out Switched: ucast: 1625654 pkt, 1115273430 bytes mcast: 0 pkt, 0 bytes

1584871 packets input, 568106584 bytes, 0 no buffer

Received 12 broadcasts (0 IP multicasts)

0 runts, 0 giants, 0 throttles

0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

1623842 packets output, 1114306193 bytes, 0 underruns

0 output errors, 0 interface resets

0 output buffer failures, 0 output buffers swapped out

FastEthernet1/36 is up, line protocol is up (connected)

Hardware is C6k 100Mb 802.3, address is 0009.11f6.35b3 (bia 0009.11f6.35b3)

Description: Spamcan new port - pa testing

MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec,

reliability 255/255, txload 2/255, rxload 2/255

Encapsulation ARPA, loopback not set

Keepalive set (10 sec)

Full-duplex, 100Mb/s, media type is 10/100BaseTX

input flow-control is off, output flow-control is unsupported

ARP type: ARPA, ARP Timeout 04:00:00

Last input 00:00:25, output never, output hang never

Last clearing of "show interface" counters 01:50:32

Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 2468

Queueing strategy: fifo

Output queue: 0/40 (size/max)

30 second input rate 942000 bits/sec, 289 packets/sec

30 second output rate 1097000 bits/sec, 271 packets/sec

1592429 packets input, 569790617 bytes, 0 no buffer

Received 3434 broadcasts (3422 multicasts)

0 runts, 0 giants, 0 throttles

0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

0 watchdog, 0 multicast, 0 pause input

0 input packets with dribble condition detected

1626638 packets output, 1105460469 bytes, 0 underruns

0 output errors, 0 collisions, 0 interface resets

0 babbles, 0 late collision, 0 deferred

0 lost carrier, 0 no carrier, 0 PAUSE output

0 output buffer failures, 0 output buffers swapped out

sh int fast1/36 counters error

Port Align-Err FCS-Err Xmit-Err Rcv-Err

UnderSize OutDiscards

Fa1/36 0 0 0 0

0 2468

Port Single-Col Multi-Col Late-Col Excess-Col

Carri-Sen Runts Giants

Fa1/36 0 0 0 0

0 0 0

Port SQETest-Err Deferred-Tx IntMacTx-Err IntMacRx-Err

Symbol-Err

Fa1/36 0 0 0 0

0

Interface FastEthernet1/36 queueing strategy: Weighted Round-Robin

Port QoS is enabled

Trust boundary disabled

Port is untrusted

Extend trust state: not trusted [COS = 0]

Default COS is 0

Queueing Mode In Tx direction: mode-cos

Transmit queues [type = 1p3q1t]:

Queue Id Scheduling Num of thresholds

-----------------------------------------

1 WRR 1

2 WRR 1

3 WRR 1

4 Priority 1

WRR bandwidth ratios: 100[queue 1] 150[queue 2] 200[queue 3]

queue random-detect-min-thresholds

----------------------------------

1 70[1]

2 70[1]

3 70[1]

queue random-detect-max-thresholds

----------------------------------

1 100[1]

2 100[1]

3 100[1]

WRED disabled queues:

queue thresh cos-map

---------------------------------------

1 1 0 1

2 1 2 3 4

3 1 6 7

4 1 5

paul amaral · ‎05-09-2016

Josehph, thanks for trying to help me out on this.

Behind the port with the output errors i have 4 servers doing SMTP, 3 spam filtering and one DB server, so im assuming that even though i saw no microbursts that there is enough packets hitting the output of that interface that it needs to be queued and that maybe there's not enough output buffer memory.

I'm assuming that each packet gets a buffer on the TX ring every time before the packet gets sent and once the ring is full it drops new packets based on random-detect, correct?

so it could be that the packets are small packets thus when im not seeing bandwidth spikes and the RX ring getting full as the memory is not enough?

Im thinking of taking each server behind the 2900 L2 switch and plug it in directly to the 6900's ports and im almost positive this should fix the issue, we will see.

Also because all the servers are connected to the L2 switch and need to communicate with each other I think this is done via L2 mac and they never actually go back out to the 6900 L3 vlan but rather just pass the 2900 ports, my thinking was that somehow traffic was passing that 6900 once when the initial internet packet reached the smtp filter server and then again once the filter server checks with the DB server but thinking about it this should all be L2 and stay on the 2900.

thanks for your feedback

paul

Joseph W. Doherty · ‎05-09-2016

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages wha2tsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

I'm assuming that each packet gets a buffer on the TX ring every time before the packet gets sent and once the ring is full it drops new packets based on random-detect, correct?

I'm unsure with a 6500 line card, but on, for example, ISRs, the TX ring overflows into the software queue(s).

Also, don't recall RED being an option on most 6500 line cards.

so it could be that the packets are small packets thus when im not seeing bandwidth spikes and the RX ring getting full as the memory is not enough?

Depending on how buffers are allocated, yes you could run out of buffer space, for less traffic volume, if the packets are small.

Im thinking of taking each server behind the 2900 L2 switch and plug it in directly to the 6900's ports and im almost positive this should fix the issue, we will see.

If you have the available ports, it's worth trying. It will be interesting to see if drops show on any of the server edge port.

Also because all the servers are connected to the L2 switch and need to communicate with each other I think this is done via L2 mac and they never actually go back out to the 6900 L3 vlan but rather just pass the 2900 ports, my thinking was that somehow traffic was passing that 6900 once when the initial internet packet reached the smtp filter server and then again once the filter server checks with the DB server but thinking about it this should all be L2 and stay on the 2900.

If all the hosts are in the same VLAN, then normally yes, inter-host traffic should all stay on the 2900.

Joseph W. Doherty · ‎05-09-2016

"lurks" - laugh

Pawan Raut · ‎04-25-2016

Output drops is expected behaviour when you have QoS enable and any class of that QoS/policy-map exceed the traffic than allocated bandwith then QoS drops the exceed packets and it is known as output drops. Could you please provide the QoS config and interface config.

paul amaral · ‎04-25-2016

there is no QOS config, its not exceeding the interface throughput, thats why I can't figure this out.

The 6500 line card itself has built in QOS queues but again its not under congestion and there is not mirco bursts or runts/crc errors etc.

If there a way i can nail down which type of packets are being drop!?

Pawan Raut · ‎04-25-2016

If it has auto Qos then you should verify which queue the packets are getting drops using below troubleshooting guide

http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/71600-tsqos-6k.html

Carlos Villagran · ‎04-25-2016

Hi!

Can you send the configuration of the interface and the show policy-map interface gigx/x ?

Best regards!

JC

paul amaral · ‎04-25-2016

there is not policy on the interface, so threre is no output for that.

FastEthernet1/36 is up, line protocol is up (connected)
Hardware is C6k 100Mb 802.3, address is 0009.11f6.35b3 (bia 0009.11f6.35b3)
Description: Spamcan new port - pa testing
MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec,
     reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 100Mb/s, media type is 10/100BaseTX
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:03, output never, output hang never
Last clearing of "show interface" counters 3d02h
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 49199
Queueing strategy: fifo
Output queue: 0/40 (size/max)
30 second input rate 353000 bits/sec, 132 packets/sec
30 second output rate 614000 bits/sec, 134 packets/sec
     40471382 packets input, 9804150733 bytes, 0 no buffer
     Received 139057 broadcasts (138560 multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 0 multicast, 0 pause input
     0 input packets with dribble condition detected
     43463962 packets output, 29110336306 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 PAUSE output
     0 output buffer failures, 0 output buffers swapped out

Carlos Villagran · ‎04-25-2016

Hi!

Can you please post the show interfaces stats, show buffers and show interface x/x switching?

Regards!

JC

Carlos Villagran · ‎04-26-2016

Hi Paul, can you try tunning the output queue with the switch(config-if)#hold-queue 1000 out?

I think this is an oversuscription issue.

Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 2468

Output queue: 0/40 (size/max)

Hope it helps, best regards!

JC

paul amaral · ‎04-26-2016

I have done this an set it to 2000 since the input queue is 2000. I have watched the output errors accumulate while looking at wireshark and the bandwidth spike was around 3 megs only. It's a 100Mb interface.

Can't figure out output discards