OutDiscard errors 3750x

Tech21 · ‎04-04-2018

Hi

I'm trying to get to the bottom of some issues experienced by my users, they are seeing network slowness on occasion and the only issue that I can find from a network perspective is out discards. Most stations are unaffected - this seems to be about data transfer/processes on specific PCs but I'd like to see if I can get tot he bottom of the outDiscard issues too.

Scenario - Users are transferring data (high volumes of images from 20Mb+ per file, transferring Gigabytes worth of data throughout the day) between PCs and SAN storage. PCs are connected to catalyst 3750x switch which then connects up to Catalyst 4500 aggregate router and from there servers & SAN are connected via etherchanelled 10Gb fibre ports to aggregate router. (Note: original setup had agg router > server switch (3750 again) > Server/SAN, with 3x 1Gb interfaces etherchanneled between the switch and the server)

We were previously seeing outdiscard errors on the server switch so we upgraded sever NICs to 10Gb and moved uplinks directly to agg router as that had greater 10Gb capacity - and the outDiscards have gone away from that side of this setup. However, on the client side, we are still seeing outdiscards for specific stations - from my understanding of out discard errors and from output results, is that it seems that these discards are happening as the traffic exits the interfaces towards the client PCs. We can replicate this error with multiple files transfers to and from servers (initially thought it was scripting issues, but it also happens using windows explorer). Of course this is potentially just an issue relating to high traffic volumes, but it would be really useful if we could eliminate these errors and move root cause resolution along to something else (or fix the issue :) )

I have manually set speed to 1000 and duplex to full on these ports to see if this would help, but sadly not.

Hopefully someone can suggest some further troubleshooting or resolutions - possibly relating to buffering, thanks

Please note that QoS is not configured on the switch

Model number : WS-C3750X-48T-S, running 15.0(2)SE7

Example errors:

Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
Gi1/0/44 0 0 0 0 0 234338

3750-x#sh interface summary

*: interface is up
IHQ: pkts in input hold queue     IQD: pkts dropped from input queue
OHQ: pkts in output hold queue    OQD: pkts dropped from output queue
RXBS: rx rate (bits/sec)          RXPS: rx rate (pkts/sec)
TXBS: tx rate (bits/sec)          TXPS: tx rate (pkts/sec)
TRTL: throttle count

Interface                   IHQ       IQD       OHQ       OQD      RXBS      RXPS      TXBS      TXPS      TRTL
------------------------------------------------------------------------------------------
* GigabitEthernet1/0/37         0         0         0       163    122000        26   2220000       177         0
* GigabitEthernet1/0/38         0         0         0         0 20466000      1680    274000       436         0
* GigabitEthernet1/0/39         0         0         0       814    152000       178 20560000      1686         0
* GigabitEthernet1/0/40         0         0         0       238   1384000       159   7049000       625         0
* GigabitEthernet1/0/41         0         0         0         0   2089000       164     52000        58         0
* GigabitEthernet1/0/42         0         0         0      1803    615000       919 172827000     14252         0
* GigabitEthernet1/0/43         0         0         0         0 44873000      3704    904000      1127         0
* GigabitEthernet1/0/44         0         0         0      3748   6987000      1486 53013000      4584         0

3750-x#$show platform port-asic stats drop gigabitEthernet 1/0/44

Interface Gi1/0/44 TxQueue Drop Statistics
    Queue 0
      Weight 0 Frames 0
      Weight 1 Frames 0
      Weight 2 Frames 0
    Queue 1
      Weight 0 Frames 0
      Weight 1 Frames 0
      Weight 2 Frames 0
    Queue 2
      Weight 0 Frames 0
      Weight 1 Frames 0
      Weight 2 Frames 0
    Queue 3
      Weight 0 Frames 0
      Weight 1 Frames 0
      Weight 2 Frames 347769
    Queue 4
      Weight 0 Frames 0
      Weight 1 Frames 0
      Weight 2 Frames 0
    Queue 5
      Weight 0 Frames 0
      Weight 1 Frames 0
      Weight 2 Frames 0
    Queue 6
      Weight 0 Frames 0
      Weight 1 Frames 0
      Weight 2 Frames 0
    Queue 7
      Weight 0 Frames 0
      Weight 1 Frames 0
      Weight 2 Frames 0

Any suggestions on cause, fix would be appreciated.

Thanks

Leo Laohoo · ‎04-04-2018

@Tech21 wrote:

I'm trying to get to the bottom of some issues experienced by my users, they are seeing network slowness on occasion and the only issue that I can find from a network perspective is out discards. Most stations are unaffected - this seems to be about data transfer/processes on specific PCs but I'd like to see if I can get tot he bottom of the outDiscard issues too.

This is typical behaviour when a high-speed data is pushed down a low-speed/high-latency switch, like a Catalyst switch.

This is the reason why Cisco has the Nexus family of switches.

The only way to get around this is to configure QoS on the switches so traffic can be shaped properly.

Tech21 · ‎04-04-2018

Thanks. So if queue 3 is dropping packets, how would I go about identifying the traffic needing to be shaped and how would I spread this over a number of queues - I figure QoS but have no experience in applying it. I thought that no QoS applied would mean that the buffers/queues are automatically manage the data flows across all queues, but perhaps I'm wrong

Joseph W. Doherty · ‎04-04-2018

The 3750 series is a bit infamous for egress packets drops, especially if QoS has been enabled using its defaults.

This because the 3750 has limited RAM for egress buffers (according to one Cisco document, its 2 MB per 24 copper ports or for its uplink ports).

I have had some great success reducing output discards (i.e. from several drops a second to a couple drops per day) with buffer "tuning". (NB: if you're interested, "how" is described in some of my past posts on this issue.)

Tech21 · ‎04-04-2018

Thanks, I'll take a look