04-04-2018 02:53 AM - edited 03-08-2019 02:31 PM
Hi
I'm trying to get to the bottom of some issues experienced by my users, they are seeing network slowness on occasion and the only issue that I can find from a network perspective is out discards. Most stations are unaffected - this seems to be about data transfer/processes on specific PCs but I'd like to see if I can get tot he bottom of the outDiscard issues too.
Scenario - Users are transferring data (high volumes of images from 20Mb+ per file, transferring Gigabytes worth of data throughout the day) between PCs and SAN storage. PCs are connected to catalyst 3750x switch which then connects up to Catalyst 4500 aggregate router and from there servers & SAN are connected via etherchanelled 10Gb fibre ports to aggregate router. (Note: original setup had agg router > server switch (3750 again) > Server/SAN, with 3x 1Gb interfaces etherchanneled between the switch and the server)
We were previously seeing outdiscard errors on the server switch so we upgraded sever NICs to 10Gb and moved uplinks directly to agg router as that had greater 10Gb capacity - and the outDiscards have gone away from that side of this setup. However, on the client side, we are still seeing outdiscards for specific stations - from my understanding of out discard errors and from output results, is that it seems that these discards are happening as the traffic exits the interfaces towards the client PCs. We can replicate this error with multiple files transfers to and from servers (initially thought it was scripting issues, but it also happens using windows explorer). Of course this is potentially just an issue relating to high traffic volumes, but it would be really useful if we could eliminate these errors and move root cause resolution along to something else (or fix the issue :) )
I have manually set speed to 1000 and duplex to full on these ports to see if this would help, but sadly not.
Hopefully someone can suggest some further troubleshooting or resolutions - possibly relating to buffering, thanks
Please note that QoS is not configured on the switch
Model number : WS-C3750X-48T-S, running 15.0(2)SE7
Example errors:
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
Gi1/0/44 0 0 0 0 0 234338
3750-x#sh interface summary
*: interface is up
IHQ: pkts in input hold queue IQD: pkts dropped from input queue
OHQ: pkts in output hold queue OQD: pkts dropped from output queue
RXBS: rx rate (bits/sec) RXPS: rx rate (pkts/sec)
TXBS: tx rate (bits/sec) TXPS: tx rate (pkts/sec)
TRTL: throttle count
Interface IHQ IQD OHQ OQD RXBS RXPS TXBS TXPS TRTL
------------------------------------------------------------------------------------------
* GigabitEthernet1/0/37 0 0 0 163 122000 26 2220000 177 0
* GigabitEthernet1/0/38 0 0 0 0 20466000 1680 274000 436 0
* GigabitEthernet1/0/39 0 0 0 814 152000 178 20560000 1686 0
* GigabitEthernet1/0/40 0 0 0 238 1384000 159 7049000 625 0
* GigabitEthernet1/0/41 0 0 0 0 2089000 164 52000 58 0
* GigabitEthernet1/0/42 0 0 0 1803 615000 919 172827000 14252 0
* GigabitEthernet1/0/43 0 0 0 0 44873000 3704 904000 1127 0
* GigabitEthernet1/0/44 0 0 0 3748 6987000 1486 53013000 4584 0
3750-x#$show platform port-asic stats drop gigabitEthernet 1/0/44
Interface Gi1/0/44 TxQueue Drop Statistics
Queue 0
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 1
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 2
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 3
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 347769
Queue 4
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 5
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 6
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 7
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Any suggestions on cause, fix would be appreciated.
Thanks
04-04-2018 02:57 AM
@Tech21 wrote:
I'm trying to get to the bottom of some issues experienced by my users, they are seeing network slowness on occasion and the only issue that I can find from a network perspective is out discards. Most stations are unaffected - this seems to be about data transfer/processes on specific PCs but I'd like to see if I can get tot he bottom of the outDiscard issues too.
This is typical behaviour when a high-speed data is pushed down a low-speed/high-latency switch, like a Catalyst switch.
This is the reason why Cisco has the Nexus family of switches.
The only way to get around this is to configure QoS on the switches so traffic can be shaped properly.
04-04-2018 04:56 AM
04-04-2018 04:32 AM - edited 04-04-2018 04:33 AM
The 3750 series is a bit infamous for egress packets drops, especially if QoS has been enabled using its defaults.
This because the 3750 has limited RAM for egress buffers (according to one Cisco document, its 2 MB per 24 copper ports or for its uplink ports).
I have had some great success reducing output discards (i.e. from several drops a second to a couple drops per day) with buffer "tuning". (NB: if you're interested, "how" is described in some of my past posts on this issue.)
04-04-2018 04:57 AM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide