cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
16485
Views
12
Helpful
15
Replies

2960 buffers (drops+packet loss due to micro-bursts)

johnelliot6
Level 2
Level 2

Have a pair of 2960's in a stack, one port(trunk) connects to another DC and we are seeing ~5% packet-loss

and large output drops to this DC.



#sh interfaces gigabitEthernet 1/0/17 counters errors



Port        Align-Err     FCS-Err    Xmit-Err     Rcv-Err  UnderSize  OutDiscards


Gi1/0/17            0           0           0           0          0       182867



GigabitEthernet1/0/17 is up, line protocol is up (connected)

  Hardware is Gigabit Ethernet, address is a0cf.5b87.ec11 (bia a0cf.5b87.ec11)

  Description: QinQ_to_DC2

  MTU 1998 bytes, BW 100000 Kbit, DLY 100 usec,

     reliability 255/255, txload 41/255, rxload 23/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 100Mb/s, media type is 10/100/1000BaseTX

  input flow-control is off, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 6d13h, output 00:00:00, output hang never

  Last clearing of "show interface" counters 04:02:15

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 183592

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  30 second input rate 9047000 bits/sec, 2075 packets/sec

  30 second output rate 16324000 bits/sec, 2309 packets/sec




As you can see, 30sec rate isnt excessive, but as the drops are outdiscards it would appear we are getting hit by the small buffers/microburst issue.

Gi1/0/17 is mapped to asic 0/20



Gi1/0/17  17   17   17   0/20 1    17   17   local     Yes     Yes



Port-asic Port Drop Statistics - Summary


========================================



Port 20 TxQueue Drop Stats: 308277833



And majority appear to be in Queue 1:



Port 20 TxQueue Drop Statistics

    Queue 0

      Weight 0 Frames 3

      Weight 1 Frames 0

      Weight 2 Frames 0

    Queue 1

      Weight 0 Frames 308240408

      Weight 1 Frames 458

      Weight 2 Frames 0

    Queue 2

      Weight 0 Frames 37898

      Weight 1 Frames 0

      Weight 2 Frames 0

    Queue 3

      Weight 0 Frames 91

      Weight 1 Frames 0

      Weight 2 Frames 0

    Queue 4

      Weight 0 Frames 0

      Weight 1 Frames 0

      Weight 2 Frames 0

    Queue 5

      Weight 0 Frames 0

      Weight 1 Frames 0

      Weight 2 Frames 0

    Queue 6

      Weight 0 Frames 0

      Weight 1 Frames 0

      Weight 2 Frames 0

    Queue 7

      Weight 0 Frames 0

      Weight 1 Frames 0

      Weight 2 Frames 0




Done a bit of reasearch, and as we have mls qos configured(we have some ssh/rdp policies in place on access-ports), we need to look at "tweaking" the buffer allocations on the switch to hopefully mitigate(reduce) these drops.



There appears to be a range of recommendations when it comes to these tweaks - Hoping someone has some suggestions on

what to set with "mls qos queue-set output" to alleviate the drops?(start conservative, then apply more aggressive if needed)....and also, does adjusting the buffers require an outage window?



Our traffic is primarily backup(replication which is very bursty), and Internet



Thanks in advance.

15 Replies 15

nkarpysh
Cisco Employee
Cisco Employee

Hello,

We can see that most traffic being dropped withing queue2.

You can check the following commands to see which DSCP-QOS are matching that queue and possibly then adjust it:

sho mls qos int gi1/0/17 stat
show mls qos maps dscp-cos

sh mls qos maps cos-output-q

also check the current queue allocation with "sh mls qos int gi1/0/17"

and possibly increase queue 2 with:

mls qos queue-set output 1 buffers 10 40 25 25   -  possibly then tune it differently based on the new outputs.

You can also change thresholds to:

mls qos queue-set output 1 threshold 2  500 500 50 500

Hope this helps,

Nik

HTH,
Niko

Thanks Nik - results of requested sh:

#sho mls qos int gi1/0/17 stat
GigabitEthernet1/0/17 (All statistics are in packets)

  dscp: incoming
-------------------------------

  0 -  4 :    55800843            0            1            0           38
  5 -  9 :           0            0            0   1071079724            0
10 - 14 :           1            0            0            0            0
15 - 19 :           0           10            0            0            0
20 - 24 :           0            0            0            0            2
25 - 29 :           0   1816074149            0            0            0
30 - 34 :           0            0            0            0            0
35 - 39 :           0            0            0            0            0
40 - 44 :           3            0            0            0            0
45 - 49 :           0            0            0       108865            0
50 - 54 :           0            0            0            0            0
55 - 59 :           0            0            0            0            0
60 - 64 :           0            0            0            0
  dscp: outgoing
-------------------------------

  0 -  4 :  2161609365         4528   2468576816          861     31334116
  5 -  9 :         143          500          815    330708481          450
10 - 14 :       15721            0        13672            0         2285
15 - 19 :           2     10955553          128         2739            2
20 - 24 :          35            3          201           27        62641
25 - 29 :          26     37037176            1           46            2
30 - 34 :           3            2          945            1          405
35 - 39 :           1            5            1           34            2
40 - 44 :         194           11            2            2        10140
45 - 49 :           0       222214            1       522172            6
50 - 54 :          17            2            1            1            5
55 - 59 :           2           56            1        36540            3
60 - 64 :           2            3            0         6116
  cos: incoming
-------------------------------

  0 -  4 :  3748217471   1072163134           10   1816076438            0
  5 -  7 :           3       108865          528
  cos: outgoing
-------------------------------

  0 -  4 :  2422389883    330740734     10958766     37099949         1394
  5 -  7 :      232564       522207       840938
  output queues enqueued:
queue:    threshold1   threshold2   threshold3
-----------------------------------------------
queue 0:      232564           0           0
queue 1:  2756394379      108784      800685
queue 2:    48058748           0           0
queue 3:      566322           0           0

  output queues dropped:
queue:    threshold1   threshold2   threshold3
-----------------------------------------------
queue 0:           3           0           0
queue 1:   309661435         465           0
queue 2:       37912           0           0
queue 3:          91           0           0

Policer: Inprofile:            0 OutofProfile:            0

#show mls qos maps dscp-cos

   Dscp-cos map:

     d1 :  d2 0  1  2  3  4  5  6  7  8  9

     ---------------------------------------

      0 :    00 00 00 00 00 00 00 00 01 01

      1 :    01 01 01 01 01 01 02 02 02 02

      2 :    02 02 02 02 03 03 03 03 03 03

      3 :    03 03 04 04 04 04 04 04 04 04

      4 :    05 05 05 05 05 05 05 05 06 06

      5 :    06 06 06 06 06 06 07 07 07 07

      6 :    07 07 07 07

#sh mls qos maps cos-output-q

   Cos-outputq-threshold map:

              cos:  0   1   2   3   4   5   6   7

              ------------------------------------

  queue-threshold: 2-1 2-1 3-1 3-1 4-1 1-1 4-1 4-1

#sh mls qos int gi1/0/17

GigabitEthernet1/0/17

trust state: not trusted

trust mode: not trusted

trust enabled flag: ena

COS override: dis

default COS: 0

DSCP Mutation Map: Default DSCP Mutation Map

Trust device: none

qos mode: port-based

based on the above output, do you still recommend:

"mls qos queue-set output 1 buffers 10 40 25 25"

as a starting point? And will adjusting the buffers require an outage window?

Thanks again for your assistance.

Hey,

SO we have this:

output queues enqueued:

queue:    threshold1   threshold2   threshold3

-----------------------------------------------

queue 0:      232564           0           0

queue 1:  2756394379      108784      800685

queue 2:    48058748           0           0

queue 3:      566322           0           0

Thsu I would do following buffers:

mls qos queue-set output 1 buffers 10 50 30 10

Regards,

HTH,
Niko

Thanks Nik - I had adjusted as per you previous post, and packet-loss actuall increased!  then I checked link utilisation, and post your recommeded changes, utilisation had increased to 90Mbit/sec!  I rate limited the offending backup traffic, to 30Mbit, and packet loss is gone, and drops have reduced dramatically(We have tried this previously...even rate-limiting the backup traffic to 10Mbit, but the packet-loss/drops only reduced slightly)....so very good outcome! I will apply your suggestion above, and hopefully it improves even more!

Thanks again!

Glad it gave some good results from first try. Qos is always subject for fine tuning.

Regards

HTH,
Niko

Hi Nik - Just an update, If I rate-limit the backup(replication) traffic to 20Mb, I see no packet loss, but If I increase it to 40Mb, I start to see packet-loss across the link again....do you have any further suggestions on fine tuning the buffers to allow for the backup/replication traffic to run at faster speeds?

Cheers

Hi John,

Can you please collect the logs we captured before with 40 MB rate-limiter to see which queue is dropping.

BTW Paolo suggestion below can be also valid - if you don't have traffic of particular priority some times QOS blocked can improve situation. Anyway as we still have some room for buffer tuning - we can try that. So will apreciate if you can attach the commands requested beofre once again.

Nik

HTH,
Niko

Thanks Nik - As reqested:

#sho mls qos int gi1/0/17 stat
GigabitEthernet1/0/17 (All statistics are in packets)

  dscp: incoming 
-------------------------------

  0 -  4 :   187717358            0            1            0           38 
  5 -  9 :           0            0            0   1147706354            0 
10 - 14 :           1            0            0            0            0 
15 - 19 :           0           11            0            0            0 
20 - 24 :           0            0            0            0            2 
25 - 29 :           0   1831834072            0            0            0 
30 - 34 :           0            0            0            0            0 
35 - 39 :           0            0            0            0            0 
40 - 44 :           3            0            0            0            0 
45 - 49 :           0            0            0       111307            0 
50 - 54 :           0            0            0            0            0 
55 - 59 :           0            0            0            0            0 
60 - 64 :           0            0            0            0 
  dscp: outgoing
-------------------------------

  0 -  4 :   993879429         4583   2486230550          862     32074437 
  5 -  9 :         146          504          816    338168174          459 
10 - 14 :       15764            0        13701            0         2319 
15 - 19 :           2     10979560          128         4103            2 
20 - 24 :          80            3          201           27        62901 
25 - 29 :          26     38571868            1           46            2 
30 - 34 :           3            2          946            1         3541 
35 - 39 :           1            5            1           34            2 
40 - 44 :         194           11            2            2        10141 
45 - 49 :           0       222216            1       576619            6 
50 - 54 :          17            2            1            1            5 
55 - 59 :           2           56            1        36612            3 
60 - 64 :           2            3            0         6126 
  cos: incoming 
-------------------------------

  0 -  4 :   396051087   1148789637           11   1831834074            0 
  5 -  7 :           3       111307          550 
  cos: outgoing
-------------------------------

  0 -  4 :  3811637267    338200419     10984104     38634849         4531 
  5 -  7 :      232567       576653       873865 
  output queues enqueued:
queue:    threshold1   threshold2   threshold3
-----------------------------------------------
queue 0:      232567           0           0
queue 1:  4153234309      114000      833566
queue 2:    49618986           0           0
queue 3:      623987           0           0

  output queues dropped:
queue:    threshold1   threshold2   threshold3
-----------------------------------------------
queue 0:           3           0           0
queue 1:   317860776         465          53
queue 2:       37913           0           0
queue 3:         124           0           0

Policer: Inprofile:            0 OutofProfile:            0

#show mls qos maps dscp-cos

   Dscp-cos map:

     d1 :  d2 0  1  2  3  4  5  6  7  8  9

     ---------------------------------------

      0 :    00 00 00 00 00 00 00 00 01 01

      1 :    01 01 01 01 01 01 02 02 02 02

      2 :    02 02 02 02 03 03 03 03 03 03

      3 :    03 03 04 04 04 04 04 04 04 04

      4 :    05 05 05 05 05 05 05 05 06 06

      5 :    06 06 06 06 06 06 07 07 07 07

      6 :    07 07 07 07

#sh mls qos maps cos-output-q

   Cos-outputq-threshold map:

              cos:  0   1   2   3   4   5   6   7 

              ------------------------------------

  queue-threshold: 2-1 2-1 3-1 3-1 4-1 1-1 4-1 4-1

#sh mls qos int gi1/0/17

GigabitEthernet1/0/17

trust state: not trusted

trust mode: not trusted

trust enabled flag: ena

COS override: dis

default COS: 0

DSCP Mutation Map: Default DSCP Mutation Map

Trust device: none

qos mode: port-based

#show mls qos maps dscp-cos
   Dscp-cos map:
     d1 :  d2 0  1  2  3  4  5  6  7  8  9
     ---------------------------------------
      0 :    00 00 00 00 00 00 00 00 01 01
      1 :    01 01 01 01 01 01 02 02 02 02
      2 :    02 02 02 02 03 03 03 03 03 03
      3 :    03 03 04 04 04 04 04 04 04 04
      4 :    05 05 05 05 05 05 05 05 06 06
      5 :    06 06 06 06 06 06 07 07 07 07
      6 :    07 07 07 07

#sh mls qos maps cos-output-q
   Cos-outputq-threshold map:
              cos:  0   1   2   3   4   5   6   7 
              ------------------------------------
  queue-threshold: 2-1 2-1 3-1 3-1 4-1 1-1 4-1 4-1

#sh mls qos int gi1/0/17

GigabitEthernet1/0/17

trust state: not trusted

trust mode: not trusted

trust enabled flag: ena

COS override: dis

default COS: 0

DSCP Mutation Map: Default DSCP Mutation Map

Trust device: none

qos mode: port-based #sh mls qos int gi1/0/17
GigabitEthernet1/0/17
trust state: not trusted
trust mode: not trusted
trust enabled flag: ena
COS override: dis
default COS: 0
DSCP Mutation Map: Default DSCP Mutation Map
Trust device: none
qos mode: port-based

We do have qos applied to access ports giving management (ssh+rdp) higher priority...so I would prefer not to remove all as Paolo suggested if possible.

Cheers.

Ok

SO we got more traffic in same queue too. I would try following order:

1. Check new bandwidth with following threshold config

mls qos queue-set output 1 threshold 1 3200 3200 100 3200

mls qos queue-set output 1 threshold 2 3200 3200 100 3200

mls qos queue-set output 1 threshold 3 3200 3200 100 3200


2. Then you can further adjust the queues:

mls qos queue-set output 1 buffers 5 60 25 10

or even

mls qos queue-set output 1 buffers 5 65 25 5

and see which queues are dropping with command

sho mls qos int gi1/0/17 stat

3. If you still see drops  - you can consider switching off QoS globally and  see if FIFO on output improves the situation

Nik

HTH,
Niko

Thanks Nik - Adjusted as suggested:

#sh run  | include mls

mls qos queue-set output 1 threshold 1 3200 3200 100 3200

mls qos queue-set output 1 threshold 2 3200 3200 100 3200

mls qos queue-set output 1 threshold 3 3200 3200 100 3200

mls qos queue-set output 1 buffers 5 65 25 5

and with 40Mb rate-limit:

#sho mls qos int gi1/0/17 stat
GigabitEthernet1/0/17 (All statistics are in packets)

  dscp: incoming 
-------------------------------

  0 -  4 :   262871790            0            1            0           38 
  5 -  9 :           0            0            0   1149948698            0 
10 - 14 :           1            0            0            0            0 
15 - 19 :           0           11            0            0            0 
20 - 24 :           0            0            0            0            2 
25 - 29 :           0   1832252709            0            0            0 
30 - 34 :           0            0            0            0            0 
35 - 39 :           0            0            0            0            0 
40 - 44 :           3            0            0            0            0 
45 - 49 :           0            0            0       111380            0 
50 - 54 :           0            0            0            0            0 
55 - 59 :           0            0            0            0            0 
60 - 64 :           0            0            0            0 
  dscp: outgoing
-------------------------------

  0 -  4 :  1117251944         4583   2486570792          862     32110564 
  5 -  9 :         146          504          816    338229400          459 
10 - 14 :       15764            0        13701            0         2319 
15 - 19 :           2     10981358          128         4173            2 
20 - 24 :          80            3          201           27        62901 
25 - 29 :          26     38640706            1           46            2 
30 - 34 :           3            2          946            1         3541 
35 - 39 :           1            5            1           34            2 
40 - 44 :         194           11            2            2        10141 
45 - 49 :           0       222216            1       577210            6 
50 - 54 :          17            2            1            1            5 
55 - 59 :           2           56            1        36615            3 
60 - 64 :           2            3            0         6126 
  cos: incoming 
-------------------------------

  0 -  4 :   471244235   1151031981           11   1832252711            0 
  5 -  7 :           3       111380          551 
  cos: outgoing
-------------------------------

  0 -  4 :  3935400094    338261645     10985972     38703687         4531 
  5 -  7 :      232567       577244       875689 
  output queues enqueued:
queue:    threshold1   threshold2   threshold3
-----------------------------------------------
queue 0:      232567           0           0
queue 1:  4277066131      114047      835389
queue 2:    49689692           0           0
queue 3:      624581           0           0

  output queues dropped:
queue:    threshold1   threshold2   threshold3
-----------------------------------------------
queue 0:           3           0           0
queue 1:   319047026         466          61
queue 2:       37915           0           0
queue 3:         124           0           0

Policer: Inprofile:            0 OutofProfile:            0

Cheers

SO that still dropping for queue 2. We can increse that for the cost of decreasing queue 3, but it will not improve much and I guess it will start dropping soon.

What is the speed of your int? Can you do "show int gi1/0/7". Can you please also explain how you do rate-limiting?

P.S. did you try a test with disabling of QoS completely?

Nik

HTH,
Niko

Hi Nik,

Int speed is 100Mb, and sh int below:

#sh int gigabitEthernet 1/0/17

GigabitEthernet1/0/17 is up, line protocol is up (connected)

  Hardware is Gigabit Ethernet, address is a0cf.5b87.ec11 (bia a0cf.5b87.ec11)

  Description: QinQ_to_DC2

  MTU 1998 bytes, BW 100000 Kbit, DLY 100 usec,

     reliability 255/255, txload 119/255, rxload 19/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 100Mb/s, media type is 10/100/1000BaseTX

  input flow-control is off, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 1w5d, output 00:00:03, output hang never

  Last clearing of "show interface" counters 17:46:12

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 171602

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  30 second input rate 7589000 bits/sec, 3497 packets/sec

  30 second output rate 46906000 bits/sec, 4617 packets/sec

     169405700 packets input, 53082999025 bytes, 0 no buffer

     Received 194378 broadcasts (38990 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 38990 multicast, 0 pause input

     0 input packets with dribble condition detected

     226258242 packets output, 285168510077 bytes, 0 underruns

     0 output errors, 0 collisions, 0 interface resets

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 PAUSE output

     0 output buffer failures, 0 output buffers swapped out

rate-limit input 40960000 7680000 15360000 conform-action transmit exceed-action drop

rate-limit output 40960000 7680000 15360000 conform-action transmit exceed-action drop

#sh int gigabitEthernet 1/0/17

GigabitEthernet1/0/17 is up, line protocol is up (connected)

  Hardware is Gigabit Ethernet, address is a0cf.5b87.ec11 (bia a0cf.5b87.ec11)

  Description: QinQ_via_AAPT_to_FUJI

  MTU 1998 bytes, BW 100000 Kbit, DLY 100 usec,

     reliability 255/255, txload 119/255, rxload 19/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 100Mb/s, media type is 10/100/1000BaseTX

  input flow-control is off, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 1w5d, output 00:00:03, output hang never

  Last clearing of "show interface" counters 17:46:12

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 171602

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  30 second input rate 7589000 bits/sec, 3497 packets/sec

  30 second output rate 46906000 bits/sec, 4617 packets/sec

     169405700 packets input, 53082999025 bytes, 0 no buffer

     Received 194378 broadcasts (38990 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 38990 multicast, 0 pause input

     0 input packets with dribble condition detected

     226258242 packets output, 285168510077 bytes, 0 underruns

     0 output errors, 0 collisions, 0 interface resets

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 PAUSE output

     0 output buffer failures, 0 output buffers swapped out

We have multiple vlans running over this link, the backup/replication traffic is one of the vlans, that is in a vrf - I am rate limiting it on the L3 Int..i.e.

rate-limit input 40960000 7680000 15360000 conform-action transmit exceed-action drop
rate-limit output 40960000 7680000 15360000 conform-action transmit exceed-action drop

(This is done on a 7200)

Ok so with 100 Mb we can see some bursts causing those output drops - so we can tune further the queue2 - but possibly will get more drops in other classes - but may be that would be fine for your traffic.

Also you can try disabling the qos to see how that works.

Nik

HTH,
Niko

Hi Nik (Apologies for the delay in responding - Have been out of office)

If I disable qos (no mls qos), will that be adequate?  What happens to the existing buffer+output thresholds(Are they still active, or will they be removed?)...We also have "trust dscp" on some access ports(Will this be retained, or are qos markings simply passed through with mls qos disabled)?

We also have service policy on access-ports giving rdp+ssh higher priorty(marks it as af31)

Cheers

Review Cisco Networking for a $25 gift card