cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1375
Views
10
Helpful
6
Replies

QoS flowdrops?

jmcgrady1
Level 1
Level 1

On an ISR4431 router running 16.12 i am seeing flowdrops on my class-default. The service is presented at 1000mb/s and policed to 100mb/s by the provider. Queue depth doesn't appear to be an issue. What typically causes flow drops on an uncongested link? Even short bursty traffic should be smoothed by the buffer?

 


class-map match-any CM-BULK-DATA
match ip dscp af11 af12
match protocol cifs
match protocol exchange
match protocol capwap-data

class-map match-any CM-CRITICAL
match ip dscp cs2 af21 af22 af31 cs6 cs7
match protocol active-directory
match protocol sap
match protocol sqlserver
match protocol snmp

class-map match-any CM-REALTIME
match ip dscp cs3 af41 af42 cs5 ef
match protocol rtp
match protocol sip
match protocol skype
match application cisco-phone
match application h323
match application ip-camera
match application rtp
match application sip
match application webex-meeting

class-map match-any CM-SCAVENGER
match ip dscp cs1


policy-map PM-QOS-CATEGORIES
class CM-REALTIME
priority percent 18
class CM-CRITICAL
bandwidth remaining percent 27
class CM-BULK-DATA
bandwidth remaining percent 5
queue-limit 4000 packets
class CM-SCAVENGER
bandwidth remaining percent 1
class class-default

queue-limit 4000 packets
fair-queue

 

policy-map PM-WAN-QOS
class class-default
shape average 85000000
service-policy PM-QOS-CATEGORIES

 

interface GigabitEthernet0/0/2
service-policy output PM-WAN-QOS

 

6 Replies 6

Robert C
Level 1
Level 1

I have experimented a bit with bursts - to test this I used wanperf  and set the sending timeslot (Tc) to different values. When I send with a Tc of 100 ms, I get a big microburst at the beginning of the sending timeslot (Line Rate 1 Gbit/s) and then noting for the rest of the timeslot.

The result (with small queue-limits)  is that you get an empty queue (the queue length only reflects a too small available bandwidth) and drops due to bursts. You get a smaller bandwidth as the router would send without a QoS-Policy, which is a very bad user experience.

 

With your queue-limit of 4000 packets is only ca.15 ms at 1 Gbit/s and 500-bytes-pakets. My Experience with ISR4K is that the queue limit is better than the default 64 Bytes, but it does not handle long Microburtsts well (the question is if we can still call it a microburst when it lasts 100 ms...). I prefer Using queue lengths in ms with IOS XE, as they a more understandable and gives a predictable delay/jitter. The Cisco Live Session BRKARC-2031 (QoS Config Migrations From Classic IOS to IOS XE) is a great resource for this.

 

 

By the way, I always use  a "bandwidth remaining" configuration also for Best Effort, do someone have an opinion on the best practice here?

something like this?  

 

class-map match-any voip
match dscp ef
match dscp cs5
class-map match-any pdata
match dscp af21
match dscp cs2
match access-group name NMS
class-map match-any control
match dscp af31
match dscp cs3
class-map match-any video
match dscp af41
match dscp cs4
!
policy-map remark-dscp
class class-default
set dscp default
policy-map police-2M
class class-default
police 2000000
policy-map shape-2M
class class-default
shape average 2000000
policy-map QOS-5-50M
class voip
police cir percent 25
conform-action transmit
exceed-action drop
priority
class video
bandwidth remaining percent 30
class control
bandwidth remaining percent 10
queue-limit 256 packets
class pdata
bandwidth remaining percent 20
class class-default
bandwidth remaining percent 40
random-detect
fair-queue
fair-queue queue-limit 1024 pre-classify
queue-limit 4096 packets
policy-map shape-20M-QOS
class class-default
shape average 19000000
service-policy QOS-5-50M
!

Joseph W. Doherty
Hall of Fame
Hall of Fame

"Queue depth doesn't appear to be an issue."

If the stats are showing output drops, unless you're dealing with some bug, queue or buffer overflow is the cause of drops.

"What typically causes flow drops on an uncongested link?"

How do you define an uncongested link?  Perhaps low average utilization?  If so, it's interface congestion which causes drops, which can also exist in conjunction with links having overall low average utilization.  (BTW, I consider an interface congested any time a frame cannot be transmitted immediately.)

"Even short bursty traffic should be smoothed by the buffer?"

Assuming the buffer is big enough.  Also assuming, the buffer is not too large.  (Sometimes a too large buffer leads to a huge number of drops.)

 

 

 

When looking at the policy-map stats for my 100mbps interface, i saw two concerning things under class-default: the 5 minute offered drop rate was over 50Kbps, and flow drops were over 700K. Ive investigated what kind of traffic is matching class-default (is there a way the switch can give me a breakdown like it does for the other class maps?).  The largest, although not particularly large, is some scada traffic self marked as cs0.  Being scada, it should be classed as critical. Ive added an access list to match it and include it in my critical class map. That has seen the drop rate drop to zero. However the flowdrops are still high. 

 

The 'shape average' is 85%.  I'll now vary the queue-limit to see if that helps. Would a time based queue limit rather than packet size be of benefit?

In your case, packet-size queue-limits are fine. Time based queue limits are nice when you have on policy for many sites with different bandwidths.

 

Be aware that queue-limits produce delay and jitter. If you are using a SCADA, you might want to keep the delay low. That's a trade-off between delay/jitter and drops.

 

"However the flowdrops are still high."

Sorry, I overlooked that earlier you also did explicitly mention FLOW drops!

Since you have FQ enabled in class-default, the issue might be as simple as your FQ individual flow queues are too small.

Some platforms will note their size, when showing policy-map interface stats.

Some platforms will also allow you to adjust their sizes too.  (Don't recall the command.  See what options you have, in class default, when you use a question mark.)

Do note (I recall?) FQ flow queue sizes are not impacted by the class "standard" queue-limit command.

Review Cisco Networking products for a $25 gift card