Solved: Re: Need help understanding QoS and police command - Page 2

philldogger · ‎12-02-2010

Let me first state I am not an WAN admin, but do architect Exchange and Sharepoint, and have some basic questions that I can't seem to get answered from our own internal IT. Here it goes:

Our WAN infrastructure comprises of many MPLS sites and we use priority QoS in this order: VoIP, oracle apps, RDP/intranet, default. What happens, almost every day is outlook cachemode profiles are setup for a new machine, wiped machine, etc...and these large OST files begin to congest the WAN lines. While they don't effect VoIP, oracle, rdp/intranet, because of priority it basically kills everything else in default like internet, file transfer, etc... We don't want to move internet and file transfer traffic out of default because we don't want a reverse effect where someone starts downloading large files from the internet and next thing we know we have email issues because it's a lower priority than internet now. What I'd like to see is exchange traffic bumped up out of default into it's own priority, and that priority cannot exceed a certain maximum....EVEN when there is no congestion on the WAN line. For example, exchange priority is set to max at 30% on a line, 5 users begin to cache large outlook profiles, and the users browsing the internet from the default queue don't see a performance hit since the default queue continues to use whatever is available...and it just so happens that 60-70% is available because now exchange can only max out at 30% even though it's higher than the default queue.

I read this is possible associating a police command with a priority....but I don't fully understand. thanks for the help!

philldogger · ‎12-02-2010

Thanks Cadetalain. Now when you say traffic dropped would that mean for my scenario that exchange would just drop back to 30% after going over that during non-congestion? I thought police meant it couldn't exceed a maximum even if it wanted to...so what is there to drop? So I want to create exchange class with a policy-map policed to never exceed 30%....do I have that right?

niro · ‎12-02-2010

Chad, if I was in your shoes I would ask the networking team to shape your exchange traffic with a relatively low value, maybe 10% or 20%. It shouldn't cause issues with other traffic since it sounds like they have more business critical application already in different queues. What you want to do is prevent exchange from fighting other types of traffic in the default queue, especially during new outlook client caching a large mailbox for the first time.

if you police this queue, than you may run into issues where during this first time caching you may have users complaining of delayed mail and other issues, even if there is plenty of bandwidth available. If it's available, why not use it right?

I tend to only police traffic I consider malicious or just wasting our resources.

philldogger · ‎12-02-2010

I'm pretty sure the default queue has no allocated percentage and just gets what is available. Since Exchange and internet traffic are in the default currently, even if I move exchange to it's own, it will crush the default queue until some percentage is allocated to it. Or I can just create an exchange queue with police. Now I see your concern with police, but what really happens if 100 people have exchange requests and 3 of them are receiving large amounts of data....doesn't it just allow all 100 requests to come in divided equally, yet at 30% of the line? For example Bob is downloading a 2gb OST from exchange for his outlook profile which will take up it's 30% no doubt, however Sally during this get's a 2mb email sent to her outlook. Sally's email doesn't have to wait for Bob's download to finish from exchange before showing up does it?

I know today it doesn't appear to work that way...we ahve a site in Boise that has it's WAN 100% full every other day because they are always reloading machines out there and downloading cachemode outlook profiles. However we don't have email slowness complaints....just internet complaints because they all share the default queue.

thanks for helping me understand...this forum community is great and responsive.

lapinmort · ‎12-02-2010

QoS definitely requires a lot of planning any way you approach it.

I would advise you to get familiar with the behavior of the different traffic flows going through your router, and identify them first. Also find out the different queueing strategies your router supports, their pluses and minuses. Only then you can start planning a QOS strategy that works for you.

As for your statement on Exchange. Let's say you have five classes of traffic, classified by DSCP value:

EF

AF31

AF21

AF11

BE (Best effort)

Then you decide that you will tag only Exchange packets with DSCP value AF31. Then you create a policy stating how AF31 tagged packets are to be treated in terms of the bandwidth they will take (in percentage or bitrate), and what priority queue that traffic will take on your exit interface's transmit (TX) ring.

I don't know if it will make the concept of queuing in QoS easier to graps, but think of your router's interface as a gattling gun (TX/RX rings) fed by multiple ammunition belts (queues). Each queue contains a different class of ammunition (one will contain EF tagged traffic, the next AF31 tagged traffic, etc...). Your queuing strategy is the algorithm (LLQ, WFQ, CBWFQ, WRED, etc...) on the gun that will decide which ammunition belt the gun feeds off first for firing, and in which order the ammo will be fired. Your gattling gun's maximum firing rate is your WAN bandwidth. If you use policing, your gun sets a strict maximum feed rate for each ammo belt, and will either drop out excess bullets from belts violating the feed rate without firing them, or queue them in a lower priority ammo belt for re-feeding depending on our specifications. When you use traffic shaping, your ammunition feeding mechanism creates a buffer that takes in excess rate ammo from the belts until the buffer is full, while the gun keeps firing. By classifying only the Exchange traffic as say AF31, you effectively load it as ammunition in it's own belt, it will have its own queue.

So yes, you will have to spend some time thinking and planning your strategy. No other way around QoS.

Rado

philldogger · ‎12-02-2010

Here's the current breakdown of our QoS:

AF5 - VoIP

AF41 - 40%

AF31 - 20%

AF32 - 20%

AF21 - 8%

AF22 - 8%

Default - 0%

I realize the math doesn't equal 100% and not sure if that's normal, but this what the network team has given me. Additionally 60% of the QoS entries are old and retired servers, and it's never been cleaned up or updated. Let's now even start down that path of why this is...

Here's what I'd like to recommend, let me know your comments:

AF41 - 40% - Oracle, Ceridian, Kronos

AF31 - 20% - Tandberg, OCS

AF32 - 20% - Intranet, RDP

AF21 - 15% - Exchange - POLICE (Cannot exceed maximum)

Default - 5% - Everything else

thanks,

niro · ‎12-02-2010

Looks good but you're forgetting VoIP:

EF - 10% VoIP - this heavily depends on how many simultaneous calls are expected to go over the line.

AF41 - 30% - Oracle, Ceridian, Kronos

AF31 - 20% - Tandberg, OCS

AF32 - 20% - Intranet, RDP

AF21 - 15% - Exchange - POLICE (Cannot exceed maximum)

Default -5% - Everything else

You can alway change AF21 to be shaped instead of policed and play around with the percentages later if you run into issues.

lapinmort · ‎12-02-2010

I find it hard to believe that your network people allocated 0% bandwidth to the best effort traffic. You usually want to allocate %25 of your bandwidth to best effort traffic, because it accounts for all the traffic you don't mark with a DSCP value.

Your priority queue traffic should not take more than 33% of your bandwidth.Since here you are using 10% for your VOIP traffic (I assume it's in the priority queue), that should leave 65% of your bandwidth to allocate to the AF41, AF31, AF32, and AF21 traffic.

This is where knowing your traffic flows comes in. It allows you to decide which applications are bandwidth hungry and allocate bandwidth accordingly. You also get to see which ones are more sensitive to delay, and you prioritize accordingly. Say if the Tandberg traffic gobbles more bandwidth than the Oracle traffic, but it has a lesser priority, you may assign more bandwidth in the AF31 class, than in the AF41 class. I would put Exchange in the AF32 class, since that one has a higher drop probability than AF31 anyway (the 2 in AF 32), and Intranet and RDP in the AF21 class, unless you use that for network management, in which case I would give it a higher priority. You want to be able to manage your appliances even when there's congestion.

philldogger · ‎12-03-2010

Maybe this will help...here is current breakdown with percentages and apps/ports

AF5 - VoIP - Not used yet

AF41 - 40% - Oracle

AF31 - 20% - Handheld telnet, Kronos, RDP

AF32 - 20% - Intranet, Citrix, DCs, Ceridian

AF21 - 8% - Any 172 to Any 172 (essentially all private traffic)

AF22 - 8% - Oracle printing

Default - 0% - Any non-private traffic, so internet, public FTP, etc...

All MPLS branch sites access the internet out of the datacenter firewalls across the WAN.

Knowing this, what would you guys recommend? We use to have the old exchange servers under AF32, and that still was a crappy solution. Now you can see that the new exchange servers (which have been online for a year now) fall under AF21...the internal catch all class.

thanks for the help guys....really appreciate the feedback I'd been seeing.