Bandwidth Planning Query

paulstone80 · ‎11-22-2012

Hello,

I have a query regarding bandwidth capacity planning.

If you monitored a 4 Mbps WAN link over a number of months, and saw that the bandwidth utilisation never peaked above 1.8 Mbps, would it be safe to assume that you could reduce the bandwidth on the link to 2 Mbps without having an impact on performance?

Obviously burst traffic that pushes the utilisation over 2 Mbps will be affected, but I wondered if there were any of factors that come into play that would affect performance.

Thanks,

Paul

HTH Paul ****Please rate useful posts****

Joseph W. Doherty · ‎11-23-2012

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

If you monitored a 4 Mbps WAN link over a number of months, and saw that the bandwidth utilisation never peaked above 1.8 Mbps, would it be safe to assume that you could reduce the bandwidth on the link to 2 Mbps without having an impact on performance?

No, I don't believe that's a safe assumption.

Obviously burst traffic that pushes the utilisation over 2 Mbps will be affected, but I wondered if there were any of factors that come into play that would affect performance.

Yes, many.

Additional comments . . .

Basically, all application traffic would like, from the network, zero delay, no frame/packet loss and infinite bandwidth. Of course, providing such is impossible.

You can work to provide the least delay, least frame/packet lost and most possible bandwidth, which if provided, generally makes application traffic "happy", but this can be prohibitively expensive. So, what we really work toward is providing, at least, the minimum delay, minimum frame/packet lost and least amount of bandwidth that a network application will consider acceptable or good for its service requirements. Service requirements tend to vary much based on the nature of the network application being supported.

For example, given the stats you've provided me, if all your traffic was e-mail server-to-server data, not only might reducing the bandwidth from 4 Mbps to 2 Mbps be fine, if you further told me e-mail only was required to be successfully transferred within 24 hours, perhaps even much less than 2 Mbps might be fine too. The real question would be, how much e-mail volume is transferred across 24 hours and do you have enough bandwidth to support that volume in 24 hours?

Conversely, if you told me all your traffic was (just) VoIP using g.729, dropping to 2 Mbps might be very damaging to the the quality of your VoIP calls. Further, assuming your stats are from typical 5 minute polls, the 4 Mbps you have now might actually be insufficient.

If you have mixed traffic types crossing this link, each needs analysis against the aggregate usage.

Doing capacity planning (or monitoring) "right" can be rather involved; i.e. many factors. Because of this, most seem to take the practical solution if users complain, add bandwidth until users stop complaining. (No reason why the inverse wouldn't work too, i.e. reduce bandwidth until users complain. Intentionally breaking the network can be damaging to one's reputation, though.)

BTW an empirical approach doesn't work too badly although many don't realize how even basic QoS can skew the bandwidth "needed". For example, if your 4 Mbps is using FIFO queuing, user perceived performance might be just as well with 2 Mbps if you use fair-queuing. (Note: without knowing much, much more about your traffic, not saying this would be actually true for you.)

PS:

Also BTW, some Cisco IOS images have a very interesting bandwidth analysis feature, look for Corvil Bandwidth on Cisco's main site.

paulstone80 · ‎11-23-2012

Hi Joseph,

Thanks for your reply, I thought there was more to it than my original assumption.

The reason I ask is that we had a site (lets call it site A) running on a 4 Mbps MPLS link, with 15 users predominantly using Terminal Services, and one user with a laptop also on terminal services, but occasionally using Outlook and files locally. We monitored this site at 1 minute intervals, and for a consolidated week or monthly report, the bandwidth utilisation was running at around 1.5 Mbps.

In another part of our company we also have Site B, where there are 5 users using a Citrix environment. Site B was monitored at 5 minute intervals and the consolidated weekly/monthly report shows bandwdith utilisation at 0.5 Mbps.

Users from Site A were relocating to Site B, so someone in our business looked at the two bandwidth reports from Site A and Site B, put two and two together, and decided that there was no requirement to upgrade the bandwidth at Site B because 1.5 + 0.5 = 2, therefore there is sufficient bandwidth.

The users from Site A have moved to Site B, and are now reporting that their Terminal Services sessions are running painfully slow.

Blame has been pointed at our monitoring system at Site A, and that the data was incorrect, but I don't think this is the issue. What factors affect an applications performance on different bandwidths? Is it to do with the TCP window size and that this can be much larger, is it because higher bandwidth usually means lower latency?

Thanks,

Paul

HTH Paul ****Please rate useful posts****

Joseph W. Doherty · ‎11-23-2012

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

The users from Site A have moved to Site B, and are now reporting that their Terminal Services sessions are running painfully slow.

I did write about users complaining

"Screen scraping" (e.g. terminal services, Citrix) applications tend to be very sensitive to latency. Not too keen on drops, either. Usually not too bandwidth intensive, although remote printing or file copying can be an issue.

To insure low latency, you often need lots of excess bandwidth. Generally queuing theory shows utilization under about 1/3 has very little queuing, between 1/3 and 2/3 you may queue, but normally it's shallow, but once past about 2/3 queuing starts to exponentially increase.

The original 1.5 Mbps usage across 4 Mbps is under 40%; generally "safe" and avoids queuing latency.

.5 Mps on 2 Mps, is even better.

.5 and 1.5 on 2 Mbps, is 100% and likely to have massive queuing delay. I'm not surprised Terminal Services users are now complaining.

So, it appears not so much your data was wrong, just interpretation of the significance of that data vs. your type of traffic.

For something like Terminal Services, you should try to get its utilization below 2/3, i.e. 3 Mbps. This will likely provide acceptable performance. For good performance, try to get its utilization below 1/2, i.e. about 4 Mbps. For ideal performance, try to get its utilization below 1/3, i.e. about 6 Mbps.

PS:

What factors affect an applications performance on different bandwidths? Is it to do with the TCP window size and that this can be much larger, is it because higher bandwidth usually means lower latency?

Interactive type applications normally don't need to muck about with TCP window size (or perhaps even insure full MTU is being used). (NB: TCP's RWIN, to support full transmission rate does need to support path's BDP. Again, doubtful if this is an issue for Terminal Services.)

Higher bandwidth's can reduce serialization delay, which is good, but can be one of those things that can be prohibitly expensive across a WAN. Less expensive alternatives might exist, such as compression.

BTW, when dealing with "MPLS", you can run into shared port congestion issues. For example, branch might only a megabit or two of bandwidth, and "HQ" multi-megabit, but it might actually be oversubscribed. If so, you can have "invisible" (to you) transient congestion on WAN MPLS provider's interface. This doesn't show on typical polled stats, but if happening, can be very adverse to something like Terminal Services type traffic.

paulstone80 · ‎11-23-2012

Hi Joseph,

Thanks again for your reply, that's really helpful and explains a lot.

I find your point about the safe percentages of bandwidth utilisation for queing interesting, because we have a few sites on our WAN that complain of poor network performance, and the bandwidth stats for these sites show utilisation at 75% - 80%. These are predominantly Citrix users that experience the slow downs. The feedback I get from our Network manager is that the bandwidth isn't being maxed, therefore the WAN isn't the issue. I thought this didn't sound right!

Can you recommend any further reading that will give me a better understanding of this subject?

Thanks,

Paul

HTH Paul ****Please rate useful posts****

Joseph W. Doherty · ‎11-24-2012

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Paul, sorry, I don't know of a good single source of information that would further explain.

There's much information you can find on the Internet about queuing theory. In particular look for information on M/M/1. There's also (free) queuing "calculators" available on the Internet.

Understand, simple queuing theory, for M/M/1 assumes an average arrival rate and equal time to process each arrival. With most network traffic, the arrival work (per sender) often varies because of variable quantity which makes the processing time variable too. More advanced queuing can account for this, but as long as you realize actual queuing delay is likely to be even more variable than a simple M/M/1 calculator will predict, such a calculator can still provide, sort of, "best case" performance.

Unsure Citrix documents recommend performance requirements, but again, since it's effectively screen scraping, you want to avoid any additional latency, such as might be added by queuing transient bursts.

PS:

If you do file copying or printing through Citrix, NBAR can recognize Citrix traffic types and allow QoS to treat such traffic differently then actual "screen scraping".

Fair-queuing of even "screen scraping" Citrix traffix can avoid one particularly high volume flow being especially detrimental to other Citrix flows. (This is an example where sometimes QoS may allow usage of less bandwidth, as you use your bandwidth more efficiently.)