Solved: Re: Metro Ethernet QoS

de1denta · ‎02-12-2014

Hi,

We have a couple of sites that are connected to a metro ethernet VPLS service. Our main site is connected to the service using a port speed of 1Gbps with an access rate of 40Mbps and our remote sites are connected using a port speed of 100Mbps with an access rate of 20Mbps. All of the circuits are connected to Cisco 3560 switches.

I have noticed an issue where the remote site links are getting congested at 20Mbps inbound presumably because the main site is sending at a rate of 20Mbps which is overwhelming the remote sites 10Mbps connections. Also we are seeing packet loss when transmitting data from all sites because the service provider is rate limiting the speeds.

I can look at configuring the remote site switches with the srr-bandwidth limit 10 command to shape traffic to 10Mbps but this wont work for the main site as the minimum percent is 10 which means the minimum speed I can shape at 1Gbps is 100Mbps.

What is the best option here? Will I need to install an ISR or Metro-E switch in the main site to terminate the VPLS service and the apply proper MQC shaping and queuing?

Thank you

Joseph W. Doherty · ‎02-12-2014

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Ok that make sense. So on the main site switch I would need to map all traffic classes to one of the 4 egress queues and then shape that queue to 40Mbps. Is that correct?

Correct.

I think the challenge that I'm going to have is limiting the traffic to the remote sites as the service has been setup as an any-to-any VPLS network and not hub and spoke so I cant shape individual logical interfaces, I only have the one interface on the main site switch. If I shape out to 20 Mbps on the main site then that will affect all traffic.

How many remote locations? If four, or less, you could shape the different egress queues.

If more than four, and if you have spare ports, you might be able to "self-loop" different remotes through dedicated ports using VRFs, similar to what you could do with a 2nd 3560. (A real Rube Goldberg setup.)

What is the best approach here? I could look at reconfiguring the VPLS to use VLANs and sub-interfaces on the main site switch to emulate a hub and spoke network and then shape to each spoke?

Best would be a more suitable device, to manage your bandwidth.

Unsure a low-end MetroE switch would really allow the ideal (again, only if MetroE also polices bandwidth egress from their cloud to site) two tier shaping. Believe a 29xx router might work well. For 40 Mbps, a 2911 (35 Mbps) or 2921 (50 Mbps) should do.

You don't need different interfaces as long as you can identify traffic going to different sites.

View solution in original post

Joseph W. Doherty · ‎02-13-2014

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

There are different things you can try.

You could use a different queue-set to allocate all your buffer resources to the single queue you're using. (Don't forget to zero out the 3 other queue buffer reservations.)

If your not using an uplink port for your Metro hand-off, you might try that. Cisco documents the 3750X has 2 MB of buffer RAM for each set of 24 downlink ports, and 2 MB of buffer RAM for the uplink ports. Don't know what buffer RAM resources are on different 3560/3750 models, but suspect similar resource allocations. (i.e. uplinks ports get more buffer space per port)

If not already doing so, you might shutdown ports not being used. This might release those ports' buffer space.

As you hub has 20 Mbps available, you could split your sites across to shaped queues, each getting 10 (or so) Mbps. This would take advantage of your additional hub bandwidth w/o risking overrunning any one spoke. It will complicate buffer sharing, now you'll need to split buffers across both queues. You can try 50% reserved for each, or nothing reserved for each, or something in between. Hard to say which will work best for you.

Ideally, if you split your spokes across two shaped queues, you'll round-robin them on their average bandwidth consumption (this to insure best average usage of your bandwidth).

Lastly, there's my Rube Goldberg suggestion. If your 3560 had 48 copper ports, and 4 uplinks, you take LAN input on a gig uplink, then you can have 24 copper ports connected to the other 24 copper ports. Each set in a different VRF. This would allow you to run each port at 10 Mbps, and support 4 classes of service. The combined aggregate uses another uplink port connected to another uplink port, with another VRF, both running at 100 Mbps (requires multispeed copper SFP) but port limited to 20%. Again, your can support 4 classes of service. The last port, the remaining uplink, is used to connect to your MetroE at gig.

What the forgoing would allow would be, up to, 24 logical pipes running at 10 Mbps, each supporting 4 classes of service. You determine what maps to each logical pipe. The aggregate of the 24 logical pipes maps to another 20 Mbps pipe, also supporting 4 classes of service.

Basically your 10 Mbps logical pipes keep from overrunning spokes, but if there's more than 10 Mbps of traffic, you can provide up to 4 different class treatments. The aggregate is limited to your hub's 20 Mbps, again allowing you to provide 4 different class treatments.

An ISR, with its much more advanced QoS, does the above in software, and you can have more than 4 different classes.

View solution in original post

Joseph W. Doherty · ‎02-12-2014

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

If your provider supports 100 Mbps on you gig connection, you could run the hub's hand-off at 100 Mbps, and then rate limit the port for 40%.

Or, if you configure QoS, you could direct all traffic to a single queue and shape it for 40 Mbps. (NB: buffer tuning might be required if you activate QoS.)

That said, ideally you may want to shape the aggregate egress for 40 Mbps, and traffic to each spoke to 20 Mbps (if the provider also polices inbound).

BTW, 3560/3750 port shaping isn't precise, so you might need to set value lower than desired to avoid hitting MetroE vendor's bandwidth policer limits.

de1denta · ‎02-12-2014

Hi Joseph,

Unfortunately the provider cant do this so I'm stuck with 1 Gbps.

Ok that make sense. So on the main site switch I would need to map all traffic classes to one of the 4 egress queues and then shape that queue to 40Mbps. Is that correct?

I think the challenge that I'm going to have is limiting the traffic to the remote sites as the service has been setup as an any-to-any VPLS network and not hub and spoke so I cant shape individual logical interfaces, I only have the one interface on the main site switch. If I shape out to 20 Mbps on the main site then that will affect all traffic.

What is the best approach here? I could look at reconfiguring the VPLS to use VLANs and sub-interfaces on the main site switch to emulate a hub and spoke network and then shape to each spoke?

Thanks

Joseph W. Doherty · ‎02-12-2014

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Ok that make sense. So on the main site switch I would need to map all traffic classes to one of the 4 egress queues and then shape that queue to 40Mbps. Is that correct?

Correct.

I think the challenge that I'm going to have is limiting the traffic to the remote sites as the service has been setup as an any-to-any VPLS network and not hub and spoke so I cant shape individual logical interfaces, I only have the one interface on the main site switch. If I shape out to 20 Mbps on the main site then that will affect all traffic.

How many remote locations? If four, or less, you could shape the different egress queues.

If more than four, and if you have spare ports, you might be able to "self-loop" different remotes through dedicated ports using VRFs, similar to what you could do with a 2nd 3560. (A real Rube Goldberg setup.)

What is the best approach here? I could look at reconfiguring the VPLS to use VLANs and sub-interfaces on the main site switch to emulate a hub and spoke network and then shape to each spoke?

Best would be a more suitable device, to manage your bandwidth.

Unsure a low-end MetroE switch would really allow the ideal (again, only if MetroE also polices bandwidth egress from their cloud to site) two tier shaping. Believe a 29xx router might work well. For 40 Mbps, a 2911 (35 Mbps) or 2921 (50 Mbps) should do.

You don't need different interfaces as long as you can identify traffic going to different sites.

de1denta · ‎02-12-2014

Thanks Joseph, thats very helpful

de1denta · ‎02-13-2014

Hi,

I tested this today and I just wanted to post my findings.

I enabled qos on the main site 3560 switch and then mapped all cos values to output queue 1. I then enabled shaping on the 1Gbps interface connecting to the Metro-E provider using command 'srr-queue shape 110 0 0 0' Using 110 results in a shaped rate of 9 Mbps which is less then the providers policer on the remote sites.

After enabling the above packet loss was worse although latency had improved during peak utilization. I can see that we are no longer congesting the remote site links but we are now seeing a high level of output drops on the main site 3560 Metro-E interface which I'm presuming is due tail drop because there are not enough buffers allocated to the queue.

I attempted to tweak the buffers by allocating more of the buffer space to queue 2 with a full reservation, I also ensured that the traffic was using threshold 3 so that packets were not dropping because of congestion avoidance. Unfortunately this didn't improve packet loss.

Am on on the right lines here and is there anything else that I can do to reduce the output drops?

Thank you

Joseph W. Doherty · ‎02-13-2014