QoS questions - choppy voice across WAN

wilson_1234_2 · ‎05-05-2011

Experiencing choppy voice across an MPLS network.

Assuming the carrier is not the problem, I would like to verify the existing QoS policy is
ok, along with some questions to ensure I am understanding QoS.

Currently the existing set up has a hub site with an Avaya phone system and end users
uplinking the phones on dedicated 3560 PoE access switches. The 3560s have only the Avaya
phones connected. They are configured as access ports, with a default auto QoS config enabled
per Cisco markings.

The 3560 access switches uplink to a 6509 with QoS DISABLED, no QoS. The 6509 is set up
as a collapsed core/distribution layer and links to a 3845 router that connects to several
branches across an MPLS network. The 3845 has a standard QoS config per standard DSCP
markings.

The branch is configured mnuch the same way, but has dedicated 3750s as the voice
switches, configured as access ports. In hub site and branch, the workstations are on
differnet switches, the phones are uplinked to access ports to dedicated switches.

First of all, I have some general questions regarding QoS:

1. Is my understanding correct that QoS at the Access layer, would be configured as layer
2 COS and the mapping on the switch (shown below), sets the layer 2 to layer 3
associations?:

mls qos map cos-dscp 0 8 16 24 32 46 48 54

For example, the above line maps
COS 0, 1, 2, 3, 4, 5, 6, 7
to
dscp 0, 8, 16, 24, 32, 46, 48, 54

so that (for example) COS 5 is equal to dscp 46?

2. Setting the switchport that uplinks the WAN router to "trust dscp", ensures the COS to
dscp mapping from layer 2 to layer 3 stays intact?

3. As mentioned in the above, the hub distribution/core 6509 does not have QoS enabled. Is
this a problem, or does the fact that the packets are marked in the access switch where
the phones uplink (QoS enabled) via the COS map, satisfy the QoS requirement for the voice
traffic, and this marking is carried through to the router?

4. Do the phone ports being configured as access ports and not access/voice ports affect
QoS at all? The Avaya phones are uplinked directly to the 3560 as access ports, with no
voice vlan configured. The access port vlan carries the voice traffic on dedicated 3560s.
The PCs are in a different vlan and on a different switch, so the voice vlan is not being utilized on the 3560s.

5. If Avaya sets voice to COS 5 and Cisco set it to COS 6, how would I configure this to
match up?

Roman Rodichev · ‎05-05-2011

If I were you, I would switch all 3560 ports from "mls qos trust cos" to "mls qos trust dscp", including the uplink port to the 6509. This will ensure that DSCP markings by Avaya phones (more important) and DSCP marking of the incoming traffic from the WAN (not as important) is left unchanged. If 6509 has qos disabled (no global "mls qos" command) it's not a big deal, except that you don't have queueing on the 6509, it's probably ok for your LAN. When qos is disabled the switch doesn't touch markings and leaves them unchanged, therefore using "mls qos trust" command on 6509 will not do anything.

Either way, I would setup RITE on the router, capture traffic into a file, copy the file to your PC and look at it with wireshark (http://www.cisco.com/en/US/docs/ios/12_4t/12_4t11/ht_rawip.html), to make sure that phone traffic coming from the LAN are marked appropriately.

More importantly, check exactly how avaya marks voice packets. I've seen many installs with non-cisco voice systems where markings are screwed up and don't get matched by the WAN qos policy. Make sure voice packets are DSCP EF (46) and voice control packets are either DSCP CS3 (24) or AF31 (26). Cisco callmanager in the past used AF31 for voice control, and then switched to CS3 on newer systems. Avaya might have it set to CS3, AF31, CS4, AF41, or even EF. Check it. Some phone systems adjust marking dynamically, I would disable that feature if that's the case.

sean_evershed · ‎05-05-2011

Hi,

I would not make the assumption that there is no problem with carrier.

It's important that QoS markings are retained end to end from source to destination.

You need to check what QoS SLA you have purchased from your carrier. Your router's QoS markings need to match these.

If not then the carrier will remark them to a lower value or drop them all together. Then when the link becomes congested packets will be dropped.

A good way to check whether a packet marked as EF at the branch still retains this marking when it reaches the head office is to use Netflow.

I suggest you use the command show show policy-map interface XXX to check where packets are being dropped on the router.

Also review how much bandwidth you have carved out on your link for voice traffic. This needs to match how many calls are expected to be made during the busy hour. I have seen some recommendations where this need to be as high as 33% for voice and up to 10% for voice signalling.

Kishore Chennupati · ‎05-05-2011

Hi ,

I completely agree with Sean that you need to ensure that your ISP is not tampering with your marking and honoring them.

you can do some tests by sending some pings with different TOS values and see if they are reaching the destination unchanged(check in netflow). Do the same test the other way around as well.

Choppy voice can also result in bad interfaces, speed, duplex settings errors on the interfaces etc as well. It doesnt necessariyl always have to be QoS tat is not configured properly.

For eg: You can have all your Qos confirgured properly but if the physical layer is bad then you can have issues.

If you are facing any particular issue, post it here and we can try to help you.

HTH

Regards

Kishore

please rate if helpful

wilson_1234_2 · ‎05-18-2011

Thanks for the reply.

I have been able to verify the Avaya phone is marking the voice as dscp 46.

According to Avaya:

"The Avaya IP Phones uses a p-bit value of 6 and DSCP value of 46 for Voice Media. For Voice

Signaling, the Avaya IP Phone uses a p-bit value of 5 and DSCP value of 40."

I do not see any cos marking when capturing packets with wireshark. If the phone was marking cos, I should be able to see it with wireshark correct?

Sense I see the dscp marking, then your suggestion to configure the ports to "trust dscp" makes sense.

Also, Can you assist with clearing up some misunderstanding about what exactly the switch is doing with the cos map and the different trust options?

I have seen several different explanations,

I have seen that if the switch is set to "trust cos" on a port, this would be the reason for using a cos-dscp map. The reason is the switch will prioritize based on the dscp value, and the ingress traffic being marked with a cos value, needs to be translated to something the switch can understand.

Is that correct?

lgijssel · ‎05-18-2011

As you state the phones are connected to access ports, you cannot use the 'trust cos' setting.

Expl: The switch considers Untagged packets as having a cos of zero. DSCP is then rewritten accordingly.

This explains your trace with all packets marked as dscp 0.

When using 'trust dscp', the situation is reversed i.e. cos is rewritten based on dscp. This should deliver better results.

regards,

Leo

wilson_1234_2 · ‎05-18-2011

But I do see dscp marked as 46 in the trace, I do not see anything for the cos value in the captured packet.

This is coming from an Avaya phone and wireshark

Differentiated Services Field: 0xb8 (DSCP 0x2e: Expedited Forwarding; ECN: 0x00)
1011 10.. = Differentiated Services Codepoint: Expedited Forwarding (0x2e)

According to the Avaya documentation:

"The Avaya IP Phones uses a p-bit value of 6 and DSCP value of 46 for Voice Media."

Are you refering to "untagged" as VLAN tag? If so, then I should see a DSCP value of 0 correct?

wilson_1234_2 · ‎06-01-2011

Ok,

I have ensured that DSCP is trusted from endpoint to endpoint. I have captured traffic on the router on the branch end (where I am) and I have verified the rtp stream is getting marked on the source and destination packets as ef (46).

DSCP is trusted on the access ports and the uplinks.

One thing that is different than I have seen is the uplinks are layer 3 fro:

Access > Distribution > Core1 and Core 2 (via port group)

The distribution switch has a port group of two ports, one to core 1, the other to core 2, I am not sure how the aggregate port would be affecting the voice traffic.

I still hear the skipping in a voice conversation. In the wireshark capture, it is showing a lot of jitter.

If I am seeing on my end that the traffic sourced on the remote side is marked as ef, then that would mean the traffic is getting marked correctly and staying instact all the way to my end correct?

Unless it is getting re-marked somewhere along the way?

Also, is it possible that the traffic is getting marked and prioritized correctly with a good policy, but the queueing is incorrect or doesn't exist somewhere?

Or are they not two separate entities?

Joseph W. Doherty · ‎06-02-2011

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

If I am seeing on my end that the traffic sourced on the remote side is marked as ef, then that would mean the traffic is getting marked correctly and staying instact all the way to my end correct?

Unless it is getting re-marked somewhere along the way?

Usually traffic that has the same emerging DSCP end-to-end has not been remarked, although that's possible, but very unlikely.

Also, is it possible that the traffic is getting marked and prioritized correctly with a good policy, but the queueing is incorrect or doesn't exist somewhere?

Very possible, especially when working with a WAN cloud beyond your control. Again, if you're using a MPLS cloud, have you verified what treatment the MPLS vendor provides for your marked/tagged packets?

Or are they not two separate entities?

They are different. ToS marking's (e.g. DSCP), alone, don't force different treatment. Distinct treatment doesn't require ToS markings.

Joseph W. Doherty · ‎05-18-2011

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

When there are VoIP issues across a WAN cloud, most often (but not always) they are caused by congestion events at WAN cloud ingress or egress since they are often common bandwidth bottlenecks.

When dealing with a MPLS cloud, especially if you have multipoint any-to-any communications, working with the MPLS vendor's supported QoS models are generally a necessity.

PS:

Some notes:

CoS is a L2 QoS marking which requires VLAN tagged frames.

DSCP is a L3 QoS (ToS) marking.

On many (all?) Cisco switches, having QoS support globally disenabled means the switch just ignores QoS. If QoS support globally enabled, usually (always?) on Cisco switches, by default, QoS ingress marking is not trusted and is reset! CoS and/or ToS will generally be used for egress queue selection.

On routers, L3 ToS is usually left alone and ignored, unless device configured to do otherwise.

Both CoS and ToS can modified along the way.

When possible try to use just L3 ToS since not tied to VLAN tagged frames, however pure L2 switches can often only work with CoS.

wilson_1234_2 · ‎06-06-2011

Ok,

I have a little more information:

as mentioned above, the endpoints (branch and hub) are configured (from phone to switch to router) to trust DSCP.

I have the circuit provider information, which is an AT&T MPLS PVC and it is showing a CoS table.

There is an Ingress and Egress CoS profile.

If AT&;T is showing their QoS in a CoS format, whould their CoS be recognized by our DSCP config?

And Vise Versa, would they be providing their CoS to us in DSCP format?

I can see some calls from the remote end are sourcing the ef as 46, but if they are only using CoS, then my thinking is that the marking we applied on the remote end couls stay intact, but not get prioritized while going through the AT&T CoS priority queues.

Am I any where near correct on this, or all wet?

Joseph W. Doherty · ‎06-06-2011

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

When you describe AT&T CoS, you're discussing their class of service template(s), not L2 CoS, correct? If you are, I recall what generally happens, they preserve your L3 ToS (DSCP) marking, but if it's not one they map into one of their dedicated classes, I believe it's treated at Best Effort. One exception might be for EF, some vendors will drop if over contract, but I don't recall AT&T policy. They might have a document they can provide explaining how their QoS/CoS model works.