02-04-2017 04:44 PM - edited 03-08-2019 09:11 AM
Hello Experts,
I have a question regarding packet loss and Cisco QOS. I am having a packet loss issue. I have a Cisco VOIP system running with no issues. I have a 1Gb fibre link to a remote office across town. We have had this setup for several years without any problems. Recently we have installed a Netapp FAS in the remote office as a backup NAS.
The backups are scheduled to take place after hours at night. When I run the backup I am noticing quite a bit of packet loss which is to be expected. The backup session is opening multiple streams to connect to the backup device and the QOS is countering it. (As to be expected).
Looking at both the core and remote switch configurations I can see that mlp qos is enabled. It was installed by the phone company. I understand that it is enabled on a global basis. Is there any way that it can be bypassed on a certain port? I understand that it is a global setting required for the Voip system to operate. I am also investigating a way of splitting the fibre between the sites to isolate the traffic. I was just wondering if anyone had any suggestions?
Thanks
Solved! Go to Solution.
02-06-2017 05:12 PM
If you're running NAS backups, some drops are likely.
However, QoS correctly configured often can minimize drops, but unsuitable QoS can increase drops. I.e. if you see drops decrease when you remove QoS, it's more likely your QoS isn't optimal for your situation, not that QoS, itself, is the problem.
02-07-2017 04:45 AM
Agree with Mr. Doherty in that QoS may not be the problem. However, QoS in and of itself, in this form can be confusing to say the least. Like the idea of removing the QoS just to see how the backups run. Here's a link that may be helpful explaining the drops you are seeing:
http://www.cisco.com/c/en/us/support/docs/switches/catalyst-3750-series-switches/116089-technote-switches-output-drops-qos-00.html
Aside from that, I know you do the backups after hours, but is there anything else going on while the backups are running? Are the backups using DSCP 0? Could you post the output of "sh mls qos int g1/1/4 statistics" It my provide a clearer picture of what traffic is being dropped.
If you want to start fresh, you can also clear these counters with the "clear mls qos int g1/1/4 statistics" command.
02-05-2017 03:58 AM
Would it be possible to post sanitized configs of the core and remote switch?
02-06-2017 09:08 AM
Do you need the whole config or any specific areas? I have C/P the mps qos section and the interface config on the port to the remote office. (Core Switch config section)
!
interface GigabitEthernet1/1/4
description Uplink_To_RemoteOffice
switchport trunk encapsulation dot1q
switchport trunk native vlan 4
switchport trunk allowed vlan 2,4,15,26,128,172
switchport mode trunk
srr-queue bandwidth share 1 70 25 5
srr-queue bandwidth shape 33 0 0 0
queue-set 2
priority-queue out
mls qos trust dscp
ip igmp filter 2
!
!
mls qos map policed-dscp 0 10 18 to 8
mls qos map cos-dscp 0 8 16 24 32 46 46 56
mls qos srr-queue output cos-map queue 1 threshold 3 4 5
mls qos srr-queue output cos-map queue 2 threshold 1 2
mls qos srr-queue output cos-map queue 2 threshold 2 3
mls qos srr-queue output cos-map queue 2 threshold 3 6 7
mls qos srr-queue output cos-map queue 3 threshold 3 0
mls qos srr-queue output cos-map queue 4 threshold 3 1
mls qos srr-queue output dscp-map queue 1 threshold 3 32 33 40 41 42 43 44 45
mls qos srr-queue output dscp-map queue 1 threshold 3 46 47
mls qos srr-queue output dscp-map queue 2 threshold 1 16 17 18 19 20 21 22 23
mls qos srr-queue output dscp-map queue 2 threshold 1 26 27 28 29 30 31 34 35
mls qos srr-queue output dscp-map queue 2 threshold 1 36 37 38 39
mls qos srr-queue output dscp-map queue 2 threshold 2 24
mls qos srr-queue output dscp-map queue 2 threshold 3 48 49 50 51 52 53 54 55
mls qos srr-queue output dscp-map queue 2 threshold 3 56 57 58 59 60 61 62 63
mls qos srr-queue output dscp-map queue 3 threshold 3 0 1 2 3 4 5 6 7
mls qos srr-queue output dscp-map queue 4 threshold 1 8 9 11 13 15
mls qos srr-queue output dscp-map queue 4 threshold 2 10 12 14
mls qos queue-set output 1 threshold 1 100 100 50 200
mls qos queue-set output 1 threshold 2 125 125 100 400
mls qos queue-set output 1 threshold 3 100 100 100 400
mls qos queue-set output 1 threshold 4 60 150 50 200
mls qos queue-set output 1 buffers 15 25 40 20
mls qos
!
02-07-2017 04:45 AM
Agree with Mr. Doherty in that QoS may not be the problem. However, QoS in and of itself, in this form can be confusing to say the least. Like the idea of removing the QoS just to see how the backups run. Here's a link that may be helpful explaining the drops you are seeing:
http://www.cisco.com/c/en/us/support/docs/switches/catalyst-3750-series-switches/116089-technote-switches-output-drops-qos-00.html
Aside from that, I know you do the backups after hours, but is there anything else going on while the backups are running? Are the backups using DSCP 0? Could you post the output of "sh mls qos int g1/1/4 statistics" It my provide a clearer picture of what traffic is being dropped.
If you want to start fresh, you can also clear these counters with the "clear mls qos int g1/1/4 statistics" command.
02-07-2017 11:37 AM
No there isn't anything else running after hours. From what I can see it looks like the backups are using DSCP.
sh mls qos int g1/0/5
GigabitEthernet1/0/5
trust state: not trusted
trust mode: not trusted
trust enabled flag: ena
COS override: dis
default COS: 0
DSCP Mutation Map: Default DSCP Mutation Map
Trust device: none
qos mode: port-based
I reset the port statistics on the remote office switch and ran a backup. Here is the C/P of the qos mls stats:
sh mls qos int g1/0/5 statistics
GigabitEthernet1/0/5 (All statistics are in packets)
dscp: incoming
-------------------------------
0 - 4 : 325797757 0 0 0 15
5 - 9 : 0 0 0 0 0
10 - 14 : 0 0 0 0 0
15 - 19 : 0 0 0 0 0
20 - 24 : 0 0 0 0 0
25 - 29 : 0 0 0 0 0
30 - 34 : 0 0 0 0 0
35 - 39 : 0 0 0 0 0
40 - 44 : 0 0 0 0 0
45 - 49 : 0 0 0 0 0
50 - 54 : 0 0 0 0 0
55 - 59 : 0 0 0 0 0
60 - 64 : 0 0 0 0
dscp: outgoing
-------------------------------
0 - 4 : 606794600 0 0 0 15
5 - 9 : 0 0 0 0 0
10 - 14 : 0 0 0 0 0
15 - 19 : 0 0 0 0 0
20 - 24 : 0 0 0 0 0
25 - 29 : 0 0 0 0 0
30 - 34 : 0 0 0 0 0
35 - 39 : 0 0 0 0 0
40 - 44 : 0 0 0 0 0
45 - 49 : 0 0 0 0 0
50 - 54 : 0 0 0 0 0
55 - 59 : 0 0 0 0 0
60 - 64 : 0 0 0 0
cos: incoming
-------------------------------
0 - 4 : 325797910 0 0 0 0
5 - 7 : 0 0 0
cos: outgoing
-------------------------------
0 - 4 : 606794626 0 0 0 0
5 - 7 : 0 0 5015
output queues enqueued:
queue: threshold1 threshold2 threshold3
-----------------------------------------------
queue 0: 0 0 0
queue 1: 0 269 5385
queue 2: 0 0 606794626
queue 3: 0 0 0
output queues dropped:
queue: threshold1 threshold2 threshold3
-----------------------------------------------
queue 0: 0 0 0
queue 1: 0 0 0
queue 2: 0 0 0
queue 3: 0 0 0
Policer: Inprofile: 0 OutofProfile:
It looks like quite a bit of deferment. I don't have an in depth knowledge of mls qos. Next step I'm planning on disabling the qos and running a backup to see what that is going to look like. Thanks for your help.
02-07-2017 11:47 AM
No problem.
By the way, the output you posted isn't the of the G1/1/4 uplink to the remote office and doesn't show any packets being dropped. Are the interfaces between the two locations indicating packets are being dropped?
02-07-2017 12:05 PM
I checked both interfaces (port status) and there is no packet loss/drop or CRC or anything else showing on both interfaces. Here is the mps qos status on G1/1/4:
sh mls qos int g1/1/4 st
GigabitEthernet1/1/4 (All statistics are in packets)
dscp: incoming
-------------------------------
0 - 4 : 81400189 0 0 0 5
5 - 9 : 0 0 0 0 0
10 - 14 : 0 0 0 0 0
15 - 19 : 0 0 0 0 0
20 - 24 : 0 0 0 0 52014937
25 - 29 : 0 2083370 0 0 0
30 - 34 : 0 0 0 0 0
35 - 39 : 0 0 0 0 0
40 - 44 : 0 0 0 0 0
45 - 49 : 0 122442725 0 597191 0
50 - 54 : 0 0 0 0 0
55 - 59 : 0 0 0 0 0
60 - 64 : 0 0 0 0
dscp: outgoing
-------------------------------
0 - 4 : 501942793 0 0 0 734
5 - 9 : 0 0 0 0 0
10 - 14 : 0 0 0 0 0
15 - 19 : 0 0 0 0 0
20 - 24 : 0 0 0 0 38903382
25 - 29 : 0 379168 0 0 0
30 - 34 : 0 0 0 0 0
35 - 39 : 0 0 0 0 0
40 - 44 : 8 0 0 0 0
45 - 49 : 0 122395063 0 5240041 0
50 - 54 : 11 0 0 0 109
55 - 59 : 0 29 0 0 0
60 - 64 : 0 0 0 0
cos: incoming
-------------------------------
0 - 4 : 908709001 0 0 54098323 0
5 - 7 : 122442699 8940 47033688
cos: outgoing
-------------------------------
0 - 4 : 593288785 0 0 39282550 0
5 - 7 : 122395071 5240161 22278770
output queues enqueued:
queue: threshold1 threshold2 threshold3
-----------------------------------------------
queue 0: 0 0 122395022
queue 1: 65053894 40585904 33865208
queue 2: 0 0 590889754
queue 3: 0 0 27448723
output queues dropped:
queue: threshold1 threshold2 threshold3
-----------------------------------------------
queue 0: 0 0 0
queue 1: 0 0 0
queue 2: 0 0 17702776
queue 3: 0 0 0
Policer: Inprofile: 0 OutofProfile:
02-14-2017 09:37 AM
Update - So we removed the QoS settings temporary and let the backup run over the weekend. It completed in 10 hours without any snags. Previously it would choke up and die halfway though the process. Next step will work with our Cisco expert about tweaking the QoS.
Thanks for your help and suggestions on this.
02-06-2017 10:10 AM
I'm unsure what problem you're actually trying to solve.
Is the problem you think QoS is causing more packet loss for your backup then you believe you would obtain otherwise w/o it? Is the problem your NAS backups are now causing drops for the non-NAS traffic?
Generally speaking, correctly implemented QoS should be of some benefit. When there's no benefit, either QoS isn't implemented correctly or it's an issue that QoS cannot address.
02-06-2017 10:47 AM
Yes I believe that the QoS is causing the packet loss. I am noticing large amounts of packet loss when the backups are running. Otherwise everything else is working nicely. I am working on setting up a isolated interface that doesn't have QoS on it to confirm it.
02-06-2017 05:12 PM
If you're running NAS backups, some drops are likely.
However, QoS correctly configured often can minimize drops, but unsuitable QoS can increase drops. I.e. if you see drops decrease when you remove QoS, it's more likely your QoS isn't optimal for your situation, not that QoS, itself, is the problem.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide