Solved: Low packet loss during peak time between 6509 and 3600 switches

bobbydazzler · ‎11-05-2021

Hello,

I have an issue with some packet loss between two switches: a 6509 and a 3600. The loss is not a lot, around ~0.2% but it is causing issue with some of our equipment connected to these switches.

The switches connected at 2x 1gbps lacp link with CWDM SFPs.

The loss appears only during peak time, not during night time. The traffic on the link is not saturated, it is max at 600mbps on both ports.

I see some output packet drop on the interfaces, what could be causing this? Any pointers on thing that I should check further?

Thank you in advance for your help.

Kind regards

Joseph W. Doherty · ‎11-05-2021

"The traffic on the link is not saturated, it is max at 600mbps on both ports."

Ah, if the dual gig links, peak is always 2 gig. Something like 600 Mbps is a usage average over some time period (often 5 minutes).

Drops happen down at the millisecond level, so "normal" peak usage averages don't show "burst" loading.

What might help you to understand a common cause is reading about "microbursts".

Basically, if your aggregate traffic to the interface can exceeds the interface's bandwidth, you can have drops, even with "low" overall utilization.

If you're wondering if there is anything you can do to mitigate this, beyond providing more bandwidth, the answer is yes - sometimes. For example, you mention having a dual Etherchannel link, is your hashing choice optimal for your traffic, assuming your device can provide an optimal choice for your traffic?

You might (if supported on your device) increase interface buffer/queue resources (which, generally, increases occasionally latency, while decreasing drops).

You might use QoS techniques to "protect" important/fragile traffic, and/or use an early or tier dropping to better manage traffic rates and (sometimes) diminish (not eliminate) overall drops.

View solution in original post

balaji.bandi · ‎11-05-2021

Couple of the things to chek :

Pysical :

1. check the patch leads

2. try to reseat or relace SFP Module (both the side 1 at a time)

Configuration side :

1. look at the LACP Loadbalance configured

2. see what Interface having any drops (input or output)

show interface gi 0/X (both) if possible post here to understand the issue

simple test :

1. Shutdown one of the interface in port-channel (since your Traffic less than 1GB, you should not see any issue) - if that packet loss gone, you know what to do with port shutdown now (if that is not the case, repeat other one bringing other interface up and shutdown other interface).

Finally post show version code running with below out put here :

1. show version

2. show interface Gi x/x (from bot the switch)

3. how are you pinging inside switch to switch ? what is the source IP and destination IP ? they directly connected to switches ?

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

bobbydazzler · ‎11-05-2021

Hello,

Thank you for your reply. I already tried to shutdown one port to get all the traffic on only one port but the issue remains the same, on either port.

Here is the output of one of the port:

s01gal#sh int g3/24
GigabitEthernet3/24 is up, line protocol is up (connected)
  Hardware is C6k 1000Mb 802.3, address is 0027.0dcb.2287 (bia 0027.0dcb.2287)
  Description: --to-S01REG-CWDM-gray
  MTU 9216 bytes, BW 1000000 Kbit, DLY 10 usec, 
     reliability 255/255, txload 178/255, rxload 10/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 1000Mb/s, media type is CWDM-1470
  input flow-control is off, output flow-control is off
  Clock mode is auto
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:18, output 00:00:12, output hang never
  Last clearing of "show interface" counters 05:54:36
  Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 2463168
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)

I don't have any input or output errors but I see the Total output drops increasing quite a lot at peak time. Is it related to some queues?

It is the versions of the switches:

3600#sh ver
Cisco IOS Software, ME360x Software (ME360x-UNIVERSALK9-M), Version 15.4(3)S5, RELEASE SOFTWARE (fc1)

6509#sh ver
Cisco IOS Software, s72033_rp Software (s72033_rp-ADVIPSERVICESK9-M), Version 15.1(2)SY10, RELEASE SOFTWARE (fc4)

Best

balaji.bandi · ‎11-05-2021

Looking at your output :

s01gal#sh int g3/24
GigabitEthernet3/24 is up, line protocol is up (connected)
  Hardware is C6k 1000Mb 802.3, address is 0027.0dcb.2287 (bia 0027.0dcb.2287)
  Description: --to-S01REG-CWDM-gray
  MTU 9216 bytes, BW 1000000 Kbit, DLY 10 usec, 
     reliability 255/255, txload 178/255, rxload 10/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 1000Mb/s, media type is CWDM-1470
  input flow-control is off, output flow-control is off
  Clock mode is auto
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:18, output 00:00:12, output hang never
  Last clearing of "show interface" counters 05:54:36
  Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 2463168

as mentioned other posters you may have good amount of traffic that is bursting short period of time.

troubleshoot :

https://www.cisco.com/c/en/us/support/docs/routers/10000-series-routers/6343-queue-drops.html

Qos check :

https://www.cisco.com/c/en/us/support/docs/switches/catalyst-6000-series-switches/10587-73.html

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

bobbydazzler · ‎11-05-2021

Thank you for the links, I will have a look. Also looking at the graph I see that one of the port of the LACP is loaded at around 700mbps while the other one is only at 300mbps. Is it possible to even the load across the two ports somehow?

balaji.bandi · ‎11-06-2021

Thank you for the links, I will have a look. Also looking at the graph I see that one of the port of the LACP is loaded at around 700mbps while the other one is only at 300mbps. Is it possible to even the load across the two ports somehow?

Configuration side :

1. look at the LACP Loadbalance configured

2. see what Interface having any drops (input or output)

show interface gi 0/X (both) if possible post here to understand the issue

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Joseph W. Doherty · ‎11-05-2021