cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
400
Views
2
Helpful
6
Replies

CAt 9300L ver 17.06.04, issues with laggy user experience

mseanmiller
Level 1
Level 1

Good afternoon,

Looking for any advice on my issues. We have a pair of C9300L-48P-4X-A at our core and three compact 9200CX and one 9200L-48. We swapped out a pair of 3750v2 and the other older 3750 and 2960 switches and had no issues prior to the swap.

We have a 40Mpbs MPLS connected to a 4321 isr router woth QoS enabled on the edge for internal and VOIP systems. We have a 100Mbps for internet traffic through a Fortigate firewall. Both the router and firewall are using OSPF with the sites device gateways  pointed core switch and the core switch pointed to the firewall.

The site has a flat /24 network using VLAN 1, pretty basic. We are about to roll out NAC and 12 new VLAN's but only have the L2 and L3 SVI's up and routing at this time.

Our issue is after the swap out, our customers started to notice applications lagging or losing O365 authentication and the voice traffic also has been affected adversely with seemingly lost packets.(Both internet and intranet are laggy)

The show interface command doesn't show many output drops at all compared to the issues being reported but we were seeing a lot of Drop-TH2 errors on some interfaces but not all.

per TAC we enable "qos queue-softmax-multiplier 1200" and had a little bit of relief. We also found the uplink to the firewall was half duplex and fixed that.

We haven't had to enable QoS on switches and wouldn't think we would need to since everything was working well on the 3750's.

TAC doesn't seem to have much advice other than enabling QoS on the switch.

I did run the Cisco CLI Analyzer and it did find the following punts below but couldn't get anything back from TAC on the cause or remedy yet..

Cat9K_Punt_cause_drops_analyzer

This C9300L-48P-4X device running 17.6.4 experienced the following problem:

Following drops are found. Please check 'show platform software fed switch active punt cause summary' command
Drop found 103964 , cause info Glean adjacency.


Following non zero values found. Please check 'show platform software fed switch active punt cpuq all' command
Non zero value 2758 found for RX spurious interrupt - CPU Q ID 1.
Non zero value 32008 found for RX spurious interrupt - CPU Q ID 2.
Non zero value 130 found for RX spurious interrupt - CPU Q ID 4.
Non zero value 457868 found for RX spurious interrupt - CPU Q ID 5.
Non zero value 1760 found for RX spurious interrupt - CPU Q ID 12.
Non zero value 103964 found for RX dropped count - CPU Q ID 14.
Non zero value 103964 found for RX conversion failure dropped - CPU Q ID 14.
Non zero value 12102 found for RX spurious interrupt - CPU Q ID 14.
Non zero value 55 found for RX spurious interrupt - CPU Q ID 15.
Non zero value 591 found for RX spurious interrupt - CPU Q ID 20.

Any help from this forum would be awesome.

 

1 Accepted Solution

Accepted Solutions

We ended up finding the issue.

We had set the arp timeout to 5 on the SVI.

We had been setting on 3750's without issues for years but certainly was an issue on these 9k's.

We swapped over to Juniper EX line back 8 years ago and just dusted off the old templates for the 9k's.

This was in a template we had been using for years on 3750s 3850's and 2960s. I believe the original engineer (Likley me) thought the timeout was 5 minutes but it's 5 seconds.

Sorry for wasting your time Leo and anyone else for that matter.

 

View solution in original post

6 Replies 6

Leo Laohoo
Hall of Fame
Hall of Fame

During normal business hours, post the complete output to the following commands: 

sh platform resource
sh platform software status con brief
sh controllers utilization

Let's start with that and see what rabbit hole appears.

Thank you Leo,  will do. 

Good morning Leo,

I was on a call with one user at the site and had some packet loss during the call. He said we was still seeing some lag in his email application this morning but didn't notice the voice packet loss. The office has a fairly early start so here is the data requested, but it all looks normal.

LTIATAC-MOCC-C9300SK-DSWC#sh platform resource
**State Acronym: H - Healthy, W - Warning, C - Critical
Resource Usage Max Warning Critical State
----------------------------------------------------------------------------------------------------
Control Processor 11.30% 100% 90% 95% H
DRAM 2952MB(38%) 7752MB 85% 90% H
TMPFS 209MB(2%) 7752MB 40% 50% H

LTIATAC-MOCC-C9300SK-DSWC#sh platform software status con brief
Load Average
Slot Status 1-Min 5-Min 15-Min
1-RP0 Healthy 0.26 0.42 0.50
2-RP0 Healthy 0.57 0.46 0.41

Memory (kB)
Slot Status Total Used (Pct) Free (Pct) Committed (Pct)
1-RP0 Healthy 7938716 3023012 (38%) 4915704 (62%) 3371968 (42%)
2-RP0 Healthy 7938724 2932040 (37%) 5006684 (63%) 3319200 (42%)

CPU Utilization
Slot CPU User System Nice Idle IRQ SIRQ IOwait
1-RP0 0 6.49 2.99 0.00 90.50 0.00 0.00 0.00
1 8.80 4.20 0.00 86.88 0.00 0.10 0.00
2 6.90 4.10 0.00 89.00 0.00 0.00 0.00
3 7.79 3.59 0.00 88.41 0.00 0.19 0.00
2-RP0 0 5.99 2.09 0.00 91.90 0.00 0.00 0.00
1 4.79 2.09 0.00 93.00 0.00 0.09 0.00
2 5.79 2.29 0.00 91.90 0.00 0.00 0.00
3 4.29 1.59 0.00 94.10 0.00 0.00 0.00

LTIATAC-MOCC-C9300SK-DSWC#sh controllers utilization
Port Receive Utilization Transmit Utilization
Gi1/0/1 0 0
Gi1/0/2 0 0
Gi1/0/3 0 0
Gi1/0/4 0 0
Gi1/0/5 0 0
Gi1/0/6 0 0
Gi1/0/7 0 0
Gi1/0/8 0 0
Gi1/0/9 0 0
Gi1/0/10 0 0
Gi1/0/11 0 0
Gi1/0/12 0 0
Gi1/0/13 0 0
Gi1/0/14 0 0
Gi1/0/15 0 0
Gi1/0/16 0 0
Gi1/0/17 0 0
Gi1/0/18 0 0
Gi1/0/19 0 0
Gi1/0/20 0 0
Gi1/0/21 0 0
Gi1/0/22 0 0
Gi1/0/23 0 0
Gi1/0/24 0 0
Gi1/0/25 0 0
Gi1/0/26 0 0
Gi1/0/27 0 0
Gi1/0/28 0 0
Gi1/0/29 0 0
Gi1/0/30 0 5
Gi1/0/31 0 0
Gi1/0/32 0 0
Gi1/0/33 0 0
Gi1/0/34 0 0
Gi1/0/35 0 0
Gi1/0/36 0 0
Gi1/0/37 0 0
Gi1/0/38 0 2
Gi1/0/39 0 0
Gi1/0/40 0 0
Gi1/0/41 0 0
Gi1/0/42 0 0
Gi1/0/43 0 0
Gi1/0/44 0 0
Gi1/0/45 0 0
Gi1/0/46 0 0
Gi1/0/47 0 0
Gi1/0/48 0 0
Te1/1/1 1 4
Te1/1/2 0 13
Te1/1/3 0 0
Te1/1/4 0 0
Ap1/0/1 0 0
Gi2/0/1 0 0
Gi2/0/2 0 0
Gi2/0/3 0 0
Gi2/0/4 0 0
Gi2/0/5 0 0
Gi2/0/6 0 0
Gi2/0/7 0 0
Gi2/0/8 0 0
Gi2/0/9 0 0
Gi2/0/10 0 0
Gi2/0/11 0 0
Gi2/0/12 1 0
Gi2/0/13 0 0
Gi2/0/14 0 0
Gi2/0/15 0 0
Gi2/0/16 0 0
Gi2/0/17 0 0
Gi2/0/18 0 0
Gi2/0/19 0 0
Gi2/0/20 0 0
Gi2/0/21 0 1
Gi2/0/22 0 0
Gi2/0/23 0 0
Gi2/0/24 0 0
Gi2/0/25 0 0
Gi2/0/26 0 0
Gi2/0/27 0 0
Gi2/0/28 0 0
Gi2/0/29 0 0
Gi2/0/30 0 0
Gi2/0/31 0 0
Gi2/0/32 0 0
Gi2/0/33 0 0
Gi2/0/34 0 0
Gi2/0/35 0 0
Gi2/0/36 3 0
Gi2/0/37 0 0
Gi2/0/38 0 0
Gi2/0/39 0 0
Gi2/0/40 0 0
Gi2/0/41 0 0
Gi2/0/42 0 0
Gi2/0/43 2 2
Gi2/0/44 0 0
Gi2/0/45 0 0
Gi2/0/46 0 0
Gi2/0/47 0 0
Gi2/0/48 0 0
Te2/1/1 0 0
Te2/1/2 0 0
Te2/1/3 1 0
Te2/1/4 0 0
Ap2/0/1 0 0

Total Ports : 106
Total Ports Receive Bandwidth Percentage Utilization : 0
Total Ports Transmit Bandwidth Percentage Utilization : 0
Switch 1 Stack Ring Max Percentage Utilization : 0
Switch 2 Stack Ring Max Percentage Utilization : 0

 

We ended up finding the issue.

We had set the arp timeout to 5 on the SVI.

We had been setting on 3750's without issues for years but certainly was an issue on these 9k's.

We swapped over to Juniper EX line back 8 years ago and just dusted off the old templates for the 9k's.

This was in a template we had been using for years on 3750s 3850's and 2960s. I believe the original engineer (Likley me) thought the timeout was 5 minutes but it's 5 seconds.

Sorry for wasting your time Leo and anyone else for that matter.

 

If possible, mark your last reply as a solution.

Will do. Thank you

Review Cisco Networking for a $25 gift card