10-22-2015 06:04 AM - edited 03-08-2019 02:20 AM
Dear Community members,
I need your advise for troubleshooting a high CPU usage problem on a Catalyst 3750X L3 switch, which is the core one in my network. This problem is producing high latency and data lost that is driving my network slow and unstable.
So, the evidence. This is the output of "show proc cpu sorted" command
#show proc cpu sorted CPU utilization for five seconds: 98%/28%; one minute: 99%; five minutes: 99% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 214 41003600 13944778 2940 24.15% 25.21% 25.35% 0 IP Input 169 34067372 3928624 8671 16.31% 14.55% 14.37% 0 Hulc LED Process 232 20500311 2892288 7087 8.95% 8.43% 8.43% 0 Spanning Tree 125 19816534 4237944 4675 5.59% 7.24% 7.74% 0 hpm main process 12 3744640 2199415 1702 2.87% 2.56% 2.45% 0 ARP Input 85 1858596 630601 2947 1.27% 0.95% 0.89% 0 RedEarth Tx Mana 129 2207888 176060 12540 0.95% 1.04% 1.03% 0 hpm counter proc 212 2836209 795249 3566 0.79% 1.46% 1.32% 0 IP ARP Adjacency 53 137483 4075 33738 0.79% 0.09% 0.06% 0 Per-minute Jobs 245 608854 166106 3665 0.79% 0.64% 0.54% 0 PI MATM Aging Pr 121 68235 11682 5841 0.63% 0.07% 0.01% 0 Strider Tcam Mem 91 1099121 68040 16154 0.47% 0.52% 0.48% 0 Adjust Regions 340 471153 649411 725 0.31% 0.27% 0.23% 0 VLAN Manager 84 686262 849745 807 0.31% 0.34% 0.36% 0 RedEarth I2C dri 51 278373 3518118 79 0.31% 0.30% 0.28% 0 Net Input 170 398886 133716 2983 0.15% 0.14% 0.15% 0 HL3U bkgrd proce
As you can see, the CPU is completely utilized and the top process is IP Input. Googleing I learned that the idea is to use interrupt-level switching (Fast,CEF, between others) instead of proccess-level one, so there is no CPU payload on switching. Checking this...
#show cef interface brief Interface IP-Address Status Switching Vlan1 10.100.10.1 up no dCEF FastEthernet0 unassigned down no dCEF GigabitEthernet1/0/1 unassigned up CEF GigabitEthernet1/0/2 unassigned down CEF GigabitEthernet1/0/3 unassigned up CEF GigabitEthernet1/0/4 unassigned up CEF GigabitEthernet1/0/5 unassigned up CEF GigabitEthernet1/0/6 unassigned up CEF GigabitEthernet1/0/7 unassigned up CEF GigabitEthernet1/0/8 unassigned down CEF GigabitEthernet1/0/9 unassigned up CEF GigabitEthernet1/0/10 unassigned down CEF GigabitEthernet1/0/11 unassigned up CEF GigabitEthernet1/0/12 unassigned up CEF GigabitEthernet1/0/13 unassigned up CEF GigabitEthernet1/0/14 unassigned up CEF GigabitEthernet1/0/15 unassigned up CEF GigabitEthernet1/0/16 unassigned up CEF GigabitEthernet1/0/17 unassigned up CEF GigabitEthernet1/0/18 unassigned up CEF GigabitEthernet1/0/19 unassigned up CEF GigabitEthernet1/0/20 unassigned up CEF GigabitEthernet1/0/21 unassigned up CEF GigabitEthernet1/0/22 unassigned up CEF GigabitEthernet1/0/23 unassigned up CEF GigabitEthernet1/0/24 unassigned up CEF GigabitEthernet1/1/1 unassigned down CEF GigabitEthernet1/1/2 unassigned down CEF GigabitEthernet1/1/3 unassigned down CEF GigabitEthernet1/1/4 unassigned down CEF TenGigabitEthernet1/1/1 unassigned down CEF TenGigabitEthernet1/1/2 unassigned down CEF Null0 unassigned up no CEF Vlan2 X.X.171.126 up CEF Vlan3 unassigned up CEF Vlan4 unassigned up CEF Vlan6 X.X.165.65 up CEF Vlan7 X.X.166.190 up CEF Vlan8 X.X.152.2 up CEF Vlan9 X.X.166.62 up CEF Vlan10 unassigned up CEF Vlan12 X.X.161.126 up CEF Vlan13 X.X.171.254 up CEF Vlan14 X.X.167.254 up CEF Vlan15 X.X.166.65 up CEF Vlan16 unassigned up CEF Vlan17 X.X.167.62 up CEF Vlan19 X.X.161.190 up CEF Vlan21 X.X.167.126 up CEF Vlan22 X.X.171.158 up CEF Vlan27 X.X.162.226 up CEF Vlan28 X.X.162.242 up CEF Vlan30 10.201.6.67 up CEF Vlan60 192.168.60.1 up CEF Vlan61 unassigned up CEF Vlan100 192.168.100.1 up CEF Vlan200 unassigned up CEF Vlan201 unassigned up CEF Vlan240 X.X.168.253 up no dCEF Vlan250 X.X.171.190 up CEF Vlan251 192.168.101.1 up CEF Vlan252 192.168.102.1 up CEF Vlan253 192.168.103.30 up CEF Vlan260 X.X.162.190 up CEF Vlan280 unassigned up CEF Vlan281 unassigned down CEF Vlan298 unassigned down CEF Vlan301 unassigned down CEF StackPort1 unassigned down CEF Virtual1 unassigned up - Virtual2 unassigned up -
I can see that CEF is enabled on almost every interface, but checking "show interface switching" (output cropped for readability)
#show interface switching Vlan1 Throttle count 0 Drops RP 1139 SP 0 SPD Flushes Fast 0 SSE 0 SPD Aggress Fast 0 SPD Priority Inputs 0 Drops 0 Protocol IP Switching path Pkts In Chars In Pkts Out Chars Out Process 339540 75121303 57097 4611829 Cache misses 0 - - - Fast 0 0 11 1053 Auton/SSE 0 0 0 0 Protocol ARP Switching path Pkts In Chars In Pkts Out Chars Out Process 1596666 95800784 72413 4344780 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 NOTE: all counts are cumulative and reset only after a reload. Vlan2 Protocol IP Switching path Pkts In Chars In Pkts Out Chars Out Process 2816137 201283125 10400 1470747 Cache misses 0 - - - Fast 1205 183277 10 2542 Auton/SSE 0 0 0 0 Protocol ARP Switching path Pkts In Chars In Pkts Out Chars Out Process 132743 7964580 86383 5182980 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 NOTE: all counts are cumulative and reset only after a reload. Vlan3 Protocol ARP Switching path Pkts In Chars In Pkts Out Chars Out Process 68260 4095600 0 0 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 Protocol Other Switching path Pkts In Chars In Pkts Out Chars Out Process 2 120 0 0 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 NOTE: all counts are cumulative and reset only after a reload. Vlan4 All statistics for this interface are zero. Vlan6 Throttle count 0 Drops RP 9 SP 0 SPD Flushes Fast 0 SSE 0 SPD Aggress Fast 0 SPD Priority Inputs 0 Drops 0 Protocol IP Switching path Pkts In Chars In Pkts Out Chars Out Process 5140914 356471282 23015 2993665 Cache misses 0 - - - Fast 428 39212 6 1135 Auton/SSE 0 0 0 0 Protocol ARP Switching path Pkts In Chars In Pkts Out Chars Out Process 13803 828180 16550 993000 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 NOTE: all counts are cumulative and reset only after a reload. Vlan7 Protocol IP Switching path Pkts In Chars In Pkts Out Chars Out Process 151934 14734706 7539 1527065 Cache misses 0 - - - Fast 156 14866 6 1315 Auton/SSE 0 0 0 0 Protocol ARP Switching path Pkts In Chars In Pkts Out Chars Out Process 10696 641760 9251 555060 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 NOTE: all counts are cumulative and reset only after a reload. Vlan8 Throttle count 0 Drops RP 53150 SP 0 SPD Flushes Fast 0 SSE 0 SPD Aggress Fast 0 SPD Priority Inputs 0 Drops 0 Protocol IP Switching path Pkts In Chars In Pkts Out Chars Out Process 76358777 5367493319 759413 66953170 Cache misses 0 - - - Fast 110151 10095563 74 13479 Auton/SSE 0 0 0 0 Protocol ARP Switching path Pkts In Chars In Pkts Out Chars Out Process 779028 46741694 459708 27582480 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 Protocol Other Switching path Pkts In Chars In Pkts Out Chars Out Process 127 7620 0 0 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 NOTE: all counts are cumulative and reset only after a reload.
... I notice that almost all the switching is done by process, so I don't know how to work around this.
To put into context. This switch connect directly or indirectly almost 190 other devices, between switches (2960s, 2950s, 3560, and 3750) and access points. It is the default gateway for a lot of VLAN and directly connect our WAN access.
The switch is running IOS 12.2(58)SE2. I have red that in other post (lik this one https://supportforums.cisco.com/discussion/11628666/cisco-3750x-24se-12258se2-cpu-utilization-high) that it's advice to to downgrade to 12.2(55)SE8 for stability reason, but this post is 3 years old so I'd like to know if this remains as a valid solution or is another preferable IOS version to work around this problem.
From the "show proc cpu" command I can see other 2 proccess eating CPU resource, HULC LED and Spanning Tree, but I'd like to troubleshoot the IP Input one first, cause is the top most one.
Please, advise for where should I look for solving this problem.
Thanks very much.
10-22-2015 07:08 AM
Hello
Can you also post the output of show platform tcam utilization ?
Is it possible that you've exceeded your TCAM limitations, or that you use an incorrect SDM template on your switch (show sdm prefer) ?
It's possible that you have a lot of traffic that needs to be process switched (check what traffic is destined to the switch itself and if you see any excessive communication - e.g. ARP). Check your routing (especially static routes) configuration.
Best would be if you could share the entire running-config.
Also this link might be helpful: http://www.cisco.com/c/en/us/support/docs/switches/catalyst-3750-series-switches/68461-high-cpu-utilization-cat3750.html
Best regards,
Martin
10-22-2015 08:59 AM
Hello Martin. Sorry ... forgot to post tcam utilization. I checked that, and there is no limit problem.
#show platform tcam utilization CAM Utilization for ASIC# 0 Max Used Masks/Values Masks/values Unicast mac addresses: 6364/6364 2212/2212 IPv4 IGMP groups + multicast routes: 1120/1120 1/1 IPv4 unicast directly-connected routes: 6144/6144 772/772 IPv4 unicast indirectly-connected routes: 2048/2048 180/180 IPv4 policy based routing aces: 452/452 12/12 IPv4 qos aces: 512/512 21/21 IPv4 security aces: 964/964 36/36
Running "show sdm prefer" show me that SDM is set to "desktop default"
#show sdm prefer The current template is "desktop default" template. The selected template optimizes the resources in the switch to support this level of features for 8 routed interfaces and 1024 VLANs. number of unicast mac addresses: 6K number of IPv4 IGMP groups + multicast routes: 1K number of IPv4 unicast routes: 8K number of directly-connected IPv4 hosts: 6K number of indirect IPv4 routes: 2K number of IPv4 policy based routing aces: 0 number of IPv4/MAC qos aces: 0.5K number of IPv4/MAC security aces: 1K
I had already check that link for torubleshooting high CPU on cat3750 and paid attention to the static route ARP problem, but they pointed out the cases where a static route is configured for a broadcast interface, which is not my case. But, anyway, this is my running config (some sensitive data will be omitted or modified)
#sh run Building configuration... Current configuration : 11031 bytes ! ! Last configuration change at 16:55:22 UTC Wed Oct 21 2015 ! version 12.2 no service pad service timestamps debug datetime msec service timestamps log datetime msec no service password-encryption service sequence-numbers ! ! boot-start-marker boot-end-marker ! ! no aaa new-model clock timezone UTC -3 0 switch 1 provision ws-c3750x-24s system mtu routing 1500 ip routing ! ! mls qos ! crypto pki trustpoint TP-self-signed-298484864 enrollment selfsigned subject-name cn=IOS-Self-Signed-Certificate-298484864 revocation-check none rsakeypair TP-self-signed-298484864 ! ! crypto pki certificate chain TP-self-signed-298484864 certificate self-signed 01 /* omited */ quit spanning-tree mode pvst spanning-tree extend system-id ! ! ! ! vlan internal allocation policy ascending ! ! ! ! ! ! ! interface FastEthernet0 no ip address no ip route-cache cef no ip route-cache shutdown ! interface GigabitEthernet1/0/1 switchport trunk encapsulation dot1q switchport trunk allowed vlan 1-19,21-4094 switchport mode trunk ! interface GigabitEthernet1/0/2 switchport trunk encapsulation dot1q switchport trunk allowed vlan 1-19,21-280,282-4094 switchport mode trunk shutdown ! interface GigabitEthernet1/0/3 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/4 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/5 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/6 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/7 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/8 switchport mode access shutdown ! interface GigabitEthernet1/0/9 switchport access vlan 28 switchport mode access spanning-tree portfast ! interface GigabitEthernet1/0/10 ! interface GigabitEthernet1/0/11 switchport trunk encapsulation dot1q switchport trunk allowed vlan 1,16,200-203 switchport mode trunk ! interface GigabitEthernet1/0/12 switchport trunk encapsulation dot1q switchport trunk allowed vlan 1,16,200-203 switchport mode trunk ! interface GigabitEthernet1/0/13 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/14 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/15 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/16 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/17 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/18 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/19 switchport access vlan 3 switchport mode access ! interface GigabitEthernet1/0/20 switchport access vlan 211 switchport trunk encapsulation dot1q switchport mode access speed 100 ! interface GigabitEthernet1/0/21 switchport access vlan 3 switchport trunk encapsulation dot1q switchport mode access speed 100 ! interface GigabitEthernet1/0/22 switchport access vlan 230 switchport trunk encapsulation dot1q switchport mode access ! interface GigabitEthernet1/0/23 switchport trunk encapsulation dot1q switchport mode trunk ! interface GigabitEthernet1/0/24 switchport trunk encapsulation dot1q switchport trunk native vlan 8 switchport mode trunk duplex full ! interface GigabitEthernet1/1/1 shutdown ! interface GigabitEthernet1/1/2 shutdown ! interface GigabitEthernet1/1/3 shutdown ! interface GigabitEthernet1/1/4 shutdown ! interface TenGigabitEthernet1/1/1 shutdown ! interface TenGigabitEthernet1/1/2 shutdown ! interface Vlan1 ip address 10.100.10.1 255.255.255.0 no ip route-cache cef no ip route-cache no ip mroute-cache ! interface Vlan2 ip address X.X.171.126 255.255.255.128 ! interface Vlan3 no ip address ! interface Vlan4 no ip address ! interface Vlan6 ip address X.X.165.65 255.255.255.192 ! interface Vlan7 ip address X.X.166.190 255.255.255.224 ! interface Vlan8 ip address X.X.161.254 255.255.255.224 secondary ip address X.X.163.2 255.255.255.0 secondary ip address X.X.152.2 255.255.255.0 ! interface Vlan9 ip address X.X.166.62 255.255.255.224 ! interface Vlan10 no ip address ! interface Vlan12 ip address X.X.161.126 255.255.255.128 ! interface Vlan13 description *** Red Pabellon C *** ip address X.X.171.254 255.255.255.192 ! interface Vlan14 ip address X.X.167.254 255.255.255.240 ! interface Vlan15 ip address X.X.166.65 255.255.255.192 ! interface Vlan16 no ip address ! interface Vlan17 ip address X.X.167.62 255.255.255.192 ! interface Vlan19 ip address X.X.161.190 255.255.255.192 ! interface Vlan21 ip address X.X.167.126 255.255.255.192 ! interface Vlan22 ip address X.X.171.158 255.255.255.224 ! interface Vlan27 ip address X.X.162.226 255.255.255.252 ! interface Vlan28 description *** Red Borde Filtrada *** ip address X.X.162.242 255.255.255.240 ! interface Vlan30 ip address 10.201.6.67 255.255.255.248 ! interface Vlan60 ip address 192.168.60.1 255.255.255.0 ! interface Vlan61 no ip address ! interface Vlan100 ip address 192.168.100.1 255.255.255.0 ! interface Vlan200 no ip address ! interface Vlan201 no ip address ! interface Vlan240 ip address X.X.168.253 255.255.255.0 no ip route-cache cef no ip route-cache ! interface Vlan250 ip address X.X.171.190 255.255.255.240 ! interface Vlan251 ip address 192.168.101.1 255.255.255.0 ! interface Vlan252 ip address 192.168.102.1 255.255.255.0 ! interface Vlan253 ip address 192.168.103.30 255.255.255.224 ! interface Vlan260 ip address X.X.162.190 255.255.255.192 ! interface Vlan280 description *** VDI PB *** no ip address ! interface Vlan281 description *** VDI AyF *** no ip address ! interface Vlan298 no ip address ! interface Vlan301 description *** Equipos de Borde *** no ip address ! router rip redistribute connected network 10.0.0.0 network X.X.0.0 no auto-summary ! ip default-gateway 10.100.10.1 ! ip http server ip http secure-server ! ip route 0.0.0.0 0.0.0.0 X.X.162.225 ip route 10.83.14.0 255.255.255.192 X.X.162.225 ip route 10.201.6.0 255.255.255.0 X.X.162.225 ip route X.X.160.0 255.255.255.128 X.X.162.225 ip route X.X.160.128 255.255.255.240 X.X.162.225 ip route X.X.160.144 255.255.255.240 X.X.162.225 ip route X.X.160.160 255.255.255.224 X.X.162.225 ip route X.X.160.192 255.255.255.192 X.X.162.225 ip route X.X.161.192 255.255.255.224 X.X.162.225 ip route X.X.162.0 255.255.255.128 X.X.162.225 ip route X.X.162.128 255.255.255.192 X.X.162.225 ip route X.X.162.232 255.255.255.248 X.X.162.225 ip route X.X.162.240 255.255.255.240 X.X.162.241 ip route X.X.165.0 255.255.255.192 X.X.162.225 ip route X.X.165.128 255.255.255.128 X.X.162.225 ip route X.X.166.0 255.255.255.224 X.X.162.225 ip route X.X.166.128 255.255.255.224 X.X.162.225 ip route X.X.166.192 255.255.255.192 X.X.162.225 ip route X.X.167.128 255.255.255.192 X.X.162.225 ip route X.X.168.0 255.255.255.0 X.X.162.225 ip route X.X.169.0 255.255.255.0 X.X.162.225 ip route X.X.170.0 255.255.255.0 X.X.162.225 ip route X.X.171.160 255.255.255.240 X.X.162.225 ip route X.X.172.0 255.255.255.128 X.X.162.225 ip route X.X.172.128 255.255.255.240 X.X.162.225 ip route X.X.172.144 255.255.255.240 X.X.162.225 ip route X.X.172.160 255.255.255.224 X.X.162.225 ip route X.X.172.192 255.255.255.240 X.X.162.225 ip route X.X.172.208 255.255.255.240 X.X.162.225 ip route X.X.172.224 255.255.255.224 X.X.162.225 ip route 172.29.57.0 255.255.255.0 X.X.162.225 ip route 192.168.61.0 255.255.255.0 192.168.60.3 ip route 192.168.62.0 255.255.255.0 192.168.60.3 ! ip sla enable reaction-alerts logging esm config logging X.X.161.237 access-list 10 permit 10.83.14.18 access-list 120 deny ip host 10.83.14.18 any access-list 120 permit tcp host X.X.168.46 any eq www access-list 120 deny ip any any ! snmp-server community <comunity> RO snmp-server host X.X.162.152 <comunity> ! ! ! ! monitor session 1 source interface Gi1/0/24 monitor session 1 destination interface Gi1/0/10 ntp server X.X.162.152 end
10-22-2015 02:25 PM
Hello Dago,
Last time I experienced a similar issue it was due to a loop somewhere in the LAN..
A couple of questions : Is the CPU at 99% most of the time ? 'show proc cpu history'..
Is there any specific logs on your core and access switches ?
Please find some ideas :
- Try to check on which interfae(s) high traffic is coming.
- I know that it is your central core switch, but I would advise to shut the interfaces going to each access switch then see if the CPU decreases. If not renable and test for another interface.
Thank you.
Karim
10-23-2015 07:09 AM
Hello krahmani323,
#sh proc cpu history 999999999999999999999999999999999999999999999999999999999999 999999999999999999999999999999999999999999999999999999999999 100 ********************************************************** 90 ********************************************************** 80 ********************************************************** 70 ********************************************************** 60 ********************************************************** 50 ********************************************************** 40 ********************************************************** 30 ********************************************************** 20 ********************************************************** 10 ********************************************************** 0....5....1....1....2....2....3....3....4....4....5....5....6 0 5 0 5 0 5 0 5 0 5 0 CPU% per second (last 60 seconds) 1 11 1 1 11 1 1 11 11 11 999090099099999099900999099999999099999900999999999900999009 999090099099999099900999099999999099999900999999999900999009 100 ########################################################## 90 ########################################################## 80 ########################################################## 70 ########################################################## 60 ########################################################## 50 ########################################################## 40 ########################################################## 30 ########################################################## 20 ########################################################## 10 ########################################################## 0....5....1....1....2....2....3....3....4....4....5....5....6 0 5 0 5 0 5 0 5 0 5 0 CPU% per minute (last 60 minutes) * = maximum CPU% # = average CPU% 111 1111111111 111111111 1111111 000999999999999900000000009998999999999990000000009999999999999990000000 000699299599489900000000009886292299799990000000009998888899999890000000 100 ##**** ***** ****#########*** * ******##########***************##### 90 ##*************###########**************##########***************##### 80 ###**********##############************############*************###### 70 #############################**#*#**################################## 60 ###################################################################### 50 ###################################################################### 40 ###################################################################### 30 ###################################################################### 20 ###################################################################### 10 ###################################################################### 0....5....1....1....2....2....3....3....4....4....5....5....6....6....7.. 0 5 0 5 0 5 0 5 0 5 0 5 0 CPU% per hour (last 72 hours) * = maximum CPU% # = average CPU%
001505: Oct 23 13:13:08.168: %SW_MATM-4-MACFLAP_NOTIF: Host b84f.d527.bbd3 in vlan 200 is flapping between port Gi1/0/12 and port Gi1/0/1
I'll keep on troubleshooting this issue.
Thanks
10-23-2015 07:23 AM
The HULC is a cosmetic bug so you can deduct that from your total not that it really helps in your case but best not to waste time tracking it , known issue on 3750/3560/2960s etc just put 3750 hulc bug into google you will see most platforms/versions are effected.
your process switching vlan 1 & 240 so everything from that subnet is being punted through the cpu you should turn cef back on to prevent that , remove the no ip route-cache cef from under the svis
10-23-2015 11:31 AM
Hello Mark,
Thanks for the HULC procces advice.
About VLAN 1 and 240, I did correct the "no ip route-cache cef" command and at least some package started to be CEF-switched, but not all of them. Eventually the CPU problem persist.
As you can see almost all package are process-switched
#show interfaces vlan 240 switching Vlan240 Throttle count 0 Drops RP 12209 SP 0 SPD Flushes Fast 0 SSE 0 SPD Aggress Fast 0 SPD Priority Inputs 0 Drops 0 Protocol IP Switching path Pkts In Chars In Pkts Out Chars Out Process 71295805 5034710305 86603 7877115 Cache misses 0 - - - Fast 633 45994 351 101781 Auton/SSE 0 0 0 0 Protocol ARP Switching path Pkts In Chars In Pkts Out Chars Out Process 771524 46291440 195426 11725560 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 Protocol Other Switching path Pkts In Chars In Pkts Out Chars Out Process 7 420 0 0 Cache misses 0 - - - Fast 0 0 0 0 Auton/SSE 0 0 0 0 NOTE: all counts are cumulative and reset only after a reload.
This behavior present in all SVIs interfaces.
More evidence that a lot of package aren't CEF-switched
#show ip cef switching statistics Reason Drop Punt Punt2Host RP LES No route 7 0 3 RP LES No adjacency 261761 0 0 RP LES Incomplete adjacency 8241 0 0 RP LES TTL expired 0 0 4 RP LES IP options set 0 0 1669 RP LES IP redirects 0 0 1 RP LES Neighbor resolution req 673552 553 0 RP LES Total 943561 553 1677 All Total 943561 553 1677
What should I look for now?
Thanks
10-23-2015 05:09 PM
What IOS is the stack running on?
IP Input normally means the switch (stack) is being hammered by the client, like a SAN server pushing so much data. If this is the case, then a command like "sh interface counter errors" will show which ports are are incrementing "Total output drops".
10-26-2015 06:32 AM
Hi Leo,
The switch (it just one switch, not a stack) is running IOS 12.2(58)SE2. As seen in an old post (3 years ago) you encourage to downgrade to 12.2(55)SE8. It's still as a valid solution in your opinion? At this time is there another stable IOS version?
Running "sh int counter errors" command made me notice a lot of error coming through another cat 3750 switch (L2) which connect a couple of host.
#sh int count err Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards Gi1/0/1 0 0 0 0 0 0 Gi1/0/2 0 0 0 0 0 0 Gi1/0/3 0 0 0 0 0 0 Gi1/0/4 0 0 0 0 0 0 Gi1/0/5 0 0 0 0 0 0 Gi1/0/6 0 0 0 0 0 0 Gi1/0/7 0 0 0 0 0 0 Gi1/0/8 0 0 0 0 0 0 Gi1/0/9 0 0 0 0 0 0 Gi1/0/10 0 0 0 0 0 0 Gi1/0/11 0 0 0 0 0 0 Gi1/0/12 0 0 0 0 0 0 Gi1/0/13 0 0 0 0 0 0 Gi1/0/14 0 3 0 4 0 26587 Gi1/0/15 0 0 0 0 0 0 Gi1/0/16 0 0 0 0 0 0 Gi1/0/17 0 0 0 0 0 0 Gi1/0/18 0 0 0 0 0 0 Gi1/0/19 0 0 0 0 0 0 Gi1/0/20 0 0 0 0 0 0 Gi1/0/21 0 0 0 0 0 4411 Gi1/0/22 0 0 0 0 0 0 Gi1/0/23 0 0 0 0 0 43 Gi1/0/24 0 0 0 0 0 0 Port Single-Col Multi-Col Late-Col Excess-Col Carri-Sen Runts Giants Gi1/0/1 0 0 0 0 0 0 0 Gi1/0/2 0 0 0 0 0 0 0 Gi1/0/3 0 0 0 0 0 0 0 Gi1/0/4 0 0 0 0 0 0 0 Gi1/0/5 0 0 0 0 0 0 0 Gi1/0/6 0 0 0 0 0 0 0 Gi1/0/7 0 0 0 0 0 0 0 Gi1/0/8 0 0 0 0 0 0 0 Gi1/0/9 0 0 0 0 0 0 0 Gi1/0/10 0 0 0 0 0 0 0 Gi1/0/11 0 0 0 0 0 0 0 Gi1/0/12 0 0 0 0 0 0 5 Gi1/0/13 0 0 0 0 0 0 0 Gi1/0/14 0 0 0 0 0 0 1 Gi1/0/15 0 0 0 0 0 0 0 Gi1/0/16 0 0 0 0 0 0 0 Gi1/0/17 0 0 0 0 0 0 0 Gi1/0/18 0 0 0 0 0 0 0 Gi1/0/19 0 0 0 0 0 0 0 Gi1/0/20 0 0 0 0 0 0 0 Gi1/0/21 0 0 0 0 0 0 0 Gi1/0/22 0 0 0 0 0 0 0 Gi1/0/23 0 0 0 0 0 0 0 Gi1/0/24 0 0 0 0 0 0 0
As a test, I shutdown this interface, but high CPU usage remains. I'm aware that I must workarround this OutDiscard package problem, but I'll leave it for after solving the Ip Input one first.
Keep on troubleshooting. Any new idea?
Thanks
10-26-2015 08:05 PM
As seen in an old post (3 years ago) you encourage to downgrade to 12.2(55)SE8. It's still as a valid solution in your opinion? At this time is there another stable IOS version?
Yes. Upgrade to the latest 12.2(55)SE-train, which is 12.2(55)SE10.
I've tested 12.2(58)SE2 about four years ago and I wasn't impressed with this version at all. It's all got something to do with CPU spiking.
10-26-2015 11:49 PM
Hi Dago,
You could upgrade to 15.0(2)SE8 too, instead of a downgrade to 55SE10.
Also the MAC flap you mentioned above might not be expected. Are you sure the MAC addresses are that of the wireless clients? The wireless client MAC addresses are not learned on the 3750X, it would be learned on the controller. I feel there could be a loop in the network, causing the flap?
Regards,
Roopa
10-27-2015 08:10 AM
Hi roor,
About changing IOS version. Is there any problem with licenses level? Right now is "ipservice" which is permanent. Do I need another license for upgrade or downgrade? This switch was bought to a service provider but we don't have a cisco account with permissions for download this IOS software. Do we have to ask the service provider for the new IOS software? Should be a purchase involved?
About MAC flap associated to wifi clients. I am pretty sure that at least MACFLAP messages are coming from wireless client, cause it always mention our wifi VLAN (200,201,202) flapping through interface that indirectly connect access points.
As an overall, first we started with a series of standalone APs (some Cisco and Ubiquiti ones) broadcasting 3 SSID which were catch by a captive portal running ontop of a pfSense box, which provide NAT, routing and firewall capabilities also. Attached is a simplified diagram of wireless network (wifi_simplificado1.png).
Now we add a Cisco 5508 WLC and change most of the APs to controller based AIRCAP 2700 series (wifi_simplificado2.png attached file). Right now we are in the procces of replacing all the APs so the wlc is just administering AP functionallities but wifi client traffic steel pass through pfsense box and, eventually, through the 3750x switch. PfSense based captive portal is still there for backward compatibility.
In this scenario. Do you still think that MAC flap message would be somehow a problem related to the high CPU usage?
Thanks very much,
Dago
10-28-2015 12:29 PM
Hi Dago,
IPServices license is fine. You should be able to just download the new software.
I still feel the MAC Flap would have to do with the High CPU. Since 5508 is the controller used, the MACs flapping on 3750X might not be that of the wireless clients as the wireless client MAC should be learned over capwap between the APs and the controller.
HTH,
Roopa
11-06-2015 04:40 AM
So ... keeping on troubleshooting this problem, I was able to determinate that most of CPU work load was comming form an interface which directly connect another 3750X (L2) and underectly almost a thirth of our network devices. Shuting down that interface low down CPU usage to 50% aprox.
So I did a package capture with wireshark that interface. Whith just about 100 seconds of capture I got a 2GB file which had a grate ammount of ARP broadcast packages (check wireshark_iograph.pn attached file).
Will workarround this issue. Any suggestion onhow should I proceed?
Thanks
11-06-2015 06:36 AM
you could turn off gratuitous arps
no ip gratuitous-arps
sometimes though when you see way to many of these you could have an infrected host
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide