08-23-2020 05:03 AM
Hello,
We've a 3560X switch which is using a lot of CPU (around 80-90% CPU) and 25-30% interrupts.
Top 3 processes:
175 1060646641 3649393632 0 16.13% 14.93% 14.93% 0 Hulc LED Process
224 3956630454 3320689710 0 10.86% 10.97% 11.01% 0 IP Input
13 2497812332 3353714124 0 3.35% 3.77% 3.72% 0 ARP Input
I think it's caused by ICMP traffic flooding the CPU because "show controllers cpu-interface" shows me this output (I monitored it for 24 hours):
cpu-queue-frames, icmp: increasing with 5800 packets/sec
cpu-queue-frames, sw forwarding: increasing with 485 packets/sec
cpu-queue-frames, routing: increasing with 297 packets/sec
show buffers output:
RxQ11 buffers, 2040 bytes (total 16, permanent 16):
1 in free list (0 min, 16 max allowed)
2225720443 hits, 2500324642 misses
How can I find what is causing this? I found that I can use "debug platform cpu-queues icmp-q" but this will probably crash the switch which is not an option because the amount of traffic going through this switch.
Thanks,
Erik
08-23-2020 05:18 AM
Post the complete output to the following commands:
sh version sh logs sh proc cpu sort | ex 0.00
08-23-2020 05:40 AM
#sh proc cpu sort | ex 0.00
CPU utilization for five seconds: 71%/24%; one minute: 76%; five minutes: 82%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
175 1061116087 3649447988 0 14.23% 15.15% 15.65% 0 Hulc LED Process
224 3956956216 3320891212 0 12.15% 11.19% 11.13% 0 IP Input
13 2497921404 3353822980 0 4.15% 3.65% 3.67% 0 ARP Input
85 2415659242 571532433 4226 2.23% 1.76% 1.80% 0 RedEarth Tx Mana
178 1022797110 107440483 9519 1.75% 1.11% 1.10% 0 HL3U bkgrd proce
14 1188472142 164323555 7232 1.59% 1.16% 1.12% 0 ARP Background
91 1448298093 58537854 24741 1.59% 1.33% 1.41% 0 Adjust Regions
212 371093820 224918596 1649 0.95% 0.64% 0.64% 0 CEF: IPv4 proces
84 784236347 789874155 992 0.63% 0.97% 0.95% 0 RedEarth I2C dri
129 707665518 147989673 4781 0.63% 0.78% 0.79% 0 hpm counter proc
242 227879339 539345236 422 0.31% 0.10% 0.12% 0 Spanning Tree
223 116344653 4149362332 0 0.31% 0.06% 0.05% 0 IP ARP Retry Age
56 100671811 147991845 680 0.15% 0.04% 0.02% 0 Per-Second Jobs
190 278277891 29473918 9441 0.15% 0.19% 0.20% 0 HQM Stack Proces
290 32244963 184700818 174 0.15% 0.14% 0.15% 0 TCP Protocols
420 207838851 1220917627 170 0.15% 0.11% 0.14% 0 HSRP IPv4
86 19754962 394937080 50 0.15% 0.04% 0.04% 0 RedEarth Rx Mana
126 288853092 2015114174 143 0.15% 0.18% 0.16% 0 hpm main process
107 483417410 4149037853 0 0.15% 0.10% 0.06% 0 HLFM address lea
#show version
Cisco IOS Software, C3560E Software (C3560E-UNIVERSALK9-M), Version 15.2(4)E10, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2020 by Cisco Systems, Inc.
Compiled Tue 31-Mar-20 21:44 by prod_rel_team
ROM: Bootstrap program is C3560E boot loader
BOOTLDR: C3560E Boot Loader (C3560X-HBOOT-M) Version 12.2(58r)SE1, RELEASE SOFTWARE (fc1)
License Level: ipservices
License Type: Permanent
Next reload license Level: ipservices
cisco WS-C3560X-24 (PowerPC405) processor (revision M0) with 262144K bytes of memory.
Processor board ID FDO1852F1J3
Last reset from power-on
7 Virtual Ethernet interfaces
1 FastEthernet interface
28 Gigabit Ethernet interfaces
2 Ten Gigabit Ethernet interfaces
The password-recovery mechanism is enabled.
512K bytes of flash-simulated non-volatile configuration memory.
Base ethernet MAC Address : 74:A2:E6:67:D0:80
Motherboard assembly number : 73-12554-12
Motherboard serial number : FDO185302LK
Model revision number : M0
Motherboard revision number : A0
Model number : WS-C3560X-24T-E
Daughterboard assembly number : 800-32786-02
Daughterboard serial number : FDO18520QZ6
System serial number : FDO1852F1J3
Top Assembly Part Number : 800-31331-09
Top Assembly Revision Number : F0
Version ID : V06
CLEI Code Number : CMMPW00DRA
Hardware Board Revision Number : 0x05
Switch Ports Model SW Version SW Image
------ ----- ----- ---------- ----------
* 1 30 WS-C3560X-24 15.2(4)E10 C3560E-UNIVERSALK9-M
show log only shows rows of:
%SEC_LOGIN-5-LOGIN_SUCCESS: Login Success
Nothing else.
08-23-2020 06:00 AM
You already running latest IOS, is this CPU high you seeing after any recent upgrade?
08-23-2020 06:04 AM
We upgraded last week from 15.0(2)SE7 to this version to see if this fixed the issue. We've this high CPU Problem for a few weeks now.
08-23-2020 06:00 AM
No logs? That's strange.
Try this:
08-23-2020 06:12 AM
04-15-2025 05:53 AM
I am having the same issues after upgrading to 15.2(4)E10. My switches cpu is 93-98%. Has anything been resolved?
04-15-2025 06:37 AM
"My switches cpu is 93-98%."
Doing what, exactly?
04-15-2025 06:50 AM
I dont know.. this is what i am seeing on the cpu processes. Our monitoring software is always showing high cpu utilization for these switches. I have seen this high utilization on switches with hardly any devices connected to it. I do have 1 switch that only has 2 vlans and its utilization is a lot lower, so i dont know if this is a traffic thing causing it or something else.. the other switches that this is happening on have about 10 vlans. Its exactly what Wowzie is describing above.
CPU utilization for five seconds: 91%/30%; one minute: 94%; five minutes: 95%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
175 122864891 11695364 10505 18.71% 16.54% 16.42% 0 Hulc LED Process
337 94543646 47980026 1970 13.59% 13.89% 14.42% 0 IGMPSN
336 71752206 45542865 1575 7.19% 9.79% 10.34% 0 IGMPSN MRD
84 35128976 4849807 7243 5.27% 4.65% 4.61% 0 RedEarth Tx Mana
83 34099663 7842599 4348 5.59% 4.51% 4.47% 0 RedEarth I2C dri
4 16189838 659116 24562 0.00% 1.71% 2.05% 0 Check heaps
129 8181575 472255 17324 1.11% 1.07% 1.11% 0 hpm counter proc
126 4735154 6201911 763 0.63% 0.55% 0.59% 0 hpm main process
190 3493717 93551 37345 0.47% 0.47% 0.47% 0 HQM Stack Proces
105 1702 126 13507 0.47% 1.66% 0.41% 1 SSH Process
240 1709813 1917425 891 0.31% 0.23% 0.23% 0 Spanning Tree
11 1638126 7843 208864 0.00% 0.19% 0.21% 0 Licensing Auto U
117 1423997 10708648 132 0.15% 0.23% 0.19% 0 HLFM address lea
85 1300877 3735185 348 0.31% 0.25% 0.18% 0 RedEarth Rx Mana
57 1290083 472661 2729 0.00% 0.16% 0.17% 0 Per-Second Jobs
332 1240785 1013532 1224 0.31% 0.16% 0.16% 0 Marvell wk-a Pow
222 1152807 6672356 172 0.00% 0.14% 0.15% 0 VRRS Main thread
355 1108000 6672177 166 0.00% 0.11% 0.13% 0 MMA DB TIMER
382 1122830 6672279 168 0.15% 0.10% 0.13% 0 MMA DP TIMER
164 863659 2245806 384 0.00% 0.09% 0.10% 0 Hulc Storm Contr
225 917102 10708396 85 0.15% 0.10% 0.10% 0 IP ARP Retry Age
455 620597 730518 849 0.00% 0.08% 0.09% 0 LLDP Protocol
383 1029793 13213914 77 0.00% 0.06% 0.09% 0 MMON MENG
191 624047 187176 3334 0.15% 0.10% 0.09% 0 HRPC qos request
95 680613 2353751 289 0.00% 0.09% 0.08% 0 yeti2_emac_proce
--More--
04-15-2025 08:22 AM
Reading https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst3750/software/troubleshooting/cpu_util.html hard to say whether you have a real problem, or it's normal. Further, whether there's anything you might do beyond trying another IOS variant.
Two things to keep in mind.
First, most data forwarding, on a switch, is accomplished on dedicated hardware. I.e., a busy CPU matters little.
Second, usually CPU processes are prioritized, so a low priority, background process, consuming much, even all available CPU, is not detrimental to normal operations. The biggest "issue" is general monitoring showing a very busy CPU.
The one thing, that's possibility of concern, is the high utilization of the two IGMP snooping processes.
What's your multicast environment like?
04-16-2025 06:34 AM
Reasonably low. I have a total of 4 vlans that are running multicast. most of the switches with high cpu dont have those vlans on them.
on one of the switches:
show ip mroute active
Active IP Multicast Sources - sending >= 4 kbps
on our whole network:
Active IP Multicast Sources - sending >= 4 kbps
Group: 239.192.4.226, (?)
Source: 10.0.60.32 (?)
Rate: 83 pps/64 kbps(1sec), 64 kbps(last 20 secs), 63 kbps(life avg)
Group: 239.192.4.227, (?)
Source: 10.0.60.32 (?)
Rate: 20 pps/18 kbps(1sec), 18 kbps(last 20 secs), 18 kbps(life avg)
Group: 239.192.4.224, (?)
Source: 10.0.60.32 (?)
Rate: 20 pps/18 kbps(1sec), 18 kbps(last 20 secs), 18 kbps(life avg)
Group: 239.192.4.225, (?)
Source: 10.0.60.32 (?)
Rate: 85 pps/65 kbps(1sec), 65 kbps(last 20 secs), 65 kbps(life avg)
Group: 239.192.4.228, (?)
Source: 10.0.60.32 (?)
Rate: 20 pps/18 kbps(1sec), 18 kbps(last 20 secs), 18 kbps(life avg)
Group: 239.192.4.229, (?)
Source: 10.0.60.32 (?)
Rate: 19 pps/18 kbps(1sec), 18 kbps(last 20 secs), 18 kbps(life avg)
Group: 239.192.4.192, (?)
Source: 10.0.60.31 (?)
Rate: 20 pps/18 kbps(1sec), 18 kbps(last 30 secs), 18 kbps(life avg)
Group: 239.192.4.193, (?)
Source: 10.0.60.31 (?)
Rate: 85 pps/65 kbps(1sec), 65 kbps(last 30 secs), 65 kbps(life avg)
Group: 239.192.4.194, (?)
Source: 10.0.60.31 (?)
Rate: 83 pps/63 kbps(1sec), 63 kbps(last 30 secs), 63 kbps(life avg)
Group: 239.192.4.195, (?)
Source: 10.0.60.31 (?)
Rate: 20 pps/18 kbps(1sec), 18 kbps(last 30 secs), 18 kbps(life avg)
Group: 239.192.4.196, (?)
Source: 10.0.60.31 (?)
Rate: 20 pps/18 kbps(1sec), 18 kbps(last 30 secs), 18 kbps(life avg)
Group: 239.192.4.197, (?)
Source: 10.0.60.31 (?)
Rate: 20 pps/18 kbps(1sec), 18 kbps(last 30 secs), 18 kbps(life avg)
Group: 239.192.109.160, (?)
Source: 10.0.75.102 (?)
Rate: 4 pps/6 kbps(1sec), 6 kbps(last 40 secs), 6 kbps(life avg)
Group: 239.192.109.128, (?)
Source: 10.0.75.101 (?)
Rate: 4 pps/6 kbps(1sec), 6 kbps(last 30 secs), 32 kbps(life avg)
Group: 239.192.109.129, (?)
Source: 10.0.75.101 (?)
Rate: 197 pps/173 kbps(1sec), 173 kbps(last 20 secs), 169 kbps(life avg)
04-16-2025 07:12 AM
Cannot say for sure, but if it's due to your multicast environment, it may not have anything to due with bandwidth of multicast flows, or number of such flows, but rather, perhaps, host IGMP activity.
Unless someone else replies with some useful information, I only see 3 options. First, if there's no operational issues due to the high CPU, you just accept it. Second, if you have Cisco support, open a case with TAC. Third, retain a network consultant to come on-site to analyze the issue.
04-16-2025 07:35 AM
Thank you for your input on this. I am seeing my only option is to accept and move on. These switches are out of life, out of support, so i cant open a TAC case on them. I know they need to be replaced, but the budget doesnt allow me to replace all of them which is why i am trying to nurse these things as long as i can.
04-16-2025 07:49 AM
Hi,
I think it's still the same old well-known bug related to that "Hulc LED Process". There are tens of discussions on cisco forum about same problem for this models of switches 3560X/3750X. That time when I was working with 3750X I had exactly same issue.
https://bst.cisco.com/bugsearch/bug/CSCtn42790?rfs=qvlogin
Best regards,
Jan
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide