07-12-2018 03:30 PM - edited 03-08-2019 03:39 PM
I have a stack of 5 WS-C3850-48P running IOS 16.3.6.
A newer IOS has not been authorized yet for my company so upgrading IOS is not an option.
I saw a bug in the older IOS 16.3.3 that the IOSD ipc task would max out the CPU. I have multiple 3850's running 16.3.6 with no issue but in one case the IOSD ipc task is causing the CPU to run at a constant 70 to 90 percent.
I am monitoring this device's CPU with Prime Infrastructure 3.4 and below is a copy of the command "show process cpu | ex 0.0"
Any help would be much appreciated.
c3850-a01-zamjp-101-CMDCTR-0101-1s1#sho processes cpu | ex 0.00
CPU utilization for five seconds: 70%/27%; one minute: 77%; five minutes: 76%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
67 12234013 7303339 1675 0.39% 0.39% 0.39% 0 Net Background
74 1157800167 180749246 6405 39.95% 41.08% 41.07% 0 IOSD ipc task
91 2874211 6767303 424 0.07% 0.09% 0.08% 0 cpf_process_tpQ
111 1626020 29843562 54 0.07% 0.05% 0.05% 0 100ms check
118 1055734 5208071 202 0.07% 0.13% 0.07% 0 IOSXE-RP Punt Se
136 35330646 12607273 2802 1.42% 1.13% 1.14% 0 PLFM-MGR IPC pro
137 4545331 762258 5962 0.23% 0.15% 0.15% 0 FEP background p
138 1061137 304905 3480 0.07% 0.04% 0.02% 0 XPS background p
183 2569911 4473606 574 0.07% 0.07% 0.07% 0 Inline Power
190 3707017 41130020 90 0.15% 0.10% 0.09% 0 VRRS Main thread
196 12212408 7366089 1657 0.39% 0.38% 0.39% 0 CDP Protocol
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
204 2858665 55339339 51 0.08% 0.05% 0.07% 0 IPAM Manager
233 12771167 30322707 421 0.48% 0.39% 0.39% 0 UDLD
377 2400464 19488478 123 0.08% 0.09% 0.08% 0 PM Callback
507 10914107 13146518 830 0.24% 0.31% 0.33% 0 LLDP Protocol
520 443 164 2701 0.32% 0.36% 0.12% 2 SSH Process
Solved! Go to Solution.
07-26-2018 09:52 AM
07-26-2018 05:27 PM
Ok. The current set up does not call for a voice network yet. So it was not added. Because I knew voice was going to be installed at a later date I went ahead and added it to the switch. CPU instantly dropped down to 25% and continuing to hold. Again I am on 16.3.6 with a stack of 4 3850's being used as a layer 3 device.
Thank you for the help.
07-13-2018 04:32 AM
show ipc hog-info
https://www.cisco.com/c/en/us/td/docs/ios/12_0s/feature/guide/s_hogipc.html
07-15-2018 12:12 PM
Can you please provide the output of the following commands?
show platform software fed switch active punt cause summary
show platform software fed switch active inject cause summary
show platform hardware fed switch active qos queue stats internal cpu policer
show processes cpu platform sorted
show proc cpu sorted | exc una
show memory debug leaks
Regards,
07-15-2018 08:45 PM
sho platform software fed switch active punt cause summary
Statistics for all causes
Cause Cause Info Rcvd Dropped
------------------------------------------------------------------------------
7 ARP request or response 86060307 0
11 For-us data 6235673 0
21 RP<->QFP keepalive 1648636 0
24 Glean adjacency 333228 0
29 RP handled ICMP 95718 0
45 BFD control 21 0
55 For-us control 1961679 0
58 Layer2 bridge domain data packet 244265 0
60 IP subnet or broadcast packet 798750 0
74 ACL log 50453 0
96 Layer2 control protocols 676555 0
97 Pakcets to LFTS 81050 0
------------------------------------------------------------------------------
show platform software fed switch active inject cause summary
Statistics for all causes
Cause Cause Info Rcvd Dropped
------------------------------------------------------------------------------
1 L2 control/legacy 71419555 0
2 QFP destination lookup 10044994 0
5 QFP <->RP keepalive 1648700 0
7 QFP adjacency-id lookup 1480683 0
12 ARP request or response 4173162 0
43 Applications Injecting Pkts usin 356651 0
------------------------------------------------------------------------------
show platform hardware fed switch active qos queue stats internal cpu policer
QId PlcIdx Queue Name Enabled Rate Rate Drop
------------------------------------------------------------------------
0 11 DOT1X Auth No 1000 1000 0
1 1 L2 Control No 500 500 0
2 14 Forus traffic No 1000 1000 0
3 0 ICMP GEN Yes 200 200 0
4 2 Routing Control Yes 1800 1800 0
5 14 Forus Address resolution No 1000 1000 0
6 3 Punt Copy to ICMP Redirect No 500 500 0
7 6 WLESS PRI-5 No 1000 1000 0
8 4 WLESS PRI-1 No 1000 1000 0
9 5 WLESS PRI-2 No 1000 1000 0
10 6 WLESS PRI-3 No 1000 1000 0
11 6 WLESS PRI-4 No 1000 1000 0
12 0 BROADCAST Yes 200 200 0
13 10 Learning cache ovfl Yes 100 100 0
14 13 Sw forwarding Yes 1000 1000 0
15 8 Topology Control No 13000 13000 0
16 12 Proto Snooping No 500 500 0
17 16 DHCP Snooping No 1000 1000 0
18 9 Transit Traffic Yes 500 500 0
19 10 RPF Failed Yes 100 100 0
20 15 MCAST END STATION Yes 2000 2000 0
21 13 LOGGING Yes 1000 1000 0
22 7 Punt Webauth No 1000 1000 0
23 10 Crypto Control Yes 100 100 0
24 10 Exception Yes 100 100 0
25 3 General Punt No 500 500 0
26 10 NFL SAMPLED DATA Yes 100 100 0
27 2 Low Latency Yes 1800 1800 0
28 10 EGR Exception Yes 100 100 423424
29 16 Nif Mgr No 1000 1000 0
30 9 MCAST Data Yes 500 500 0
31 10 Gold Pkt Yes 100 100 0
show processes cpu platform sorted
CPU utilization for five seconds: 85%, one minute: 85%, five minutes: 85%
Core 0: CPU utilization for five seconds: 84%, one minute: 83%, five minutes: 84%
Core 1: CPU utilization for five seconds: 88%, one minute: 85%, five minutes: 86%
Core 2: CPU utilization for five seconds: 86%, one minute: 85%, five minutes: 86%
Core 3: CPU utilization for five seconds: 86%, one minute: 86%, five minutes: 85%
Pid PPid 5Sec 1Min 5Min Status Size Name
--------------------------------------------------------------------------------
11470 10478 69% 71% 73% R 2021191680 linux_iosd-imag
13508 12439 55% 54% 53% R 636878848 smd
10395 9510 51% 50% 49% R 559919104 fman_rp
16844 14617 31% 30% 30% S 1395531776 sif_mgr
16378 14115 22% 22% 23% S 2472513536 fed main event
32746 31253 17% 17% 17% S 302845952 repm
28777 26333 3% 3% 3% S 648773632 fman_fp_image
10850 9563 1% 1% 1% R 508071936 hman
21 2 1% 1% 1% S 0 rcuc/3
16 2 1% 1% 1% S 0 rcuc/2
32744 32006 0% 0% 0% S 4268032 bexec.sh
32273 31033 0% 0% 0% S 502558720 plogd
32006 31970 0% 0% 0% S 5914624 brelay.sh
31970 31174 0% 0% 0% S 11988992 in.telnetd
31969 1 0% 0% 0% S 13627392 rotee
show proc cpu sorted | exc una
CPU utilization for five seconds: 68%/26%; one minute: 68%; five minutes: 69%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
74 1252680005 201873207 6205 37.03% 35.58% 36.32% 0 IOSD ipc task
136 38552894 13802235 2793 1.75% 1.22% 1.17% 0 PLFM-MGR IPC pro
515 2304009 4509346 510 0.63% 0.07% 0.06% 0 OSPF-78 Router
67 13389246 8259489 1621 0.47% 0.40% 0.40% 0 Net Background
507 11904606 14368028 828 0.47% 0.35% 0.35% 0 LLDP Protocol
233 13940851 33098957 421 0.39% 0.41% 0.41% 0 UDLD
196 13327456 8038351 1657 0.39% 0.39% 0.39% 0 CDP Protocol
35 33331041 28267894 1179 0.31% 1.55% 1.12% 0 ARP Input
91 3144807 7429974 423 0.15% 0.09% 0.08% 0 cpf_process_tpQ
377 2624577 21287959 123 0.15% 0.09% 0.08% 0 PM Callback
137 4959664 831790 5962 0.15% 0.13% 0.14% 0 FEP background p
520 904 1277 707 0.15% 0.05% 0.05% 2 SSH Process
118 1247143 6415582 194 0.07% 0.03% 0.05% 0 IOSXE-RP Punt Se
190 4074112 45135507 90 0.07% 0.11% 0.10% 0 VRRS Main thread
236 176379 288076 612 0.07% 0.00% 0.00% 0 CEF background p
207 2965601 61394092 48 0.07% 0.08% 0.07% 0 IP ARP Retry Age
show memory debug leaks
I ran this command and it wouldn't display anything.
Thanks for the help.
07-25-2018 09:28 AM - edited 07-25-2018 12:53 PM
I am having the same exact problem on a few of my 3850 stacks running 16.6.3. All of them are setup a provisioned exactly the same.
show processes cpu platform sorted
CPU utilization for five seconds: 62%, one minute: 69%, five minutes: 68%
Core 0: CPU utilization for five seconds: 53%, one minute: 70%, five minutes: 70%
Core 1: CPU utilization for five seconds: 52%, one minute: 69%, five minutes: 67%
Core 2: CPU utilization for five seconds: 57%, one minute: 69%, five minutes: 68%
Core 3: CPU utilization for five seconds: 55%, one minute: 68%, five minutes: 68%
Pid PPid 5Sec 1Min 5Min Status Size Name
--------------------------------------------------------------------------------
4559 3923 100% 99% 100% R 2131861504 linux_iosd-imag
3838 2817 41% 55% 57% S 745037824 fman_rp
32294 30366 37% 49% 50% R 804732928 smd
20722 19415 13% 16% 17% S 1611341824 sif_mgr
19196 17292 11% 10% 10% S 2624487424 fed main event
31002 29282 7% 10% 10% S 843722752 fman_fp_image
30822 28916 5% 5% 5% S 755179520 repm
show proc cpu sort | ex 0.00
CPU utilization for five seconds: 92%/36%; one minute: 92%; five minutes: 92%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
74 4654743 816164 5703 55.16% 53.49% 52.62% 0 IOSD ipc task
206 62037 56743 1093 0.64% 0.72% 0.72% 0 Spanning Tree
132 35399 18016 1964 0.64% 0.41% 0.39% 0 PLFM-MGR IPC pro
I found this to be odd.
sh platform software infrastructure thread scheduler Statistics for IOS scheduler activities: 0 msec minimum clock runtime, 4335 msec maximum clock runtime 0 msec minimum cpu runtime, 3698 msec maximum cpu runtime 0 msec minimum sleep time, 27 msec maximum sleep time, 578255 interrupted 6835531 IOS Scheduler invocation, 3529093 idle 0 sec wallclock, (4464448471132%) IOS tasks, (13393345526371%) scheduler, (13393345548240%) sleep
Perhaps related to this?
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCus28046
07-26-2018 12:31 AM
Sorry for my delayed response,
Are you having dot1x configured on this switch?
Can you please run this command 3 times and provide the outputs?
show platform software fed switch active cpu-interface
I noticed that we are having some EGR exception drops at the control plane:
show platform hardware fed switch active qos queue stats internal cpu policer
...omitted output...
28 10 EGR Exception Yes 100 100 423424
So, some egress traffic is not being processed as expected and is being sent to CPU to take care of it.
Regards,
07-26-2018 12:40 AM
Yes we are using 802.1x.
Please see the output from sho platform software fed switch active cpu-interface.
c3850-a01-zamjp-101-CMDCTR-0101-1s1#sho platform software fed switch active cpu-interface
queue retrieved dropped invalid hol-block
-------------------------------------------------------------------------
Routing Protocol 310452 0 0 0
L2 Protocol 107244 0 0 0
sw forwarding 62523 0 0 0
broadcast 124876 0 0 0
icmp 0 0 0 0
icmp redirect 0 0 0 0
logging 10221 0 0 0
rpf-fail 0 0 0 0
DOT1X authentication 37992 0 0 0
Forus Traffic 2069488 0 0 0
Forus Resolution 12886731 0 0 0
Wireless q5 0 0 0 0
Wireless q1 0 0 0 0
Wireless q2 0 0 0 0
Wireless q3 0 0 0 0
Wireless q4 0 0 0 0
Learning cache 0 0 0 0
Topology control 0 0 0 0
Proto snooping 1489 0 0 0
BFD Low latency 0 0 0 0
Transit Traffic 0 0 0 0
Multi End station 37561 0 0 0
Health Check 0 0 0 0
Health Check 0 0 0 0
Crypto control 0 0 0 0
Exception 0 0 0 0
General Punt 0 0 0 0
NFL sampled data 0 0 0 0
STG cache 0 0 0 0
EGR exception 25224 0 0 0
FSS 0 0 0 0
Multicast data 0 0 0 0
c3850-a01-zamjp-101-CMDCTR-0101-1s1#sho platform software fed switch active cpu-interface
queue retrieved dropped invalid hol-block
-------------------------------------------------------------------------
Routing Protocol 310473 0 0 0
L2 Protocol 107246 0 0 0
sw forwarding 62523 0 0 0
broadcast 124885 0 0 0
icmp 0 0 0 0
icmp redirect 0 0 0 0
logging 10221 0 0 0
rpf-fail 0 0 0 0
DOT1X authentication 37992 0 0 0
Forus Traffic 2069592 0 0 0
Forus Resolution 12886757 0 0 0
Wireless q5 0 0 0 0
Wireless q1 0 0 0 0
Wireless q2 0 0 0 0
Wireless q3 0 0 0 0
Wireless q4 0 0 0 0
Learning cache 0 0 0 0
Topology control 0 0 0 0
Proto snooping 1489 0 0 0
BFD Low latency 0 0 0 0
Transit Traffic 0 0 0 0
Multi End station 37564 0 0 0
Health Check 0 0 0 0
Health Check 0 0 0 0
Crypto control 0 0 0 0
Exception 0 0 0 0
General Punt 0 0 0 0
NFL sampled data 0 0 0 0
STG cache 0 0 0 0
EGR exception 25224 0 0 0
FSS 0 0 0 0
Multicast data 0 0 0 0
c3850-a01-zamjp-101-CMDCTR-0101-1s1#sho platform software fed switch active cpu-interface
queue retrieved dropped invalid hol-block
-------------------------------------------------------------------------
Routing Protocol 310484 0 0 0
L2 Protocol 107248 0 0 0
sw forwarding 62523 0 0 0
broadcast 124891 0 0 0
icmp 0 0 0 0
icmp redirect 0 0 0 0
logging 10221 0 0 0
rpf-fail 0 0 0 0
DOT1X authentication 37999 0 0 0
Forus Traffic 2069615 0 0 0
Forus Resolution 12886775 0 0 0
Wireless q5 0 0 0 0
Wireless q1 0 0 0 0
Wireless q2 0 0 0 0
Wireless q3 0 0 0 0
Wireless q4 0 0 0 0
Learning cache 0 0 0 0
Topology control 0 0 0 0
Proto snooping 1489 0 0 0
BFD Low latency 0 0 0 0
Transit Traffic 0 0 0 0
Multi End station 37564 0 0 0
Health Check 0 0 0 0
Health Check 0 0 0 0
Crypto control 0 0 0 0
Exception 0 0 0 0
General Punt 0 0 0 0
NFL sampled data 0 0 0 0
STG cache 0 0 0 0
EGR exception 25224 0 0 0
FSS 0 0 0 0
Multicast data 0 0 0 0
07-26-2018 06:33 AM
Hello,
I upgraded to 16.6.4 and the high CPU issue went away.
show proc cpu sorted | ex 0.00 CPU utilization for five seconds: 11%/3%; one minute: 50%; five minutes: 41% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 468 1273 21408 59 1.43% 0.88% 0.31% 0 IP SLAs Responde 467 1266 21468 58 1.27% 0.89% 0.32% 0 IP SLAs Control 465 527 674 781 0.87% 0.18% 0.11% 0 LLDP Protocol 132 414 498 831 0.63% 0.15% 0.08% 0 PLFM-MGR IPC pro 206 1219 1507 808 0.39% 0.48% 0.29% 0 Spanning Tree 340 1308 324 4037 0.31% 0.07% 0.19% 5 SSH Process 114 566 3986 141 0.31% 0.18% 0.12% 0 IOSXE-RP Punt Se 144 343 344 997 0.31% 0.05% 0.02% 0 Inline Power 67 628 579 1084 0.23% 0.15% 0.10% 0 Net Background 79 686 49 14000 0.23% 0.14% 0.09% 0 Crimson flush tr 324 162 6473 25 0.15% 0.07% 0.03% 0 MMON MENG 91 76 275 276 0.15% 0.07% 0.01% 0 cpf_process_tpQ 292 210 3372 62 0.15% 0.09% 0.04% 0 MMA DB TIMER 74 15566 5149 3023 0.15% 1.74% 2.46% 0 IOSD ipc task 207 281 2390 117 0.15% 0.13% 0.07% 0 UDLD 323 187 3373 55 0.07% 0.06% 0.03% 0 MMA DP TIMER 195 177 5326 33 0.07% 0.08% 0.04% 0 IP ARP Retry Age 181 195 3374 57 0.07% 0.07% 0.04% 0 VRRS Main thread 133 176 69 2550 0.07% 0.06% 0.04% 0 FEP background p 377 499 1804 276 0.07% 0.07% 0.08% 0 SISF Main Thread
07-26-2018 09:52 AM
07-26-2018 05:27 PM
Ok. The current set up does not call for a voice network yet. So it was not added. Because I knew voice was going to be installed at a later date I went ahead and added it to the switch. CPU instantly dropped down to 25% and continuing to hold. Again I am on 16.3.6 with a stack of 4 3850's being used as a layer 3 device.
Thank you for the help.
07-26-2018 04:12 PM
Unfortunately I am not able to upgrade to 16.6.X. We are limited to 16.3.X and the latest 16.3.6 is what we are using. This is only happening on this one device, however this is the only device that is running layer 3.
Thoughts?
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide