02-15-2012 05:19 AM - edited 03-07-2019 04:57 AM
Hi Experts,
We had a core switch(30 vlans) in our environment and it's noticed that CPU utilization of the core switch is showing high during recent days. I have checked logs and processes, but couldn't find the root cause. The issue found only in the office hourly only(after that the cpu utilization is idle and normal). I have already referred the following link to troubleshoot the issue "
http://www.cisco.com/en/US/products/hw/switches/ps663/products_tech_note09186a00804cef15.shtml
"
Please see the following outputs from the core switch
CPU utilization for five seconds: 99%/0%; one minute: 99%; five minutes: 99%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
30 6687557041432420936 466 47.52% 47.10% 45.98% 0 Cat4k Mgmt LoPri
55 30667074884269560256 0 45.36% 45.50% 45.85% 0 IP Input
29 28552856922870089254 0 3.59% 3.50% 3.57% 0 Cat4k Mgmt HiPri
49 405212948 846027881 478 1.03% 0.99% 0.99% 0 Spanning Tree
59 56846588 590581776 96 0.71% 0.87% 0.88% 0 HSRP (Standby)
3 18379520 156281914 117 0.23% 0.23% 0.23% 0 OSPF Hello
117 116 146 794 0.15% 0.08% 0.02% 2 Virtual Exec
18 42606344 200951937 212 0.07% 0.07% 0.07% 0 ARP Input
Core1#sh platform health
%CPU %CPU RunTimeMax Priority Average %CPU Total
Target Actual Target Actual Fg Bg 5Sec Min Hour CPU
Lj-poll 1.00 0.02 2 0 100 500 0 0 0 145:58
GalChassisVp-review 3.00 0.15 10 14 100 500 0 0 0 732:59
S2w-JobEventSchedule 10.00 0.44 10 9 100 500 0 0 0 2694:0
0
Stub-JobEventSchedul 10.00 0.00 10 0 100 500 0 0 0 0:00
StatValueMan Update 1.00 0.08 1 1 100 500 0 0 0 364:31
Pim-review 0.10 0.00 1 1 100 500 0 0 0 34:51
Ebm-host-review 1.00 0.00 8 4 100 500 0 0 0 91:39
Ebm-port-review 0.10 0.00 1 0 100 500 0 0 0 3:02
Protocol-aging-revie 0.20 0.00 2 0 100 500 0 0 0 0:09
Acl-Flattener 1.00 0.00 10 5 100 500 0 0 0 0:00
KxAclPathMan create/ 1.00 0.00 10 5 100 500 0 0 0 0:03
KxAclPathMan update 2.00 0.00 10 0 100 500 0 0 0 30:18
KxAclPathMan reprogr 1.00 0.00 2 0 100 500 0 0 0 25:00
TagMan-RecreateMtegR 1.00 0.00 10 0 100 500 0 0 0 0:00
K2CpuMan Review 30.00 41.56 30 33 100 500 39 44 32 96152:
23
K2AccelPacketMan: Tx 10.00 2.31 20 1 100 500 2 2 5 22504:
47
K2AccelPacketMan: Au 0.10 0.00 0 0 100 500 0 0 0 0:00
K2AclMan-taggedFlatA 1.00 0.00 10 0 100 500 0 0 0 0:00
K2AclCamMan stale en 1.00 0.00 10 0 100 500 0 0 0 0:00
K2AclCamMan hw stats 3.00 0.19 10 5 100 500 0 0 0 2336:3
7
K2AclCamMan kx stats 1.00 0.66 10 5 100 500 0 0 0 966:25
K2AclCamMan Audit re 1.00 0.00 10 5 100 500 2 0 0 1112:3
8
K2AclPolicerTableMan 1.00 0.00 10 2 100 500 0 0 0 88:31
K2L2 Address Table R 2.00 0.29 12 5 100 500 0 0 0 827:26
K2L2 New Static Addr 2.00 0.00 10 5 100 500 0 0 0 0:02
K2L2 New Multicast A 2.00 0.00 10 5 100 500 0 0 0 0:01
K2L2 Dynamic Address 2.00 0.00 10 5 100 500 0 0 0 0:00
K2L2 Vlan Table Revi 2.00 0.00 12 8 100 500 0 0 0 0:03
K2 L2 Destination Ca 2.00 0.00 10 0 100 500 0 0 0 0:00
K2PortMan Review 2.00 1.61 15 11 100 500 1 1 1 8004:4
0
----------------------------------------------------------------------------------------------------------------------------------------
Core1#sh platform cpu packet statistics
Packets Dropped In Hardware By CPU Subport (txQueueNotAvail)
CPU Subport TxQueue 0 TxQueue 1 TxQueue 2 TxQueue 3
------------ --------------- --------------- --------------- ---------------
0 0 0 0 421934692
2 0 101147 0 0
RkiosSysPacketMan:
Packet allocation falures: 0
Packet Buffer(Software Common) allocation falures: 0
Packet Buffer(Software ESMP) allocation falures: 0
Packet Buffer(Software EOBC) allocation falures: 0
IOS Packet Buffer Wrapper allocation falures: 0
Packets Dropped In Processing Overall
Total 5 sec avg 1 min avg 5 min avg 1 hour avg
-------------------- --------- --------- --------- ----------
72 0 0 0 0
Packets Dropped In Processing by CPU event
Event Total 5 sec avg 1 min avg 5 min avg 1 hour avg
----------------- -------------------- --------- --------- --------- ----------
SA Miss 16 0 0 0 0
Packets Dropped In Processing by Priority
Priority Total 5 sec avg 1 min avg 5 min avg 1 hour avg
----------------- -------------------- --------- --------- --------- ----------
Normal 56 0 0 0 0
Medium 72 0 0 0 0
Packets Dropped In Processing by Reason
Reason Total 5 sec avg 1 min avg 5 min avg 1 hour avg
------------------ -------------------- --------- --------- --------- ----------
SrcAddrTableFilt 16 0 0 0 0
L2DstDrop 2 0 0 0 0
NoDstPorts 4 0 0 0 0
NoFloodPorts 50 0 0 0 0
Total packet queues 16
Packets Received by Packet Queue
Queue Total 5 sec avg 1 min avg 5 min avg 1 hour avg
---------------------- --------------- --------- --------- --------- ----------
Control 1592264321 61 52 45 34
Host Learning 1140162 0 0 0 0
L3 Fwd High 149 0 0 0 0
L3 Fwd Medium 3 0 0 0 0
L3 Fwd Low 18667481859 1558 1390 1177 943
L2 Fwd Medium 255 0 0 0 0
L2 Fwd Low 425118994 19 9 9 4
L3 Rx High 499 0 0 0 0
L3 Rx Low 54609808 0 0 0 0
RPF Failure 134 0 0 0 0
ACL fwd(snooping) 426175149 16 10 7 0
Packets Dropped by Packet Queue
Queue Total 5 sec avg 1 min avg 5 min avg 1 hour avg
---------------------- --------------- --------- --------- --------- ----------
Host Learning 299235 0 0 0 0
L3 Fwd Low 1732 0 0 0 0
L2 Fwd Low 63674 0 0 0 0
----------------------------------------------------------------------------------------------------------------------------------------
Core1# sh spanning-tree summary
Switch is in pvst mode
Root bridge for: none
Extended system ID is enabled
Portfast Default is disabled
PortFast BPDU Guard Default is disabled
Portfast BPDU Filter Default is disabled
Loopguard Default is disabled
EtherChannel misconfig guard is enabled
UplinkFast is disabled
BackboneFast is disabled
Configured Pathcost method used is short
Name Blocking Listening Learning Forwarding STP Active
---------------------- -------- --------- -------- ---------- ----------
VLAN0001 2 0 0 5 7
VLAN0044 2 0 0 5 7
VLAN0055 2 0 0 5 7
VLAN0066 2 0 0 5 7
VLAN0100 2 0 0 5 7
VLAN0110 2 0 0 5 7
VLAN0111 2 0 0 5 7
VLAN0112 2 0 0 5 7
VLAN0113 2 0 0 5 7
VLAN0114 2 0 0 5 7
VLAN0115 2 0 0 5 7
VLAN0116 2 0 0 5 7
VLAN0117 2 0 0 5 7
VLAN0118 2 0 0 5 7
VLAN0119 2 0 0 5 7
VLAN0120 2 0 0 5 7
VLAN0121 2 0 0 5 7
VLAN0122 2 0 0 5 7
VLAN0123 2 0 0 5 7
VLAN0124 2 0 0 5 7
Name Blocking Listening Learning Forwarding STP Active
---------------------- -------- --------- -------- ---------- ----------
VLAN0125 2 0 0 5 7
VLAN0126 2 0 0 5 7
VLAN0127 2 0 0 5 7
VLAN0128 2 0 0 5 7
VLAN0129 2 0 0 5 7
VLAN0130 2 0 0 5 7
VLAN0131 2 0 0 5 7
VLAN0132 2 0 0 5 7
VLAN0200 2 0 0 5 7
VLAN0555 2 0 0 5 7
---------------------- -------- --------- -------- ---------- ----------
30 vlans 60 0 0 150 210
----------------------------------------------------------------------------------------------------------------------------------------
Core1#sh logging
Syslog logging: enabled (0 messages dropped, 30 messages rate-limited, 0 flushes
, 0 overruns, xml disabled, filtering disabled)
Console logging: level debugging, 450 messages logged, xml disabled,
filtering disabled
Monitor logging: level informational, 0 messages logged, xml disabled,
filtering disabled
Buffer logging: level debugging, 479 messages logged, xml disabled,
filtering disabled
Exception Logging: size (8192 bytes)
Count and timestamp logging messages: disabled
Trap logging: level notifications, 376 message lines logged
Logging to 10.55.44.11, 16 message lines logged, xml disabled,
filtering disabled
Log Buffer (4096 bytes):
1y4w: %C4K_L2MAN-6-INVALIDSOURCEADDRESSPACKET: (Suppressed 2 times)Packet receiv
ed with invalid source MAC address (00:00:00:00:00:00) on port Gi3/2 in vlan 1
1y4w: %C4K_L2MAN-6-INVALIDSOURCEADDRESSPACKET: Packet received with invalid sour
ce MAC address (00:00:00:00:00:00) on port Gi3/2 in vlan 1
1y4w: %C4K_L2MAN-6-INVALIDSOURCEADDRESSPACKET: (Suppressed 1 times)Packet receiv
ed with invalid source MAC address (00:00:00:00:00:00) on port Gi3/2 in vlan 1
Note: Each Invalid source mac entry in each day from same port (above is the output after clearing logs before 3 days)
My queries regarding the issues are following
1) How can we find root cause regarding the high cpu utilization?
2) Is a single invalid mac address makes the cpu highly utilized for suppressing it?
Please provide your valuable suggestions regarding this issue
Advanced Thanks & Regards,
Sihanu
02-15-2012 08:02 AM
I have noticed many packets traveling around my network with invalid source MAC addresses - turns out these packets were from two different sources.
One source was valid, clients requesting DHCP addresses
The second was from a wanna-be technical security engineer trying out new software - just to see what would happen.
The second was invalid and removing him from the production network solved that issue.
:
I have also seen time of high utilization, especially on less-smart 4500 platforms, spanning-tree loops. The 4500 could not keep up with the traffic and thus ran the CPU 99%. And this was fixed by mapping out the net and breaking the loop.
:
Just last week, I had to change the IOS level running as something within the IOS caused 99% utilization 100% of the time.
:
Don't know if any of this helps, hopefully so
Regards
Frank
02-15-2012 01:38 PM
Hi Frank,
Many thanks for your information update,
Please let us know following details
1) How to find and break loops in the core switch 4503?
2) Please let me know the stable ios version for 4503?
Thanks & Regards,
Sihanu N
02-15-2012 04:22 PM
Hi Sihanu,
Before we start down the spanning-tree road, I think it might be more beneficial to figure out what is causing the high cpu utilization and then focus on that issue. Shown below is a link to a Cisco doc the help identify reasons for high cpu issues on 4500 series switches. Let us know how it goes
Identify the Reason for High CPU Utilization on Catalyst 4500
http://www.cisco.com/en/US/products/hw/switches/ps663/products_tech_note09186a00804cef15.shtml
02-15-2012 10:48 PM
Hi Frank,
I have already referred the cisco document mentioned, but couldn't find the root cause regarding this issue. Please see my first post in this discussion. All the required show command outputs are posted in that, it would be great if you could find and provide the exact root cause from those outputs and let me know any more show command output needs to be posted for the kind reference.
Its noted that one of the access switch in our environment is the root bridge, any issue related to this with the high cpu utilization of the core switch?
Kindly furnish the solution for this issue
Thanks and Regards,
Sihanu N
02-16-2012 05:38 PM
Hi Sihanu
Looking through the output in your original post, some of the outputs are missing. The output you listed is VERY hard to follow due to your text font choice.
When you paste output from the CLI into this post, select font COURIER NEW which makes all the data fields line up, selecting size 8 usually allows each line to not wrap to the next line (not always, but usually).
From the url provided in my previous post, the troubleshooting doc indicates if you receive K2CpuMan (which you did) but you failed to provide the next steps or output.
:
Perhaps the document will not help, maybe it will.
One other thought, if spanning-tree is the problem you will see spanning-tree recalculating numbers very high. If the recalculating numbers are low, move onto something else.
:
Don't have any solid details at this point, try to follow the steps in the troubleshooing doc and lets see what shows up.
Another idea is to open a TAC case as the Cisco techs will focus on your issue - with you.
Best Regards
Frank
:
The general troubleshooting steps are:
Issue the show processes cpu command in order to identify the Cisco IOS processes that consume CPU cycles.
Issue the show platform health command in order to further identify the platform-specific processes.
If the highly active process is K2CpuMan Review , issue the show platform cpu packet statistics command in order to identity the type of traffic that hits the CPU.
If the activity is not due to the K2CpuMan Review process, skip Step 4 and go on to Step 5.
Identify the packets that hit the CPU with use of the Troubleshooting Tools to Analyze the Traffic Destined to the CPU, if necessary.
An example of the troubleshooting tools to use is the CPU Switched Port Analyzer (SPAN).
Review this document and the section Troubleshoot Common High CPU Utilization Problems for common causes.
If you still cannot identify the root cause, contact Cisco Technical Support.
02-17-2012 01:52 AM
Hi Frank,
Many many thanks for your valuable suggessions and solution steps. Please see the output in correct format of show cpu packet statistics
Core1# sh platform cpu packet statistics
Packets Dropped In Hardware By CPU Subport (txQueueNotAvail)
CPU Subport TxQueue 0 TxQueue 1 TxQueue 2 TxQueue 3
------------ --------------- --------------- --------------- ---------------
0 0 0 0 421854793
2 0 101147 0 0
RkiosSysPacketMan:
Packet allocation falures: 0
Packet Buffer(Software Common) allocation falures: 0
Packet Buffer(Software ESMP) allocation falures: 0
Packet Buffer(Software EOBC) allocation falures: 0
IOS Packet Buffer Wrapper allocation falures: 0
Packets Dropped In Processing Overall
Total 5 sec avg 1 min avg 5 min avg 1 hour avg
-------------------- --------- --------- --------- ----------
72 0 0 0 0
Packets Dropped In Processing by CPU event
Event Total 5 sec avg 1 min avg 5 min avg 1 hour avg
----------------- -------------------- --------- --------- --------- ----------
SA Miss 16 0 0 0 0
Packets Dropped In Processing by Priority
Priority Total 5 sec avg 1 min avg 5 min avg 1 hour avg
----------------- -------------------- --------- --------- --------- ----------
Normal 56 0 0 0 0
Medium 72 0 0 0 0
Packets Dropped In Processing by Reason
Reason Total 5 sec avg 1 min avg 5 min avg 1 hour avg
------------------ -------------------- --------- --------- --------- ----------
SrcAddrTableFilt 16 0 0 0 0
L2DstDrop 2 0 0 0 0
NoDstPorts 4 0 0 0 0
NoFloodPorts 50 0 0 0 0
Total packet queues 16
Packets Received by Packet Queue
Queue Total 5 sec avg 1 min avg 5 min avg 1 hour avg
---------------------- --------------- --------- --------- --------- ----------
Control 1591972297 61 52 45 34
Host Learning 1139997 0 0 0 0
L3 Fwd High 149 0 0 0 0
L3 Fwd Medium 3 0 0 0 0
L3 Fwd Low 18661965352 823 796 769 880
L2 Fwd Medium 255 0 0 0 0
L2 Fwd Low 425038440 24 10 8 4
L3 Rx High 499 0 0 0 0
L3 Rx Low 54595043 3 1 0 0
RPF Failure 134 0 0 0 0
ACL fwd(snooping) 426097006 15 10 7 0
Packets Dropped by Packet Queue
Queue Total 5 sec avg 1 min avg 5 min avg 1 hour avg
---------------------- --------------- --------- --------- --------- ----------
Host Learning 299235 0 0 0 0
L3 Fwd Low 1732 0 0 0 0
L2 Fwd Low 63674 0 0 0 0
show logging output
1y4w: %C4K_L2MAN-6-INVALIDSOURCEADDRESSPACKET: (Suppressed 2 times)Packet receiv
ed with invalid source MAC address (00:00:00:00:00:00) on port Gi3/2 in vlan 1
1y4w: %C4K_L2MAN-6-INVALIDSOURCEADDRESSPACKET: Packet received with invalid sour
ce MAC address (00:00:00:00:00:00) on port Gi3/2 in vlan 1
1y4w: %C4K_L2MAN-6-INVALIDSOURCEADDRESSPACKET: (Suppressed 1 times)Packet receiv
ed with invalid source MAC address (00:00:00:00:00:00) on port Gi3/2 in vlan 1
1y5w: %C4K_L2MAN-6-INVALIDSOURCEADDRESSPACKET: (Suppressed 1 times)Packet receiv
ed with invalid source MAC address (00:00:00:00:00:00) on port Gi3/2 in vlan 1
1y5w: %C4K_L2MAN-6-INVALIDSOURCEADDRESSPACKET: Packet received with invalid sour
ce MAC address (00:00:00:00:00:00) on port Gi3/2 in vlan 1
We are getting an Invalid mac address by analysing the logs everyday. We decided to span the traffic for monitoring for Gi3/2, and wireshark to monitor the traffic.
Kindly let me know the suggessions to the following queries
1) Is any other tool rather than Wireshark to monitor the traffic effectively find the source of invalid mac address?
2) Is any other debugging in core swich can find the exact reason for the high cpu utilization?
3) Please clarify the monitoring of traffic to the CPU using span(which all are the physical interfaces or vlan interface to monitor using span))?
Thanks and Regards,
Sihanu N
Core1# sh platform cpu packet statistics
Packets Dropped In Hardware By CPU Subport (txQueueNotAvail)
CPU Subport TxQueue 0 TxQueue 1 TxQueue 2 TxQueue 3
------------ --------------- --------------- --------------- ---------------
0 0 0 0 421854793
2 0 101147 0 0
RkiosSysPacketMan:
Packet allocation falures: 0
Packet Buffer(Software Common) allocation falures: 0
Packet Buffer(Software ESMP) allocation falures: 0
Packet Buffer(Software EOBC) allocation falures: 0
IOS Packet Buffer Wrapper allocation falures: 0
Packets Dropped In Processing Overall
Total 5 sec avg 1 min avg 5 min avg 1 hour avg
-------------------- --------- --------- --------- ----------
72 0 0 0 0
Packets Dropped In Processing by CPU event
Event Total 5 sec avg 1 min avg 5 min avg 1 hour avg
----------------- -------------------- --------- --------- --------- ----------
SA Miss 16 0 0 0 0
Packets Dropped In Processing by Priority
Priority Total 5 sec avg 1 min avg 5 min avg 1 hour avg
----------------- -------------------- --------- --------- --------- ----------
Normal 56 0 0 0 0
Medium 72 0 0 0 0
Packets Dropped In Processing by Reason
Reason Total 5 sec avg 1 min avg 5 min avg 1 hour avg
------------------ -------------------- --------- --------- --------- ----------
SrcAddrTableFilt 16 0 0 0 0
L2DstDrop 2 0 0 0 0
NoDstPorts 4 0 0 0 0
NoFloodPorts 50 0 0 0 0
Total packet queues 16
Packets Received by Packet Queue
Queue Total 5 sec avg 1 min avg 5 min avg 1 hour avg
---------------------- --------------- --------- --------- --------- ----------
Control 1591972297 61 52 45 34
Host Learning 1139997 0 0 0 0
L3 Fwd High 149 0 0 0 0
L3 Fwd Medium 3 0 0 0 0
L3 Fwd Low 18661965352 823 796 769 880
L2 Fwd Medium 255 0 0 0 0
L2 Fwd Low 425038440 24 10 8 4
L3 Rx High 499 0 0 0 0
L3 Rx Low 54595043 3 1 0 0
RPF Failure 134 0 0 0 0
ACL fwd(snooping) 426097006 15 10 7 0
Packets Dropped by Packet Queue
Queue Total 5 sec avg 1 min avg 5 min avg 1 hour avg
---------------------- --------------- --------- --------- --------- ----------
Host Learning 299235 0 0 0 0
L3 Fwd Low 1732 0 0 0 0
L2 Fwd Low 63674 0 0 0 0
02-17-2012 08:44 AM
Sihanu,
1. Yes Wireshark is the best tool to identify issues on the network but let me also say, this tool is not for the new engineer as there is toooooooooo much data to understand.
:
2. There are other tools but the output is very similar to the output in Wireshark - pages and pages and pages, again not for a new engineer as there is tooooooooooooooooooooooooooo much data and the IOS tools can render your device unmanageable. Monitoring with Wireshark is the best and saftest way to proceed.
:
3. To narrow your search, start monitoring the vlan and or trunk link(s), then when you see specific traffic, note the details and this will allow you to focus on the exact port and/or physical interface(s).
:
4. Not sure if the logging output provided [in your last post] is complete - if you are only seeing a few log entries with invalid MAC entries, to me this is a mute and minor issue since the box is processing Gigbytes of traffic.
:
:
A couple of things I would do before moving forward:
1. set the date and time
2. set logging levels [to buffer] to see debugging-level messages
3. Consider (ONLY by managements approval) reloading your switch as it has been in operation for more than a year. Reloading may remove some unknown (stuck) session(s) - no guarantee but may help.
4. If you reload, reset the date and time (as it dosn't appear you are using NTP.)
5. If problem remains - Open a TAC case to get dedicate support.
:
HTH
Frank
02-20-2012 02:10 AM
Hi Frank,
Thanks for all your support and sorry for the delayed reply.
Any way we are planning to implement span traffic and monitor the traffic to find root cause. But we have already configured the time manually and its showing correct when we check the clock of the swich.
1) Why time is not showing in the logs entry (now its like 1y 5w or 1y 6w)?
2)Also syslog server is also implemented recently but for the switches its not receivng configured trap levels to the syslog server(only console messages are getting), for routers and asa's its getting according to the trap levels we set, What might be the reason for this issue?
Thanks and Regards,
Sihanu N
02-20-2012 07:49 AM
This will add timestamps to your log messages
service timestamps log datetime
service timestamps debug datetime
Send logging messages to a remote syslog server
logging
HTH
Frank
02-29-2012 04:09 AM
Hi Frank,
Sorry for the delayed reply for the post
Thanks for your valuable reply to my queries. It works like a charm.
Thanks and Regards
Sihanu N
10-02-2014 04:17 PM
Just want to share i resolved my issue by disabling ip redirects in my interface Vlans. there was one Vlan that was causing the issue, and CPU dropped from 80% to 30%. Happy days.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide