09-09-2014 06:16 PM - edited 03-01-2019 05:04 PM
Troubleshooting High CPU on 3750
Let us see how to troubleshoot High CPU Utilization on the DSBU switches.
Before troubleshooting HIGH CPU, we need to verify a few things.
1. What changes in the network/device might have triggered this issue?
Below are the respective commands which helps you to identify the same.
debug platform cpu-queues {broadcast-q | cbt-to-spt-q | cpuhub-q | host-q |
icmp-q | igmp-snooping-q | layer2-protocol-q | logging-q |remote-console-q |
routing-protocol-q | rpffail-q | software-fwd-q | stp-q} -----------> its intrusive
in nature and run only when you see drops in the queue.
2-
Unlike Catalyst 4500 and 6500, there is no sniffer trace (e.g. debug netdr, inband RP trace, debug platform packet all receive buffer) for Catalyst 2K/3K for traffic punted to CPU. It is pretty tedious to troubleshoot high CPU caused by interrupt. This guide shows you step-by-step procedure to troubleshoot the same.
Any traffic coming from or going to CPU is put into one of the 16 queues. Below is the mapping from queue 0 to queue 15
CPU Queue # Description Explanation
0 rpc Remote procedure call; used by IOS processes to communicate across the stac
1 stp Spanning tree
2 ipc interprocess communication; used by IOS processesto communicate across the stack
3 routing protocol receive queue for routing protocol packets
4 L2 protocol queue for protocol packets such as LACP, UDLD, CDP, and etc
5 remote console queue used for “session <switch number>” is used to open console on switch members
6 sw forwarding Traiffc required software switched (e.g. unknown multicast, IP header with option)
7 host Traffic to the switch including directed broadcast
8 broadcast broadcast packets (e.g. ARP, RIPv1)
9 cbt-to-spt Packets of hitting a (*,G) entry that exceeding the stpthreshold.
10 igmp snooping Packets are placed on this queue as a result of hitting the IGMP entry
11 icmp for ICMP redirect or ICMP un-reachable
12 logging ACL excpetion
13 rpf-fail multicast traffic fails RPF checking
14 dstats drop stats. Unused during normal operation
15 cpu heartbeat CPU keepalive to check the health of CPU queues
This document will explain why queue 0 to 15 (not queue 1 to 16) are used.. When a traffic is placed into one of the queues above, a buffer is allocated to store the traffic temporarily. If the CPU utilization on interrupt level is high, it is usually caused by certain type of traffic.
We collect the output command show controllers cpu-interface multiple times. Below is an example of the show controllers cpu-interface:
cpu-queue-frames retrieved dropped invalid hol-block stray
----------------- ---------- ---------- ---------- ---------- ----------
rpc 0 0 0 0 0
stp 737164 0 0 0 0
ipc 0 0 0 0 0
routing protocol 1146606170 0 0 0 0
L2 protocol 65643 0 0 0 0
remote console 0 0 0 0 0
sw forwarding 0 0 0 0 0
host 5 0 0 0 0
broadcast 19394 0 0 0 0
cbt-to-spt 0 0 0 0 0
igmp snooping 0 0 0 0 0
icmp 0 0 0 0 0
logging 0 0 0 0 0
rpf-fail 0 0 0 0 0
dstats 0 0 0 0 0
cpu heartbeat 29100077 0 0 0 0
Look at the queue with the largest difference in retrieved. That's likely the source of traffic.
From looking at the queue, try to think of a logical reason why traffic is punted to CPU. For example, configure no ip unreachables and no ip redirects on layer 3 interfaces if icmp queue (i.e. queue 11) has most traffic punt to CPU. For sw forwarding queue (i.e. queue 6), see Common Problems below.
Collect this output only if necessary because it is time consuming to look at the show output. Collect the output of show buffer pool RxQ<X> packet, where X is the queue # with highest increments. For example, you want to look at the content of sw forwarding queue. Use command show buffer pool RxQ6 packet.
4. Check Input Queue Drops:
It is also useful to look at "show interface" and see layer 3 interfaces with huge number packets sitting in input queue.
These are packets going to CPU. You can dump the packets using the command "show buffer input-interface X dump" where X is the interface name , like vlan 10
3- Below are few common process utilizing the CPU:
Hulc LED:
The "Hulc LED" process does following tasks:
- Check Link status on every port
- If the switch supports POE, it checks to see if there is a Power Device (PD) detected
- Check the status of the transceiver
- Update Fan status
- Set Main LED and ports LEDs
- Update both Power Supplies and RPS
- Check on system temperature status
b) HIGH CPU DUE TO H13U BKGRD PROCESS ==============
Hl3u bkgrd process - manages quite a few background tasks like...
- Hardware Arp throttling house keeping tasks
- Retry adjs/Fib in case of an out of hardware resource condition
- sending out gratuitous arps in certain scenarios like master
switchover
- proxy arp house keeping functions
- ICMP redirect processing
- TTL ICMP error generation in some conditions
- handling correct route forwarding in an output acl full condition
c) High CPU due to SNMP:
# show snmp ===> capture taken few times with 5 min interval. (to verify the SNMP traffic)
If SNMP ENGINE process is consuming major CPU, then ONLY "show stack 318" will provide useful SNMP stack trace.
The below steps are useful only when we high CPU for SNMP ENGINE process.
Hope the above should help you. In case if you still wants help then you can raise the TAC case or post the thread and one of us would help you.
Regards
Inayath
https://supportforums.cisco.com/t5/network-management/cpu/m-p/3079496/highlight/false#M113815
this helped me, it may help you also
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: