cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
13308
Views
155
Helpful
63
Replies

Ask the Expert: Troubleshooting High CPU in Catalyst Switches

Monica Lluis
Level 9
Level 9
 

This session will provide an opportunity to learn and ask questions about how to troubleshoot CPU issues in the Cisco Catalyst Switches IOS architecture.

 

Ask questions from Monday, January 25 to Friday February 5, 2016

Featured Experts

Naveen Venkateshaiah is a customer support engineer in High-Touch Technical Services (HTTS). He is an expert on Routing, LAN Switching and Data Center products. His areas of expertise include Cisco Catalyst 3000, 4000, 6500,  and Cisco Nexus 7000,Nexus 5000, Nexus 3000, Nexus 2000, UCS, and MDS SAN Switches. He has over 8 years of industry experience working with large enterprise and Service Provider networks. Venkateshaiah holds a CCNA, CCNP, and  CCDP-ARCH, AWLANFE, LCSAWLAN Certification. He is currently working to obtain a CCIE in Data Center.

 

Abhishek Soni is a customer support engineer in High-Touch Technical Services (HTTS). He is an expert on Routing, LAN Switching and Data Center products. His areas of expertise include Cisco Catalyst 3000, 4000, 6500,  and Cisco Nexus 7000. He has over 8 years of industry experience working with large enterprise and Service Provider networks. Soni holds a CCNA and CCNP Certification. He is currently working to obtain a CCIE in routing and switching.

 

Find other  https://supportforums.cisco.com/expert-corner/events.

** Ratings Encourage Participation! **
Please be sure to rate the Answers to Questions

 


 

I hope you and your love ones are safe and healthy
Monica Lluis
Community Manager Lead
63 Replies 63

Jessica Deaken
Level 1
Level 1

Hello Naveen and Abhishek,

I suspect my Cisco Catalyst 6500 is having a high CPU since I see slow performance. How do we check if my device s having high CPU?

Thank you for your prompt response.

 - Jessica

The basic command is:

Show process cpu

To see the history for the cpu you can use:

show process cpu history

If you want a link try - http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/63992-6k-high-cpu.html

Andy

Hi,

     we are running 4503 as core switch and its running default Vlan only ,

It's worked fine after sometimes we are not able to ping the VLAN 1 SVI IP address 10.0.0.254,when I ping that SVI continuously More request time out coming,after some times it's working fine/

I have checked CPU and found that "Cat4k Mgmt LoPri" is utilizing more CPU above 49 %

and I checked that a same MAC IS LEARNED FROM 2 DIFFERENT SWITCH PORTS IN cat 4500(This CAT4500 switch ports are connected to switches,servers,fiber links)

What to check in this and for same MAC learning from different switch port what would be the solution?Please suggest me.

Hi Vasanth,

Like Naveen said, The switch fails to respond if there is high cpu utilization ,so since the cpu is high when you ping you may get request time out.

The reason we need to check for STP TCN because you are seeing MAC flap (learning from two different ports).

Look for Topology change notifications (TCN):


- show spanning details | in executing|changes|from
- show spann det | in ieee|from|occur|is exec

If spanning tree is stable, then please trace the MAC address which is flapping between two switchports. Login to those devices and check 'show mac address table address <mac>' where it is being learned from.

Also please attach the following outputs:

- show process cpu sort

- show spanning details | in executing|changes|from
- show spann det | in ieee|from|occur|is exec

If MAC is flapping, it is expected to see high CPU utilization by "Cat4k Mgmt LoPri".

Regards,

Abhishek Soni

Hi Abi,

  thanks for your time and answer

I just copied this from above log,

in this how can we say that TCN is not the cause and if the High CPU utilization is due to TCN then how we identify in the log(i.e Any high number of changes,or Last changed),Please point out with some example that would really helpful.

If a port bundled in a channel fails ,will the topology change occur ,or STP will re compute the topology,because STP will consider Port channel as a single port ,after a port in channel fails the channel still up,then why STP executing ?

 VLAN0001 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 21 last change occurred 7w2d ago
          from StackPort1
 VLAN0010 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 26 last change occurred 13w0d ago
          from Port-channel17
 VLAN0020 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 37 last change occurred 11w0d ago
          from Port-channel11
 VLAN0030 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 28 last change occurred 13w0d ago
          from Port-channel17
 VLAN0040 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 32 last change occurred 1w1d ago
          from Port-channel5
 VLAN0050 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 26 last change occurred 13w0d ago
          from Port-channel17
 VLAN0060 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 165 last change occurred 1w1d ago
          from Port-channel5
 VLAN0070 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 135 last change occurred 1w2d ago
          from StackPort1
 VLAN0080 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 27 last change occurred 13w0d ago
          from Port-channel17
 VLAN0090 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 27 last change occurred 13w0d ago
          from Port-channel17
 VLAN0099 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 8 last change occurred 17w0d ago
          from Port-channel38
 VLAN0100 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 27 last change occurred 13w0d ago
          from Port-channel17
 VLAN0110 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 23 last change occurred 13w0d ago
          from Port-channel17
 VLAN0120 is executing the rstp compatible Spanning Tree protocol
Number of topology changes 9 last change occurred 17w0d ago
          from Port-channel38
 VLAN0130 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 14 last change occurred 1w1d ago
          from Port-channel5
 VLAN0140 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 8 last change occurred 17w0d ago
          from Port-channel38
 VLAN0150 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 13 last change occurred 1w1d ago
          from Port-channel5
 VLAN0160 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 8 last change occurred 17w0d ago
          from Port-channel38
 VLAN0165 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 117 last change occurred 7w2d ago
          from StackPort1
 VLAN0166 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 11 last change occurred 17w0d ago
          from Port-channel38
 VLAN0167 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 117 last change occurred 7w2d ago
          from StackPort1
 VLAN0168 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 11 last change occurred 17w0d ago
          from Port-channel38
 VLAN0170 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 238 last change occurred 1w0d ago
          from StackPort1
 VLAN0180 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 7 last change occurred 17w0d ago
          from Port-channel38
 VLAN0200 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 13 last change occurred 1w2d ago
          from StackPort1
 VLAN0204 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 9 last change occurred 17w0d ago
          from Port-channel38
 VLAN0208 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 8 last change occurred 17w0d ago
 from Port-channel38
 VLAN0212 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 8 last change occurred 17w0d ago
          from Port-channel38
 VLAN0990 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 118 last change occurred 7w2d ago
          from StackPort1
 VLAN0999 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 123 last change occurred 7w2d ago
          from StackPort1
 VLAN1000 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 122 last change occurred 7w2d ago
          from StackPort1

Hi Vasanth,

Thanks for asking a good question. As we are seeing in above output, spanning tree is stable since last many weeks.

For example:-

 VLAN0040 is executing the rstp compatible Spanning Tree protocol
 Number of topology changes 32 last change occurred 1w1d ago
          from Port-channel5

++ On Vlan 40, last TCN received 1 Week 1 Day ago, from Port-Channel 5. Which means spanning tree is stable since last 1 week.

++ The reason, we have asked to check spanning-tree, because if spanning-tree is not stable it may result in MAC flap and high CPU.

++ Also if MAC is flapping between two different ports, it can cause the control plane to learn and relearn the MAC and ARP will refresh due to that. Which can contribute to CPU if traffic rate is high.

++ I can see you mentioned high CPU due to Cat4k Mgmt LoPri. :- Each of the platform-specific processes has a target or expected CPU utilization. When that process is within range, the CPU executes the process in the high-priority context and the show processes cpu command output displays that utilization under Cat4k Mgmt HiPri . If a process exceeds the target, it runs under the low-priority context and the show processes cpu command output counts that additional utilization under Cat4k Mgmt LoPri.

If you are still seeing the high CPU and to assist you with troubleshooting, I would request you to upload following logs in text file:

- show clock

- show version

- show module

- show logging

- show process cpu sort

- show spanning details | in executing|changes|from

- show spann det | in ieee|from|occur|is exec

- show platform health | exc 0.00

++ Based on the above outputs, we need to move towards next course of troubleshooting.

++ also you need to check following, I advised earlier:

If spanning tree is stable, then please trace the MAC address which is flapping between two switchports. Login to those devices and check 'show mac address table address <mac>' where it is being learned from.

Please feel free to ask me if you have any question.

Best Regards,

Abhishek Soni

Hi Abi,

   I understand But,

1

=====

VLAN0040 is executing the rstp compatible Spanning Tree protocol
 Number of topology changes 32 last change occurred 1w1d ago
          from Port-channel5

what from Port-channel5 means to be here?

2

=====

VLAN1000 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 122 last change occurred 7w2d ago
          from StackPort1

what  from StackPort1 means to be here?

Both are Due to portchannel misconfiguration The STP is reexecuting to find the new topology?

My Important question is,

When a PORT IN A PORT CHANNEL DOWN ,how a STP will treat in that condition?It's a topology change?Because port channel is still UP by means of other ports in port channel?

Please correct me If I'm wrong 

Thank you,

Thanks Vasanth for reaching out. I will try to explain, in case you still have any question please feel free to ask.

1.

=====

VLAN0040 is executing the rstp compatible Spanning Tree protocol
 Number of topology changes 32 last change occurred 1w1d ago
          from Port-channel5

++ above output indicates that TCN BPDUs are coming from Port-Channel 5 which should be connected to a switch.

++ Now we need to login to the switch, which is connected on Port-Channel 5 and run the same command: 

show spanning details | in executing|changes|from
sh spanning-tree detail | i ieee|occur|from

++ this way we can track the TCNs and track the originating port. TCN will be generated and forwarded to your switch if a port is flapping on neighbor switch.

2

=====

VLAN1000 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 122 last change occurred 7w2d ago
          from StackPort1

what  from StackPort1 means to be here?

++ Like I explained earlier, you need to see where your stack port 1 is connected. If stack port one is connected to Switch 2, one of the port in switch 2 must be getting TCNs from somewhere.

Port-Channel misconfiguration should not cause any STP TCN, unless whole port-channel is flapping.

3.

=====

When a PORT IN A PORT CHANNEL DOWN ,how a STP will treat in that condition?It's a topology change?Because port channel is still UP by means of other ports in port channel?

++ If you have redundant link between two switches, spanning-tree will block all the redundant ports except one.

++ Port-Channel used to bundle multiple ports into a single logical link and all the ports will be forwarding the traffic. From spanning-tree perspective, it is a single port.

++ To answer your question, if a port is going down in a port-channel, STP will not detect any topology change. So you are correct.

In summary, if we are getting TCN from a port-channel, it means we are getting TCNs from the neighbor device and now we need to login back to neighbor switch and track the TCNs.

Hi Jessica,

Thanks for raising this question.

I would suggest you to verify if any of this following events happening on your switch, these should be the possible cause of  high CPU issue.

1. Spanning-tree topology change—When a Layer 2 network device does not receive timely spanning-tree BPDUs on its root port, it considers the Layer 2 path to the root switch as down, and the device tries to find a new path. Spanning tree re-converges in the Layer 2 network.

Look for Topology change notifications (TCN):


show spanning details | in executing|changes|from

2. Routing topology change, such as BGP route flapping or OSPF route flapping.

3. EtherChannel links bounce-When the network device at the other end of the EtherChannel does not receive the protocol packets required to maintain the EtherChannel link, this might bring down the link.

4. The switch fails to respond to normal management requests:

–ICMP ping requests.

–SNMP timeouts

–Telnet or SSH sessions that are slow or cannot be started

- UDLD flapping—The switch relies on keepalives from its peer in aggressive mode.

- IP SLAs failures due to SLAs responses beyond the acceptable threshold.

- DHCP or IEEE 802.1x failures if the switch cannot forward or respond to requests.

- Dropped packets or increased latency for those packets routed in software.

4. We can enable mac move notification and which helps us identifying whenever a MAC address or host moves between different switch ports.

This example shows how to enable MAC-move notification:


Switch (config)# mac-address-table notification mac-move

and check in "show logging"

5. Look for  HSRP flapping messages .

Post to these outcome ,we can Troubleshoot the cause of the high cpu.

Also collect  below commands during time of high cpu.

show process cpu | ex 0.00
show process cpu sorted | ex 0.00
show process cpu history
show logging

Regards,
Naveen Venkateshaiah.

For TCN ,what command shall I use to Identify the cause because of Topology change?

4. The switch fails to respond to normal management requests:

Hi Naveen,

Can you please explain this 4th point,

By using repeated ping to the IP cause CPU utilization

OR

Because of High CPU utilization we are getting Request time out while pinging the switch

Please correct me,

Thankyou,

Hi Vasanth,

yes,  The switch fails to respond if there is high cpu utilization ,so since the cpu is high when you ping you may get request time out.

Look for Topology change notifications (TCN):


show spanning details | in executing|changes|from
show spann det | in ieee|from|occur|is exec

Regards,

Naveen Venkateshaiah.

Hi Naveen,

 thanks,

If I ping a physical interface IP(Gateway) or VLAN interface(Gateway) from multiple systems ,what will happen?

will I get the response from switch in all systems?

Due to this condition (Ping from multiple system to a single IP)will the packet get dropped  due to this?

Thank you,

Hi Vasanth,

++ If you ping a IP address on the switch ''physical interface IP(Gateway) or VLAN interface(Gateway)''. ICMP packets will go to CPU only.

++ Ideally you should get response from switch in all systems unless the switch CPU is busy handling previous ICMPs or other control plane traffic.

++ So in case your switch is having CPU, you might get ping drop from various systems.

++ Also if you are sending lot of ICMP packets (pinging from many different systems), switch CPU will go high.

Hope I answer your question. Please feel free to reply me if you have any question.

Thanks,

Abhishek Soni

 

Thank you,

Why I have asked this Question is  because User system and also all network engineer system they continuously ping the gateway for all time,

So I understand that is not a good thing or practice for a network from your answer

So what are all the traffic and what are the task Processed by CPU?

and what will cause more utilization of CPU resources ,major process I'm asking?

which will help troubleshoot case like this..

 

Review Cisco Networking products for a $25 gift card