cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2502
Views
1
Helpful
30
Replies

Need help in increasing CPU utilization for Cisco Switch

mahazz12
Level 1
Level 1

Hello,

I have a cisco switch (Cisco Catalyst C9300) and want to increase CPU utilization of the switch. I am passing lots of different types and heavy traffic on the switch through TRex, but the CPU utilization is stuck on 1%. No matter how much I try, the CPU utilization is not increasing.

I also tried disabling STP which increased CPU utilization to about 40%, but every time it resulted in losing my SNMP connection to which my monitoring tool and switch were connected, hence I am unable to further monitor the switch (for throughput etc).
 
I am new to networking and want to conduct experiments, so I want that the CPU utilization 0%, 50% and 100% (or you can say variations in CPU utilizations up to MAX) in order to carry out the tests. Can someone please help me out with how do I increase the CPU utilization? 
 
TIA.
 
30 Replies 30

Ramblin Tech
Spotlight
Spotlight

Switches/routers that forward data-plane traffic in hardware (NPU) will not have their CPU utilization materially increased by increasing the amount of transit traffic that they forward, if that transit traffic stays in the NPU. If, for your experimental purposes, you want to increase CPU utilization, then I see the basic approach as giving the CPU (not NPU) more work to do. A few ideas come to mind here:

 - If the Cat 9300 supports process switching, then configure TRex to generate traffic that must be punted out of the NPU and up into the CPU for forwarding. This might include using option fields in the IP header that are not handled in the NPU. I have not worked closely with XE in some time so I am not sure if this is even possible anymore (I worked mostly with XR in the recent past, where process switching does not exist and any transit traffic not handled in the NPU is dropped). But, if the 9300 supports process switching and every single packet you send on every interface must be processed switched, you might be out of CPU quickly.

 - Generate lots of control-plane traffic and disable any control-plane policing features. If you flood the interfaces with L3 and L2 control-plane protocol traffic then the CPU will have to respond to a constant barrage of messages saying that peers  or topologies have changed.

 - Generate lots of management-plane plane traffic.  Walking SNMP MIB structures was a notorious CPU hog in the past, so do that. Also turn on lots of debug and logging messages (send them to a file or syslog server, else your console will get overwhelmed).

 - Something as simple as sending the Cat 9300 ARP messages at line rate on multiple interfaces might do the trick, as long as control-plane policing is disabled.

Disclaimer: I am long in CSCO

"If the Cat 9300 supports process switching"

Possibly via disabling CEF?  But on a switch, CEF might only be used for first packet.  I.e. non-CEF hardware forwarding might still be forwarded in hardware.

Also, BTW, way back when, on Brand X, after doing some post Code Red analysis, I wrote a script to send UDP packets, where I just kept incrementing the destination IP for each packet.  That hardware was 1st packet flow based (similar to a 6500 sup1, I believe) and it only took about 3 Mbps of such packets to "crush" a multi-gig capable L3 switch (Brand X equivalent of a 6500).

Thank you for your response. Can you please suggest me some ways to generate control plane traffic on the switch?

Joseph W. Doherty
Hall of Fame
Hall of Fame

As @Ramblin Tech also explained, on a typical L3 switch, all "normal" frame/packet forwarding (the data plane) is done on dedicated hardware, usually referred to as application-specific integrated circuit (ASIC) chips.  That's why you don't see the CPU usage rise.

To increase CPU utilization, as the CPU is used for the control-plane, you need to make those functions "busy" to drive the CPU usage up.  Jim provides a list of things to do that usually will impact control-plane functions, which will drive up CPU utilization.

Something to understand, which may very much impact whatever testing you have in mind, usually control-plane feature processes are prioritized and/or, proportional scheduled for execution.  This means, even if CPU utilization is 100%, some control-plane functions, might barely be impacted, while other control-plane functions are unable to do what they need to do when they need to do it.

If you're not used to thinking about proportional usage, consider if you had two flows that want to use a link.  Either flow wants, and can, use the link's whole bandwidth.  If either is only flow using the link, the link will show 100% utilization (and flow is "happy").  However, if you try to send both flows, concurrently, link utilization will still show 100% utilization, but the halving the bandwidth to both flows might be adverse to both.  (I.e. if you were testing either flow's performance, it would likely be degraded, compared to when you was the only flow using the link.)

Now, if we absolutely prioritize one flow, link will still show, again, 100% utilization.  But now, depending on which flow we're monitoring, one will show its getting the full link bandwidth, just as before, while the other isn't obtaining any bandwidth!

So, again, with two flows, and 100% utilization, what's the effective impact against a particular flow (or for a CPU function's processes), may be very different based on what you're monitoring (or testing).

Again, on a switch, generally, CPU utilization has no impact to data transfer performance.  However, if something like a dynamic routing process gets insufficient CPU, the device may no longer "know" where to direct packets, and the impact of that is usually quite noticeable.

I mention all the above, because whatever testing you have in mind, might be unproductive.

Thank you for your response!

Basically, I'm aiming to observe variations in the temperature readings of the CPU's temperature sensor and the internal temperature sensors located near the inlet and outlet fans of the switch. I want to monitor these temperature changes and I believe increasing the CPU utilization shall help me to achieve this.

Currently, when I increase the traffic, the temperature sensors are only showing a slight variation of about one degree or so. However, my expectation is that as the CPU works more intensively, we should observe more significant fluctuations in temperature, that's the reason I'm finding ways to increase CPU utilization of the switch.

As far as I am aware, the fans operate in two modes:  ON or OFF. 

Okay, understood.

I suspect, even running the CPU at 100%, will accomplish that.

One reason, on a switch, the CPU, since (relatively) it does little work, is "small".  I.e., might be incapable, alone, to generate enough heat for it to impact the temperature in the switch with its fans operating normally.

So is there any way to increase the CPU utilisation without impacting the connection with SNMP? Since my monitoring tool (Zabbix) is using SNMP to read statistics from the switch.

That I don't know.

In theory, a last priority process could consume whatever level of spare/available CPU cycles, to a certain CPU level of consumption desired, but what process is that?  I don't know.  Even if we know, how would you get it to run at the CPU level desired?

For an user of the product, what you want to accomplish might be nearly impossible.  Even if you accomplish what you want, unclear there's much value.

For example, when I execute the debug all command on the switch, its CPU utilisation increases to 12%, so I just wanted to know whether there are any other commands or functionalities that might increase the CPU utilisation?

Leo Laohoo
Hall of Fame
Hall of Fame

Woooooooooooooooo ... 

You want to get blow the CPU and memory of the switch, do you?  Which ones, the control- or data plane?  

Start doing this: 

1.  Get two switches and stack them.  
2.  Load them with, say 16.12.1. 
3.  Loosen the stacking cable so that it the stacking module will start to flap.  
4.  Watch the CPU using the following commands: 

  • Data Plane:  sh platform resources
  • Control Plane:  sh platform software status con brief

Alternatively, use the hidden command "test crash".  This command can only be used through a console cable.  

I want to increase the CPU utilisation steadily like 0%, 50% and then near 100%. Meanwhile, I will be passing different types of traffic (i.e. UDP, IMIX, HTTP) at different utilisation levels through Trex traffic generator to the C9300 catalyst switch. This switch is connected with Zabbix monitoring tool through SNMP, so I also don't want to break the connection with SNMP. I tried to create STP loop and disabled STP in the switch but that broke my connection with SNMP and I couldn't get any statistics on Zabbix then. 

Will this solution not break my SNMP connection?

@mahazz12 wrote:
Will this solution not break my SNMP connection?

Oh for sure!  I guarantee it will.  I even have a picture what a "runaway switch" may look like: 

3850 stack, 16.12.63850 stack, 16.12.6

The memory spike is due to a fault with the stacking cable.  Around or before 4pm, the stack can be pinged but console and remote access is no longer possible.  This is why we termed it a "runaway switch".

Between 4pm to 17:35, SNMP stopped working.   I personally cold-rebooted the entire stack at around 17:45.  

 

Exactly, the same scenario is happening with me as well. Is there any solution that can increase the CPU utilisation and not choke the CPU and also maintain the SNMP connection so I can monitor the switch through Zabbix?