Souvik Ghosh is a customer support engineer at the Cisco Technical Assistance Center in Bangalore, India. He has three and half years of experience in LAN switching technologies. LAN switching products such as the Cisco Catalyst 6500, 4500, 3750, and 2960 Series Switches are his areas of expertise. He has been involved in various escalation requests from India, Singapore, and Australia and is currently working as a technical lead for the LAN switching team in Bangalore, India. He holds CCNP and CCIP certifications.
The following experts were helping Souvik to answer few of the questions asked during the session: Amit Singh, Akshay Balaganur, and Ranganatha Raju. Amit, Akshay and Ranganatha are support engineers and have vast knowledge in 6500 related topics.
The Complete Recording of this live Webcast can be accessed here.
A. This depends totally on the kind of broadcast. We can implement storm control on access layer switches so that we can save the CPU in case of broadcast storm. In case of ARP there are two options available: one is implementing mls qos protocol arp command, which will rate limit arp packets in hardware. Second, we can implement ip dhcp snooping option along with arp inspection and configure rate limit to arp inspection value. The drawback to both of these is that they are system wide, so normal broadcast packets are also dropped. As a result, storm control is the best option.
A. 60% constant CPU usage is not normal. However, an intermittent spike might occur and be normal. It also depends on if you are using Non-modular IOS (mz image) or Modular IOS (vz image).
A. In addition to COPP, we have the option of hardware-based limiters on the CPU.
A. Refer to Protection against Denial of Services, which contains details to avoid DDoS attacks.
A. We can log in remotely to SP, or use the command “remote command switch show process cpu” and “remote command switch show process cpu history”.
A. That depends on whether it is a layer 3 port or a layer 2 port. For a layer 3 port, input queue – software queue. For a layer 2, input queue – hardware queue.
A. The CPU on the line card is used for communicating with the Supervisor. This CPU will not be used for maintaining the control plane protocols and will only be used for responding to diagnostic requests from the Supervisor. You can use "attach <module no>" command to log on to the module and check the cpu utilization, using the "show process cpu" command.
A. It depends on the kind of network with which we are dealing. It becomes 1 in case of loop in the network causing the TTL to go down. This should not be an issue with high CPU; rather high CPU is a symptom of some other issue like a loop in the network. It can also be caused of a bug within an application causing the TTL to go down to a value of 1.
A. Yes, the "show platform hardware capacity" command will show the utilization.
A. If it is virtual exec, that is used for servicing vty lines, Vty lines are used for logging into the switch. If we are trying to dump a huge output like "show tech", it is expected to see high CPU and is not a matter for concern.
A. High CPU utilization under the IOS-BASE process is for Supervisors running modular IOS. First, find out the PID of the IOS-BASE process from the output of the "show process cpu" command. Following that use the command "show process cpu detail <PID of the IOS-BASE process>" and find out which sub-process under the IOS-BASE process is consuming most of the CPU cycles.
A. Seeing a very small CPU spike while running “show tech” is expected as it is caused by “SSH Process”. It is not a result of IOS.
A. These are two different things. Hardware rate limiters are available for specific features. If these are not available for the feature we are looking for, we need to fall back to COPP. Hardware rate limiters are best option because of where packets are dropped (right at PFC). In COPP, it is done in software, so the packet needs to go from PFC to software and is dropped in between.
A. Ideally, when network protocol has converged, it should be 0% or, at a maximum, 10%.
A. Yes and no. Depending on what caused the CPU utilization, we need to check if upgrade will help or not.
A. Yes, we have a CPU on line cards running DFCs. However, that is not used for maintaining control plane protocol, it is used for diagnostic purposes. This CPU is used for responding back to Sup in case Sup is polling towards LC. You might see a CPU utilization there, but it has to do with internal management of LC and system, not network traffic. You should get in touch with TAC in such instances as this can be a bug as well.
A. If there is CPU utilization due to process switching, you should see some process associated with it. If you do not see a process, it could be a bug in the IOS code.
A. Yes, we can make it zero.
A. In version 12.2(18)SXF onwards, you do not see any impact on the switch by enabling Netdr capture. On older versions of code, it is just like enabling any other debug, there would be some impact on CPU.
A. CPU rising threshold monitors both RP and SP. If you need to monitor only SP CPU, then this EEM script is helpful:
event snmp oid 1.3.6event manager applet POLL-SP-CP
.220.127.116.11.18.104.22.168.22.214.171.124 get-type exact entry-op gt
entry-val "60" poll-interval 10
action 1.1 syslog priority notifications msg "CPU utilization >60% on SP processor, logging data"
action 2.1 cli command "enable"
action 2.2 cli command "rem comm swi show proc cpu sort | redirect bootflash:SP-CPU"¬
action 2.3 cli command "end”
A. Yes, it is the same for 7600 platform.
A. First analyze the netdr captured outputs in order to understand why these UDP packets are hitting the CPU. If these packets are to be dropped, use an ACL.
A. It is safe to use this command as it just takes a dump of packets in the buffer.
A. Yes, we can use ELAM to find what traffic is going to CPU. Note that this is an internal tool, so you should have TAC assistance to run this tool.
A. Yes. The IP Input process handles the interrupt/process switched traffic, meaning that the traffic is handled by CPU. As a result, we need to do a packet capture to see what traffic is being processed by CPU.
A. Yes, you can use RP/SP inband span on 7600 as well. There are no limitations.
A. No, we do not have any recommended COPP configuration. It totally depends on your network and traffic pattern.
A. In this case, only one sup will be in active state. As a result, you will not see high CPU on redundant supervisor. You will see high CPU on active only, so that is what you need to t/s using the tools we discussed.
A. No. Refer to the Feature Navigator tool in order to check which version supports it. After that version, there is a specific driver used for EEM functionality. As a result, we should not see any high CPU due to EEM running in background.
A. Yes, but depending on IOS version running, the commands can differ.
A. Netdr can always be used when you are trying to capture packets which are bound to or from the CPU. If you are suspecting that the CPU is not sending certain traffic you can configure Netdr in the "Tx" direction and check the output of "show netdr capture" command.The other options available with Netdr is used for filtering the traffic captured in the "Debug netdr capture" command.
A. It will be Tx direction.