09-11-2015 03:13 PM - edited 03-08-2019 01:44 AM
This session will provide an opportunity to learn and ask questions about Cisco Catalyst Switches IOS architecture, and how to troubleshoot any unexpected reboots and other errors on switches.
Ask questions from Monday, October 5 to Friday, October 16, 2015
Featured Experts
Ivan Shirshin is a customer support engineer in High-Touch Technical Services (HTTS). He is an expert on Routing, LAN Switching and Data Center products. His areas of expertise include Cisco Catalyst 2000, 3000, 4000, 6500, Cisco Nexus 7000, ISRs, as well as Cisco routers ASR1000, 7600, 10000 and XR platforms. He has over 7 years of industry experience working with large Enterprise and Service Provider networks. Shirshin holds a CCNA, CCNP, CCDP, and CCIE (# 43481) in routing and swtiching, as well as XR specialist certifications.
Naveen Venkateshaiah is a customer support engineer in High-Touch Technical Services (HTTS). He is an expert on Routing, LAN Switching and Data Center products. His areas of expertise include Cisco Catalyst 3000, 4000, 6500, and Cisco Nexus 7000. He has over 7 years of industry experience working with large enterprise and Service Provider networks. Venkateshaiah holds a CCNA, CCNP, and CCDP-ARCH, AWLANFE, LCSAWLAN Certification. He is currently working to obtain a CCIE in routing and switching.
Find other https://supportforums.cisco.com/expert-corner/events.
** Ratings Encourage Participation! **
Please be sure to rate the Answers to Questions
10-09-2015 11:26 PM
Hi Semaj,
We need to check for Crashinfo file in switch to find the cause of the reload, If there is no crash info need to verify show stacks which will monitor the stack usage of processes and interrupt routines if there is any, If there is no error and no crash information in this Switch.
From show version command we notice that the reason of the last reload was due to a "power-on" which means that there is something bad either on the power source or in the cables that cause this reload.
The switch might have reloaded due to a power fluctuation. Look for logs stored on syslog server during the time of reload.
Normally for to find the cause of the crash we need to check show tech output from the switch and also the crashinfo file generated which will be stored in the switch flash memory.
Let me know if you have any further doubt.
Regards,
Naveen Venkateshaiah.
10-09-2015 10:43 AM
Hello Experts,
We encountered a high cpu usage on our 6500 switch. First we are getting an error log message of
*Sep 6 16:45:08.947 PST: %EARL_NETFLOW-SP-4-TCAM_THRLD: Netflow TCAM threshold exceeded, TCAM Utilization [92%]
*Sep 6 16:48:11.685 PST: %EARL_NETFLOW-SP-4-TCAM_THRLD: Netflow TCAM threshold exceeded, TCAM Utilization [92%]
Saw consistent overutilization on the TCAM netflow table
Earl in Module 5
Summary of Netflow CAM Utilization (as a percentage)
====================================================
TCAM Utilization : 86%
After modifying the sampling packet based to 4096, the TCAM utilization drops down to 26%
Summary of Netflow CAM Utilization (as a percentage)
====================================================
TCAM Utilization : 26%
Cisco TAC still seeing some traffic coming from vlan 56 that possibly is causing the cpu spikes
------- dump of incoming inband packet -------
interface Vl56, routine draco2_process_rx_packet_inline
dbus info: src_vlan 0x38(56), src_indx 0x4A(74), len 0x56(86)
bpdu 0, index_dir 0, flood 0, dont_lrn 0, dest_indx 0x380(896)
38020400 00380000 004A0000 56000000 002F0438 00000400 00000000 0380A9D1
mistral hdr: req_token 0x0(0), src_index 0x4A(74), rx_offset 0x76(118)
requeue 0, obl_pkt 0, vlan 0x38(56)
destmac 00.19.A9.9D.FA.C0, srcmac 00.D0.83.05.A0.73, protocol 0800
protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 68, identifier 13053 df 0, mf 0, fo 0, ttl 30, src 192.168.56.195, dst 10.0.0.1, proto 47
Based on the findings gathered by CiscoTAC, it’s best practice to configure the wccp assignment to MASK since if the assignment is on the HASH, the CPU can reach up to 90% if it receives an amount of more than 750 CPS (connection per seconds). Upon checking, wccp 2 assignment status is “HASH”. Wccp 1 assignment status is “MASK”
any opinions?
10-10-2015 01:42 AM
Hello,
I will answer the issues you listed separately below.
First issue is the Netflow TCAM notifications occurring repeatedly.
The messages "%EARL_NETFLOW-4-TCAM_THRLD: Netflow TCAM threshold exceeded, TCAM Utilization" indicate that the NetFlow ternary content addressable memory (TCAM) is almost full. The Supervisor Engine 720 checks how full the NetFlow table is every 30 seconds. The Supervisor Engine turns on aggressive aging when the table size reaches 90 percent.
The idea behind aggressive aging is that the table is nearly full, so there are new active flows that cannot be created. Therefore, it makes sense to aggressively age-out the less active flows (or inactive flows) in the table in order to make space for more active flows.
The capacity for each policy feature card (PFC) NetFlow table (IPv4), for PFC3A and PFC3B, is 128,000 flows. For the PFC3BXL, the capacity is 256,000 flows.
This issue with Netflow TCAM may happen when you set the NetFlow mask to "full" mode or there are too any flows - TCAM for NetFlow can overflow because there are so many entries. WCCP also uses Netflow resources in its operation. You can use the "show mls netflow ip" count command in order to check Netflow mode. Another solution to reduce number of entries is to change the sampling - which you did by modifying the sampling packet based to 4096.
Note that TCAM for packet forwarding and TCAM for NetFlow accounting are separate, so there is no impact to packet forwarding because of this issue.
Second issue is related to CPU spikes on the switch. Looking into the packet dump you collected, it seems that packet is sent to CPU for processing - dest_indx 0x380 is a Unicast packet punted to CPU - CPU port (15/1).
------- dump of incoming inband packet -------
interface Vl56, routine draco2_process_rx_packet_inline
dbus info: src_vlan 0x38(56), src_indx 0x4A(74), len 0x56(86)
bpdu 0, index_dir 0, flood 0, dont_lrn 0, dest_indx 0x380(896)
38020400 00380000 004A0000 56000000 002F0438 00000400 00000000 0380A9D1
mistral hdr: req_token 0x0(0), src_index 0x4A(74), rx_offset 0x76(118)
requeue 0, obl_pkt 0, vlan 0x38(56)
destmac 00.19.A9.9D.FA.C0, srcmac 00.D0.83.05.A0.73, protocol 0800
protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 68, identifier 13053 df 0, mf 0, fo 0, ttl 30, src 192.168.56.195, dst 10.0.0.1, proto 47
Due to this packets there was likely high CPU utilization by interrupts.
You mentioned that this traffic source was related to WCCP. Also, protocol type shows 47 - which is GRE.
The reason this causes high CPU is indeed that there is HASH assignment instead of MASK. Using HASH is not recommended on Catalyst 6500 switches.
The assignment method in WCCP determines how traffic will be distributed among multiple WCCP clients in a given service group. There are two assignment methods available, hash-based and mask-based. The assignment method chosen for a given service-group is negotiated between the router and the WCCP clients.
The negotiation of the assignment method is performed between the router and the clients via the WCCPv2 ISU and WCCPv2 HIA messages, respectively. The Cisco Catalyst 6500 supports both the hash-based and mask-based assignment methods and will advertise these capabilities in its ISU messages. The WCCP client must be configured for mask-based assignment and then implicitly choose the mask-based assignment method by first observing the supported method in the router's ISU message and then advertising mask-based assignment in its subsequent HIA messages.
The hash-based assignment method is the default and will be chosen unless the client is configured to support the mask-based assignment method.
1. MASK assignment:
The combination of an ingress traffic intercept method with mask-based assignment provides a full hardware-based traffic assignment method. This means that CPU resources are not used with this type of assignment, so there would not be CPU spikes.
Traffic is filtered for WCCP redirection using an Access Control List. The WCCP mask value is then applied to the redirect ACL to create entries in the Cisco Catalyst 6500 ACL TCAM[3]. The TCAM entries are used to provide hardware accelerated lookups and to derive a specific WCCP client which will service the traffic flow. In this way the forwarding path is performed completely in the Cisco Catalyst 6500 hardware resources.
2. Hash-based assignment method is supported but not recommended on the Cisco Catalyst 6500. A hash-based assignment method will utilize a combination of software and hardware forwarding resources. Traffic flows will need to be forwarded via software initially while also setting up flow entries using the Cisco Catalyst 6500 Netflow resources. This approach is certainly viable for some deployments but is not the best practice solution for the Cisco Catalyst 6500.
10-12-2015 06:12 AM
Hi
I wonder if you could clarify the Spanning tree instance limits on lower end switches such as 2960's, 3560's and 3750's. The documentation states they are limited to 128 spanning tree instances. I previously thought that a spanning tree instance is created per port and per vlan. When I do the command "show spanning-tree summary totals" I thought the number of spanning tree instances were refelcted in the STP active column. However, in testing on a switch with a documented limit of 128 instances I have no problem creating vlans until I hit the 128th vlan. Even though I have 3 trunks on the switch and the STP active value is over 300 at that stage.
So in short is a spanning tree instance just a single instance per vlan no matter how many ports are in the vlan ?
Thanks, Stuart.
10-13-2015 04:08 AM
Hi Stuart,
"is a spanning tree instance just a single instance per vlan no matter how many ports are in the vlan",
The limitations are as follows on switches running PVST, PVST+ or Rapid-PVST:
2950 SI: Maximum 64 STP instances, Maximum 128 VLANs.
2950 EI: Maximum 64 STP instances, Maximum 250 VLANS.
3550, 3560, 3750: Maximum 128 STP instances, Maximum 1005 VLANs.
“SPANTREE_VLAN_SW-2-MAX_INSTANCE: Platform limit of 64 STP instances exceeded. No instance created for VLANxxx”
The maximum number of Per VLAN Spanning Tree instances on the 6500 switch is 128. In your case, after you grow past 64 (or 128) VLANs, you will need to configure MSTP and begin grouping VLANs into common Spanning Tree instances. The 2950 and the 6500 both support MSTP, you might want to check these links for a detailed description on how to configure it:
Catalyst 2950 and Catalyst 2955 Switch Software Configuration Guide
Configuring MSTP
http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst2950/software/release/12-1_14_ea1/configuration/guide/2950scg/swmstp.html
Regards,
Naveen
10-13-2015 02:40 AM
Hello Ivan, Naveen,
For SUP7-E, SUP8-E, how can i monitor CPU by separate core?
Can you advise for OID? Thx in advance
10-13-2015 06:22 AM
Hi Rojer,
event manager applet CheckCPUCore
event timer cron cron-entry "00 11 * * *" /----------- line (1)
action 1.0 cli command "enable"
action 2.0 cli command "show proc cpu | include Core"
action 3.0 set cpu_output $_cli_result
action 4.0 mail server <mail_server_IP address> to navevenk@cisco.com from
logs@cisco.com subject " CPU Core 0 & 1 utilization" body "show proc cpu | I
Core; current status is $_cli_result"
!
!!! if your router has TACACS configured, put the below statement too. Purpose is to log
the user that runs the script, doesn't need password
!
event manager session cli username <TACACS username>
***********************************************
event timer cron name PERIODIC cron-entry "*/5 * * * *" /-------------- line (2)
in order to trigger action sending email when the average cpu usage for last 5min exceeds
certain level of threshold (70%) , line (1) can be changed to line (3) as below.
if the average cpu usages goes down lower than 30%, it will stop to send the commands
output via email.
event snmp oid "1.3.6.1.4.1.9.9.109.1.1.1.1.8" get-type exact entry-op ge entry-val 70
exit-op le exit-val 30 poll-interval 5 /-------------- line (3)
http://www.cisco.com/en/US/docs/ios/netmgmt/command/reference/nm_06.html#wp1157622
Thanks,
Naveen Venkateshaiah
10-13-2015 06:04 AM
Hello Ivan, Naveen
I'm trying to monitor a cat 3560G which is configured for storm control. I want to monitor the following OIDs via snmp but snmpwalk says these OIDs are not available.
cErrDisableInterfaceEventRev1 (1.3.6.1.4.1.9.9.548.0.2)
cErrDisableIfStatusCause (1.3.6.1.4.1.9.9.548.1.3.1.1.2)
portAdditionalOperStatus(1.3.6.1.4.1.9.5.1.4.1.1.23)
image I'm running is c3560-ipservicesk9-mz.122-55.SE10. According to http://tools.cisco.com/Support/SNMP/do/BrowseOID.do?local=en
this image supports the above OIDs.
1.3.6.1.4.1.9.9.548.1 is available but not 1.3.6.1.4.1.9.9.548.0
1.3.6.1.4.1.9.9.548.1.2 is available but not 1.3.6.1.4.1.9.9.548.1.3
1.3.6.1.4.1.9.5.1.4.1.1 is available only up to 1.3.6.1.4.1.9.5.1.4.1.1.12
When I do a 'show snmp MIB' on the switch, all the above 3 are listed.
Any help is really appreciated.
Thanks
Pani
10-13-2015 10:02 PM
10-14-2015 01:48 AM
Thanks Ivan.
I tried again and 1.3.6.1.4.1.9.9.548.1.3.1.1.2 isn't available. Only up to 1.3.6.1.4.1.9.9.548.1.2 is available.
I'll try the image 12.2(50)SE5 and up date you.
Thanks again.
Pani
10-14-2015 02:47 AM
Hi Ivan,
Just tried the 12.2(50)SE5 image.
Switch Ports Model SW Version SW Image
------ ----- ----- ---------- ----------
* 1 52 WS-C3560G-48PS 12.2(50)SE5 C3560-IPSERVICESK9-M
Still OID 1.3.6.1.4.1.9.9.548.1.3.1.1.2 isn't available. Only upto 1.3.6.1.4.1.9.9.548.1.2 is available.
Is it possible that there's a problem with this 3560 (license?)? I don't have another one to test.
Thanks
Pani
10-16-2015 01:11 AM
Hi Pani,
We are going to check that in the lab on our 3560 shortly.
Kind Regards,
Ivan
10-15-2015 06:31 AM
Hello to all,
I'm looking to create a custom report through Cisco UCCX Historical Reports that tells me the number of repeat calls our organization is receiving per queue. We want to know when the repeat calls are coming in per day and what the queue is that the call is hitting. Is this something that would be easily customizable through Cisco? If so, what are the steps I would need to take in order to build this.
Thank you for your help,
Eric
10-15-2015 08:51 PM
Hello Eric,
This expert session is for Switch and IOS Architecture questions. You would need to contact Unified Contact Center experts for solution to your problem.
Kind Regards,
Ivan
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide