09-11-2015 03:13 PM - edited 03-08-2019 01:44 AM
This session will provide an opportunity to learn and ask questions about Cisco Catalyst Switches IOS architecture, and how to troubleshoot any unexpected reboots and other errors on switches.
Ask questions from Monday, October 5 to Friday, October 16, 2015
Featured Experts
Ivan Shirshin is a customer support engineer in High-Touch Technical Services (HTTS). He is an expert on Routing, LAN Switching and Data Center products. His areas of expertise include Cisco Catalyst 2000, 3000, 4000, 6500, Cisco Nexus 7000, ISRs, as well as Cisco routers ASR1000, 7600, 10000 and XR platforms. He has over 7 years of industry experience working with large Enterprise and Service Provider networks. Shirshin holds a CCNA, CCNP, CCDP, and CCIE (# 43481) in routing and swtiching, as well as XR specialist certifications.
Naveen Venkateshaiah is a customer support engineer in High-Touch Technical Services (HTTS). He is an expert on Routing, LAN Switching and Data Center products. His areas of expertise include Cisco Catalyst 3000, 4000, 6500, and Cisco Nexus 7000. He has over 7 years of industry experience working with large enterprise and Service Provider networks. Venkateshaiah holds a CCNA, CCNP, and CCDP-ARCH, AWLANFE, LCSAWLAN Certification. He is currently working to obtain a CCIE in routing and switching.
Find other https://supportforums.cisco.com/expert-corner/events.
** Ratings Encourage Participation! **
Please be sure to rate the Answers to Questions
10-05-2015 07:29 AM
Hello,
We have WS-X6704-10GE Module along with WS-F6700-DFC3B, How to identify DFC-Equipped Module Has Reset on its own?
Thank you for your prompt response.
Jessica
10-05-2015 07:51 AM
Hi Jessica,
Thanks for raising this question.
If a Distributed Forwarding Card (DFC)-module has rebooted on its own without user manual reload, you can check the bootflash of the DFC card in order to see if it crashed. If a crash information file is available, you can find the root cause of the crash.
Issue the dir dfc#module#-bootflash: command in order to verify if there is a crash information file and when it was written.
If the DFC reset matches the crashinfo timestamp, issue the more dfc#module#-bootflash:filename command.
We can also issue the copy dfc#module#-bootflash:filename tftp command in order to transfer the file via TFTP to a TFTP server.
cat6kSwitch#dir dfc#6-bootflash:
Directory of dfc#6-bootflash:/
-#- ED ----type---- --crc--- -seek-- nlen -length- -----date/time------ name
1 .. crashinfo 2B245A6A C24D0 25 261332 Sep 22 2014 21:35:25 crashinfo_
20140922-204842
After you have the crashinfo file available, collect the output of the show logging command and the show tech command and contact you can reach to TAC support to further find the cause of this crash.
Let me know if you have any further doubt.
Regards,
Naveen Venkateshaiah.
10-05-2015 11:40 PM
Hi Naveen ,
One of our 6500 switch with sup32 has been reloaded automatically twice. And there were crashinfo files has been generated. In show version command output , the last reset was power on. We tried to use output intrepretor for crashinfo files and there were no relevant results for that. So should we have to raise TAC only for decoding the files or as we are a partner privileged to use any tools to decode?
Thanks,
Tamil.
10-06-2015 04:31 AM
Hi Tamil,
Thanks for raising this question,
The Cisco Catalyst 6000/6500 Switches can unexpectedly reload due to an unknown cause. The output of the show version command displays a similar error message:
System returned to ROM by unknown reload cause - suspect
boot_data[BOOT_COUNT] 0x0, BOOT_COUNT 0, BOOTDATA 19 (SP by power-on)
This issue is documented in Cisco bug ID CSCef80423 (registered customers only) . Upgrade the switch to the latest Cisco IOS Software release unaffected by the bug in order to resolve this issue.
example:
From SP:
======
Mar 25 10:46:19.074 GMT: %C6K_PLATFORM-SP-2-PEER_RESET: SP is being reset by the RP << Here Switch Processor is reset due to Route Processor.
Hence we have to look for RP crash info.
From RP:
======
Mar 25 10:46:11.166 GMT: %SYSTEM_CONTROLLER-3-ERROR: Error condition detected: TM_NPP_PARITY_ERROR
Mar 25 10:46:11.166 GMT: %SYSTEM_CONTROLLER-3-FATAL: An unrecoverable error has been detected. The system is being reset.
Let me know if you have any further doubt.
Regards,
Naveen Venkateshaiah.
10-07-2015 04:45 AM
Hi,
Can you please explain the IOS XE images naming convention and how to identify what is the current version running on my switch?Is there any difference while we run the show commands on IOS and IOS-XE?
Regards
Dhiresh
10-07-2015 07:41 AM
Hi,
image name: cat4500e-universalk9.SPA.03.01.00.SG.15-01.SG
cat4500e: Platform Designator.
universal: Feature Set Designator.
k9: Crypto Designator if crypto code is present in IOSd package.
SPA: Indicates image is digitally signed.
03.01.00.SG: IOS XE Release Version number.
15.01.SG: IOSd package version number – this will allow you to correlate the version of IOSd to another platform running classic IOS
Kernel Version:
=========
cat4500e#show version running
Package: Base, version: 03.00.00, status: active
File: cat4500e-basek9.SPA.03.00.00.pkg, on: Slot3
From Bundle: cat4500e-universalk9.03.01.00.SG
Infrastructure Version
==============
Package: Infra, version: 03.00.00, status: active
File: cat4500e-infra.SPA.03.00.00.pkg, on: Slot3
From Bundle: cat4500e-universalk9.03.01.00.SG
IOSd Version
========
Package: IOS, version: 150-1.SG, status: active
File: cat4500e-universalk9.SPA.15-01.SG.pkg, on: Slot3
From Bundle: cat4500e-universalk9.03.01.00.SG
Below are the Few commands for example:
IOS Command | IOS-XE Command | Comments on New CLI | |
|
|
| |
|
|
| |
|
|
|
Regards,
Naveen Venkateshaiah.
10-05-2015 07:33 AM
Hello Ivan and Naveen,
We are getting this error message: “%CFIB-7-CFIB_EXCEPTION: FIB TCAM exception, Some entries will be software switched”
We could not find what is the meaning of this error in the documentation. Can you let me know what it means and if there is anything I should do in order to stop receiving it?
Thank you,
Ivan
10-05-2015 07:38 AM
Hey Ivan I seen this before
heres the link explains It and what to do, go to the section ----FIB TCAM Exception
https://supportforums.cisco.com/document/59926/troubleshooting-high-cpu-6500-sup720
10-05-2015 09:52 PM
The error message indicates that number of route entries that are installed is about to reach the hardware FIB capacity or the maximum routes limit set for the specified protocol. If the limit is reached, some prefixes are dropped.
There is a workaround available. You need to reload the router in order to exit the exception mode.
Then enter the “mls cef maximum-routes” command in global configuration mode in order to increase the maximum number of routes for the protocol.
You should use the “show mls cef maximum-routes” command in order to check the maximum-routes. And use the “show mls cef summary” command, which shows the summary of CEF table information, in order to check the current usage.
10-07-2015 10:20 PM
Hi Naveen & Ivan,
We have an issue identified on cisco RSP8 (R7000) 7513mx chasis where the router has got 2 mpls links which went down both at the same time and did not return up untill we made the router to physically reboot suspecting that the router might have got hung up. Below is the error message that I have noticed when executed the "show ver" command. Does this error message correspond to hardware or ios issue ? Kindly clarify.
System returned to ROM by processor memory parity error at PC 0x406A0D1C, address 0x0 at 02:42:04
FYI.. This error remained still even after the router got rebooted. The both mpls links restored however the error remained in router. Does this error message can lead to any issues like making the router hung up again later, if so kindly advise possible solution to overcome the issue.
Thanks in advance..
10-07-2015 10:31 PM
Hi Ajar,
The message indicates that there was a memory parity error in the processor DRAM. This is a problem related to hardware.
Note that the message is related to the reason of last restart and it won't be cleared till next restart. It does not mean the problem is still occurring at this time.
There are two kinds of parity errors:
1. Soft parity errors
These errors occur when an energy level within the chip (for example, a one or a zero) changes. When referenced by the CPU, such errors cause the system to either crash (if the error is in an area that is not recoverable) or they recover other systems (for example, a CyBus complex restarts if the error was in the packet memory (MEMD)). In case of a soft parity error, there is no need to swap the board or any of the components. See the Related Information section for additional information about soft parity errors.
2. Hard parity errors
These errors occur when there is a chip or board failure that corrupts data. In this case, you need to re-seat or replace the affected component, which usually involves a memory chip swap or a board swap. There is a hard parity error when multiple parity errors occur at the same address. There are more complicated cases that are harder to identify. In general, if you see more than one parity error in a particular memory region in a relatively short period, you can consider it to be a hard parity error.
Studies have shown that soft parity errors are 10 to 100 times more frequent than hard parity errors. Therefore, Cisco highly recommends you to wait for a second parity error before you replace anything. This greatly reduces the impact on your network.
If you see this error once, I recommend to monitor for 2-3 days. If the issue reoccurs, you need to replace DRAM on this card.
Kind Regards,
Ivan
10-07-2015 11:00 PM
Hi Ivan,
Thanks for the info...
As of now the issue does not re-occur for this branch after physical reboot, but will go with your advise of replacing the processor DRAM on the router.
Incase if the issue repeats then after replacing the DRAM does the "system parity error" should vanish from the output of "show ver"
what are the possible commands that can be issued on this router cisco RSP8 (R7000) 7513mx chasis when this kind of "system parity error" generates on the router
Also please let me know what are the post checks with required commands that can be issued on this router incase we have the new processor DRAM placed in router ?
Please advise..
10-07-2015 11:12 PM
Hi Ajar,
If this problem happened once and does not reoccur, then it is highly probable it was a soft parity and there is no need for action. Soft parities are very rare to be seen twice.
You can read more about this kind of issues and difference between soft and hard parities here:
http://www.cisco.com/c/en/us/support/docs/routers/7200-series-routers/6345-crashes-pmpe.html#softvshard
Parities are either reported in "show ver", in crashinfo file if memory corruption led to a crash (file is saved in the flash or bootflash usually), or in SYSLOG - which you can check on your syslog server or with "show log" command.
System run parity check automatically on the bootup and during operation, so you do not need to do anything manually to test new memory.
Kind Regards,
Ivan
10-07-2015 11:23 PM
Thanks lot Ivan for the information..
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide