Issues with TAC support procedures for router hangs.... =(
I have a problem with the direction TAC is asking me to take:
They essentialy want me to make changes to the config register, and setup the router to repond to the console during a crash. The issue I have is we have only one router and its production all the time. Sitting waiting for it to impact our production network isn't a great solution.
We are running a relatively newer version of code:
Is there any other way dto proceed? Would anyone be able to speak of moving to different software instead of being a guinea pig? I suppose we could have bad hardware, but I doubt it. Should I open a ticket to have Cisco help us find more stable hardware?
Exact steps recomeneded below:
Procedure to be performed when a device stops responding
The procedure has 3 parts, one has to be done when the router is working properly (no network impacting), so you need to enter the 3 commands: config-register , scheduler allocate, and write the configuration (steps 1-3).
The second part involves sending the device to rommon, and this has to be done in a maintenance window (steps 4 -6) since is network impacting; the third part is done when the router is hanging and has not being reloaded.
Note: If the console is unresponsive during the event and if the device is an ISR router add this command so the router will be forced to crash and it could be sent to us for analysis.
Here is a very detailed step-by-step instructions list to troubleshoot router hangs:
1. Set the configuration register to 0x2002 using the "config-register 0x2002" command issued from global configuration mode of the device.
2. Configure the device to allow console access during high CPU utilization by issuing the "scheduler allocate 30000 1000" command from configuration mode of the device.
Router(config)# scheduler allocate 30000 1000
3. Write the configuration to memory using the "write mem" command from enable mode of the device.
4. Reload the router using the "reload" command from enable mode of the device.
Then test the use of the "stack 50" of "k 50" command from the ROMMON prompt to gather diagnostic data. Then test the use of the "cont"
command from the ROMMON prompt to return to IOS. This will be service impacting, so please do this during a service window.
rommon2>stack 50 (if the device doesn't take "stack 50" use "k 50")
6. Reload the router again using the "reload" command from enable mode of the device.
Steps 5 to 6 of the second part can be omitted if you have no time to do testing because the router in question is in production, but is definitively a very accurate way to find out if the configuration changes were applied correctly and that they will work at the time of the real hang.
AT TIME OF HANG:
1. Connect to console port of the device and start logging output to a plain text file.
2. Confirm that the device is in a hang state by pressing Enter several times. No response means that the device is in a hang state.
3. Use break sequence to drop router to ROMMON (in Windows/Hyperterminal, it is usually Ctrl-Break.
4. Enter the command "stack 50" or "k 50" from the ROMMON prompt in order to display diagnostic information about the hang state.
rommon3>stack 50 (if the device doesn't take "stack 50" use "k 50")
5. Enter the command "cont" from the ROMMON prompt in order to go back to the IOS and the hang state.
6. Repeat steps 2-5 about 10 times. This is important in order to make sure we get an accurate reading of where the CPU is hanging.
7. Stop capture and send in the resulting plain text file containing the log from the entire procedure.
Listen: https://smarturl.it/CCRS9E24 Follow us: https://twitter.com/CiscoChampion
Cisco Radio Aware Routing addresses several of the challenges faced when merging IP routing and radio communications in mobile networks, especially those exhibiti...
Listen: https://smarturl.it/CCRS9E23 Follow us: https://twitter.com/CiscoChampion The Wi-Fi 6E Catalyst 9136 access point takes advantage of the 6-GHz band to produce a network that is more reliable and secure, with higher throughput, more ...
When moving from OSPFv2 to OSPFv3, there are many changes in the format of the LSAs Type, but the most known changes are: IP prefix informations are no longer carried in Type-1 LSA and Type-2 LSA, new LSAs Type 8 and 9 are added to carry these prefixes.
Read a preview of my book OSPF The ultimate for CCIE Enterprise and Infrastructure exam in Books Google Doc.
Read a preview of my book OSPF The ultimate for CCIE Enterprise and Infrastructure exam in Books Google Doc #CCIE #Cisco #exam #Enterprise ...
Question 1: What are the OSPF Loop Prevention Mechanisms
In single area, the routers have the Link State Database, having the same LSDB helps the routers to build a loop-free topology.
Now in a multiarea topology, the ABRs are respon...