cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1219
Views
10
Helpful
3
Replies

9200 Stack Master Randomly Rebooting

Spray123
Level 1
Level 1

I have a stack of 2 C9200L-48P-4X running 16.12.4 and about once a week which ever member is set as master will randomly crash/reboot/reload. It doesn't matter which member is master at the time, its happened with both stack members being master. Log output for the time of the reload is:

Nov 4 07:08:31.047: %PLATFORM_INFRA-5-IOS_INTR_OVER_LIMIT: IOS thread disabled interrupt for 272 msec
-Traceback= 1#714eae76875aa6a2c50124258b444fa8 :10000+421D2B8 :10000+1697708 :10000+1673728 :10000+1676EE8 :10000+56054A0
Nov 4 07:08:33.090: %HMANRP-6-HMAN_IOS_CHANNEL_INFO: HMAN-IOS channel event for switch 2: EMP_RELAY: Channel UP!
Nov 4 07:08:35.218: %PLATFORM-6-HASTATUS: RP switchover, received chassis event to become active
Nov 4 07:08:35.368: %REDUNDANCY-3-SWITCHOVER: RP switchover (PEER_NOT_PRESENT)
Nov 4 07:08:35.381: %REDUNDANCY-3-SWITCHOVER: RP switchover (PEER_DOWN)
Nov 4 07:08:35.381: %REDUNDANCY-3-SWITCHOVER: RP switchover (PEER_REDUNDANCY_STATE_CHANGE)
Nov 4 07:08:36.912: %PLATFORM-6-HASTATUS: RP switchover, sent message became active. IOS is ready to switch to primary after chassis confirmation
Nov 4 07:08:38.276: %HMANRP-6-EMP_NO_ELECTION_INFO: Could not elect active EMP switch, setting emp active switch to 0: EMP_RELAY: Could not elect switch with mgmt port UP
Nov 4 07:08:38.296: %PLATFORM-6-HASTATUS: RP switchover, received chassis event became active
Nov 4 07:08:38.299: %PLATFORM_FEP-1-FRU_PS_SIGNAL_OK: Switch 2: signal on power supply A is restored
Nov 4 07:08:38.344: %PLATFORM_FEP-1-FRU_PS_SIGNAL_OK: Switch 2: signal on power supply B is restored
Nov 4 07:08:38.600: %PLATFORM-6-HASTATUS_DETAIL: RP switchover, received chassis event became active. Switch to primary (count 4)
Nov 4 07:08:38.707: %HA-6-SWITCHOVER: Route Processor switched from standby to being active
Nov 4 07:08:38.838 UTC: Unable to set IPV4 table id for BT interface

Nov 4 07:08:38.852 UTC: Unable to set IPV6 table id for BT interface

Nov 4 07:08:42.145: %SYS-6-LOGGINGHOST_STARTSTOP: Logging to host 172.24.253.2 port 514 started - CLI initiated
Nov 4 07:08:42.315: %HMANRP-5-CHASSIS_DOWN_EVENT: Chassis 1 gone DOWN!

 

CPU utilization also spikes to 90+% during this time. 

3 Replies 3

balaji.bandi
Hall of Fame
Hall of Fame

First, i would turn off the stack

re-seat the stack cable and see if they are tight if you have a spare replace them with a new one.

after that still have issue, suggest upgrading to the latest IOS XE code 17.3.X and test it.

BB

=====Preenayamo Vasudevam=====

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

I've tested and reseated the cables and they are tightened down as tight as they will go. Unfortunately, I don't have any spare stack cables. My next course of action, when I can get a maintenance window is to upgrade the OS to 16.12.8 which is the latest recommended release for the 16.x OS train.

 

Post the complete output to the following commands: 

  • sh platform resource 
  • sh platform software status control-processor brief

If the stack is on 16.12.4, then I suspect there is a memory leak due to FN - 72323 - Cisco IOS XE Software: QuoVadis Root CA 2 Decommission Might Affect Smart Licensing, Smart Call Home, and Other Functionality

If neither workarounds are done, my guess is it would take about 6 to 8 months before the memory leak would look like the the picture below: 

3850 (4 x switches), Firmware version:  16.12.4, Uptime:  1y43w4d3850 (4 x switches), Firmware version: 16.12.4, Uptime: 1y43w4d

And if this is the case, then I'd drill down further to determine the process monopolizing the memory.  

Memory leak due to "keyman" processMemory leak due to "keyman" process

"keyman" process usually is between 10k to 15k.  "keyman" process is one of the few attributed to Cisco Smart License.  

Finally, 16.12.4 is not a stable firmware version for the 9200.  Downgrade to the latest 16.6.X for stability purposes.