07-29-2020 12:27 AM
Hi Everyone,
In a stack, member switch (Cisco 9300-48P) went into removed state after reload.
Switch# Role Mac Address Priority Version Current State
-------------------------------------------------------------------------------------
*1 Active 4ce1.75f1.a400 1 V03 Ready
2 Member 0000.0000.0000 0 V03 Removed
Is there any possible way to fix it without reload
As I can see in "show logging"
%HMANRP-6-HMAN_IOS_CHANNEL_INFO: HMAN-IOS channel event for switch 2: EMP_RELAY: Channel DOWN!
Please share you thoughts !
07-29-2020 12:39 AM
Helllo
Have you perfromed a physical check of the failed siwtch?
Is there power did it even come back up?
Are the stack cables seated correctly?
08-05-2021 01:47 AM - edited 08-05-2021 01:48 AM
Hello Paul
I have exactly the same issue, I added a second switch to my stack of one switch and it worked for 2-3 hours where the second switch was marked as STANDBY but now I discover it has been removed. Any ideas what is wrong here? The two C9200L switches are new.
Below is the ouput of "sh platform":
#sh platform
Switch Ports Model Serial No. MAC address Hw Ver. Sw Ver.
------ ----- --------- ----------- -------------- ------- --------
1 52 C9200L-48PXG-4X ########## 28af.fd04.ec80 V01 17.03.03
Switch/Stack Mac Address : 28af.fd04.ec80 - Local Mac Address
Mac persistency wait time: Indefinite
Current
Switch# Role Priority State
-------------------------------------------
*1 Active 1 Ready
2 Member 0 Removed
In the log I have:
Aug 4 17:48:41.715: %HMANRP-6-HMAN_IOS_CHANNEL_INFO: HMAN-IOS channel event for switch 2: EMP_RELAY: Channel DOWN!
Aug 4 17:50:08.060: %SMART_LIC-3-COMM_FAILED: Communications failure with the Cisco Smart License Utility (CSLU) : Unable to resolve server hostname/domain name
07-29-2020 12:49 AM - edited 07-29-2020 12:50 AM
Post the complete output to the command "sh platform".
If my suspicion is correct, the stack could be running a buggy version with "stack merge" as the cause.
08-05-2021 02:33 AM
- Use latest-advisory software release, check if that can help.
M.
08-05-2021 02:41 AM
How do I find out the version of the latest advisory software release of IOS?
08-05-2021 03:14 AM
- FYI : https://software.cisco.com/download/home/286313983/type/282046477/release/Amsterdam-17.3.3
(Check download page, for the particular model , look for Suggested Release)
M.
08-05-2021 03:23 AM
Thank you for the pointer. So the suggested release is 17.3.3 and that's exactly the version number both of my switches in the stack are running.
08-05-2021 03:39 AM
- Besides possible solutions offered here, then usually the next step is to escalate the problem by contacting TAC.
M.
08-05-2021 04:37 AM
If that 2nd switch has joined the stack, post the complete output to the following commands:
dir flash-2: dir flash-2:core dir crashinfo-2: sh log on switch 2 uptime detail
08-05-2021 05:23 AM
I actually managed to reload that switch earlier today running "reload slot 2" and it rejoined the stack, so far so good. Here is the output you have asked for:
sw#dir flash-2: Directory of flash-2:/ 8106 -rw- 2097152 Aug 5 2021 12:09:30 +02:00 nvram_config_bkup 8112 -rw- 2097152 Aug 5 2021 12:09:30 +02:00 nvram_config 40481 drwx 4096 Aug 5 2021 11:10:46 +02:00 .installer 8109 -rw- 556 Aug 5 2021 11:09:30 +02:00 vlan.dat 48579 drwx 4096 Aug 5 2021 11:09:26 +02:00 license_evlog 56673 drwx 4096 Aug 5 2021 11:09:09 +02:00 .prst_sync 8107 -rw- 15139 Aug 5 2021 11:08:04 +02:00 rdope_out.txt 8105 -rw- 0 Aug 5 2021 11:08:04 +02:00 dope_hist 8108 -rw- 89 Aug 5 2021 11:08:01 +02:00 rdope.log 8102 -rw- 134458 Aug 5 2021 11:05:55 +02:00 memleak.tcl 8098 -rw- 2130 Aug 5 2021 11:05:40 +02:00 boothelper.log 80984 drwx 4096 Aug 5 2021 11:05:38 +02:00 dc_profile_dir 8099 -rw- 407 Aug 5 2021 11:05:19 +02:00 bootloader_evt_handle.log 8110 drwx 4096 Aug 5 2021 11:04:08 +02:00 .rommon_sync 8101 -rw- 2130 Aug 5 2021 10:59:05 +02:00 boothelper.log.old 40491 drwx 4096 Jun 28 2021 11:57:05 +02:00 pnp-tech 48578 drwx 4096 May 24 2021 12:17:45 +02:00 .rollback_timer 40518 -rw- 40648801 May 24 2021 12:12:24 +02:00 cat9k_lite-rpboot.17.03.03.SPA.pkg 40514 -rw- 4919 May 24 2021 12:12:24 +02:00 packages.conf 40517 -rw- 11031572 May 24 2021 12:11:00 +02:00 cat9k_lite-webui.17.03.03.SPA.pkg 40516 -rw- 4133912 May 24 2021 12:11:00 +02:00 cat9k_lite-srdriver.17.03.03.SPA.pkg 40515 -rw- 426931224 May 24 2021 12:11:00 +02:00 cat9k_lite-rpbase.17.03.03.SPA.pkg 40486 drwx 4096 May 24 2021 12:05:48 +02:00 .dbpersist 40484 drwx 4096 May 24 2021 12:02:56 +02:00 core 48580 drwx 4096 May 24 2021 12:02:36 +02:00 pnp-info 40488 drwx 4096 May 24 2021 12:02:35 +02:00 onep 89057 drwx 4096 May 24 2021 12:01:21 +02:00 .USWAP 113345 drwx 4096 May 24 2021 11:56:40 +02:00 Tbot 105249 drwx 4096 May 24 2021 11:56:39 +02:00 .CRFT 80986 drwx 4096 May 24 2021 11:56:33 +02:00 sys_report 80961 drwx 4096 May 24 2021 11:56:32 +02:00 tech_support 56676 drwx 4096 May 24 2021 11:56:32 +02:00 ss_disc 8100 -rw- 5242880 May 24 2021 11:56:32 +02:00 ssd 1956904960 bytes total (1359216640 bytes free) sw#dir flash-2:core Directory of flash-2:/core/ 40490 -rw- 1 Aug 5 2021 14:09:26 +02:00 .callhome 64769 drwx 4096 May 24 2021 11:56:29 +02:00 modules 1956904960 bytes total (1359216640 bytes free) sw#dir crashinfo-2: Directory of crashinfo-2:/ 29313 drwx 24576 Aug 5 2021 14:19:42 +02:00 tracelogs 15 -rw- 11033250 Aug 5 2021 10:57:35 +02:00 sw_2_RP_0-system-report_2_20210805-105728-CEST.tar.gz 14 -rw- 2932907 Aug 4 2021 19:48:30 +02:00 sw_2_RP_0_trace_archive_0-20210804-194826.tar.gz 13 -rw- 9819246 Aug 4 2021 18:00:54 +02:00 sw_2_RP_0-system-report_2_20210804-180048-CEST.tar.gz 11 -rw- 2797408 Aug 4 2021 18:00:46 +02:00 sw_2_RP_0_trace_archive_0-20210804-180041.tar.gz 12 -rw- 0 Dec 12 2020 05:36:51 +01:00 koops.dat 825753600 bytes total (751304704 bytes free) sw#sh log on switch 2 uptime detail -------------------------------------------------------------------------------- UPTIME SUMMARY INFORMATION -------------------------------------------------------------------------------- First customer power on : 05/24/2021 12:01:55 Total uptime : 0 years 0 weeks 1 days 0 hours 43 minutes Total downtime : 0 years 10 weeks 2 days 1 hours 20 minutes Number of resets : 8 Number of slot changes : 1 Current reset reason : Reload Slot Command Current reset timestamp : 08/05/2021 11:06:41 Current slot : 2 Chassis type : 247 Current uptime : 0 years 0 weeks 0 days 3 hours 0 minutes -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- UPTIME CONTINUOUS INFORMATION -------------------------------------------------------------------------------- Time Stamp | Reset | Uptime MM/DD/YYYY HH:MM:SS | Reason | years weeks days hours minutes -------------------------------------------------------------------------------- 05/24/2021 12:01:55 Power Failure or Unknown 0 0 0 0 0 05/24/2021 12:20:42 Image Install 0 0 0 0 15 05/24/2021 12:24:27 Reload Command 0 0 0 0 0 06/28/2021 11:54:49 Power Failure or Unknown 0 0 0 0 0 06/28/2021 15:37:30 Power Failure or Unknown 0 0 0 1 0 08/04/2021 15:39:30 Power Failure or Unknown 0 0 0 1 0 08/04/2021 18:03:26 stack merge 0 0 0 1 58 08/05/2021 11:00:07 stack merge 0 0 0 16 0 08/05/2021 11:06:41 Reload Slot Command 0 0 0 0 0 --------------------------------------------------------------------------------
08-05-2021 03:59 PM
@MichaelBalzer77353 wrote:
08/04/2021 18:03:26 stack merge 0 0 0 1 58 08/05/2021 11:00:07 stack merge 0 0 0 16 0
Hello there, my "friend". Nice to see you again. (F*ck!)
@MichaelBalzer77353 wrote:
15 -rw- 11033250 Aug 5 2021 10:57:35 +02:00 sw_2_RP_0-system-report_2_20210805-105728-CEST.tar.gz 13 -rw- 9819246 Aug 4 2021 18:00:54 +02:00 sw_2_RP_0-system-report_2_20210804-180048-CEST.tar.gz
There several known bugs with the 9200/9200L, 9300/9300L that involves "stack merge". This is a very "generic" message that revolves around the when something "blows up" with the switch stack process, like a memory leak.
I cannot guarantee anything but if you can attach the two system reports, I may be able to determine what is causing the stack merge. Again, I cannot guarantee anything -- I do not work for Cisco or Cisco TAC and I have very simple methods of "looking" (vs analyzing) those crashinfo files.
08-05-2021 06:03 AM
Yes I joined that switch #2 yesterday in the stack but this morning when I checked it was "Removed". So what I did a bit earlier today is to "reload slot 2" and now it looks all good. But I wonder how could this happen to be "Removed" on it's own...
Here are the output you have asked for:
sw#dir flash-2: Directory of flash-2:/ 8106 -rw- 2097152 Aug 5 2021 12:09:30 +02:00 nvram_config_bkup 8112 -rw- 2097152 Aug 5 2021 12:09:30 +02:00 nvram_config 40481 drwx 4096 Aug 5 2021 11:10:46 +02:00 .installer 8109 -rw- 556 Aug 5 2021 11:09:30 +02:00 vlan.dat 48579 drwx 4096 Aug 5 2021 11:09:26 +02:00 license_evlog 56673 drwx 4096 Aug 5 2021 11:09:09 +02:00 .prst_sync 8107 -rw- 15139 Aug 5 2021 11:08:04 +02:00 rdope_out.txt 8105 -rw- 0 Aug 5 2021 11:08:04 +02:00 dope_hist 8108 -rw- 89 Aug 5 2021 11:08:01 +02:00 rdope.log 8102 -rw- 134458 Aug 5 2021 11:05:55 +02:00 memleak.tcl 8098 -rw- 2130 Aug 5 2021 11:05:40 +02:00 boothelper.log 80984 drwx 4096 Aug 5 2021 11:05:38 +02:00 dc_profile_dir 8099 -rw- 407 Aug 5 2021 11:05:19 +02:00 bootloader_evt_handle.log 8110 drwx 4096 Aug 5 2021 11:04:08 +02:00 .rommon_sync 8101 -rw- 2130 Aug 5 2021 10:59:05 +02:00 boothelper.log.old 40491 drwx 4096 Jun 28 2021 11:57:05 +02:00 pnp-tech 48578 drwx 4096 May 24 2021 12:17:45 +02:00 .rollback_timer 40518 -rw- 40648801 May 24 2021 12:12:24 +02:00 cat9k_lite-rpboot.17.03.03.SPA.pkg 40514 -rw- 4919 May 24 2021 12:12:24 +02:00 packages.conf 40517 -rw- 11031572 May 24 2021 12:11:00 +02:00 cat9k_lite-webui.17.03.03.SPA.pkg 40516 -rw- 4133912 May 24 2021 12:11:00 +02:00 cat9k_lite-srdriver.17.03.03.SPA.pkg 40515 -rw- 426931224 May 24 2021 12:11:00 +02:00 cat9k_lite-rpbase.17.03.03.SPA.pkg 40486 drwx 4096 May 24 2021 12:05:48 +02:00 .dbpersist 40484 drwx 4096 May 24 2021 12:02:56 +02:00 core 48580 drwx 4096 May 24 2021 12:02:36 +02:00 pnp-info 40488 drwx 4096 May 24 2021 12:02:35 +02:00 onep 89057 drwx 4096 May 24 2021 12:01:21 +02:00 .USWAP 113345 drwx 4096 May 24 2021 11:56:40 +02:00 Tbot 105249 drwx 4096 May 24 2021 11:56:39 +02:00 .CRFT 80986 drwx 4096 May 24 2021 11:56:33 +02:00 sys_report 80961 drwx 4096 May 24 2021 11:56:32 +02:00 tech_support 56676 drwx 4096 May 24 2021 11:56:32 +02:00 ss_disc 8100 -rw- 5242880 May 24 2021 11:56:32 +02:00 ssd 1956904960 bytes total (1359216640 bytes free) sw#dir flash-2:core Directory of flash-2:/core/ 40490 -rw- 1 Aug 5 2021 14:09:26 +02:00 .callhome 64769 drwx 4096 May 24 2021 11:56:29 +02:00 modules 1956904960 bytes total (1359216640 bytes free) sw#dir crashinfo-2: Directory of crashinfo-2:/ 29313 drwx 24576 Aug 5 2021 14:19:42 +02:00 tracelogs 15 -rw- 11033250 Aug 5 2021 10:57:35 +02:00 sw_2_RP_0-system-report_2_20210805-105728-CEST.tar.gz 14 -rw- 2932907 Aug 4 2021 19:48:30 +02:00 sw_2_RP_0_trace_archive_0-20210804-194826.tar.gz 13 -rw- 9819246 Aug 4 2021 18:00:54 +02:00 sw_2_RP_0-system-report_2_20210804-180048-CEST.tar.gz 11 -rw- 2797408 Aug 4 2021 18:00:46 +02:00 sw_2_RP_0_trace_archive_0-20210804-180041.tar.gz 12 -rw- 0 Dec 12 2020 05:36:51 +01:00 koops.dat 825753600 bytes total (751304704 bytes free) sw#sh log on switch 2 uptime detail -------------------------------------------------------------------------------- UPTIME SUMMARY INFORMATION -------------------------------------------------------------------------------- First customer power on : 05/24/2021 12:01:55 Total uptime : 0 years 0 weeks 1 days 0 hours 43 minutes Total downtime : 0 years 10 weeks 2 days 1 hours 20 minutes Number of resets : 8 Number of slot changes : 1 Current reset reason : Reload Slot Command Current reset timestamp : 08/05/2021 11:06:41 Current slot : 2 Chassis type : 247 Current uptime : 0 years 0 weeks 0 days 3 hours 0 minutes -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- UPTIME CONTINUOUS INFORMATION -------------------------------------------------------------------------------- Time Stamp | Reset | Uptime MM/DD/YYYY HH:MM:SS | Reason | years weeks days hours minutes -------------------------------------------------------------------------------- 05/24/2021 12:01:55 Power Failure or Unknown 0 0 0 0 0 05/24/2021 12:20:42 Image Install 0 0 0 0 15 05/24/2021 12:24:27 Reload Command 0 0 0 0 0 06/28/2021 11:54:49 Power Failure or Unknown 0 0 0 0 0 06/28/2021 15:37:30 Power Failure or Unknown 0 0 0 1 0 08/04/2021 15:39:30 Power Failure or Unknown 0 0 0 1 0 08/04/2021 18:03:26 stack merge 0 0 0 1 58 08/05/2021 11:00:07 stack merge 0 0 0 16 0 08/05/2021 11:06:41 Reload Slot Command 0 0 0 0 0 --------------------------------------------------------------------------------
08-05-2021 06:55 AM
It crashed again and now I have the full log from the buffer:
Aug 5 15:29:15: %REDUNDANCY-3-STANDBY_LOST: Standby processor fault (PEER_NOT_PRESENT) Aug 5 15:29:15: %REDUNDANCY-3-STANDBY_LOST: Standby processor fault (PEER_DOWN) Aug 5 15:29:15: %REDUNDANCY-3-STANDBY_LOST: Standby processor fault (PEER_REDUNDANCY_STATE_CHANGE) Aug 5 15:29:15: %HMANRP-5-CHASSIS_DOWN_EVENT: Chassis 2 gone DOWN! Aug 5 15:29:15: %STACKMGR-4-SWITCH_REMOVED: Switch 1 R0/0: stack_mgr: Switch 2 has been removed from the stack. Aug 5 15:29:15: %RF-5-RF_RELOAD: Peer reload. Reason: EHSA standby down Aug 5 15:29:16: %IOSXE_REDUNDANCY-6-PEER_LOST: Active detected switch 2 is no longer standby Aug 5 15:29:16: %HMANRP-6-HMAN_IOS_CHANNEL_INFO: HMAN-IOS channel event for switch 2: EMP_RELAY: Channel DOWN!
08-05-2021 09:53 AM
>....
Aug 5 15:29:15: %REDUNDANCY-3-STANDBY_LOST: Standby processor fault (PEER_NOT_PRESENT)
Are there any other related messages around the same time before this message (see above)
M.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide