05-12-2022 04:01 PM
Hi Everyone,
We have a couple of Cisco C9200L-48PL-4X as our edge switches, and we found out that whenever we try to save configs, master switch in stack crashes/reboots except for the "standby' switch. We notice this happening only on stacked switches and we get the error log below:
switch 1 reloaded due to "Critical Software Exception" and Switch 2 rebooted due to "Bulk Sync Failure".
So what will happen is we will wait for the switches to come back and then issue another reload to get the switch stack operation back to normal.
#sh switch
Switch/Stack Mac Address : bcd2.95e8.5000 - Local Mac Address
Mac persistency wait time: Indefinite
H/W Current
Switch# Role Mac Address Priority Version State -------------------------------------------------------------------------------------
*1 Active bcd2.95e8.5000 15 V01 Ready
2 Member bcd2.95d8.4c80 1 V01 Ready
3 Standby bcd2.95d8.4a00 10 V01 Ready
#sh ver Cisco IOS XE Software, Version 17.07.01
Cisco IOS Software [Cupertino], Catalyst L3 Switch Software (CAT9K_LITE_IOSXE), Version 17.7.1, RELEASE SOFTWARE (fc5)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2021 by Cisco Systems, Inc.
Compiled Sat 04-Dec-21 15:32 by mcpre
Switch Ports Model SW Version SW Image Mode
------ ----- ----- ---------- ---------- ----
* 1 52 C9200L-48PL-4X 17.07.01 CAT9K_LITE_IOSXE INSTALL
2 52 C9200L-48PL-4X 17.07.01 CAT9K_LITE_IOSXE INSTALL
3 52 C9200L-48PL-4X 17.07.01 CAT9K_LITE_IOSXE INSTALL
05-12-2022 04:03 PM
17.7.1 is the latest code, worth opening a TAC case, or downgrading to 17.3.3 or any same variant and test.
05-12-2022 06:31 PM - edited 05-12-2022 06:40 PM
I would not use 17.X.X on 9200/9200L because there has been several memory leak involving the stack-mgr process. When this process blows up, it will show up as "EHSA standby down" or "Bulk Sync Failure" or something else like "rc_0_0_0".
Downgrade the firmware to 16.12.7) and see if this makes the stack stable.
Please post the complete output to the following commands:
11-16-2023 11:38 AM - edited 11-16-2023 11:39 AM
Leo Laohoo do you still feel this way after a year and a half? we are looking at moving to 17.9.4a and came across this post.
11-16-2023 03:00 PM
We have >800 stacks of 9300 and >400 9500 on 17.9.3 and we have not seen any issues.
I am slowly upgrading our switches to 17.9.4a.
11-16-2023 03:26 PM
thanks.
11-16-2023 03:21 PM
yes we upgraded 17.9.4a looks good so far.
11-16-2023 03:26 PM
thanks
01-11-2024 07:05 AM
I'm have sporadic issues with 17.9.4a I have had 4 stacks randomly crash a single member. I am thinking about rolling back to 17.9.3. That seems to be the most stable release, unless anyone else is experiencing issues with that release.
01-11-2024 08:54 AM
not that we aware 17.9.4a have any issue, May be worth open a TAC case investigate for you
if you are happy with 17.9.3 - feel free to downgrade to restore the services.
01-11-2024 01:45 PM
@ParsonsAP82 wrote:
I'm have sporadic issues with 17.9.4a I have had 4 stacks randomly crash a single member.
Post the complete output to the following commands:
sh version
sh logging onboard switch <CRASHING SWITCH MEMBER> uptime detail
dir crashinfo-<CRASHING SWITCH MEMBER>:
dir crashinfo-<CRASHING SWITCH MEMBER>:tracelogs | exclude gz
dir flash-<CRASHING SWITCH MEMBER>:core
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide