02-02-2021 07:52 AM
Hello,
We have an issue with one of our C9300-48P switches, it has 4 stacks.
When we do "wr" or "copy run start" sometimes it's stuck and the switch doesn't react anymore in the CLI.
Here are the version details :
C9300-48P 16.12.4 CAT9K_IOSXE INSTALL
Are you aware of this bug and do you know why it's doing that ?
Solved! Go to Solution.
04-09-2021 12:51 AM - edited 04-09-2021 12:52 AM
FYI this issue is solved, a TAC case had been created and a Cisco engineer was able to find the source, it's related with bug CSCvp92564
You can see a thread about this bug also in Reddit: KRON defects : Cisco (reddit.com)
So it was the KRON write config scheduled task that was causing the freeze + reboot issue each time we did WR or "copy run start".
The solution proposed by Cisco, in the config :
02-02-2021 08:06 AM
not that we aware of any bugs, we are running many device with same version.
can you post
show switch
show process cpu sort | ex 0.00
show process cpu history
02-02-2021 09:45 AM
Here are the results :
#sh switch
Switch/Stack Mac Address : 40f0.xxxx.xxxx - Local Mac Address
Mac persistency wait time: Indefinite
H/W Current
Switch# Role Mac Address Priority Version State
-------------------------------------------------------------------------------------
1 Standby 40f0.xxxx.xxxx 15 V04 Ready
2 Member 40f0.xxxx.xxxx 1 V04 Ready
*3 Active 40f0.xxxx.xxxx 1 V04 Ready
4 Member 40f0.xxxx.xxxx 5 V04 Ready
show process cpu sort | ex 0.00
CPU utilization for five seconds: 2%/0%; one minute: 2%; five minutes: 2%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
101 1531419 1484196 1031 0.79% 0.63% 0.60% 0 Crimson flush tr
255 29104 70405 413 0.23% 0.26% 0.25% 0 Spanning Tree
162 10680 6199 1722 0.15% 0.11% 0.10% 0 IOMD IPC process
81 39919 244684 163 0.07% 0.13% 0.15% 0 IOSD ipc task
129 6186 108176 57 0.07% 0.04% 0.05% 0 IOSXE-RP Punt Se
133 363664 38259431 9 0.07% 0.05% 0.05% 0 L2 LISP Punt Pro
117 439483 3561577 123 0.07% 0.07% 0.07% 0 Crimson config p
554 8638 36653 235 0.07% 0.08% 0.07% 0 LLDP Protocol
549 4029 73136 55 0.07% 0.03% 0.03% 0 PDU DISPATCHER
73 5937 35059 169 0.07% 0.07% 0.05% 0 Net Background
257 94582 6154633 15 0.07% 0.07% 0.07% 0 UDLD
#show process cpu history
222222222222222222222222222222222222224444444444222223333322
100
90
80
70
60
50
40
30
20
10
0....5....1....1....2....2....3....3....4....4....5....5....6
0 5 0 5 0 5 0 5 0 5 0
CPU% per second (last 60 seconds)
434444434343434443444343444344334343444344434444444444544444
100
90
80
70
60
50
40
30
20
10 *
0....5....1....1....2....2....3....3....4....4....5....5....6
0 5 0 5 0 5 0 5 0 5 0
CPU% per minute (last 60 minutes)
* = maximum CPU% # = average CPU%
12 1 1 1
612212222222222232222222222222222221222222222222222222222222222222222122
100
90
80
70
60
50
40
30
20 **
10 ** * * *
0....5....1....1....2....2....3....3....4....4....5....5....6....6....7..
0 5 0 5 0 5 0 5 0 5 0 5 0
CPU% per hour (last 72 hours)
* = maximum CPU% # = average CPU%
02-02-2021 10:00 AM
#sh switch - looking at output, the stack switches are jumbled - May required stack reboot ( once we see below output - any switches rebooted and uptime)
Can you post show version (complete output)
02-02-2021 10:07 AM
show version
Cisco IOS XE Software, Version 16.12.04
Cisco IOS Software [Gibraltar], Catalyst L3 Switch Software (CAT9K_IOSXE), Version 16.12.4, RELEASE SOFTWARE (fc5)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2020 by Cisco Systems, Inc.
Compiled Thu 09-Jul-20 21:49 by mcpre
Cisco IOS-XE software, Copyright (c) 2005-2020 by cisco Systems, Inc.
All rights reserved. Certain components of Cisco IOS-XE software are
licensed under the GNU General Public License ("GPL") Version 2.0. The
software code licensed under GPL Version 2.0 is free software that comes
with ABSOLUTELY NO WARRANTY. You can redistribute and/or modify such
GPL code under the terms of GPL Version 2.0. For more details, see the
documentation or "License Notice" file accompanying the IOS-XE software,
or the applicable URL provided on the flyer accompanying the IOS-XE
software.
ROM: IOS-XE ROMMON
BOOTLDR: System Bootstrap, Version 17.3.1r[FC3], RELEASE SOFTWARE (P)
WROSWC1 uptime is 2 weeks, 5 days, 5 hours, 19 minutes
Uptime for this control processor is 2 weeks, 5 days, 5 hours, 20 minutes
System returned to ROM by SSO Switchover
System image file is "flash:packages.conf"
Last reload reason: PowerOn
This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.
A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html
If you require further assistance please contact us by sending email to
export@cisco.com.
Technology Package License Information:
------------------------------------------------------------------------------
Technology-package Technology-package
Current Type Next reboot
------------------------------------------------------------------------------
network-advantage Smart License network-advantage
dna-advantage Subscription Smart License dna-advantage
AIR License Level: AIR DNA Advantage
Next reload AIR license Level: AIR DNA Advantage
Smart Licensing Status: UNREGISTERED/EVAL MODE
cisco C9300-48P (X86) processor with 1343576K/6147K bytes of memory.
Processor board ID FOC2444L966
12 Virtual Ethernet interfaces
208 Gigabit Ethernet interfaces
32 Ten Gigabit Ethernet interfaces
8 TwentyFive Gigabit Ethernet interfaces
8 Forty Gigabit Ethernet interfaces
2048K bytes of non-volatile configuration memory.
8388608K bytes of physical memory.
1638400K bytes of Crash Files at crashinfo:.
1638400K bytes of Crash Files at crashinfo-2:.
11264000K bytes of Flash at flash:.
11264000K bytes of Flash at flash-2:.
0K bytes of WebUI ODM Files at webui:.
1638400K bytes of Crash Files at crashinfo-1:.
11264000K bytes of Flash at flash-1:.
1638400K bytes of Crash Files at crashinfo-4:.
11264000K bytes of Flash at flash-4:.
Base Ethernet MAC Address : xx:xx:xx:xx:xx:xx
Motherboard Assembly Number : 73-19918-03
Motherboard Serial Number : FOXXXX
Model Revision Number : A0
Motherboard Revision Number : A0
Model Number : C9300-48P
System Serial Number : FOXXXX
Switch Ports Model SW Version SW Image Mode
------ ----- ----- ---------- ---------- ----
1 65 C9300-48P 16.12.4 CAT9K_IOSXE INSTALL
2 65 C9300-48P 16.12.4 CAT9K_IOSXE INSTALL
* 3 65 C9300-48P 16.12.4 CAT9K_IOSXE INSTALL
4 65 C9300-48P 16.12.4 CAT9K_IOSXE INSTALL
Switch 01
---------
Switch uptime : 1 week, 3 hours, 16 minutes
Base Ethernet MAC Address : xx:xx:xx:xx:xx:xx
Motherboard Assembly Number : 73-19918-03
Motherboard Serial Number : FOXXXX
Model Revision Number : A0
Motherboard Revision Number : A0
Model Number : C9300-48P
System Serial Number : FOXXXX
Last reload reason : Critical software exception
Switch 02
---------
Switch uptime : 2 weeks, 5 days, 5 hours, 21 minutes
Base Ethernet MAC Address : xx:xx:xx:xx:xx:xx
Motherboard Assembly Number : 73-19918-03
Motherboard Serial Number : FOXXXX
Model Revision Number : A0
Motherboard Revision Number : A0
Model Number : C9300-48P
System Serial Number : FOXXXX
Last reload reason : PowerOn
Switch 04
---------
Switch uptime : 2 hours, 55 minutes
Base Ethernet MAC Address : xx:xx:xx:xx:xx:xx
Motherboard Assembly Number : 73-19918-03
Motherboard Serial Number : FOXXXX
Model Revision Number : A0
Motherboard Revision Number : A0
Model Number : C9300-48P
System Serial Number : FOXXXX
Last reload reason : Critical software exception
Configuration register is 0x102
02-02-2021 10:24 AM
You have inconsistance - some reason switches are rebooted - check the below uptime.
1. check the power source
2. find the Logs you see any obnormal capture for records or post here.
take maintenance window - switch off whole stack and switch on mased on the priority order and check.
Switch 3
---------
uptime is 2 weeks, 5 days, 5 hours, 19 minutes
Switch 01
---------
Switch uptime : 1 week, 3 hours, 16 minutes
Switch 02
---------
Switch uptime : 2 weeks, 5 days, 5 hours, 21 minutes
Switch 04
---------
Switch uptime : 2 hours, 55 minutes
02-02-2021 11:30 AM
Thanks I will check that, do you think the stack inconsistance could cause issue with WR ?
02-02-2021 12:08 PM
yes that should resolve your issue, first investigate why switches are rebooting. (Logs will show you some indication what is wrong ?)
02-02-2021 02:07 PM
@Clem58 wrote:
Critical software exception
Hmmmm ...
Can you post the complete output to the following commands:
dir crashinfo-1: dir crashinfo-4: dir flash-1:core dir flash-4:core
02-02-2021 10:27 PM
dir crashinfo-1:
Directory of crashinfo-1:/
11 -rw- 0 Dec 1 2020 06:16:03 +00:00 koops.dat
23665 drwx 24576 Feb 3 2021 06:22:43 +00:00 tracelogs
63105 drwx 4096 Feb 2 2021 15:06:53 +00:00 license_evlog
12 -rw- 38135814 Jan 12 2021 13:32:34 +00:00 system-report_1_20210112-133232-UTC.tar.gz
13 -rw- 38882607 Jan 26 2021 14:43:36 +00:00 system-report_1_20210126-144334-UTC.tar.gz
14 -rw- 1917962 Feb 2 2021 15:04:00 +00:00 SWNAME_1_RP_0_trace_archive_0-20210202-150359.tar.gz
1651507200 bytes total (1481637888 bytes free)
dir crashinfo-4:
Directory of crashinfo-4:/
11 -rw- 0 Dec 1 2020 06:04:47 +00:00 koops.dat
15777 drwx 32768 Feb 3 2021 06:23:31 +00:00 tracelogs
7889 drwx 4096 Jan 14 2021 12:45:32 +00:00 license_evlog
12 -rw- 1393456 Jan 12 2021 13:31:51 +00:00 SWNAME_4_RP_0_trace_archive_0-20210112-133150.tar.gz
13 -rw- 1530024 Jan 26 2021 14:42:52 +00:00 SWNAME_trace_archive_0-20210126-144251.tar.gz
14 -rw- 38381941 Feb 2 2021 15:04:47 +00:00 system-report_4_20210202-150445-UTC.tar.gz
1651507200 bytes total (1520435200 bytes free)
dir flash-1:core
Directory of flash-1:/core/
98305 drwx 4096 Dec 1 2020 05:55:06 +00:00 modules
81922 -rw- 1 Feb 3 2021 06:21:59 +00:00 .callhome
11353980928 bytes total (9936306176 bytes free)
dir flash-4:core
Directory of flash-4:/core/
376833 drwx 4096 Dec 1 2020 05:43:38 +00:00 modules
360450 -rw- 1 Feb 2 2021 14:48:49 +00:00 .callhome
11353980928 bytes total (9936306176 bytes free)
02-02-2021 11:32 PM
@Clem58 wrote:
SWNAME_1_RP_0_trace_archive_0-20210202-150359.tar.gz
system-report_4_20210202-150445-UTC.tar.gz
Export these two files and send them to TAC for analysis.
Alternatively, 16.12.5 was released 01 February 2021. I would recommend upgrading to that firmware.
02-02-2021 11:37 PM - edited 02-03-2021 12:20 AM
We have 2 switches with prio 1, could it be a problem ? In our other core switches we have an order 15-14-6-5
02-03-2021 06:46 AM
We now have these lines flooding the log :
*Feb 3 2021 14:42:00.550 UTC: %PARSER-6-WMLRETRY: Write memory lock currently held by pid '147', automatic retry. -Process= "SSH Process", ipl= 0, pid= 128
*Feb 3 2021 14:42:30.551 UTC: %PARSER-6-WMLRETRY: Write memory lock currently held by pid '147', automatic retry. -Process= "SSH Process", ipl= 0, pid= 128
*Feb 3 2021 14:43:00.551 UTC: %PARSER-6-WMLRETRY: Write memory lock currently held by pid '147', automatic retry. -Process= "SSH Process", ipl= 0, pid= 128
02-03-2021 07:39 AM
This could be bug,
post show run config
also show version, see if the switches rebooting ?
you priorty is ok, but if 15 and 14 go down other will become master, that is expected behaviour.
Also suggest to raise a TAC.
try below :
clear tcp tcb
02-03-2021 07:47 AM
Nope it's not rebooting and the 2 commands are issued without any problem.
I didn't do "clear tcp tcb" I'm afraid it could break something ?
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide