cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
10734
Views
5
Helpful
19
Replies

Switch 3850 stacks upgrade failed and stuck in boot fail loop

Wayne.spq
Level 1
Level 1

Hi guys, 

Need a help advice, today I tried to upgrade IOS on Cisco switch 3850 stacks everything seems fine until system reload after upgrading.

 

Then I got stuck in boot fail loop... I follow the emergency recovery guid to press Mode button but nothing happened.

 

do you have other ways to get through this... this is driving me crazy now..

 

Wayne

19 Replies 19

balaji.bandi
Hall of Fame
Hall of Fame

Can you explain more about the procedure you followed? along with version upgrade FROM to TO.

what is the exact model also?

Can you also please post the console logs to understand better also?

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Hi BB,

here is the whole log when i upgrading..

I tried to upgrade from 3.2.3 to 9.16.9 and have 5 switchs in stack

switch 1 provision ws-c3850-48p
switch 2 provision ws-c3850-48t
switch 3 provision ws-c3850-48t
switch 4 provision ws-c3850-48p
switch 5 provision ws-c3850-48t

 

Switch#$stall file flash:cat3k_caa-universalk9.16.09.05.SPA.bin switch ?
  Valid switch ids: 1-5

  Multiple switch ids may be specified using a
  comma separated list or  '-' separated range

  For example:
    1,3,4     specifies switch ids 1, 3 and 4
    1-3       specifies switch ids 1, 2 and 3

  WORD  Switch id(s)

Switch#$stall file flash:cat3k_caa-universalk9.16.09.05.SPA.bin switch 1-5
Preparing install operation ...
[1]: Copying software from active switch 1 to switches 2,3,4,5
[1]: Finished copying software to switches 2,3,4,5
[1 2 3 4 5]: Starting install operation
[1 2 3 4 5]: Expanding bundle flash:cat3k_caa-universalk9.16.09.05.SPA.bin
[1 2 3 4 5]: Copying package files
[1 2 3 4 5]: Package files copied
[1 2 3 4 5]: Finished expanding bundle flash:cat3k_caa-universalk9.16.09.05.SPA.                                                                                        bin
[1 2 3 4 5]: Verifying and copying expanded package files to flash:
[1 2 3 4 5]: Verified and copied expanded package files to flash:
[1 2 3 4 5]: Starting compatibility checks
[1 2 3 4 5]: Finished compatibility checks
[1 2 3 4 5]: Starting application pre-installation processing
[1 2 3 4 5]: Finished application pre-installation processing
[1]: Old files list:
    Removed cat3k_caa-base.SPA.03.02.03.SE.pkg
[2]: Old files list:
    Removed cat3k_caa-base.SPA.03.02.03.SE.pkg
[3]: Old files list:
    Removed cat3k_caa-base.SPA.03.02.03.SE.pkg
[4]: Old files list:
    Removed cat3k_caa-base.SPA.03.02.03.SE.pkg
[5]: Old files list:
    Removed cat3k_caa-base.SPA.03.02.03.SE.pkg
[1]: New files list:
    Added cat3k_caa-guestshell.16.09.05.SPA.pkg
    Added cat3k_caa-rpbase.16.09.05.SPA.pkg
    Added cat3k_caa-rpcore.16.09.05.SPA.pkg
    Added cat3k_caa-srdriver.16.09.05.SPA.pkg
    Added cat3k_caa-webui.16.09.05.SPA.pkg
[2]: New files list:
    Added cat3k_caa-guestshell.16.09.05.SPA.pkg
    Added cat3k_caa-rpbase.16.09.05.SPA.pkg
    Added cat3k_caa-rpcore.16.09.05.SPA.pkg
    Added cat3k_caa-srdriver.16.09.05.SPA.pkg
    Added cat3k_caa-webui.16.09.05.SPA.pkg
[3]: New files list:
    Added cat3k_caa-guestshell.16.09.05.SPA.pkg
    Added cat3k_caa-rpbase.16.09.05.SPA.pkg
    Added cat3k_caa-rpcore.16.09.05.SPA.pkg
    Added cat3k_caa-srdriver.16.09.05.SPA.pkg
    Added cat3k_caa-webui.16.09.05.SPA.pkg
[4]: New files list:
    Added cat3k_caa-guestshell.16.09.05.SPA.pkg
    Added cat3k_caa-rpbase.16.09.05.SPA.pkg
    Added cat3k_caa-rpcore.16.09.05.SPA.pkg
    Added cat3k_caa-srdriver.16.09.05.SPA.pkg
    Added cat3k_caa-webui.16.09.05.SPA.pkg
[5]: New files list:
    Added cat3k_caa-guestshell.16.09.05.SPA.pkg
    Added cat3k_caa-rpbase.16.09.05.SPA.pkg
    Added cat3k_caa-rpcore.16.09.05.SPA.pkg
    Added cat3k_caa-srdriver.16.09.05.SPA.pkg
    Added cat3k_caa-webui.16.09.05.SPA.pkg
[1 2 3 4 5]: Creating pending provisioning file
[1 2 3 4 5]: Finished installing software.  New software will load on reboot.
[1 2 3 4 5]: Committing provisioning file

[1 2 3 4 5]: Do you want to proceed with reload? [yes/no]: yes

System configuration has been modified. Save? [yes/no]: yes
Building configuration...
Compressed configuration from 29248 bytes to 7090 bytes[OK]
[2 3 4 5]: Reloading
[1]: Pausing before reload
[1]: Pausing before reload[1]: Reloading


Switch#
<Sat Jun 20 12:40:25 2020> Message from sysmgr: Reset Reason:Reset/Reload requested by [stack-manager]. [User requested reload]

Unmounting ng3k filesystems...
Unmounted /dev/sda3...
Warning! - some ng3k filesystems may not have unmounted cleanly...
Please stand by while rebooting the system...
Restarting system.

Booting...Initializing RAM +++++++@@@@@@@@...++++++++++++++++++++++++++++++++
Base ethernet MAC Address: 38:ed:18:db:58:00

Interface GE 0 link down***ERROR: PHY link is down
Initializing Flash...

flashfs[7]: 0 files, 1 directories
flashfs[7]: 0 orphaned files, 0 orphaned directories
flashfs[7]: Total bytes: 6784000
flashfs[7]: Bytes used: 1024
flashfs[7]: Bytes available: 6782976
flashfs[7]: flashfs fsck took 1 seconds....done Initializing Flash.
Getting rest of image
Reading full image into memory....done
Reading full base package into memory...: done = 30896047
Bundle Image
--------------------------------------
Kernel Address    : 0x6042e3b4
Kernel Size       : 0x368b7b/3574651
Initramfs Address : 0x60796f2f
Initramfs Size    : 0x19b5880/26957952
Compression Format: mzip

Bootable image at @ ram:0x6042e3b4
Bootable image segment 0 address range [0x81100000, 0x81c145b0] is in range [0x80180000, 0x90000000].
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@boot_system: 377
Loading Linux kernel with entry point 0x816e5570 ...
Bootloader: Done loading app on core_mask: 0xf

### Launching Linux Kernel (flags = 0x5)


Booting...Initializing RAM +++++++@@@@@@@@...++++++++++++++++++++++++++++++++
Base ethernet MAC Address: 38:ed:18:db:58:00

Interface GE 0 link down***ERROR: PHY link is down
Initializing Flash...

flashfs[7]: 0 files, 1 directories
flashfs[7]: 0 orphaned files, 0 orphaned directories
flashfs[7]: Total bytes: 6784000
flashfs[7]: Bytes used: 1024
flashfs[7]: Bytes available: 6782976
flashfs[7]: flashfs fsck took 1 seconds....done Initializing Flash.
Getting rest of image
Reading full image into memory....done
Reading full base package into memory...: done = 30896047
Bundle Image
--------------------------------------
Kernel Address    : 0x6042e3b4
Kernel Size       : 0x368b7b/3574651
Initramfs Address : 0x60796f2f
Initramfs Size    : 0x19b5880/26957952
Compression Format: mzip

Bootable image at @ ram:0x6042e3b4
Bootable image segment 0 address range [0x81100000, 0x81c145b0] is in range [0x80180000, 0x90000000].
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@boot_system: 377
Loading Linux kernel with entry point 0x816e5570 ...
Bootloader: Done loading app on core_mask: 0xf

### Launching Linux Kernel (flags = 0x5)

%IOSXEBOOT-5c8e9d6656e9d89a8dedeae457871084-new_cksum: (rp/0): 4
%IOSXEBOOT-5c8e9d6656e9d89a8dedeae457871084-saved_cksum: (rp/0): 4
%IOSXEBOOT-Sat-###: (rp/0): Jun 20 12:44:48 Universal 2020 PLEASE DO NOT POWER CYCLE ### BOOT LOADER UPGRADING 4
? tdl.service - TDL Resolve
   Loaded: loaded (/lib/systemd/system/tdl.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sat 2020-06-20 12:44:55 Universal; 69ms ago
  Process: 3990 ExecStart=/usr/binos/conf/tdl_boottime.sh (code=exited, status=1/FAILURE)
 Main PID: 3990 (code=exited, status=1/FAILURE)

[   64.814976] localhost systemd[1]: Starting TDL Resolve...
[   84.610154] RP_0 tdl_boottime.sh[3990]: luajit: /usr/binos/conf/epoch_lib.lua:798: attempt to compare two nil values
[   84.610946] RP_0 tdl_boottime.sh[3990]: stack traceback:
[   84.611732] RP_0 tdl_boottime.sh[3990]:         /usr/binos/conf/epoch_lib.lua:798: in function 'add_and_merge_record'
[   84.612502] RP_0 tdl_boottime.sh[3990]:         /usr/binos/conf/epoch_lib.lua:816: in function 'generate_merged_metadata'
[   84.613279] RP_0 tdl_boottime.sh[3990]:         /usr/binos/conf/epoch_lib.lua:906: in function 'generate_domain'
[   84.613998] RP_0 tdl_boottime.sh[3990]:         /usr/binos/conf/epoch_resolve.lua:209: in main chunk
[   84.614815] RP_0 tdl_boottime.sh[3990]:         [C]: at 0xaab527ccb0
[   84.620863] RP_0 tdl_boottime.sh[3990]: Time to check parameters: 0 seconds.
[   84.621663] RP_0 tdl_boottime.sh[3990]: Time to build package table: 0.002 seconds.
[   84.622446] RP_0 tdl_boottime.sh[3990]: Time to cache objects: 7.587 seconds.
[   84.629181] RP_0 systemd[1]: tdl.service: Main process exited, code=exited, status=1/FAILURE
[   84.631516] RP_0 systemd[1]: Failed to start TDL Resolve.
[   84.639898] RP_0 systemd[1]: tdl.service: Unit entered failed state.
[   84.641589] RP_0 systemd[1]: tdl.service: Failed with result 'exit-code'.
Jun 20 12:44:55.688: %BOOT-3-SYSD_STARTFAIL: R0/0: Failed to launch boot task tdl.service ( exit-code )
>>> Rebooting
Failed to set wall message, ignoring: The name org.freedesktop.login1 was not provided by any .service files
Failed to reboot system via logind: The name org.freedesktop.login1 was not provided by any .service files
reboot: Restarting system



¦BAUD=9600
BOOT=flash:packages.conf
CFG_MODEL_NUM=WS-C3850-48P-S
CLEI_CODE_NUMBER=IPM8E00ARA
CSR_PCIERST_DISCONNECTED=yes
ECI_CODE_NUMBER=466678
LINUX_COREMASK=15
MAC_ADDR=38:ED:18:DB:58:00
MANUAL_BOOT=no
MODEL_NUM=WS-C3850-48P
MODEL_REVISION_NUM=T0
MOTHERBOARD_ASSEMBLY_NUM=73-15800-07
MOTHERBOARD_REVISION_NUM=A0
MOTHERBOARD_SERIAL_NUM=FOC1927C0YW
POE1_ASSEMBLY_NUM=73-16439-01
POE1_REVISION_NUM=A0
POE1_SERIAL_NUM=FOC1927BUK2
POE2_ASSEMBLY_NUM=73-16439-01
POE2_REVISION_NUM=A0
POE2_SERIAL_NUM=FOC1927BU9M
RECOVERY_BUNDLE=sda9:cat3k_caa-recovery.bin
STKPWR_ASSEMBLY_NUM=73-11956-08
STKPWR_REVISION_NUM=B0
STKPWR_SERIAL_NUM=FOC1927A7DE
SYSTEM_SERIAL_NUM=FOC1928X0ME
TAN_NUM=800-43041-01
TAN_REVISION_NUMBER=C0
TERMLINES=0
USB_ASSEMBLY_NUM=73-16576-01
USB_REVISION_NUM=A0
USB_SERIAL_NUM=FOC19280GLC
VERSION_ID=V05
TEMPLATE=advanced
BSI=0
ABNORMAL_RESET_COUNT=0
SWITCH_NUMBER=1
RANDOM_NUM=797870126

After this I got boot fail loop like below...

 

BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL
BOOT FAIL

 

Looks something went wrong: or you might have hit with any Bug (not sure)

 

  84.622446] RP_0 tdl_boottime.sh[3990]: Time to cache objects: 7.587 seconds.
[   84.629181] RP_0 systemd[1]: tdl.service: Main process exited, code=exited, status=1/FAILURE
[   84.631516] RP_0 systemd[1]: Failed to start TDL Resolve.
[   84.639898] RP_0 systemd[1]: tdl.service: Unit entered failed state.
[   84.641589] RP_0 systemd[1]: tdl.service: Failed with result 'exit-code'.
Jun 20 12:44:55.688: %BOOT-3-SYSD_STARTFAIL: R0/0: Failed to launch boot task tdl.service ( exit-code )
>>> Rebooting

 

you need to go the route of emergency install.

 

https://www.cisco.com/c/en/us/support/docs/switches/catalyst-3850-series-switches/117552-technote-cat3850-00.html#anc20

 

(copy the image into USB) - once you get switch: 

 

switch: emergency-install usbflash0:/cat3k_caa-universalk9.16.09.05.SPA.bin

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Hi BB,
Thank you, I know I have to go emergency recovery but sadly I cannot break the boot cycle, mode button doesnt work..
Still looking other ways to fix it..

Wayne

here is the steps (be patient)

 

SUMMARY STEPS

1.    Connect a terminal or PC to the switch.

2.    Set the line speed on the emulation software to 9600 baud.

3.    Power off the standalone switch or the entire switch stack.

4.    Reconnect the power cord to the or the active switch. Within 15 seconds, press the Mode button while the System LED is still flashing green. Continue pressing the Mode button until all the system LEDs turn on and remain solid; then release the Mode button. (Hold the button til the lights go amber.)

5.    After recovering the password, reload the switch or the active switch.

6.    Power on the remaining switches in the stack.

 

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Hello BB,

Thank you but I still stuck on it.. could you take a look if you have spare time?

I took a video for what I did but couldn't get into recovery mode..

this is youtube video: https://www.youtube.com/watch?v=fBtY_CSVNRo

 

Thank you.

Wayne

If it is possible to force each switch to go into ROMMON, then you'll need to do an Emergency Recovery.

Looks you have tried all the options as mentioned. I want to try different methods, at least we can see recovery part of the stack.

 

1. before you doing this, make sure you have a backup configuration. out of the box.

2. remove all the stack cables and stack power cables from all the switch. this stage each switch act as a single switch not part of the stack.

 

3. try the above-mentioned option on Switch 1 to 5 separately? see any device get prompt switch:

4. if all the switch results  BOOT FAIL

 

contact cisco TAC and arrange  RMA is the only option. (hope you have smartnet contract).

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Leo Laohoo
Hall of Fame
Hall of Fame
Console into the switch and reboot the switch.
Post the entire boot-up process.

Hi Leo,

I took videos for the boot fail, hope you guys can take a look and any advice.

Those switch doesn't have contract.. so hoping I could fix the issue in this weekend..

 

switch 1 and 2 are the same console message: https://youtu.be/fBtY_CSVNRo

switch 3 and 4 are same message: https://youtu.be/8kXkZrBjC-Y

switch 5 couldn't see any LEDs and console message: https://youtu.be/aaOejhsxKw8

 

Message from switch 3:

▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
BAUD=9600 BOOT=flash:packages.conf BSI=0 CFG_MODEL_NUM=WS-C3850-48T-S CLEI_CODE_NUMBER=IPMW999AR9 DEFAULT_ROUTER=10.1.1.1 ECI_CODE_NUMBER=466863 EI_NOPROG=1 IP_ADDR=10.1.1.50/255.255.255.0 LINUX_CMDLINE=rw console=ttyS0,9600,n8 LINUX_COREMASK=15 MAC_ADDR= MANUAL_BOOT=no MODEL_NUM=WS-C3850-48T MODEL_REVISION_NUM=S0 MOTHERBOARD_ASSEMBLY_NUM=73-999999-07 MOTHERBOARD_REVISION_NUM=A0 MOTHERBOARD_SERIAL_NUM=FOC9999999V RECOVERY_BUNDLE=sda9:cat3k_caa-recovery.bin STKPWR_ASSEMBLY_NUM=73-11956-08 STKPWR_REVISION_NUM=B0 STKPWR_SERIAL_NUM=F999999999D SYSTEM_SERIAL_NUM=F9999999990 TAN_NUM=800-37552-03 TAN_REVISION_NUMBER=B0 TEMPLATE=advanced TERMLINES=0 USB_ASSEMBLY_NUM=73-12923-05 USB_REVISION_NUM=C0 USB_SERIAL_NUM=F0000000005 VERSION_ID=V04 ABNORMAL_RESET_COUNT=0 RANDOM_NUM=183 SWITCH_NUMBER=5

 

thank you..

Wayne

 

was you able to fix the stack boot fail problem ?

Hi, unfortunately the problem cannot fix. I RMA it in the end.

 
Review Cisco Networking products for a $25 gift card