cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
10033
Views
15
Helpful
15
Replies
Highlighted
Beginner

3850 failed upgrade how to recover?

I am getting this loop after attempting a package update to the recommended release for our 3850 48T-S

How do I proceed when it never gets to a prompt?


Getting rest of image
Reading full image into memory...Check base package header ...: done = 16384
Getting rest of image
Reading full image into memory....done
Reading full base package into memory...: done = 22301472
Bundle Image
--------------------------------------
Kernel Address : 0x53778384
Kernel Size : 0x34e9e1/3467745
Initramfs Address : 0x53ac6d65
Initramfs Size : 0x119d5bb/18470331
Compression Format: mzip

Bootable image at @ ram:0x53778384
Bootable image segment 0 address range [0x81100000, 0x81b8adc0] is in range [0x80180000, 0x90000000].
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@boot_system: 623
Loading Linux kernel with entry point 0x816902d0 ...
Bootloader: Done loading app on core_mask: 0xf

### Launching Linux Kernel (flags = 0x5)

>>> Boot Failed - pulling status and logs:
? late-network.service - Late network setup
Loaded: loaded (/lib/systemd/system/late-network.service; static)
Active: failed (Result: exit-code) since Wed 2018-02-07 19:04:53 Universal; 10s ago
Process: 988 ExecStart=/etc/init.d/network start (code=exited, status=203/EXEC)
Main PID: 988 (code=exited, status=203/EXEC)

[ 46.696742] localhost systemd[988]: Failed at step EXEC spawning /etc/init.d/network: No such file or directory
[ 46.713657] localhost systemd[1]: late-network.service: main process exited, code=exited, status=203/EXEC
[ 46.714705] localhost systemd[1]: Failed to start Late network setup.
[ 46.715737] localhost systemd[1]: Unit late-network.service entered failed state.

? sshd.service - SSH Daemon
Loaded: loaded (/lib/systemd/system/sshd.service; static)
Active: failed (Result: exit-code) since Wed 2018-02-07 19:04:54 Universal; 9s ago
Process: 1068 ExecStart=/bin/mcp_pkg_wrap rp_security /etc/init.d/sshd start (code=exited, status=127)

[ 47.520236] localhost mcp_pkg_wrap[1068]: /bin/mcp_pkg_wrap: line 122: /tmp/sw/rp/0/0/rp_security/mount/etc/init.d/sshd: No such file or directory
[ 47.532279] localhost systemd[1]: sshd.service: control process exited, code=exited status=127
[ 47.533519] localhost systemd[1]: Failed to start SSH Daemon.
[ 47.534524] localhost systemd[1]: Unit sshd.service entered failed state.

? tdl.service - TDL Resolve
Loaded: loaded (/lib/systemd/system/tdl.service; static)
Active: failed (Result: exit-code) since Wed 2018-02-07 19:05:04 Universal; 69ms ago
Process: 1691 ExecStart=/usr/binos/conf/tdl_boottime.sh (code=exited, status=1/FAILURE)
Main PID: 1691 (code=exited, status=1/FAILURE)

[ 57.247188] RP_0 tdl_boottime.sh[1691]: luajit: /usr/binos/conf/epoch_lib.lua:788: attempt to compare two nil values
[ 57.248229] RP_0 tdl_boottime.sh[1691]: stack traceback:
[ 57.249268] RP_0 tdl_boottime.sh[1691]: /usr/binos/conf/epoch_lib.lua:788: in function 'add_and_merge_record'
[ 57.250290] RP_0 tdl_boottime.sh[1691]: /usr/binos/conf/epoch_lib.lua:806: in function 'generate_merged_metadata'
[ 57.251324] RP_0 tdl_boottime.sh[1691]: /usr/binos/conf/epoch_lib.lua:896: in function 'generate_domain'
[ 57.252347] RP_0 tdl_boottime.sh[1691]: /usr/binos/conf/epoch_resolve.lua:206: in main chunk
[ 57.253356] RP_0 tdl_boottime.sh[1691]: [C]: at 0x100029a0
[ 57.258156] RP_0 tdl_boottime.sh[1691]: Time to check parameters: 0 seconds.
[ 57.259215] RP_0 tdl_boottime.sh[1691]: Time to build package table: 0 seconds.
[ 57.260238] RP_0 tdl_boottime.sh[1691]: Time to cache objects: 3.7 seconds.
[ 57.265676] RP_0 systemd[1]: tdl.service: main process exited, code=exited, status=1/FAILURE
[ 57.267331] RP_0 systemd[1]: Failed to start TDL Resolve.
[ 57.274789] RP_0 systemd[1]: Unit tdl.service entered failed state.
%IOSXEBOOT-5c8e9d6656e9d89a8dedeae457871084-new_cksum: (rp/0): 4
%IOSXEBOOT-5c8e9d6656e9d89a8dedeae457871084-saved_cksum: (rp/0): 4
>>> Rebooting
octeon_wdt: WDT device closed unexpectedly. WDT will not stop!
reboot: Restarting system

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

so I tried various things.

 

I tried copying the new software to flash:  says that it already exists, but does not show up in dir flash:

dir flash-1: nor dir flash-2: it is as if there is a hidden flash file somewhere else.

 

copying packages.conf.02- to packages.conf says it was successful but it just goes through the same failed boot looping.

 

finally from the switch: prompt running emergency-install usbflash0:cat3k_caa-universalk9.SPA.03.06.06.E.152-2.E6.bin

had to run it on each switch individually.  first switch booted up correctly, I am now doing the second in the stack.

 

So we are now in first birthday mode. (except it kept the enable password from the prior settings)

View solution in original post

15 REPLIES 15
Highlighted
Hall of Fame Community Legend

Read this:  Emergency Recovery

Highlighted

Thanks Leo, 

no help.  the unit never gets to a prompt. 

Highlighted

when the front button is held for 10 seconds, the unit proceeds with the same loop.

Highlighted
Beginner

just to add a bit more information,

the software the unit was running on 3.02.02 SE.

the package that was installed was cat3k_caa-universalk9.16.03.05b.SPA

 

Highlighted

Hi,

 

I had this exact same issue when attempting to upgrade to 16.3.5b from 3.6.6E in install mode at the weekend. To recover I had to do the following:

 

1. Power down the switch

2. Hold the mode button

3. Power on the switch without letting go of mode button

4. Wait until the SYST LED changes to amber (roughly 10 seconds) and then let go of the mode button

5. From the switch: prompt boot the backup packages.conf file 'packages.conf.00-'

 

switch: boot packages.conf.00-

 

This booted the switch using the previous 03.6.6E version. Once booted I rolled back to the original packages file using the 'software rollback' command.

 

I hope that this helps

 

Will

Highlighted

Ok I can get part way there,

what am I missing on the syntax?

when I Dir the flash shows the file as packages.conf.01- at the bottom of the list.

 

Might it be worth noting that switcha only has packages.conf.01- and switchb has it as packages.conf.00- perhaps?

 

7746 drwx 4096 Feb 8 2018 09:36:07 -08:00 dc_profile_dir
15500 drwx 4096 Feb 7 2018 10:16:13 -08:00 gs_script
77463 -rwx 1224 Aug 30 2013 01:22:49 -07:00 packages.conf.01-

1621966848 bytes total (805244928 bytes free)


Cisco3850a#software rollback flash:packages.conf.01-
^
% Invalid input detected at '^' marker.

 

Cisco3850a#software rollback provisioning-file flash:/packages.conf.01-
Preparing rollback operation ...
[1 2]: Starting rollback operation
[2]: % Provisioning file flash:/packages.conf.01- not found. Operation aborted.


Cisco3850a#software rollback provisioning-file flash:packages.conf.01-
Preparing rollback operation ...
[1 2]: Starting rollback operation
[2]: % Provisioning file flash:packages.conf.01- not found. Operation aborted.


Cisco3850a#

Highlighted

Thats odd as I would have expected there to be a packages.conf.00 file which would have been the snapshot of the packages.conf file prior to the upgrade. packages.conf.01 would have been the snapshot of the packages.conf from two installations ago.

 

When you finally managed to get to the switch: prompt, what did you do to boot the switch to get back up and running?

 

Will

Highlighted

Each switch had to individually be booted into their respective packages.conf.xx- found this out after doing dir on each to find what file to use.

 

when I ran the software backup from switch 2 that has .00- I get this:

 

Cisco3850a#software rollback
Preparing rollback operation ...
[1 2]: Starting rollback operation
[1 2]: Starting compatibility checks
[1 2]: Finished compatibility checks
[1 2]: Starting application pre-installation processing
[1 2]: Finished application pre-installation processing
[1]: No old package files removed
[2]: No old package files removed
[1]: No new package files added
[2]: No new package files added
[1]: % Could not create rollback provisioning file /flash/packages.conf.01-.00-. Operation aborted.
[2]: % Could not create rollback provisioning file /flash/packages.conf.00-.00-. Operation aborted.


Cisco3850a#

Highlighted

Ok I have not seen that issue before. You may need to download and reinstall the original version to repair the installation.

 

I'd be interested to see the outcome

 

Highlighted

right now i am curious for an outcome myself.

 

3.2.2 is not available for download any more.

 

I have these still on the flash, but i can find no help in how to re-load them.

 

77443 -rwx 74369252 cat3k_caa-base.SPA.03.02.02.SE.pkg
77444 -rwx 5808828 cat3k_caa-drivers.SPA.03.02.02.SE.pkg
77445 -rwx 32488292 cat3k_caa-infra.SPA.03.02.02.SE.pkg
77446 -rwx 30403764 cat3k_caa-iosd-universalk9.SPA.150-1.EX2.pkg
77447 -rwx 16079584 cat3k_caa-platform.SPA.03.02.02.SE.pkg
77448 -rwx 64580300 cat3k_caa-wcm.SPA.10.0.111.0.pkg

 

I have tried to use the packages.conf.02- as an install file only to be told by the software that it is a rollback file and it terminates.  Yet searching for how to initiate a rollback file to an operational file is a frustrating experience.  Cisco only speaks to creating such a file then proceeding with documentation on package mode rather than install mode.  There is no documentation I can find that indicates how to make an install mode rollback file into the bootable file.

 

I also do not understand why there are two different flash: for each device but once they are back together, they only show as one file.

 

Even something as simple as making the running packages.conf.0x- configuration the one that it boots to is very unclear.

Highlighted

I just found the following for a similar issue. This suggests that you can simply copy packages.conf.00- to packages.conf and then set the switch to boot packages.conf as normal

 

boot system flash:packages.conf

 

Are you able to try this?

 

Conditions:
Packages.conf.00- is being booted instead of packages.conf

Workaround:
copy the packages.conf.00- to packages.conf or any other regular file name and boot this file or use the software roll-back option

 

Highlighted

so I tried various things.

 

I tried copying the new software to flash:  says that it already exists, but does not show up in dir flash:

dir flash-1: nor dir flash-2: it is as if there is a hidden flash file somewhere else.

 

copying packages.conf.02- to packages.conf says it was successful but it just goes through the same failed boot looping.

 

finally from the switch: prompt running emergency-install usbflash0:cat3k_caa-universalk9.SPA.03.06.06.E.152-2.E6.bin

had to run it on each switch individually.  first switch booted up correctly, I am now doing the second in the stack.

 

So we are now in first birthday mode. (except it kept the enable password from the prior settings)

View solution in original post

Highlighted

turns out that once in the loop, the only trick that works is to wait for a specific spot in the boot cycle.

It has to first reach to the Interface GE 0 portion, about half a minute from power on. Then hold for ten seconds Actv, APX and S.Power LEDs will all turn orange indicating success in going into switch: mode

 

Booting...

 

Interface GE 0 link down***ERROR: PHY link is down
The "IP_ADDR" environment variable is not set.

The system has been interrupted prior to initializing some
filesystems and loading the operating system software.
Console will be reset to 9600 baud rate, need to change terminal setting first.
The following commands will initialize the remaining filesystems,
and finish loading the operating system software:

flash_init
boot


switch:

Highlighted
Beginner

This shows as solved, however I had the same issue and used a different approach to slam in latest version (if going the emergency-install route):

 

1. Pull Power

2. Hold Mode

3. Plug in Power

4. Continue holding Mode for approx. 25 seconds - you'll see several lights go solid after SYST flashed for first 20ish seconds. Console will show, "The system has been interrupted..."

5. switch: `flash_init`

6. switch: `emergency-install usbflash0:{yourversion}.bin` -- in my case it was: `emergency-install usbflash0:cat3k_caa-universalk9.16.09.05.SPA.bin`

7. Sit back and wait (microcode update installs if missing).

8. Done.

Content for Community-Ad