cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
9193
Views
15
Helpful
6
Replies
John Apricena
Beginner

Cisco 3650 Reboot Loop

Hey Guys,

 

Had an issue last night with one of our 3650 switches which appeared to be power related, but nonetheless the switch appeared to be running in a degraded state where it wouldn't get a link light for any port. I was able to console into it and reload it, but this is when it went into a reboot loop before finally freezing up at the below lines. My question is since this switch is out of warranty, do you believe this to be hardware related based on the below logs and I should just replace it? Thanks in advance!

 

Fatal exception: panic in 5 seconds

Kernel panic - not syncing: Fatal exception

 

 

AGLNJSW01#reload
Reload command is being issued on Active unit, this will reload the whole stack
Proceed with reload? [confirm]

*Nov 28 15:17:40.881: %SYS-5-RELOAD: Reload requested by on console. Reload Reason: Reload command.
*Nov 28 15:17:41.488: %STACKMGR-1-RELOAD_REQUEST: 1 stack-mgr: Received reload request for all switches, reason Reload command
*Nov 28 15:17:41.489: %STACKMGR-1-RELOAD: 1 stack-mgr: Reloading due to reason Reload command
*Nov 28 15:17:41.990: %IOSXE-3-PLATFORM: 1 process sysmgr: Reset/Reload requeste d by [stack-manager].
<Tue Nov 28 15:17:41 2017> Message from sysmgr: Reason Code:[3] Reset Reason:Res et/Reload requested by [stack-manager]. [Reload command]
umount: /proc/fs/nfsd: not mounted
Unmounting ng3k filesystems...
Unmounted /dev/sda3...
Warning! - some ng3k filesystems may not have unmounted cleanly...
Please stand by while rebooting the system...
Restarting system.


Booting...Initializing RAM +++++++@@@@@@@@...++++++++++++++++++++++++++++++++
Base ethernet MAC Address: 78:da:6e:3a:3a:80

Interface GE 0 link down***ERROR: PHY link is down
Initializing Flash...

flashfs[7]: 0 files, 1 directories
flashfs[7]: 0 orphaned files, 0 orphaned directories
flashfs[7]: Total bytes: 6784000
flashfs[7]: Bytes used: 1024
flashfs[7]: Bytes available: 6782976
flashfs[7]: flashfs fsck took 1 seconds....done Initializing Flash.
Getting rest of image
Reading full image into memory....done
Reading full base package into memory...: done = 79118924
Nova Bundle Image
--------------------------------------
Kernel Address : 0x6042d350
Kernel Size : 0x402ecf/4206287
Initramfs Address : 0x60830220
Initramfs Size : 0xdb902a/14389290
Compression Format: .mzip

Bootable image at @ ram:0x6042d350
Bootable image segment 0 address range [0x81100000, 0x82110000] is in range [0x8 0180000, 0x90000000].
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@boot_syst em: 377
Loading Linux kernel with entry point 0x81653a10 ...
Bootloader: Done loading app on core_mask: 0xf

### Launching Linux Kernel (flags = 0x5)

All packages are Digitally Signed
Starting System Services


<Tue Nov 28 15:22:10 2017> Message from sysmgr: Reason Code:[2] Reset Reason:Ser vice [fed] pid:[5671] terminated abnormally [6].
Details:
--------
Service: fed
Description: Forwarding Engine Driver
Executable: /tmp/sw/mount/cat3k_caa-platform.SPA.03.03.02SE.pkg//usr/binos/bin/f ed

Started at Tue Nov 28 15:21:09 2017 (964929 us)
Stopped at Tue Nov 28 15:22:10 2017 (165098 us)
Uptime: 1 minutes 1 seconds

Start type: SRV_OPTION_RESTART_STATELESS (23)
Death reason: SYSMGR_DEATH_REASON_FAILURE_SIGNAL (2)
Last heartbeat 0.00 secs ago

PID: 5671
Exit code: signal 6 (no core)

CWD: /var/sysmgr/work


PID: 5671
UUID: 3005
Unmounting ng3k filesystems...
Unmounted /dev/sda3...
Warning! - some ng3k filesystems may not have unmounted cleanly...
Please stand by while rebooting the system...
Restarting system.


Booting...Initializing RAM +++++++@@@@@@@@...++++++++++++++++++++++++++++++++
Base ethernet MAC Address: 78:da:6e:3a:3a:80

Interface GE 0 link down***ERROR: PHY link is down
Initializing Flash...

flashfs[7]: 0 files, 1 directories
flashfs[7]: 0 orphaned files, 0 orphaned directories
flashfs[7]: Total bytes: 6784000
flashfs[7]: Bytes used: 1024
flashfs[7]: Bytes available: 6782976
flashfs[7]: flashfs fsck took 1 seconds....done Initializing Flash.
Getting rest of image
Reading full image into memory....done
Reading full base package into memory...: done = 79118924
Nova Bundle Image
--------------------------------------
Kernel Address : 0x6042d350
Kernel Size : 0x402ecf/4206287
Initramfs Address : 0x60830220
Initramfs Size : 0xdb902a/14389290
Compression Format: .mzip

Bootable image at @ ram:0x6042d350
Bootable image segment 0 address range [0x81100000, 0x82110000] is in range [0x8 0180000, 0x90000000].
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@boot_syst em: 377
Loading Linux kernel with entry point 0x81653a10 ...
Bootloader: Done loading app on core_mask: 0xf

### Launching Linux Kernel (flags = 0x5)

All packages are Digitally Signed
Starting System Services


Restricted Rights Legend

Use, duplication, or disclosure by the Government is
subject to restrictions as set forth in subparagraph
(c) of the Commercial Computer Software - Restricted
Rights clause at FAR sec. 52.227-19 and subparagraph
(c) (1) (ii) of the Rights in Technical Data and Computer
Software clause at DFARS sec. 252.227-7013.

cisco Systems, Inc.
170 West Tasman Drive
San Jose, California 95134-1706

 

Cisco IOS Software, IOS-XE Software, Catalyst L3 Switch Software (CAT3K_CAA-UNIV ERSALK9-M), Version 03.03.02SE RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2014 by Cisco Systems, Inc.
Compiled Thu 20-Feb-14 21:17 by prod_rel_team

Cisco IOS-XE software, Copyright (c) 2005-2014 by cisco Systems, Inc.
All rights reserved. Certain components of Cisco IOS-XE software are
licensed under the GNU General Public License ("GPL") Version 2.0. The
software code licensed under GPL Version 2.0 is free software that comes
with ABSOLUTELY NO WARRANTY. You can redistribute and/or modify such
GPL code under the terms of GPL Version 2.0.
(http://www.gnu.org/licenses/gpl-2.0.html) For more details, see the
documentation or "License Notice" file accompanying the IOS-XE software,
or the applicable URL provided on the flyer accompanying the IOS-XE
software.

 

 

FIPS: Flash Key Check : Begin
FIPS: Flash Key Check : End, Not Found,FIPS Mode Not Enabled

This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to
export@cisco.com.

cisco WS-C3650-48PS (MIPS) processor with 4194304K bytes of physical memory.
Processor board ID FDO1742Q06N
2048K bytes of non-volatile configuration memory.
4194304K bytes of physical memory.
250456K bytes of Crash Files at crashinfo:.
1609272K bytes of Flash at flash:.
0K bytes of Dummy USB Flash at usbflash0:.
0K bytes of at webui:.

Base Ethernet MAC Address : 
Motherboard Assembly Number : 
Motherboard Serial Number :
Model Revision Number : A0
Motherboard Revision Number : A0
Model Number : WS-C3650-48PS
System Serial Number : 

Data bus error, epc == 0000000054ab2958, ra == 0000000054ab4cdc
Data bus error, epc == 00000000551258a8, ra == 0000000055127c2c
Data bus error, epc == 0000000054ab2958, ra == 0000000054ab4cdc
Data bus error, epc == 0000000054c267f8, ra == 0000000054c28b7c
Data bus error, epc == 0000000054ab2958, ra == 0000000054ab4cdc

<Tue Nov 28 15:29:24 2017> Message from sysmgr: Reason Code:[2] Reset Reason:Ser vice [stack-mgr] pid:[5683] terminated abnormally [10].
Details:
--------
Service: stack-mgr
Description: Stack Manager
Executable: /tmp/sw/mount/cat3k_caa-platform.SPA.03.03.02SE.pkg//usr/binos/bin/s tack-mgr

Started at Tue Nov 28 15:27:24 2017 (343283 us)
Stopped at Tue Nov 28 15:29:19 2017 (407062 us)
Uptime: 1 minutes 55 seconds

Start type: SRV_OPTION_RESTART_STATELESS (23)
Death reason: SYSMGR_DEATH_REASON_FAILURE_SIGNAL (2)
Last heartbeat 0.00 secs ago

PID: 5683
Exit code: signal 10 (no core)

CWD: /var/sysmgr/work


PID: 5683
UUID: 3006
Data bus error, epc == 0000000054c267f8, ra == 0000000054c28b7c
Data bus error, epc == 00000000551258a8, ra == 0000000055127c2c
Data bus error, epc == 0000000054ab2958, ra == 0000000054ab4cdc

 

Unmounting ng3k filesystems...
Unmounted /dev/sda3...
Warning! - some ng3k filesystems may not have unmounted cleanly...
Please stand by while rebooting the system...
Data bus error, epc == ffffffff81227248, ra == ffffffff81589344
CRASHINFO_OOPS:
Oops[#1]:
Cpu 0
$ 0 : 0000000000000000 0000000010108ce1 8001190500000000 8001190400000000
$ 4 : 8001190000000000 0000010000000000 8001190500200000 8001190500200000
$ 8 : 8001190500200052 0000180000000000 0000000000000000 0000000008204d80
$12 : 0000000000000000 ffffffff80000008 ffffffff81289860 c000000000000000
$16 : 8001190500000000 8001190500200000 0000000000000000 a8000001085d9800
$20 : 0000000000000001 0000000000000052 0000000000000002 0000000000000000
$24 : 7802000000000000 ffffffff81545680
$28 : a8000000f02f4000 a8000000f02f7b90 000000007fa59980 ffffffff81589344
Hi : 0000000000000129
Lo : 0000000000001a94
epc : ffffffff81227248 cvmx_pcie_config_read16+0xb8/0x100
Tainted: P
ra : ffffffff81589344 octeon_pcie_read_config+0x1a4/0x328
Status: 10108ce2 KX SX UX KERNEL EXL
Cause : 4080881c
PrId : 000d900a (Cavium Octeon II)
Modules linked in: rtc_ds1307 mtd_map bpa_mem crashinfo pds tun cpumem exportfs nfsd ipv6 OOBnd(P) OOBhal(P) dplr_pci cvmx_mdio cvmx_gpio aipcmod(P) mtsmod [las t unloaded: procfs]
Process reboot (pid: 12209, threadinfo=a8000000f02f4000, task=a8000000d1b2a938, tls=000000002b868a60)
Stack : 0000000000000000 0000000000000000 0000000000000052 000000002a000000
00000000000007d0 a8000000f02f7c20 ffffffff820f0000 ffffffff81589344
0000000000000002 ffffffff80000008 a8000000f02f7c70 ffffffff820f0000
0000000000000001 a8000001085d9800 0000000010720408 0000000000000061
00000000100e0000 ffffffff8148b1d4 00000000100e0000 0000000000000000
0000000000000000 0000000000000052 a80000010826dd38 a8000001085dd800
0000000000000052 0000000000000000 0000000000000050 ffffffff8149702c
0000000010108ce3 0000000000000129 a8000001085dd800 a8000000fe530b00
a8000000fe530b10 ffffffff8149735c 0000000000000000 0000000000000000
0000000000000000 a8000001085dd800 ffffffff820f0000 0000000001234567
...
Call Trace:
[<ffffffff81227248>] cvmx_pcie_config_read16+0xb8/0x100
[<ffffffff81589344>] octeon_pcie_read_config+0x1a4/0x328
[<ffffffff8148b1d4>] pci_bus_read_config_word+0x84/0xc0
[<ffffffff8149702c>] msi_set_enable+0x34/0x78
[<ffffffff8149735c>] pci_msi_shutdown+0x6c/0x120
[<ffffffff81492198>] pci_device_shutdown+0x38/0x50
[<ffffffff814b9c78>] device_shutdown+0x38/0xb0
[<ffffffff812897bc>] kernel_restart_prepare+0x2c/0x38
[<ffffffff8128980c>] kernel_restart+0x14/0x60
[<ffffffff81289a20>] SyS_reboot+0x1c0/0x278
[<ffffffff81244404>] handle_sysn32+0x44/0x84


Code: 00e84025 1100000d dfbf0038 <95030000> dfbf0038 dfb10030 dfb00028 3063 ffff 00032202
Sending IPI to other cpus...
Will call new kernel at 04005eb0
Bye ...
Linux version 2.6.32.59-mips-kcrash.cge-cavium-octeon (yiliu@sjc-ads-598) (gcc v ersion 4.4.1 (MontaVista Linux G++ 4.4-1211141130) ) #2 SMP PREEMPT Tue Apr 16 1 7:11:20 PDT 2013
CVMSEG size: 2 cache lines (256 bytes)
Cavium Inc. SDK-2.3
early_param: BoardId = 23
bootconsole [early0] enabled
CPU revision is: 000d900a (Cavium Octeon II)
Checking for the multiply/shift bug... no.
Checking for the daddiu bug... no.
Determined physical RAM map:
memory: 0000000002efd000 @ 0000000004000000 (usable)
Wasting 917504 bytes for tracking 16384 unused pages
Initrd not found or empty - disabling initrd
pkg not found or empty - disabling package support
Using internal Device Tree.
Placing 0MB software IO TLB between a800000005708000 - a800000005748000
software IO TLB at phys 0x5708000 - 0x5748000
Zone PFN ranges:
DMA32 0x00004000 -> 0x000f0000
Normal 0x000f0000 -> 0x000f0000
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
0: 0x00004000 -> 0x00006efd
Cavium Hotplug: Available coremask 0x5a5a5a5a
PERCPU: Embedded 11 pages/cpu @a800000005833000 s12416 r8192 d24448 u65536
pcpu-alloc: s12416 r8192 d24448 u65536 alloc=16*4096
pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
Built 1 zonelists in Zone order, mobility grouping off. Total pages: 11864
Kernel command line: root=/dev/ram maxcpus=1 init 1 irqpoll console=ttyS0,9600, n8 BoardId=23 PLATFORM_TYPE=WS-C3650 elfcorehdr=113652K savemaxmem=32M
Misrouted IRQ fixup and polling support enabled
This may significantly impact system performance
PID hash table entries: 256 (order: -1, 2048 bytes)
Dentry cache hash table entries: 8192 (order: 4, 65536 bytes)
Inode-cache hash table entries: 4096 (order: 3, 32768 bytes)
Primary instruction cache 37kB, virtually tagged, 37 way, 8 sets, linesize 128 b ytes.
Primary data cache 32kB, 32-way, 8 sets, linesize 128 bytes.
Secondary unified cache 2048kB, 16-way, 1024 sets, linesize 128 bytes.
Memory: 23040k/48116k available (5086k kernel code, 25056k reserved, 9762k data, 8100k init, 0k highmem)
Hierarchical RCU implementation.
NR_IRQS:453
Calibrating delay loop (skipped) preset value.. 1600.00 BogoMIPS (lpj=8000000)
Security Framework initialized
Mount-cache hash table entries: 256
Checking for the daddi bug... no.
Brought up 1 CPUs
checking TSC synchronization across all online CPUs: passed.
devtmpfs: initialized
NET: Registered protocol family 16
PCIe: Initializing port 0

pcie 0: Setting up 4 lane pciePCIe: Port 0 link active, 4 lanes, speed gen1
PCIe: Initializing port 1
CRASHINFO_OOPS:
do_cpu invoked from kernel context![#1]:
Cpu 0
$ 0 : 0000000000000000 0000000010108ce1 0000000000000001 0000000000000320
$ 4 : 0000000088d2a1c5 00000000010d0000 00000000010d9300 ffffffff84e20000
$ 8 : 0000000000000b78 ffffffffffffffff 000000000000ffff 0000000000000b5d
$12 : a80000000603ffe0 0000000000008c00 ffffffff84df0000 0000000000000000
$16 : 0000000000000001 8001180000001620 0000000000000000 00011b0fffffffff
$20 : 00011a05ffffffff 0000000000000000 0000000000000000 0000000000000000
$24 : 0000000000000001 ffffffff843199c8
$28 : a80000000603c000 a80000000603fca0 0000000000000000 ffffffff8413e07c
Hi : 000000000000002b
Lo : 99999999999999db
epc : ffffffff8401ab00 __udelay+0x30/0x40
Not tainted
ra : ffffffff8413e07c cvmx_pcie_rc_initialize+0xd4c/0x14d8
Status: 10108ce3 KX SX UX KERNEL EXL IE
Cause : 1080002c
PrId : 000d900a (Cavium Octeon II)
Modules linked in:
Process swapper (pid: 1, threadinfo=a80000000603c000, task=a800000006039038, tls =0000000000000000)
Stack : 0000000000000000 ffffffff842834c0 a80000000603fd38 00000000ffff8ae9
ffffffff84df0000 ffffffff8569b680 ffffffff8569b680 ffffffff8418913c
0000000000000000 ffffffff85690000 0000000000000001 a80000000603fd88
ffffffff84d15c78 ffffffff84ea1d14 ffffffff84e69ae8 00011b0fffffffff
00011a05ffffffff 0000000000000000 0000000000000000 0000000000000000
0000000000000000 ffffffff8400ee04 a80000000603fd88 a800000006039038
0000000000000000 ffffffff84e69960 0000000000000000 ffffffff84e69960
0000000000000000 00011b0fffffffff 00011a05ffffffff 0000000000000000
0000000000000000 0000000000000000 0000000000000000 ffffffff84ea1d6c
0000000000000000 ffffffff84ea19f4 ffffffff84e7e690 ffffffff85670000
...
Call Trace:
[<ffffffff8401ab00>] __udelay+0x30/0x40
[<ffffffff8413e07c>] cvmx_pcie_rc_initialize+0xd4c/0x14d8
[<ffffffff84ea1d6c>] octeon_pcie_setup+0x378/0x670
[<ffffffff84016580>] do_one_initcall+0x38/0x180
[<ffffffff84e81338>] kernel_init+0x1d4/0x27c
[<ffffffff84141550>] kernel_thread_helper+0x10/0x18


Code: 00000000 40224806 0044102b <1440fffd> 00000000 03e00008 00000000 4022 4806 3c038567
Disabling lock debugging due to kernel taint
KOOPS: Failed to create koops file /crashinfo/koops.dat
Fatal exception: panic in 5 seconds
Kernel panic - not syncing: Fatal exception

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Leo L
VIP Community Legend

I have re-read the error messages and the boot-up process and I believe this is a hardware bug, CSCva30394. 

The 3650 comes with an Enhanced Limited Lifetime Warranty so which means that if you can prove to the Entitlements Team that you are the end user of the appliance (like a purchase order or a delivery receipt) then they will provide a replacement unit after ten (10) business days.  

View solution in original post

6 REPLIES 6
Leo L
VIP Community Legend

You may need to break into ROMmon and upgrade the firmware using Emergency Recovery.

Thanks much for the reply I'll give this a try!

Hi Leo,

 

We have no warranty support, so I'd be unable to get the new firmware, though I can extract the same firmware from another 3650 switch we have here if you think it would be worth it to attempt to boot to that. The firmware version is the same on both switches though. One switch boots properly and the other doesn't.

Leo L
VIP Community Legend

I have re-read the error messages and the boot-up process and I believe this is a hardware bug, CSCva30394. 

The 3650 comes with an Enhanced Limited Lifetime Warranty so which means that if you can prove to the Entitlements Team that you are the end user of the appliance (like a purchase order or a delivery receipt) then they will provide a replacement unit after ten (10) business days.  

TAC replaced with the enhanced Warranty, thanks Leo!
Leo L
VIP Community Legend

Happy to hear it's all sorted, John.
And thanks for taking the time to provide us with the feedback and rating our posts. :)