cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2056
Views
5
Helpful
3
Replies

nxos9000 image crashes on CML 2.0

pgaltieri
Level 1
Level 1

My network consists of 2 system, both running linux, attached to a Cisco 2960 switch.  One of the systems acts as a dhcp server and the other runs CML 2.0.  The port on the 2960 switch attached to the system running CML 2.0 is configured for switchport port-security.  I configure a topology on CML consisting of an external connector, an unmanaged switch, an iosv device and a nxos9000 device.  The interface on the iosv system is configured via dhcp and works.  During the simulation, while accessing the nxos9000 device the link to the dhcp server shutdown due to a port security violation from a MAC address which happens to be the gig0/0 interface on the iosv system.  Now when I try to reboot the nxos9000 device I get 2 errors.  One from CML that says 

 

System Health
Low-Level Driver is not connected. Please check system logs for more information.

 

I'm not sure what this refers to and what to look for in the logs.  The more interesting error is the following:

 

Boot Time: 6/9/2020 1:56:0
[ 53.813969] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 28s! [swapper/0:1]
[ 53.816367] Kernel panic - not syncing: softlockup: hung tasks
[ 53.816367] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G L 4.1.21-WR8.0.0.25-standard #1
[ 53.816367] Hardware name: Red Hat KVM, BIOS 0.0.0 02/06/2015
[ 53.816367] 0000000000000000 ffff8801bfc03c90 ffffffff817d3d0f ffffffff81a58931
[ 53.816367] ffff8801b8ab8000 ffff8801bfc03d10 ffffffff817d10fa 0000000000000088
[ 53.816367] 0000000000000008 ffff8801bfc03d20 ffff8801bfc03cc0 ffff8801bfc03d10
[ 53.816367] Call Trace:
[ 53.816367] <IRQ> [<ffffffff817d3d0f>] dump_stack+0x63/0x81
[ 53.816367] [<ffffffff817d10fa>] panic+0xc1/0x218
[ 53.816367] [<ffffffff81117e54>] watchdog_timer_fn+0x1d4/0x210
[ 53.816367] [<ffffffff810da274>] __run_hrtimer+0x154/0x230
[ 53.816367] [<ffffffff810e10c2>] ? ktime_get_update_offsets_now+0x62/0x100
[ 53.816367] [<ffffffff81117c80>] ? watchdog+0x40/0x40
[ 53.816367] [<ffffffff810da983>] hrtimer_interrupt+0xf3/0x200
[ 53.816367] [<ffffffff8103c0b2>] local_apic_timer_interrupt+0x62/0x70
[ 53.816367] [<ffffffff817ddcd7>] smp_apic_timer_interrupt+0x47/0x60
[ 53.816367] [<ffffffff817dbdcf>] apic_timer_interrupt+0x7f/0x90
[ 53.816367] [<ffffffff8100d0a9>] ? sched_clock+0x9/0x10
[ 53.816367] [<ffffffff81085732>] ? __do_softirq+0x92/0x320
[ 53.816367] [<ffffffff810856f1>] ? __do_softirq+0x51/0x320
[ 53.816367] [<ffffffff810adcc8>] ? sched_clock_cpu+0xa8/0xc0
[ 53.816367] [<ffffffff81085bb0>] irq_exit+0x50/0xb0
[ 53.816367] [<ffffffff817ddcdc>] smp_apic_timer_interrupt+0x4c/0x60
[ 53.816367] [<ffffffff817dbdcf>] apic_timer_interrupt+0x7f/0x90
[ 53.816367] <EOI> [<ffffffff81373103>] ? smk_tskacc+0x53/0xb0
[ 53.816367] [<ffffffff81373103>] ? smk_tskacc+0x53/0xb0
[ 53.816367] [<ffffffff8137318d>] smk_curacc+0x2d/0x30
[ 53.816367] [<ffffffff81370e1b>] smack_inode_setattr+0x6b/0x70
[ 53.816367] [<ffffffff811c4680>] ? path_lookupat+0x550/0x5d0
[ 53.816367] [<ffffffff81369e97>] security_inode_setattr+0x47/0x70
[ 53.816367] [<ffffffff8136addc>] ? security_inode_alloc+0x3c/0x60
[ 53.816367] [<ffffffff811d2f80>] notify_change+0x1e0/0x380
[ 53.816367] [<ffffffff811b56e2>] chown_common.isra.13+0x102/0x1c0
[ 53.816367] [<ffffffff811b651b>] SyS_fchownat+0x9b/0xf0
[ 53.816367] [<ffffffff811b658d>] SyS_chown+0x1d/0x20
[ 53.816367] [<ffffffff81d26437>] do_name+0x131/0x1dc
[ 53.816367] [<ffffffff81d25ba1>] write_buffer+0x27/0x39
[ 53.816367] [<ffffffff81d25c1e>] flush_buffer+0x6b/0x8b
[ 53.816367] [<ffffffff81d25bb3>] ? write_buffer+0x39/0x39
[ 53.816367] [<ffffffff81d5a4f1>] ? decompress_method+0x57/0x57
[ 53.816367] [<ffffffff81d5a7aa>] __gunzip+0x2af/0x365
[ 53.816367] [<ffffffff81d5a87d>] gunzip+0x1d/0x1f
[ 53.816367] [<ffffffff81d25a45>] ? initrd_load+0x3f/0x3f
[ 53.816367] [<ffffffff81d25f66>] unpack_to_rootfs+0x144/0x216
[ 53.816367] [<ffffffff81d25a45>] ? initrd_load+0x3f/0x3f
[ 53.816367] [<ffffffff81d26808>] ? free_initrd+0x9e/0x9e
[ 53.816367] [<ffffffff81d26863>] populate_rootfs+0x5b/0x7f
[ 53.816367] [<ffffffff81000445>] do_one_initcall+0x105/0x1c0
[ 53.816367] [<ffffffff81d25235>] do_initcall_level+0x7d/0x92
[ 53.816367] [<ffffffff81d2495b>] ? mm_init+0x24/0x24
[ 53.816367] [<ffffffff81d2525b>] do_initcalls+0x11/0x1a
[ 53.816367] [<ffffffff81d2528d>] do_basic_setup+0x29/0x30
[ 53.816367] [<ffffffff81d25359>] kernel_init_freeable+0xba/0x171
[ 53.816367] [<ffffffff817d73ee>] ? __schedule+0x5ce/0x682
[ 53.816367] [<ffffffff817d73e2>] ? __schedule+0x5c2/0x682
[ 53.816367] [<ffffffff817d73ee>] ? __schedule+0x5ce/0x682
BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM00005 " from PciRoot(0x0)/Pci(0x6
,0x0)/Sata(0x0,0xFFFF,0x0)
BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM00005 " from PciRoot(0x0)/Pci(0x
6,0x0)/Sata(0x0,0xFFFF,0x0)
Sysconf checksum failed. Using default values
WARNING: No BIOS Info found
Sysconf checksum failed. Using default values
Sysconf checksum failed. Using default values
Sysconf checksum failed. Using default values
ATE0Q1&D2&C1S0=1
Standalone chassis
check_bootmode: grub2pxe: grub failed, launch ipxe
Trying to load ipxe
Loading Application:
/Vendor(429bdb26-48a6-47bd-664c-801204061400)/UnknownMedia(6)/EndEntire
cannot load imageFailed to launch ipxe
Came back to grub, now load efi shell
Trying to load efishell
Loading Application:
/Vendor(429bdb26-48a6-47bd-664c-801204061400)/UnknownMedia(6)/EndEntire
cannot load imageFailed to launch shell
Trying to read config file /boot/grub/menu.lst.local from (hd0,4)
Filesystem type is ext2fs, partition type 0x83
Trying to read config file /boot/grub/menu.lst.local from (hd0,5)
Filesystem type is ext2fs, partition type 0x83
Sysconf checksum failed. Using default values
console (dumb)

Booting nxos.9.2.3.bin...
Booting nxos.9.2.3.bin
Trying diskboot
Filesystem type is ext2fs, partition type 0x83

.
.
.

loader >

 

The dots above signify there may have been more messages, but I didn't get them.  

 

And it gets weirder.  If I add another instance of the nxos9000 and boot it up, it comes up just fine, but when I stop it and connect it to the unmanaged switch and start the simulation the second nxos9000 instance drops in to the loader> prompt.  I added a third nxos device and booted it up without any simulation running and it booted fine.  When I connect this to the unmanaged switch and then stop and restart the node it drops into the loader> prompt.

 

What the heck is going on and how do I fix it?  It seems that if the nxos9000 node is already up and running successfully when I connect it to the unmanaged switch it will continus to work, but if I then stop the node then restart it it's toast.

 

Paolo

 

3 Replies 3

pieterh
VIP
VIP

>>> The port on the 2960 switch attached to the system running CML 2.0 is configured for switchport port-security.  <<<

port-security limits the MAC addresses that are learned on this switchport.
default-limit is 1 (one single mac-address.

an unmanaged switch, an iosv device and a nxos9000 devicewill present multiple MAC addresses to the 2960 switch poort which will lead to an error-disbled state -> the switchport will be shut-down

==>> the low level driver message wil result from not being able to access the DHCP server -> no IP-address!

-> you need to configure the 2960 to reanable (no shut) the port and increase the mac-address limit on this port

 

 

Ollie Y
Level 1
Level 1
Couple of things worth checking regarding the NX9K:

1) That you have enough CPU to allow each device CPU to operate simultaneously. ie if you have 4 devices each with 1vCPU, then make sure you're server has minimum 4 (probably plus a couple for CML OS). NXOS9Kv is quite sensitive to lack of CPU i've found and you get that " CPU#0 stuck for 28s". From memory 9K needs 2vCPU at a decent clock speed.

2) When you reboot the 9K without adding a boot command it will drop to boot loader. Just add "boot nxos <version>.bin" to config and save before rebooting and it wont drop to boot loader on reload.

Not saying these will solve your issues, but worth thinking about and hopefully will help.

I'd like to try our #2. Where is this config?