cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
12104
Views
15
Helpful
3
Replies

UCS-FI-6248UP Fabric Interconnect lead state INAPPLICABLE mgmt services state DOWN

Sergey Roshchin
Level 1
Level 1

Hello! We have 2 x UCS-FI-6248UP + 1 x Cisco UCS 5108 Chassis (7 Servers) with 2 x Nexus 5548UP Switches with L3 Modules (Data Ceter Routet Access Scheme).

In UCS Manager I see FI A with state "inapplicable" and Fi B is currently "primary".

Using CLI (connect local-mgmt A context) I can see:

NSX6248UP1-A(local-mgmt)# show cluster extended-state
Cluster Id: 0x5a479a88bbd711e2-0xbab0002a6a07b044

Start time: Tue Jan 30 11:42:09 2018
Last election time: Tue Jan 30 11:46:41 2018

A: UP, INAPPLICABLE, (Management services: DOWN)
B: UP, PRIMARY

A: memb state UP, lead state INAPPLICABLE, mgmt services state: DOWN
B: memb state UP, lead state PRIMARY, mgmt services state: UP
   heartbeat state PRIMARY_OK

INTERNAL NETWORK INTERFACES:
eth1, UP
eth2, UP

HA NOT READY
Management services are unresponsive on local Fabric Interconnect
Waiting for response from device.
Device count, expected: 1, active: 0
Detailed state of the device selected for HA storage:
Chassis 1, serial: FOX1702GT4F, state: inactive

Fabric A, Unable to connect to local chassis-shared-storage management interface
:
FOX1702GT4F
Warning: there are pending management I/O errors on one or more devices, failover may not complete

Additional pmon data:

NSX6248UP1-A(local-mgmt)# show pmon state

SERVICE NAME             STATE     RETRY(MAX)    EXITCODE    SIGNAL    CORE
------------             -----     ----------    --------    ------    ----
svc_sam_controller     running           0(4)           0         0      no
svc_sam_dme             failed           5(4)           0         6     yes
svc_sam_dcosAG         running           0(4)           0         0      no
svc_sam_bladeAG        running           0(4)           0         0      no
svc_sam_portAG         running           0(4)           0         0      no
svc_sam_statsAG        running           0(4)           0         0      no
svc_sam_hostagentAG    running           0(4)           0         0      no
svc_sam_nicAG          running           0(4)           0         0      no
svc_sam_licenseAG      running           0(4)           0         0      no
svc_sam_extvmmAG       running           0(4)           0         0      no
httpd.sh               running           0(4)           0         0      no
httpd_cimc.sh          running           0(4)           0         0      no
svc_sam_sessionmgrAG   running           0(4)           0         0      no
svc_sam_pamProxy       running           0(4)           0         0      no
dhcpd                  running           0(4)           0         0      no
sam_core_mon           running           0(4)           0         0      no
svc_sam_rsdAG          running           0(4)           0         0      no
svc_sam_svcmonAG       running           0(4)           0         0      no

And there is message "ERROR: MGMT partition has unrecoverable error" during boot.

Boot time errorBoot time error

Can not open TAC without service subscription.

 

Thanks a lot for any help.

 
1 Accepted Solution

Accepted Solutions

Sergey Roshchin
Level 1
Level 1

Solved by full rebuild of FI A (Reference steps from https://supportforums.cisco.com/t5/data-center-documents/how-to-recover-from-a-software-failure-on-the-6120-fabric/ta-p/3121751).

 

Some tips:

1. Downloaded infrastructure firmware ucs-k9-bundle-infra.3.2.2d.A.bin (need exactly the same version as good FI to join cluster after rebuild) and extracted files (7-zip):

ucs-6100-k9-kickstart.5.0.3.N2.3.22c.bin

ucs-6100-k9-system.5.0.3.N2.3.22c.bin

ucs-manager-k9.3.2.2d.bin

2. After booting kickstart.bin from tftp used init system command (very destructive, especially if any licenses) to re-init file systems.

3. Copy all bin files to bootflash, copy/rename ucs-manager-k9.3.2.2d.bin to nuova-sim-mgmt-nsg.0.1.0.001.bin (special fixed name). Used Open TFTP Server (https://sourceforge.net/projects/tftp-server).

4. Need set boot version in UCS Manager for FI A or next boot will boot loader.

 

Some quick steps for copy-past for my verson and tftp ip here:

... reboot FI ...
NSX6248UP1-A# connect local-mgmt A
NSX6248UP1-A(local-mgmt)# reboot

Boot into bootloader (when system begin boot): CTRL+L
...

loader> set ip 10.1.253.241 255.255.255.0
loader> set gw 10.1.253.1

loader> boot tftp://10.3.11.68/ucs-6100-k9-kickstart.5.0.3.N2.3.22c.bin

switch(boot)# init system

switch(boot)# conf terminal
switch(boot)(config)# interface mgmt 0
switch(boot)(config-if)# ip address 10.1.253.241 255.255.255.0
switch(boot)(config-if)# no shut
switch(boot)(config-if)# exit
switch(boot)(config)# ip default-gateway 10.1.253.1
switch(boot)(config)# exit
switch(boot)# copy tftp://10.3.11.68/ucs-6100-k9-kickstart.5.0.3.N2.3.22c.bin bootflash:
switch(boot)# copy tftp://10.3.11.68/ucs-6100-k9-system.5.0.3.N2.3.22c.bin bootflash:
switch(boot)# copy tftp://10.3.11.68/ucs-manager-k9.3.2.2d.bin bootflash:

switch(boot)# copy bootflash:ucs-manager-k9.3.2.2d.bin bootflash:nuova-sim-mgmt-nsg.0.1.0.001.bin
switch(boot)# exit

... rebootint to loader ...
loader> boot ucs-6100-k9-kickstart.5.0.3.N2.3.22c.bin ucs-6100-k9-system.5.0.3.N2.3.22c.bin

... wait init and complete initial config to rejoin cluster ...
... set boot version for FI in UCS Manager ...

View solution in original post

3 Replies 3

Wes Austin
Cisco Employee
Cisco Employee

You can attempt a reboot of the FI or a restart of pmon services from the CLI in an effort to recover FI-A. Unfortunately, if your management database has been corrupted or needs to be recovered, it would require a TAC case to load a debug image to conduct the repair.

 

 

I have rebooted both FI A and FI B. Nothing changed.
Can not open TAC - have no subscription right now :(
Is there procedure to rebuild FI (if thre is no less time expensive procedure)?

Sergey Roshchin
Level 1
Level 1

Solved by full rebuild of FI A (Reference steps from https://supportforums.cisco.com/t5/data-center-documents/how-to-recover-from-a-software-failure-on-the-6120-fabric/ta-p/3121751).

 

Some tips:

1. Downloaded infrastructure firmware ucs-k9-bundle-infra.3.2.2d.A.bin (need exactly the same version as good FI to join cluster after rebuild) and extracted files (7-zip):

ucs-6100-k9-kickstart.5.0.3.N2.3.22c.bin

ucs-6100-k9-system.5.0.3.N2.3.22c.bin

ucs-manager-k9.3.2.2d.bin

2. After booting kickstart.bin from tftp used init system command (very destructive, especially if any licenses) to re-init file systems.

3. Copy all bin files to bootflash, copy/rename ucs-manager-k9.3.2.2d.bin to nuova-sim-mgmt-nsg.0.1.0.001.bin (special fixed name). Used Open TFTP Server (https://sourceforge.net/projects/tftp-server).

4. Need set boot version in UCS Manager for FI A or next boot will boot loader.

 

Some quick steps for copy-past for my verson and tftp ip here:

... reboot FI ...
NSX6248UP1-A# connect local-mgmt A
NSX6248UP1-A(local-mgmt)# reboot

Boot into bootloader (when system begin boot): CTRL+L
...

loader> set ip 10.1.253.241 255.255.255.0
loader> set gw 10.1.253.1

loader> boot tftp://10.3.11.68/ucs-6100-k9-kickstart.5.0.3.N2.3.22c.bin

switch(boot)# init system

switch(boot)# conf terminal
switch(boot)(config)# interface mgmt 0
switch(boot)(config-if)# ip address 10.1.253.241 255.255.255.0
switch(boot)(config-if)# no shut
switch(boot)(config-if)# exit
switch(boot)(config)# ip default-gateway 10.1.253.1
switch(boot)(config)# exit
switch(boot)# copy tftp://10.3.11.68/ucs-6100-k9-kickstart.5.0.3.N2.3.22c.bin bootflash:
switch(boot)# copy tftp://10.3.11.68/ucs-6100-k9-system.5.0.3.N2.3.22c.bin bootflash:
switch(boot)# copy tftp://10.3.11.68/ucs-manager-k9.3.2.2d.bin bootflash:

switch(boot)# copy bootflash:ucs-manager-k9.3.2.2d.bin bootflash:nuova-sim-mgmt-nsg.0.1.0.001.bin
switch(boot)# exit

... rebootint to loader ...
loader> boot ucs-6100-k9-kickstart.5.0.3.N2.3.22c.bin ucs-6100-k9-system.5.0.3.N2.3.22c.bin

... wait init and complete initial config to rejoin cluster ...
... set boot version for FI in UCS Manager ...

Review Cisco Networking for a $25 gift card

Review Cisco Networking for a $25 gift card