ā01-10-2016 03:23 AM - edited ā03-01-2019 12:32 PM
We had 5 UCS domains, which we wanted to upgrade from 2.2.3d to 2.2.6d.
We used autoinstall.
3 out of 5 worked ok
2 were hanging "waiting for activation to complete on peer fabric interconnect" at 97%
see attachments.
We opened TAC case; however, due to the fact that no webex call was possible due to administrative reasons, no solution was possible.
Any known bugs ?
PS.
All possible tricks were tried, without success
e.g. triggering another autoinstall with force
terminating autoinstall and manual activate
.....
ā01-10-2016 04:59 AM
Hi Walter,
Try to reboot the FI, using command line:
connect local-mgmt a/b >reboot
and see if that helps.
check the status of the firmware after the upgarde ,if you see the old version . try to activate the new firmware from CLI .
http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/sw/firmware-mgmt/cli/2-1/b_CLI_Firmware_Management_21/b_CLI_Firmware_Management_21_chapter_0110.html#task_6551496931252847717
I don't see any known bug at this moment.
Thanks-
Afroz
***Ratings Encourages Contributors ***
ā01-10-2016 02:15 PM
Hi Afroz
we terminated the hanging "autoinstall infra", then tried with cli to manually activate, which didn't help.
reboot ended in a total chaos, FI was corrupt had to be restored; finally we had to downgrade to 2.2.3d.
Doing this we finally stepped into the next trap (see my newest post, chassis renumbering after power fail).
This weekend was a total frustration.
Walter.
ā01-12-2016 05:19 AM
Hi Walter,
Yeah you most likely ran into that known bug we linked above. This causes a corrupt Fabric Interconnect which requires TAC to do a restore. I apologize for that, I haven't heard of UCS Central VM causing hang problems, I will look into that for you in more detail.
Regards,
Qiese Dides
ā01-13-2016 01:18 AM
Hi Qiese
Just a short clarification.
If autoinstall infrastructure updates the subordinate FI and hangs in activation;
What is the exact procedure that TAC will follow ?
Is the subordinate FI corrupted causing the trouble, or the master FI (which still runs the old firmware version).
Walter.
ā01-13-2016 05:36 AM
We would try to manually activate it. Can you still SSH into the subordinate FI or primary? If not you might have to console into them to see what output is being generated.
ā01-13-2016 06:11 AM
If it gets stuck, you can also attempt to roll the upgrade back to the previous version, in case of any other issues. That way you can at least address the issues without proceeding with the upgrade.
--Wes
ā01-13-2016 11:58 AM
Hi Wes
I got the following partial log of the situation:
autoinstall infra process with forced autoinstalls was hanging in the activation step.
we opened the console to find that the FI-B was sitting at the "console/gui" setup prompt after it's first reboot to activate the new 2.26d firmware:
After going through the setup again, a downgrade to 2.2.3d was done; however, after that it entered the setup again.
---- Basic System Configuration Dialog ----
This setup utility will guide you through the basic configuration of
the system. Only minimal configuration including IP connectivity to
the Fabric interconnect and its clustering mode is performed through these steps.
Type Ctrl-C at any time to abort configuration and reboot system.
To back track or make modifications to already entered values,
complete input till end of section and answer no when prompted
to apply configuration.
Enter the configuration method. (console/gui) ?
Unfortunately the initial boot messages were lost, as we only attached to the console after realising the FI would not come back up. We never disconnected the L1/L2 cables.
We then tried to proceed with a normal config from the console - and were asked to downgrade, as the primary FI had a lower version than the sub FI (expected behaviour):
..........
Enter the configuration method. (console/gui) ? console
Installer has detected the presence of a peer Fabric interconnect. This Fabric interconnect will be added to the cluster. Continue (y/n) ? y
Enter the admin password of the peer Fabric interconnect:
Connecting to peer Fabric interconnect... done
Retrieving config from peer Fabric interconnect... done
Installer has determined that the peer Fabric Interconnect is running a different firmware version than the local Fabric. Cannot join cluster.
Local Fabric Interconnect
UCSM version : 2.2(6d)
Kernel version : 5.2(3)N2(2.26d)
System version : 5.2(3)N2(2.26d)
local_model_no : 6296
Peer Fabric Interconnect
UCSM version : 2.2(3d)
Kernel version : 5.2(3)N2(2.23d)
System version : 5.2(3)N2(2.23d)
peer_model_no : 6296
Do you wish to update firmware on this Fabric Interconnect to the Peer's version? (y/n): y
Updating firmware of Fabric Interconnect....... [ Please don't press Ctrl+c while updating firmware ]
Updating images
Please wait for firmware update to complete....
Checking the Compatibility of new Firmware..... [ Please don't Press ctrl+c ].
Verifying image bootflash:/installables/switch/ucs-6100-k9-kickstart.5.2.3.N2.2.23d.bin for boot variable "kickstart".
..........
Switch will be reloaded for disruptive upgrade.
Install is in progress, please wait.
Performing runtime checks.
[####################] 100% -- SUCCESS
..........
Install has been successful.
Firmware Updation Successfully Completed. Please wait to enter the IP address
cat: /tmp/tp: No such file or directory
Peer Fabric interconnect Mgmt0 IPv4 Address: 192.168.15.17
Peer Fabric interconnect Mgmt0 IPv4 Netmask: 255.255.255.0
Cluster IPv4 address : 192.168.15.18
Peer FI is IPv4 Cluster enabled. Please Provide Local Fabric Interconnect Mgmt0 IPv4 Address
Physical Switch Mgmt0 IP address : 192.168.15.16
Apply and save the configuration (select 'no' if you want to re-enter)? (yes/no): yes
Applying configuration. Please wait.
cat: /tmp/tp: No such file or directory
Sat Jan 9 02:08:58 UTC 2016
[ 822.470467] Restarting system.
[ 822.506859] machine restart
[ 822.540134] Resetting board (uc)
N5000 BIOS v.3.6.0, Wed 05/09/2012, 03:15 PM
....new boot with 2.23d .... Notice the 'corruption':
Booting kickstart image: bootflash:/installables/switch/ucs-6100-k9-kickstart.5
.2.3.N2.2.23d.bin...................................................................................
..............................................Image verification OK
ĆæUsage: init 0123456SsQqAaBbCcUu
INIT: [ 9.801805] I2C - Mezz present
Starting system POST.....
Executing Mod 1 1 SEEPROM Test:...done (0 seconds)
Executing Mod 1 1 GigE Port Test:....done (32 seconds)
Executing Mod 1 1 PCIE Test:.................done (0 seconds)
Mod 1 1 Post Completed Successfully
POST is completed
can't create lock file /var/lock/mtab~207: No such file or directory (use -n flag to override) <-- "normal" behaviour; not corrupt.
S10mount-ramfs.supnuovaca Mounting /isan 3000m
Mounted /isan
Creating /callhome..
Mounting /callhome..
Creating /callhome done.
Callhome spool file system init done.
nohup: redirecting stderr to stdout
autoneg unmodified, ignoring
autoneg unmodified, ignoring
Checking all filesystems..r.r.r. done. (on other boots, we would also see: Checking all filesystems.....ERROR: MGMT partition has unrecoverable error)
Checking NVRAM block device ... done
The startup-config won't be used until the next reboot.
.
Loading system software
Uncompressing system image: bootflash:/installables/switch/ucs-6100-k9-system.5.2.3.N2.2.23d.bin
Loading plugin 0: core_plugin...
Loading plugin 1: eth_plugin...
Loading plugin 2: fc_plugin...
/bin/dd: invalid number `Error'
gzip: stdin: unexpected end of file
/bin/tar: Child returned status 1
/bin/tar: Error exit delayed from previous errors
ethernet end-host mode on CA
FC end-host mode on CA
n_port virtualizer mode.
---------------------------------------------------------------
INIT: Entering runlevel: 3
touch: cannot touch `/var/lock/subsys/netfs': No such file or directory
Mountin
/isan/bin/first-setup.core: line 2040: /isan/etc/common_defs: No such file or directory
/isan/bin/muxif_config: fex vlan id: -f,4042
Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
Added VLAN with VID == 4042 to IF -:muxif:-
cp: cannot stat `/isan/plugin_img/fex.bin': No such file or directory
ERROR: Failed to upgrade SAM CLI commands. Details in /tmp/sam_upgrade_samcli.log
2016 Jan 9 04:11:07 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: CLIS: loading cmd files begin - clis
2016 Jan 9 04:11:11 %$ VDC-1 %$ Jan 9 04:11:10 %KERN-0-SYSTEM_MSG: [ 9.801805] I2C - Mezz present - kernel
2016 Jan 9 04:11:17 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: CLIS: loading cmd files end - clis
2016 Jan 9 04:11:17 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: CLIS: init begin - clis
2016 Jan 9 04:11:22 %$ VDC-1 %$ %SNMPD-2-CRITICAL: SNMP log critical : load_mib_module :Error, while loading the mib module /isan/lib/libsvc_sam_extSnmpPlugin.so (/isan/lib/libsvc_sam_extSnmpPlugin.so: cannot open shared object file: No such file or directory)
2016 Jan 9 04:12:02 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_controller crashed with crash type:256
2016 Jan 9 04:12:02 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2016 Jan 9 04:12:03 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_dme crashed with crash type:256
2016 Jan 9 04:12:03 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
..........
System is coming up ... Please wait ...
System is coming up ... Please wait ...
2016 Jan 9 04:12:32 %$ VDC-1 %$ %VDC_MGR-2-VDC_ONLINE: vdc 1 has come online
System is coming up ... Please wait ...
touch: missing file operand
Try `touch --help' for more information.
/isan/bin/first-setup.core: line 1243: bond0: command not found
/isan/bin/first-setup.core: line 1244: bond0: command not found
/isan/bin/first-setup.core: line 1245: bond0: command not found
/isan/bin/first-setup.core: line 1247: bond0: command not found
/isan/bin/first-setup.core: line 2101: -t: command not found
nohup: appending output to `nohup.out'
nohup: cannot run command `/isan/bin/initial_setup.sh': No such file or directory
up: error fetching interface information: Device not found
2016 Jan 9 04:12:37 switch %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Running in PIO stats mode - carmelusd
---- Basic System Configuration Dialog ----
This setup utility will guide you through the basic configuration of
the system. Only minimal configuration including IP connectivity to
the Fabric interconnect and its clustering mode is performed through these steps.
Type Ctrl-C at any time to abort configuration and reboot system.
To back track or make modifications to already entered values,
complete input till end of section and answer no when prompted
to apply configuration.
Enter the configuration method. (console/gui) ? console
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
ā01-18-2016 03:04 AM
Hi community
You should be aware of the following FN, which applies to FI model 6296UP only; and both of our failing upgrades were 6296up's
http://www.cisco.com/c/en/us/support/docs/field-notices/640/fn64094.html?emailclick=CNSemail
Field Notice: FN - 64094 - Nexus 5596/UCS FI 6296 - System Fails to Boot After a Power Cycle, Microcode Upgrade Required
ā01-18-2016 04:53 AM
Hi Walter,
Where you hitting that FN? You would have to upgrade to 1.1 and then reboot for the upgrade to take effect.
ā01-18-2016 07:41 AM
Qiese
I assume the upgrade of the FI from 1.0 to 1.1 is disruptive, requiring a reboot.
Walter.
ā01-18-2016 08:33 AM
Hi Walter,
You can get TAC to do the work around with no disruption but for it to take effect a reboot is required.
Qiese Dides
ā05-17-2017 09:09 PM
Hi Walter,
How did you terminated the hanging"autoinstall infra"?
BR
ā01-12-2016 01:03 AM
Hi Afroz
Q. this UCS domain was registered in UCS Central; most likely the UCS Central VM was not reachable during the upgrade. Could this cause any trouble, e.g. hanging in activation ?
Walter.
ā01-15-2016 10:19 AM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide