cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3519
Views
40
Helpful
16
Replies

Upgrade failure 2.2.3d to 2.2.6d

Walter Dey
VIP Alumni
VIP Alumni

We had 5 UCS domains, which we wanted to upgrade from 2.2.3d to 2.2.6d.

We used autoinstall.

3 out of 5 worked ok

2 were hanging  "waiting for activation to complete on peer fabric interconnect" at 97%

see attachments.

We opened TAC case; however, due to the fact that no webex call was possible due to administrative reasons, no solution was possible.

Any known bugs ?

PS.

All possible tricks were tried, without success

e.g. triggering another autoinstall with force

terminating autoinstall and manual activate

.....

16 Replies 16

AFROJ AHMAD
Cisco Employee
Cisco Employee

Hi Walter,

Try to reboot the FI, using command line:

connect local-mgmt a/b >reboot

and see if that helps.

check the status of the firmware after the upgarde ,if you see the old version . try to activate the new firmware from CLI .

http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/sw/firmware-mgmt/cli/2-1/b_CLI_Firmware_Management_21/b_CLI_Firmware_Management_21_chapter_0110.html#task_6551496931252847717

I don't see any known bug at this moment.

Thanks-

Afroz

***Ratings Encourages Contributors ***

Thanks- Afroz [Do rate the useful post] ****Ratings Encourages Contributors ****

Hi Afroz

we terminated the hanging "autoinstall infra", then tried with cli to manually activate, which didn't help.

reboot ended in a total chaos, FI was corrupt had to be restored; finally we had to downgrade to 2.2.3d.

Doing this we finally stepped into the next trap (see my newest post, chassis renumbering after power fail).

This weekend was a total frustration.

Walter.

Hi Walter,

Yeah you most likely ran into that known bug we linked above. This causes a corrupt Fabric Interconnect which requires TAC to do a restore. I apologize for that, I haven't heard of UCS Central VM causing hang problems, I will look into that for you in more detail.

Regards,

Qiese Dides

Hi Qiese

Just a short clarification.

If autoinstall infrastructure updates the subordinate FI and hangs in activation;

What is the exact procedure that TAC will follow ?

Is the subordinate FI corrupted causing the trouble, or the master FI (which still runs the old firmware version).

Walter.

We would try to manually activate it. Can you still SSH into the subordinate FI or primary? If not you might have to console into them to see what output is being generated.

If it gets stuck, you can also attempt to roll the upgrade back to the previous version, in case of any other issues. That way you can at least address the issues without proceeding with the upgrade.

--Wes

Hi Wes

I got the following partial log of the situation:

autoinstall infra process with forced autoinstalls was hanging in the activation step.

we opened the console to find that the FI-B was sitting at the "console/gui" setup prompt after it's first reboot to activate the new 2.26d firmware:

After going through the setup again, a downgrade to 2.2.3d was done; however, after that it entered the setup again.

           ---- Basic System Configuration Dialog ----

  This setup utility will guide you through the basic configuration of
  the system.
Only minimal configuration including IP connectivity to
  the Fabric interconnect and its clustering mode is performed through these steps.

  Type Ctrl-C at any time to abort configuration and reboot system.
  To back track or make modifications to already entered values,
  complete input till end of section and answer no when prompted
  to apply configuration.


  Enter the configuration method. (console/gui) ?



Unfortunately the initial boot messages were lost, as we only attached to the console after realising the FI would not come back up.  We never disconnected the L1/L2 cables.

We then tried to proceed with a normal config from the console - and were asked to downgrade, as the primary FI had a lower version than the sub FI (expected behaviour):

..........
 
Enter the configuration method. (console/gui) ? console

  Installer has detected the presence of a peer Fabric interconnect. This Fabric interconnect will be added to the cluster. Continue (y/n) ? y

  Enter the admin password of the peer Fabric interconnect:
    Connecting to peer Fabric interconnect... done
    Retrieving config from peer Fabric interconnect... done
    Installer has determined that the peer Fabric Interconnect is running a different firmware version than the local Fabric. Cannot join cluster.

    Local Fabric Interconnect
      UCSM version     : 2.2(6d)
      Kernel version   : 5.2(3)N2(2.26d)
      System version   : 5.2(3)N2(2.26d)
      local_model_no   : 6296

    Peer Fabric Interconnect
      UCSM version     : 2.2(3d)
      Kernel version   : 5.2(3)N2(2.23d)
      System version   : 5.2(3)N2(2.23d)
      peer_model_no    : 6296


  Do you wish to update firmware on this Fabric Interconnect to the Peer's version? (y/n): y
Updating firmware of Fabric Interconnect....... [ Please don't press Ctrl+c while updating firmware ]
 Updating images
 Please wait for firmware update to complete....
 Checking the Compatibility of new Firmware..... [ Please don't Press ctrl+c ].
Verifying image bootflash:/installables/switch/ucs-6100-k9-kickstart.5.2.3.N2.2.23d.bin for boot variable "kickstart".
..........
Switch will be reloaded for disruptive upgrade.

Install is in progress, please wait.

Performing runtime checks.
[####################] 100% -- SUCCESS


..........

Install has been successful.

 Firmware Updation Successfully Completed. Please wait to enter the IP address
cat: /tmp/tp: No such file or directory
    Peer Fabric interconnect Mgmt0 IPv4 Address: 192.168.15.17
    Peer Fabric interconnect Mgmt0 IPv4 Netmask: 255.255.255.0
    Cluster IPv4 address          : 192.168.15.18

    Peer FI is IPv4 Cluster enabled. Please Provide Local Fabric Interconnect Mgmt0 IPv4 Address

  Physical Switch Mgmt0 IP address : 192.168.15.16


  Apply and save the configuration (select 'no' if you want to re-enter)? (yes/no): yes
  Applying configuration. Please wait.

cat: /tmp/tp: No such file or directory
Sat Jan  9 02:08:58 UTC 2016
[  822.470467] Restarting system.
[  822.506859] machine restart
[  822.540134] Resetting board (uc)

N5000 BIOS v.3.6.0, Wed 05/09/2012, 03:15 PM

....new boot with 2.23d .... Notice the 'corruption':



Booting kickstart image: bootflash:/installables/switch/ucs-6100-k9-kickstart.5
.2.3.N2.2.23d.bin...................................................................................
..............................................Image verification OK

ÿUsage: init 0123456SsQqAaBbCcUu
INIT: [    9.801805] I2C - Mezz present
Starting system POST.....
  Executing Mod 1 1 SEEPROM Test:...done (0 seconds)
  Executing Mod 1 1 GigE Port Test:....done (32 seconds)
  Executing Mod 1 1 PCIE Test:.................done (0 seconds)
  Mod 1 1 Post Completed Successfully
POST is completed
can't create lock file /var/lock/mtab~207: No such file or directory (use -n flag to override)          <-- "normal" behaviour; not corrupt.
S10mount-ramfs.supnuovaca Mounting /isan 3000m
Mounted /isan
Creating /callhome..
Mounting /callhome..
Creating /callhome done.
Callhome spool file system init done.
nohup: redirecting stderr to stdout
autoneg unmodified, ignoring
autoneg unmodified, ignoring
Checking all filesystems..r.r.r. done.   (on other boots, we would also see: Checking all filesystems.....ERROR: MGMT partition has unrecoverable error)
Checking NVRAM block device ... done
The startup-config won't be used until the next reboot.
.
Loading system software
Uncompressing system image: bootflash:/installables/switch/ucs-6100-k9-system.5.2.3.N2.2.23d.bin

Loading plugin 0: core_plugin...
Loading plugin 1: eth_plugin...
Loading plugin 2: fc_plugin...


/bin/dd: invalid number `Error'

gzip: stdin: unexpected end of file
/bin/tar: Child returned status 1
/bin/tar: Error exit delayed from previous errors
ethernet end-host mode on CA

FC end-host mode on CA
n_port virtualizer mode.
---------------------------------------------------------------
INIT: Entering runlevel: 3
touch: cannot touch `/var/lock/subsys/netfs': No such file or directory
Mountin
/isan/bin/first-setup.core: line 2040: /isan/etc/common_defs: No such file or directory
/isan/bin/muxif_config: fex vlan id: -f,4042
Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
Added VLAN with VID == 4042 to IF -:muxif:-
cp: cannot stat `/isan/plugin_img/fex.bin': No such file or directory
ERROR: Failed to upgrade SAM CLI commands. Details in /tmp/sam_upgrade_samcli.log
2016 Jan  9 04:11:07  %$ VDC-1 %$ %USER-2-SYSTEM_MSG: CLIS: loading cmd files begin  - clis
2016 Jan  9 04:11:11  %$ VDC-1 %$ Jan  9 04:11:10 %KERN-0-SYSTEM_MSG: [    9.801805] I2C - Mezz present  - kernel
2016 Jan  9 04:11:17  %$ VDC-1 %$ %USER-2-SYSTEM_MSG: CLIS: loading cmd files end  - clis
2016 Jan  9 04:11:17  %$ VDC-1 %$ %USER-2-SYSTEM_MSG: CLIS: init begin  - clis
2016 Jan  9 04:11:22  %$ VDC-1 %$ %SNMPD-2-CRITICAL: SNMP log critical : load_mib_module :Error, while loading the mib module /isan/lib/libsvc_sam_extSnmpPlugin.so (/isan/lib/libsvc_sam_extSnmpPlugin.so: cannot open shared object file: No such file or directory)

2016 Jan  9 04:12:02  %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_controller crashed with crash type:256
2016 Jan  9 04:12:02  %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2016 Jan  9 04:12:03  %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_dme crashed with crash type:256
2016 Jan  9 04:12:03  %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH

..........

System is coming up ... Please wait ...
System is coming up ... Please wait ...
2016 Jan  9 04:12:32  %$ VDC-1 %$ %VDC_MGR-2-VDC_ONLINE: vdc 1 has come online
System is coming up ... Please wait ...
touch: missing file operand
Try `touch --help' for more information.
/isan/bin/first-setup.core: line 1243: bond0: command not found
/isan/bin/first-setup.core: line 1244: bond0: command not found
/isan/bin/first-setup.core: line 1245: bond0: command not found
/isan/bin/first-setup.core: line 1247: bond0: command not found
/isan/bin/first-setup.core: line 2101: -t: command not found
nohup: appending output to `nohup.out'
nohup: cannot run command `/isan/bin/initial_setup.sh': No such file or directory
up: error fetching interface information: Device not found

2016 Jan  9 04:12:37 switch %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Running in PIO stats mode  - carmelusd
 
           ---- Basic System Configuration Dialog ----

  This setup utility will guide you through the basic configuration of
  the system.
Only minimal configuration including IP connectivity to
  the Fabric interconnect and its clustering mode is performed through these steps.

  Type Ctrl-C at any time to abort configuration and reboot system.
  To back track or make modifications to already entered values,
  complete input till end of section and answer no when prompted
  to apply configuration.


  Enter the configuration method. (console/gui) ? console
Usage: grep [OPTION]...
PATTERN [FILE]...
Try `grep --help' for more information.
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.



Hi community

You should be aware of the following FN, which applies to FI model 6296UP only; and both of our failing upgrades were 6296up's

http://www.cisco.com/c/en/us/support/docs/field-notices/640/fn64094.html?emailclick=CNSemail

Field Notice: FN - 64094 - Nexus 5596/UCS FI 6296 - System Fails to Boot After a Power Cycle, Microcode Upgrade Required

Hi Walter,

Where you hitting that FN? You would have to upgrade to 1.1 and then reboot for the upgrade to take effect.

Qiese

I assume the upgrade of the FI from 1.0 to 1.1 is disruptive, requiring a reboot.

Walter.

Hi Walter,

You can get TAC to do the work around with no disruption but for it to take effect a reboot is required.

Qiese Dides

Hi Walter,

How did you terminated the hanging"autoinstall infra"?

BR

Hi Afroz

Q. this UCS domain was registered in UCS Central; most likely the UCS Central VM was not reachable during the upgrade. Could this cause any trouble, e.g. hanging in activation ?

Walter.

Review Cisco Networking products for a $25 gift card