cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
7189
Views
0
Helpful
13
Replies

Chassis Installation, IOM stuck at auto-upgrade status

scottcuk1
Level 1
Level 1

We are implementing a new UCS environment comprised of a single 5108 chassis

and a pair of 6120XP interconnects and Nexus 5K.


The chassis IOMs have four twinax each to the interconnects and I had not set

the discovery poilicy beofre powering up the chassis.


The interconnects are clustered and functional but after connecting and powering

up the chassis to the interconnects both IOMs overall status from the general

tab is reporting "auto-upgrade".


The fault tab on the IOM shows code F0435 / ID 47401, IOM1/2 (B) is auto

upgrading firmware.


It has been stuck in this state for 2.5 hours now. The FSM tab shows nothing

related to the upgrade and is at 100%.

I have tried to decommission then recommission the chassis which did not work.

I then reset both IOMs but it's had no effect and the auto-upgrade is still

running. UCS Manager is running firmware version 1.3(n) and the interconnects

are 4.1(3)N2(1.3n)


Can anyone advise the best course of action to resolve this?

13 Replies 13

padramas
Cisco Employee
Cisco Employee

Hello ,

Can you please provide the output of following commands

show chassis iom detail

connect nxos a

show int bri

show int fex-fabric

Can you please verify whether we have proper cabling between IOM and FI i.e each IOM in a chassis should be connected to single FI only.

http://www.cisco.com/en/US/docs/unified_computing/ucs/hw/chassis/install/install.html#wp1331197

Padma

Hi,

Thanks for the quick response padramas.

I did request explicitly the cables from each IOM went to separate FI. It doesnt matter which ports does it? We used 1/1,1/2,1/3,1/4 on each FI.

# show chassis iom detai

Chassis Id: 1

ID: 1

Side: Left

Fabric ID: A

Overall Status: Auto Upgrade

Oper qualifier: N/A

Operability: N/A

Presence: Equipped

Thermal Status: N/A

Discovery: Auto Upgrading

Controller Subject:

Config State: Ok

Peer Comm Status: Unknown

Product Name:

PID:

VID:

Vendor: Cisco Systems Inc

Serial (SN): FCHXXXXXXX

Revision: 0

Chassis Id: 1

ID: 2

Side: Right

Fabric ID: B

Overall Status: Auto Upgrade

Oper qualifier: N/A

Operability: N/A

Presence: Equipped

Thermal Status: N/A

Discovery: Auto Upgrading

Controller Subject:

Config State: Ok

Peer Comm Status: Unknown

Product Name:

PID:

VID:

Vendor: Cisco Systems Inc

Serial (SN): FCXXXXXXX

Revision: 0

----

)# show int bri

--------------------------------------------------------------------------------

Ethernet      VLAN   Type Mode   Status  Reason                   Speed     Port

Interface                                                                   Ch #

--------------------------------------------------------------------------------

Eth1/1        1      eth  fabric down    Link not connected          10G(D) --

Eth1/2        1      eth  fabric down    Link not connected          10G(D) --

Eth1/3        1      eth  fabric down    Link not connected          10G(D) --

Eth1/4        1      eth  fabric down    Link not connected          10G(D) --

Eth1/5        1      eth  access down    SFP not inserted            10G(D) --

Eth1/6        1      eth  access down    SFP not inserted            10G(D) --

Eth1/7        1      eth  access down    SFP not inserted            10G(D) --

Eth1/8        1      eth  access down    SFP not inserted            10G(D) --

Eth1/9        1      eth  access down    SFP not inserted            10G(D) --

Eth1/10       1      eth  access down    SFP not inserted            10G(D) --

Eth1/11       1      eth  access down    SFP not inserted            10G(D) --

Eth1/12       1      eth  access down    SFP not inserted            10G(D) --

Eth1/13       1      eth  access down    SFP not inserted            10G(D) --

Eth1/14       1      eth  access down    SFP not inserted            10G(D) --

Eth1/15       1      eth  access down    SFP not inserted            10G(D) --

Eth1/16       1      eth  access down    SFP not inserted            10G(D) --

Eth1/17       1      eth  access down    Administratively down       10G(D) --

Eth1/18       1      eth  access down    Administratively down       10G(D) --

Eth1/19       1      eth  trunk  up      none                        10G(D) --

Eth1/20       1      eth  trunk  up      none                        10G(D) --

--------------------------------------------------------------------------------

Port   VRF          Status IP Address                              Speed    MTU

--------------------------------------------------------------------------------

mgmt0  --           down   10.10.10.10                            --       1500

--------------------

# show int fex-fabric

     Fabric      Fabric       Fex                FEX

Fex  Port      Port State    Uplink    Model         Serial

---------------------------------------------------------------

  1    Eth1/1    Configured     1         N20-C6508  FCHXXXXXXXX

  1    Eth1/2    Configured     2         N20-C6508  FCHXXXXXXXX

  1    Eth1/3    Configured     3         N20-C6508  FCHXXXXXXXX

  1    Eth1/4    Configured     4         N20-C6508  FCHXXXXXXXX

Hello,

Can you please share following output too

connect nxos

show run int eth 1/1

show fex 1 detail

Padma

Thanks, here's the output:

# show run int eth 1/1

version 4.1(3)N2(1.3n)

interface Ethernet1/1

  switchport mode fex-fabric

  pinning server

  fex associate 1 chassis-serial FOXXXXXXXXmodule-serial FCHXXXXXXX module-sl

ot left

  no shutdown

------------------------

)# show fex 1 detail

FEX: 1 Description: FEX0001   state: Image Download

  FEX version: 5.0(3)N2(2.03a) [Switch version: 4.1(3)N2(1.3n)]

  FEX Interim version: 5.0(3)N2(2.03a)

  Switch Interim version: 4.1(3)N2(1.3n)

  Module Sw Gen: 21  [Switch Sw Gen: 21]

pinning-mode: static    Max-links: 1

  Fabric port for control traffic: Eth1/1

  Fabric interface state:

    Eth1/1 - Interface Up. State: Active

    Eth1/2 - Interface Up. State: Active

    Eth1/3 - Interface Up. State: Active

    Eth1/4 - Interface Up. State: Active

  Fex Port        State  Fabric Port  Primary Fabric

Logs:

[08/30/2012 13:44:38.344711] Module Offline

[08/30/2012 13:45:12.810625] Module register received

[08/30/2012 13:45:12.813530] Registration response sent

[08/30/2012 13:45:12.813927] Requesting satellite to download image

[08/30/2012 13:45:16.820891] Module register received

[08/30/2012 13:45:16.823762] Registration response sent

[08/30/2012 13:45:16.824154] Requesting satellite to download image

[08/30/2012 13:45:22.910233] Module register received

[08/30/2012 13:45:22.913061] Registration response sent

[08/30/2012 13:45:22.913457] Requesting satellite to download image

[08/30/2012 13:45:30.911357] Module register received

[08/30/2012 13:45:30.914216] Registration response sent

[08/30/2012 13:45:30.914608] Requesting satellite to download image

[08/30/2012 13:45:40.920251] Module register received

[08/30/2012 13:45:40.923088] Registration response sent

[08/30/2012 13:45:40.923480] Requesting satellite to download image

[08/30/2012 13:45:52.960578] Module register received

[08/30/2012 13:45:52.963430] Registration response sent

[08/30/2012 13:45:52.963826] Requesting satellite to download image

[08/30/2012 13:46:06.960109] Module register received

[08/30/2012 13:46:06.962942] Registration response sent

[08/30/2012 13:46:06.963336] Requesting satellite to download image

[08/30/2012 14:46:23.410493] Module register received

[08/30/2012 14:46:23.413267] Registration response sent

[08/30/2012 14:46:23.413613] Requesting satellite to download image

[08/30/2012 14:46:41.660324] Module register received

[08/30/2012 14:46:41.663081] Registration response sent

[08/30/2012 14:46:41.663429] Requesting satellite to download image

[08/30/2012 14:47:01.660190] Module register received

[08/30/2012 14:47:01.662935] Registration response sent

[08/30/2012 14:47:01.663284] Requesting satellite to download image

[08/30/2012 14:47:23.910897] Module register received

[08/30/2012 14:47:23.913670] Registration response sent

[08/30/2012 14:47:23.914015] Requesting satellite to download image

[08/30/2012 14:47:48.159955] Module register received

[08/30/2012 14:47:48.162716] Registration response sent

[08/30/2012 14:47:48.163072] Requesting satellite to download image

[08/30/2012 14:48:14.484160] Module register received

[08/30/2012 14:48:14.486910] Registration response sent

[08/30/2012 14:48:14.487270] Requesting satellite to download image

[08/30/2012 14:48:42.660104] Module register received

[08/30/2012 14:48:42.662872] Registration response sent

[08/30/2012 14:48:42.663220] Requesting satellite to download image

[08/30/2012 14:49:12.660206] Module register received

[08/30/2012 14:49:12.662983] Registration response sent

[08/30/2012 14:49:12.663327] Requesting satellite to download image

[08/30/2012 14:49:44.910129] Module register received

[08/30/2012 14:49:44.912908] Registration response sent

[08/30/2012 14:49:44.913254] Requesting satellite to download image

[08/30/2012 14:50:19.182966] Module register received

[08/30/2012 14:50:19.185741] Registration response sent

[08/30/2012 14:50:19.186088] Requesting satellite to download image

[08/30/2012 14:50:55.662032] Module register received

[08/30/2012 14:50:55.664795] Registration response sent

[08/30/2012 14:50:55.665144] Requesting satellite to download image

[08/30/2012 14:51:34.130030] Module register received

[08/30/2012 14:51:34.132871] Registration response sent

[08/30/2012 14:51:34.133230] Requesting satellite to download image

[08/30/2012 14:52:14.412295] Module register received

[08/30/2012 14:52:14.415056] Registration response sent

[08/30/2012 14:52:14.415405] Requesting satellite to download image

[08/30/2012 14:52:56.460612] Module register received

[08/30/2012 14:52:56.463344] Registration response sent

[08/30/2012 14:52:56.463693] Requesting satellite to download image

Hello,

I believe it is 2204 IOM model. Can you please confirm it ?

IOM is currently running 2.03.a version and is attempting to download to 1.3 version which will not work.

Is this new installation ?

If yes, please upgrade the FI and UCSM to 2.0.X version and verify the behavior.

[EDIT]  2204 and 2208 IO modules are supported from 2.0 version

Padma

I'm not sure on the IOM model as its not showing up in the UCSM with any detail.

As its UCSM 1.3 there's no local upload so after a lot of ftp issues with our firewall configuration I managed to upload the latest package over scp (cs-k9-bundle-b-series.2.0.3c.B.bin)

After unpacking completes the files are visible in packages and images but after clicking upgrade firmware there is nothing to select. Is >2.0 a compatible upgrade for 1.3 or do I have to go to 1.4 first?

Hello,

If you physically look at the IOM, there should be label on it that would say it is UCS 2204XP or UCS 2208XP.

Otherwise, send me private message with your IOM serial number ( FCHXXXXXX ) and I could confirm it for you.

Regarding upgrade, please download UCS infra bundle ( ucs-k9-bundle-infra.2.0.3c.A.bin ) and start with UCSM upgrade.

http://www.cisco.com/cisco/software/release.html?mdfid=283612660&flowid=22121&softwareid=283655658&release=2.0%283c%29&relind=AVAILABLE&rellifecycle=&reltype=latest

If it is not in production, to keep the upgrade simple, remove the links bertween FI and IOM and then upgrade both FI ( NXOS ) image.

Once you have both FIs running with 2.0.3c version, connect the links between IOM and FI.

Padma

Hi,

I'm not onsite (which is several hours away) unfortunately so I dont have visiblility on the device. I'm fairly sure they are the same as our other environments - UCS-2104-XP.

I downloaded the bundle and activated the UCSM but it would only set to startup mode without actually initilising. 

I went ahead and activated on the subordinate FI which failed, due to the UCSM version at 1.3 I'm guessing.

I rebooted the primary FI to force the upgrade of UCSM which came back up and shows the 2.0(3c) splash screen. After entering the login credentials it times out after 60 seconds with

Login error:

java.net. SocketTimeoutException. Read timed out.

From ssh,

show version

2.0 (3c)

show system:

Software Error: Exception during execution: [Error: Timed out communicating with DME]

show fabric-interconnect

Software Error: Exception during execution: [Error: Timed out communicating with DME]

Hello,

It could take time to initialize the DME process after activating the UCSM.

Can you please provide following output

show system firmware

connect local-mgmt

show pmon state

Padma

Hi Padma

This is the output (4 hours after the attempted upgrade).

# show system firmware

Software Error: Exception during execution: [Error: Timed out communicating with DME]

# connect local-mgmt

(local-mgmt)# show pmon state

SERVICE NAME             STATE     RETRY(MAX)    EXITCODE    SIGNAL    CORE

------------             -----     ----------    --------    ------    ----

svc_sam_controller     running           0(4)           0         0      no

svc_sam_dme            running           0(4)           0         0      no

svc_sam_dcosAG         running           0(4)           0         0      no

svc_sam_bladeAG        running           0(4)           0         0      no

svc_sam_portAG         running           0(4)           0         0      no

svc_sam_statsAG        running           0(4)           0         0      no

svc_sam_hostagentAG    running           0(4)           0         0      no

svc_sam_nicAG          running           0(4)           0         0      no

svc_sam_licenseAG      running           0(4)           0         0      no

svc_sam_extvmmAG       running           0(4)           0         0      no

httpd.sh               running           0(4)           0         0      no

svc_sam_sessionmgrAG   running           0(4)           0         0      no

svc_sam_pamProxy       running           0(4)           0         0      no

sfcbd                  running           0(4)           0         0      no

dhcpd                  running           0(4)           0         0      no

sam_core_mon           running           0(4)           0         0      no

svc_sam_rsdAG          running           0(4)           0         0      no

svc_sam_svcmonAG       running           0(4)           0         0      no

Hello,

Please open a TAC service request so that we can take a look at your system via webex.

Thanks

Padma

For clarity, the primary would not allow me to login until the secondary was powered off.Once we got back into the newly update ucsm it was just a matter of updating the other components.

To close the loop on IOM upgrade issue, it was 220X device which required UCSM / NXOS 2.0 image.

Issue got resolved once we upgraded UCSM / NXOS version.

Padma

Review Cisco Networking products for a $25 gift card