cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
589
Views
2
Helpful
6
Replies

5 IOM replaced via RMA in 6 month...

Defdefred
Level 1
Level 1

IOM

DC

Date

IOM 6/1 (A)

A

10.nov.2023

IOM 3/1 (A)

B

09.jan.2024

IOM 4/2 (B)

A

12.feb.2024

IOM 6/2 (B)

A

21.feb.2024

IOM 6/1 (A)


A

16.apr.2024

Hello do you have any ideas about raisons what could cause multiple hardware IOM failure?

Hardware is new, firmware are up to date...

Same IOM 6/1 A  already failed 2 time!

Chassis: Cisco UCS 5108 AC2 Chassis

IOM: Cisco UCS 2408

regards

6 Replies 6

Steven Tardy
Cisco Employee
Cisco Employee

Is the hardware really bad or just flagged as bad or thought to be bad?
I have not heard of any major hardware issues with the 2408 IOM (doesn't mean they don't exist).

Discuss this with your TAC team.
Open a new case detailing the prior TAC cases and RMAs and request further investigation.
TAC will review prior cases / failures to look for commonalities.
After opening the TAC case, for recurring issues like this I recommend to work with your Account Team to open a "GFEP" (Account Teams know the process.) to engage Engineering.

TAC working with Engineering should be able to provide sufficient details regarding the prior failures.

 

Hello Steven,

The IOM became totally disconnected, with all led off. Reseatting don't change anything.

Nothing found in the obfl logs.

A dedicated case has been opened and an EFA is ongoing.

I will ask for information about the GFEP process...

Thanks.

Defdefred
Level 1
Level 1

Again an new dead IOM.

Still waiting for EFA...

No one else impacted?

 

We upgraded 150-200 IOMs from 2204 to 2408 in the past year or 2 as part of 12 domain FI upgrades from 6248 to 6454 and have not had DoA IOMs. What we have seen IOMs in auto update loop when upgrading from 2204 to 2408, caused by the IOM firmware version being very old (4.0(4c)) and UCSM unable to auto-upgrade the IOM to the FI firmware version (4.2(3)): CSCwk79370 : While migrating from UCS-IOM-2204 to UCS-IOM-2408, IOMs get stuck at Auto Updating/Auto Activating

I hope the above volumes give you some data point / higher sampling rate to determine if what you are seeing is abnormal (it seems to me, since with our volumes, we have never had DoA).

How new are these IOMs. Have they been in storage for long? The IOMs we had issues with were batches delivered in 2021 (all had serial number starting with FCH25) and the jump in the ship-from-factory version to the current version was just too big. IOMS delivered in 2023, starting with FCH27 had no issues. 

Another possibly related bug: CSCvy20701 2408 IOM upgrade/downgrade failure on subset shipped April 2021 (this seems to relate to auto-upgrade failure, not DoA).

ITM-Team
Level 1
Level 1

We had some problem with the 2408 IOM hitting bug CSCwf03588 resulting in errors and failures during firmware upgrades of the IOMs. Due to the /var/volatile/log/sau_dbg.log file not being rotated correctly there is at some point no space left on the IOM.

You can check the size of the file by generating and looking into a tech support file of the 2408.

martinc_intact
Level 1
Level 1

We had similar issues where brand new IOM just dies; no lights, nothing, re-seat doesn't help. Needs to be replaced. We had 3 failure on the same domain in a short period. Seems to be related with bug in 4.2(3d) and (3e):  https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwf52054

We also hit bug CSCwf03588 where /var/volatile gets full, several times (>15x). This can also cause IOM firmware upgrade to fail. We lost a whole chassis during firmware upgrade because of that...

Review Cisco Networking for a $25 gift card

Review Cisco Networking for a $25 gift card