06-04-2012 03:31 AM - edited 03-01-2019 10:26 AM
si
Hi everyone!
I've installed my first UCS system: 2 UCS 5108 & 2 UCS 6248
In first chassys six blade-servers (2 - b230 m2& 4 b200 m2). In second - 5 b200 m2. I've got two air conditioners in server room working on their maximum. For the last week i've received three faults on first chassis (Fault Code: F0411). IOM temperature was about 45-46. After that i've mooved 3 blade-servers to second chassis until i solve this problem.
UCSM version - 2.0.2r
Everything is quite good, except thermal problem. All blade-servers discovered, 0 errors and critical.
06-04-2012 04:26 AM
Hello Sergey,
Please a open TAC service request with UCSM and Chassis 1 and 2 tech support bundle.
We need more logs to investigate the thermal fault.
Padma
06-04-2012 04:51 AM
Please reset the IOM physicaly present in that chasiss. I have done this twice for the thermal issue and the issue never re-accured.
Ram
06-04-2012 06:17 AM
I'll try to reset them/ Could you tell me how to do this correctly?
06-04-2012 06:45 AM
Just unplug the right hand side IOM and fix. Wait for 20 Mins and evrything comes up, repeat the steps for another side.
Ram
06-04-2012 07:03 AM
Do not turn power off? Just unplug and set back one IOM and then another?
06-04-2012 04:50 PM
Yes, don't power off, it is not required
06-04-2012 06:19 AM
I can't open TAC at this moment - my smartnet is still on registration... i've created technical files for Chassis 1 and 2. Should i place them here or wait for my smartnet?
06-04-2012 08:53 AM
This can be caused by an I2c issue on the server.
You can try the following:
Reset fans one by one.
Reset PSU one by one
Finally, reset IOMs starting for the faulty one.
Also, determine which blade is showing any alarms and try to reseat the blade on the chassis.
Please make sure to wait a couple of minutes during the resetting of the components.
06-05-2012 12:09 AM
How to do it correctly? Power off than reset or what?
06-04-2012 05:51 PM
Think this is a code bug and you need to goto 2.0(q). Running two 6248 systems at that level and not having the issue. This thermal stuff plagued ALL the 1.4x releases.
Craig
06-05-2012 12:11 AM
i've got the same errors on 2.0.2q...
06-05-2012 07:42 AM
Sergey,
If this is a real I2C issue, you may still see the same behavior on 2.0.x release if the I2C bus was not cleared before the upgrade. (in this moment I don't know if you recently performed an upgrade on the system or not)
I2C bus tranports information about the different components of the Unified System, this, meaning Chassis, IOMs, Fans, PSU, etc... What happens is that all those components try to send theit status update while other do the same and then the I2C bus gets overwhelmed, and then noone can really report their real status, so we usually recommend the customer ro reseat all major components, one at the time, to clear the bus and then do the upgrade, if that is not done before the upgrade, it still should be done after.
Try reseating the Fans and PSU, one at the time, leaving a minute in between and then, IOMs one at the time, leaving three minutes in between and begining with the subordinate to cause minimun disruption.
If this does not clear the situation, then you will need to remove one of the components already mentioned, one at the time and do a "show tech-support chassis # all brief" to see what the I2C bus reports segment by segment (chassis, Fans, PSUs...) once you remove a component and the errors on each segment stop incrementing you will have your faulty piece of hardware, and a TAC case will be needed to send a replacement.
For further analysis or assistance, I strongly recommend a TAC case to be opened.
-Kenny
06-05-2012 02:45 PM
Have gone through powering off the whole 6248 UCS on 2q, and issues remain.
Craig
06-05-2012 02:54 PM
Actually you don't have to power off the 6248 FIs. A effective but luxury solution, is to decommission and powercycle the chassis that is generating those faults, including the power cords removal.Then, you can wait a minute and recommision the chassis. After that, all thermal fauls should go away.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide