08-30-2024 03:30 AM
hi,
need some help, I have a c220-m4 server, was running fine until late yesterday, when I decided to upgrade the CPUs on it from e5v3 to e5v4
Now, the server when powered on spins the fans to MAX, the front LEDs are disabled, no green or amber..nothing, just the power LED is green; no VGA, just fans max.
When I swapped the CPUs back to the old one, the same, loud fan and server dead.
I checked on the inside the fault LEDs and all are off, for CPU, RAM, FANs etc. all off.
The exception is for LEDs next to the power supply (LSB), all 8 are lid, I attached a photo.
Any idea what's going on? and potentially what to do?
Many thanks
Michael
08-30-2024 05:25 AM
- Depending on the state of the server; if diagnostics can still be launched look at :
https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/sw/ucsscu/scu_diag/user/guide/b_UCS_Server_Diagnostics_User_Guide_50.html
https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/release/notes/b-release-notes-for-cisco-ucs-server-diagnostics-utility.html
(provides additional info's)
- For the rest : loud fans usually means 'hardware blocked' ; if old CPU's where returned , perhaps something was touched ,
sometimes pressing on other hardware components could help , replacing and or reconnecting power supplies.
In a worst case scenario the previous used CPU's got damaged when they were removed.
M.
08-30-2024 06:41 AM
Thank you for your repose. Much appreciated.
I can't access any diagnostic any more, the server is just in this veered state with FANs at max and the LEDs that I showed before.
I checked components, disconnected everything that I could, I even remove the CPUs, and mounted a single CPU from a different server, nothing.
I tested the CPUs on another server, all good, all 4 of them (old and new).
The only thing that I have noticed since the last time is that the LEDs next to all FANs are on, by on I ming slightly magenta, please see photo attached.
I even replaced the RTC battery, don't know why, but it felt good.
Thank you
MM
08-30-2024 06:54 AM
- I won't be able to help any further unfortunately , and or you may need to invoke Cisco's support ,
M.
08-30-2024 09:37 AM
thank you for your help
08-30-2024 07:03 AM
If you had previously setup CIMC interface, when start with connecting by browser to CIMC interface and check from there the status of the components. As it uses dedicated CPU and so it should not be impacted by hardware issues (except power issues...).
regards
08-30-2024 09:38 AM
I did set up CIMIC, but the server is unresponsive even to CIMIC, well except the FANs...
I will continue to try to find the fault, I will feedback if I will find something.
Thank you
08-30-2024 09:46 AM
- Do you also have the option of not restoring all the original CPU's all at once, but only one to start with (e.g.) ?
M.
08-30-2024 10:31 AM
08-30-2024 11:25 PM
- What about the status of these : https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/c/hw/C240M4/install/C240M4/replace.html#13008
M.
08-31-2024 01:59 AM
Hi,
Nothing, they don't turn on, every time I disconnect the power from PSU the network LED on the far right flashes green and then turns off.
The Power LED is green when powered ON.
The Drives LEDs are all green, but I disconnected the drives any way, just in case.
I made a short video to show what happens when you turn on the server (attached).
I disabled the entire server into pieces, and put back together after inspection, no visible faults / damages etc.
The CPUs mounted are different, I used v3 from a different server.
I took the server home, you know, we all take the servers back home on a weekend
08-31-2024 03:02 AM
- Could you also try with a 'minimum memory bar' and or re-seat the single memory module first (then).
M.
08-31-2024 05:32 AM
hi, I removed the RAM, re-seated it, took different RAM...nothing.
I started doing photos of the motherboard and checking for any defects, for the moment nothing.
The only thing that I will add, when I switch the SW6 no.4 to ON, this is to reset the CMOS, then the server turns on the LED on front for network, it will not start on power click, but still something. When I change the switch back to OFF, server starts automatically, but is it the same unresponsive state.
Details for the DIP switch (SW6): https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/c/hw/C220M4/install/C220M4/replace.pdf
Using the Clear CMOS DIP Switch, page 3-72
09-05-2024 11:31 AM
hi All,
Just as an update, I believe I have found the reason why the system became non-bootable, no solution how to bring it back to live. Just adding this to leave an information for anyone who may come across a similar problem.
Please have a look at Cisco page: https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/c/CPU/v4/install/v4-C.html#40992
Section: Cisco UCS C-Series Servers Upgrade Guide for Intel Xeon v4 Series CPUs
Caution: You must follow the procedures in this guide when upgrading an existing server that has earlier version CPUs to use Intel Xeon v4 Series CPUs. You must upgrade server firmware before you upgrade the CPU hardware, as described in the procedures in this document. Failure to follow these procedures might result in a non-bootable server.
The Caution section of the article describes that if the BIOS and IMS in not upgraded to the minimum firmware version, the system can / will become non-bootable, this has happened. I have not checked the firmware version before preforming the upgrade, and believe that this has caused the issue.
For anyone intending to upgrade the Xeon v3 to v4 on the c220m4, please upgrade the firmware to the minimum version before attempting any upgrade.
Thank you all for your help.
MM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide