11-19-2015 02:52 PM - edited 03-01-2019 12:28 PM
Hello All,
I have been facing below error in UCS 460 M1 servers. The server is stuck and giving the follwoing errors.
In the CIMC console I'm getting follwoing ( Attached snapshot)
Critical F0174 sys/rack-unit-1/board equipment-inoperable CATERR_N: A catastrophic fault has occurred on one of the processors: Please check the processors' status.
Major F0743 sys/rack-unit-1 psu-redundancy-fail PS_REDUNDANCY: Power Supply redundancy is lost : Reseat or replace Power Supply
I have search alot but could not find any thing.
Thanks in advacned for your kind support. Regards,
11-19-2015 03:45 PM
Hello,
It appears as though one of your CPU has possibly failed in your server. I would collect the tech support files and open a TAC case to investigate.
As far as the second error, and judging by the rest of the faults in the screen capture, you are running on 2 PSU, as PSU3 and PSU4 appear to be missing. If you are only running on 2x PSU, the server redundancy is setup to throw an alarm when it is missing one or more PSU.
HTH,
Wes
11-20-2015 07:28 AM
Hi Wesley,
We have been facing the same error in our all our Cisco UCS 460M1 servers, Before all servers were working fine but we want to install Server 2012 r2 in that servers so that i have ran host upgrade utility and after that the problem appeared in all our servers.
11-23-2015 05:58 AM
Hey,
It sounds like when you upgraded firmware, the thresholds for power redundancy changed. For example, in older versions of CIMC, it may not throw a Major or even any kind of fault if you are only running on two PSU, when there are four available ports.
In new versons, if there are 4 available slots and you only occupy 2, technically the power redundancy threshold is not met, meaning a fault will be raised. This can be safely ignored if you are okay with only using two PSU.
HTH,
Wes
11-20-2015 12:17 AM
Hi Srahmedcisco,
If you could go ahead and upload your show-tech-support logs that would greatly help in identifying this issue. What it seems on the surface is that you are having a hardware failure on the processor and an issue with the power supply.
Could you try the following steps while uploading the logs to see if this clears the errors.
1) Turn off the server and open the cover
2) Remove all CPUs and All DIMMs from the server
3) Install only 1 CPU and 1 DIMM (depends on the server specification for the server), then boot the server if. If it boots that is fine and that means that this CPU is fine.
4) Remove this CPU and install the second CPU with also one DIMM and boot it if it boots that means that the second CPU is fine.
5) If the server does not boot in all these conditions that means the motherboard has failed.
6) Also reseat the powersupply and see if that clears the ID F0743 depending on how many PSUs you are running.
11-20-2015 07:22 AM
This is a duplicate of : https://supportforums.cisco.com/discussion/12710116/caterrn-catastrophic-fault-has-occurred-one-processors-please-check-processors
-Kenny
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide