P_CATERR_N Processor Error
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-12-2015 02:23 AM
Hello,
We had a Server Shutdown on UCS C220.
In the IMC Console we see this
[F0174][critical][equipment-inoperable][sys/rack-unit-1/board] P_CATERR_N: A catastrophic fault has occurred on one of the processors: Please check the processors' status.
P_CATERR_N: Processor sensor, Predictive Failure asserted
I searched web for
P_CATERR_N, but could not find anything.
What is wrong ? Any idea ? Thanks a lot
Armin
- Labels:
-
Server Networking

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-12-2015 09:55 AM
Hi,
Well, this is not good sign. I have seen this ERROR leads to resolution as either firmware upgrade or RMA. Please open a TAC case for it.
- Ashok
******************************************************************************************************
Please rate the post or mark as correct answer as it will help others looking for similar information
******************************************************************************************************

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-12-2015 08:37 PM
P_CATERR-N means a Processor Catastrophic Error on your server... Sometimes this errors show up during server POST and then go away the next second; so the best advice is to open a TAC case and see if your crash matches the time CATERR error in the logs so we can tell you if that is the real cause of the reboot/shutdown.
-Kenny
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2017 02:02 PM - edited 11-07-2017 02:04 PM
Anyone ever have success working around/through this? Running Cisco ESXi 6.5 on our UCSC-C240-M3S with BIOS Version: C240M3.3.0.3a.0 (Build Date: 03/15/17)
This is almost certainly a cause of attempting to passthrough a single PCI device -- nVidia Quadro 2000 (we have three equipped, just trying to get a single GPU). Thank you kindly for your attention to our little matter.
CIMC shows a EQUIPMENT_INOPERABLE Fault [0174][critical][equipment-inoperable][sys/rack-unit-1/board] P_CATERR_N: A catastrophic fault has occurred on one of the processors: Please check the processors' status... which is then immediately resolved upon power cycling the machine. Unfortunately, it takes everything down with it. The single node is brought to its knees.
I have attempted to refer to this documentation: https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/c/sw/fault/reference/guide/Cisco_UCS_C-Series_Servers_CIMC_Faults/CIMC_Faults.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-08-2017 09:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2018 05:12 AM
@peeat I'm facing the same issue. Could you solved your issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-12-2019 02:12 AM - edited 02-12-2019 02:16 AM
No sir, @Ali Amir I was never able to successfully work through this. I was able to get a Windows 10 VM to brief support the same/similar (single Quadro 2000 -- my Cisco rack server has three of these installed in it, perfect for VDI deployment) in a custom HPE 6.7 ESXi build, but that only lasted until I rebooted the VM, now I'm getting the entire host hanging and other odd behaviors. If I wait a very long time, the machine finally comes up and I can access it remotely, but if you were to stand there locally and watch the screen, the progress bar gets stuck and never appears to finish loading.
Trying to update firmware on my C240-M3S to latest version 3.0(4j) and hopefully try again with CISCO Custom Image for ESXi 6.7 U1 GA. Will keep you posted if I make any progress, sir. Please do the same if you have found a resolution. Hate having three GPUs in this machine taking up space and wasting electricity with no ability to properly utilize them. I know they are working find, as I booted the machine into Windows and was able to apply drivers and test all three cards independently, so it's definitely an issue with either VMware's product, or more likely, my configuration.
EDIT: I can tell you that simply rebooting the host resolves the "catastrophic" failure, but boy that certainly scared me the first time I saw it come up and wasn't sure what I had done, if I had truly messed up. Thankfully it was just related to the PCI Passthrough. Will be looking to invest some time in the coming days/weeks to hopefully give it another go and see if we can work around the issue.
There was this thread that I attempted to refer to, it seemed to have significantly more information than anyone around here was able to provide: https://forums.servethehome.com/index.php?threads/troubleshooting-gpu-passthrough-esxi-6-5.12631/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-15-2019 03:50 AM
This appeared on one of our servers, however we do not utilize GPU cards within the systems, we are utilizing the latest firmware for the systems (ucs-c240m5-huu-4.0.2f) and ESXi6.5U2 Custom ISO for Cisco (VMware-ESXi-6.5.0-9298722-Custom-Cisco-6.5.2.2). Before opening a TAC case I wanted to see if anyone came to a resolution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-15-2019 03:49 PM
@brian-henry my apologies I missed your reply. Sadly, I wasn't ever able to resolve the issue, despite being able to replicate it on demand -- i got sick of crashing the entire host, so I just gave up and stopped messing with them and the machine has continued to operate wonderfully (as long as I'm not tinkering with PCI Passthrough). I've heard others talk about temperature sensors going haywire and it causing random issues, but I'm hardly an expert on such things.
Hopefully in the months since you've been able to resolve your issue(s).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2022 11:19 PM
I am facing the same issue on HX 220C M5SX Firmware Version: 4.1(1d)
2022-07-14 14:42:23 Warning IERR: Processor sensor, Predictive Failure asserted was asserted
2022-07-14 14:42:23 Warning System Software event: Node manager sensor sensor, Record type: 02, Sensor Number: 17, Event Data: A0 0D 03, Raw SEL: 2c 60 04 DC 17 75 A0 0D 03 was asserted.
