cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
14698
Views
5
Helpful
5
Replies

Errors with UCS

v-miamos
Level 1
Level 1

I'm in the process of building out rather a lot of UCS B200 M3 chassis (I'm on chassis 10 of 40), and just encountered an error I haven't seen before and can't really find a doc that states what the error code is for:

 

"IERR:Sensor Failure Asserted;"

Details - 
<computeHealthLedSensorAlarm
alarmDesc="Sensor Failure Asserted"
alarmSeverity="minor"

dn="sys/chassis-3/blade-4/health-led/sensor-alarm-153"

sensorId="153"
sensorName="IERR"
>
</computeHealthLedSensorAlarm>

 

It's reporting as "minor", but it's causing my server builds on these three blades to hang.

I'm reasonably certain it's not the chassis, as this chassis was running a set of M2 blades before I started this upgrade process.  This is also the second set of M3 blades I've had in the slots with issues, so I'm reasonably certain it's not the blades themselves.

 

What does this error code reference?  One possibility is memory, as that's the one component swapped over when I changed out the blades; but generally when there's an issue with memory it gets called out explicitly with which slot(s) are having problems rather than the error code above.

 

Any ideas?

5 Replies 5

Wes Austin
Cisco Employee
Cisco Employee

This code is indicative of a processor error. If you are seeing this on a single blade, it may require a motherboard replacement.

Which sensor does that error code track to?

 

And also, I replaced the physical blade with another one, and I'm getting the same error code occurring.  It seems odd that I'd suddenly have 6 bad blades with the same error codes.

Please do send me what that alert maps to if you can, but I've traced the issue to problems with some of the installed memory.

 

Actually, if there's a reference guide somewhere that just maps all of the error codes to sources that would be great; I could just refer to that when/if I encounter any more errors.

IERR error is a processor error, sometimes indicative of a failed hardware component, like system board.

 

How to Recover from an IERR for Intel® Server Boards

 

What am I seeing?
An IERR is a Processor Internal Error.

Why am I seeing it?
This error is a signal that indicates a processor unrecoverable error or even a non-CPU event, such as a system bus interruption or a memory interruption, can start this signal.

How to fix it
On the Intel® Server Boards listed at the bottom of this page, you can confirm or discard a Processor IERR from the Basic Input/Output System (BIOS) Setup Utility under Advanced > Processor Configuration > CPU Retest.

The IERR Filtering Algorithm helps you determine if the IERR signal came from a false CPU internal error or from another hardware source. This filtering algorithm helps you prevent unnecessary processor replacements. At the same time, this algorithm helps you to isolate IERR events. If the IERR returns after the CPU Retest, the IERR signal most likely came from the CPU itself. If you have more than one processor installed, check the System Event Log (SEL) to find out which processor is generating the IERR.

In some cases, a system restart can also eliminate an IERR.

I've tried checking the SEL and doing a system restart with no success. What else can I try?
If the problem persists:

  1. Try to start the system with one processor at a time.
  2. Test another processor if possible.
  3. Remove and reinstall the memory.

 

Please ensure you are installing memory that is supported and configured per B200-M3 Spec Sheet. Otherwise, unexpected behavior may occur. I would agree that if you are seeing this across multiple servers it is likely memory configuration related.

MA Khatri
Level 1
Level 1

Hello, 

I am having the same issue, and replacing another server in the same chassis slot got same error.

Appreciate if you share how were you able to solve this issue.

Thank You, 

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card