05-13-2025 01:54 PM
Hello Everyone,
I am receiving the following error on Cisco UCS Manager,
<faultInst
ack="yes"
cause="health-led-amber-blinking"
changeSet=""
code="F1236"
created="2023-06-13T21:20:13Z"
descr="sys/rack-unit-4/health-led shows error. Reason DDR4_P1_A1_ECC:Sensor Threshold Crossed; "
highestSeverity="critical"
id="1696528"
lastTransition="2023-06-13T21:23:27Z"
lc="none"
occur="2"
origSeverity="critical"
prevSeverity="cleared"
rule="equipment-health-led-critical-error"
severity="critical"
tags="server"
type="equipment"
dn="sys/rack-unit-4/health-led/fault-F1236"
status="created"
sacl="addchild,del,mod">
</faultInst>
05-13-2025 02:50 PM
Hi Hami,
It appears that one of your DIMMMs DDR4_P1_A1 has crossed the threshold of ECC - correctible errors. You can find more details in the server SEL logs. It appears that the DIMM has to be replaced. If you have an active Cisco Support Contract, you can open Cisco SR to RMA the DIMM.
Derek
05-14-2025 07:27 AM
Hi,
How can I check if we have active Cisco Support Contract?
05-15-2025 07:01 AM
1 If you log a support call with Cisco and enter the server serial number, it should tell you if the server is still under maintenance.
2 You can Check Device Coverage by entering the SN at https://cway.cisco.com/sncheck/
05-15-2025 10:20 AM
Hi,
I checked it's not under warranty.
I am also having these errors.
Server 4 (service profile: org-root/org-server/org-VoIP/ls-Storage1) health: inoperable
RAID Battery on server 4 operability: inoperable. Reason: BBU has failed, needs replacement
DIMM DIMM_A1 on server 4 operability: inoperable
sys/rack-unit-4/health-led shows error. Reason DDR4_P1_A1_ECC:Sensor Threshold Crossed;
Does anyone know how I can fix this error? I don't have any experience with UCS, so any help would be really appreciated.
Thank you
05-16-2025 01:10 AM
The error message for hardware failing are usually fairly self-explanatory:
RAID Battery on server 4 operability: inoperable. Reason: BBU has failed, needs replacement
= replace the battery on the RAID controller or swap with a known good one.
DIMM DIMM_A1 on server 4 operability: inoperable
= replace the DIMM in the failed position. Sometimes reseating the DIMM or swapping with a DIMM in another slot then reseat. Also have a look at Troubleshoot DIMM Memory Issues in UCS
05-16-2025 06:52 AM
Hi,
Thanks Riaan.
Do you know how I can enbale SSH, I have access to UCS GUI but I don't know the SSH password, can I reset or setup as new password on UCS?
05-16-2025 08:15 AM
Hi,
I finally found the issue the RAM on Dimm A1 is not working anymore. So all of the following errors are due to this.
Error: DIMM DIMM_A1 on server 4 operability: inoperable
Code: F0185
Error: sys/rack-unit-4/health-led shows error. Reason DDR4_P1_A1_ECC:Sensor Threshold Crossed;
Code: F1236
Error: Server 4 (service profile: org-root/org-server/org-VoIP/ls-Storage1) health: inoperable
Code: F0317
Now I just need a couple of things.
How I can enable SSH from the UCS GUI?
Second how I can shutdown this server gracefully?
Thank you
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide