07-24-2012 04:38 AM - edited 03-01-2019 10:31 AM
Hi,
I know this error used to be common a while back, but I'm running 2.0(2q) with four apparently healthy PSUs per chassis in n+1 mode and consuming using less than 800 watts across four B200 M2 blades. Help?
Thanks,
Hamish
07-25-2012 12:23 PM
Can you paste the output of the following commands:
scope chassis x
show psu detail
show psu-control detail
show fault
Regards,
Robert
07-27-2012 04:32 PM
Hi Hamish ,
If you access the UCSM, do you see any PSU with a N/A status?
Does UCSM report any alerts such as upper non recoverable, thermal alerts or power redundancy lost?
Do you see any amber light on the power supplies?
You can reseat the power supplies one by one in order to see if they come back online.
Sometimes the chassis can generate thermal alerts and it can be related to a fan or even the IOM.
Go ahead and check the status of the power supplies physically and on UCSM.
Also collect the information from the commands Robert suggested.
05-15-2013 02:08 AM
Hello,
I'm having the same issue in 2.0(4d). I have a UCS-system with two 6120XP and a three 5108 chassis.
The system is configured for N+1 redundancy:
show psu-control detail
Psu Control:
Redundancy: NPlus1
Input Power: Ok
Output Power: Ok
Cluster Power: Slot 2 Master
Overall Status: Failed
Config Error: Redundancy Lost
C61UCSscto01-B /chassis # show fault
Severity Code Last Transition Time ID Description
--------- -------- ------------------------ -------- -----------
Major F0408 2013-05-14T18:33:53.694 740475 Power state on chassis 1 is redundancy-failed
show psu detail
PSU:
PSU: 1
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: 2500W 200-240VAC PSU for UCS 5108 Blade Server Chassis
PID: N20-PAC5-2500W
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM1616016M
HW Revision: 0
PSU: 2
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: 2500W 200-240VAC PSU for UCS 5108 Blade Server Chassis
PID: N20-PAC5-2500W
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM161703MM
HW Revision: 0
PSU: 3
Overall Status: N/A
Operability: N/A
Threshold Status: N/A
Power State: PwrSave
Presence: Equipped
Thermal Status: OK
Voltage Status: N/A
Product Name: 2500W 200-240VAC PSU for UCS 5108 Blade Server Chassis
PID: N20-PAC5-2500W
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM161703L8
HW Revision: 0
PSU: 4
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: 2500W 200-240VAC PSU for UCS 5108 Blade Server Chassis
PID: N20-PAC5-2500W
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM170304LJ
HW Revision: 0
Any clue?
Regards,
Javier
05-15-2013 03:13 AM
Hi,
Not sure why this is the first reply I've recieved notification for.
I opened a case with Cisco on this and was advised to change the policy to Grid, then back to N+1. This cleared the error for me.
Hope this helps.
Hamish
05-15-2013 05:05 AM
The problem might be with the subordinate IOM. The active one, which returned the outputs requested above looks good.
Please do the following from the UCSM CLI:
ssh to fabric Interconnect A
connect iom x (x = chassis # showing redundancy lost)
show platform software cmcctrl power redundancy
ssh to fabric interconnect B
connect iom x
show platform software cmcctrl power redundancy
Thanks,
Robert
05-15-2013 08:54 AM
Hi Robert,
Here is the output of the command:
FabricB (active)
fex-1# show platform software cmcctrl power redundancy
==============================
Last update TS : 1718362
Stale TS : 1718422
Now : 1718378
Cluster master : yes
Policy : N+1
State : Lost
Total power available : 7500
Total power usage : 1713
Power budget requested : 5472
-----------
Grid : 0
Active PS : 0 1 3
Spare PS : 2
Unavailable PS :
-----------
==============================
FabricA (subordinate)
fex-1# show platform software cmcctrl power redundancy
==============================
Last update TS : 1718486
Stale TS : 1718546
Cluster master : no
Policy : N+1
State : Lost
Total power available : 7500
Total power usage : 1732
Power budget requested : 5472
-----------
Grid : 0
Active PS : 0 1 3
Spare PS : 2
Unavailable PS :
-----------
==============================
Thanks!
05-15-2013 12:48 PM
Have you tried to change the power policy to non-redundant and then back to N+1?
This is not disruptive.
05-15-2013 02:45 PM
Hi,
Yes, we did it, but didn't work for us...
Regards,
Javier
05-16-2013 04:27 PM
Javier,
Please open a TAC case, and attach the chassis (one for each chassis) and UCSM tech support to the case.
-Kenny
05-16-2013 08:53 PM
Please open a TAC case as Kenny suggested, but also look in to the following known defect and applying the workaround if applicable:
05-19-2013 11:50 PM
Hi,
According the bug toolkit, the CSCub53747 should be fixed in 2.0(4a). We're now in 2.0(4d).
We've opened a TAC case. I'll put here the conclussions.
Thanks!
Javier
05-29-2013 11:38 PM
Hi Javier
Did the TAC provide you with a workable solution ? I seem to have the same issue as well on my end. However , in my case its a UCS 5108 with 4 Power Supplies connected and on N+1 option.
05-30-2013 01:13 AM
Hi,
Not yet. We're working on that, but the TAC recommends to go to 2.0(5b). These bugs seems to be hitting us:
CSCue49366, CSCud48637/CSCue33889.
Javier
06-21-2013 11:51 AM
Hi Javier,
Have you already solved the issue ?
I'm having the same problem and I'm on
2.1(1e)
Regards,
Bruno
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide