07-24-2012 04:38 AM - edited 03-01-2019 10:31 AM
Hi,
I know this error used to be common a while back, but I'm running 2.0(2q) with four apparently healthy PSUs per chassis in n+1 mode and consuming using less than 800 watts across four B200 M2 blades. Help?
Thanks,
Hamish
06-21-2013 01:47 PM
Bruno,
Could you please run the following commands and attach them here:
*connect local a
*connect iom # <<< # of the chassis where you see the power prob
*show platform soft cmc thermal status
*show platform soft cmc power redundancy
Next
*connect local b
*connect iom # <<< again same chassis number
*show platform soft cmc power redundancy
-Kenny
06-21-2013 03:29 PM
Hi Kenny,
See the attached files regarding Fabric-A and Fabric-B
FI-BERNA-A /chassis # show psu-control detail
Psu Control:
Redundancy: NPlus1
Input Power: Ok
Output Power: Ok
Cluster Power: Slot 1 Master
Overall Status: Failed
Config Error: Redundancy Lost
FI-BERNA-A /chassis # show fault
Severity Code Last Transition Time ID Description
--------- -------- ------------------------ -------- -----------
Major F0408 2013-06-20T11:03:06.262 341965 Power state on chassis 2 is redundancy-failed
FI-BERNA-A /chassis # show psu detail
PSU:
PSU: 1
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis
PID: UCSB-PSU-2500ACPL
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM163000M7
HW Revision: 0
Firmware Version: N/A
PSU: 2
Overall Status: Operable
Operability: Operable
Threshold Status: N/A
Power State: PwrSave
Presence: Equipped
Thermal Status: OK
Voltage Status: N/A
Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis
PID: UCSB-PSU-2500ACPL
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM163000MT
HW Revision: 0
Firmware Version: N/A
PSU: 3
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis
PID: UCSB-PSU-2500ACPL
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM162900A0
HW Revision: 0
Firmware Version: N/A
PSU: 4
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis
PID: UCSB-PSU-2500ACPL
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM162900A1
HW Revision: 0
Firmware Version: N/A
Kind regards,
Bruno Fernandes
06-21-2013 03:51 PM
Bruno,
Thanks for the information.
So PSU is in a power save mode:
PSU: 2
Overall Status: Operable
Operability: Operable
Power State: PwrSave <<<
From the Active IOM, I can see this:
fex-1# show platform software cmcctrl power redundancy
==============================
Cluster master : yes <<< Shows we are in the primary IOM
Policy : N+1
State : Lost <<< This is the only problem cause the PSU is fine
Total power available : 7500 <<< 3 PSUs available
Total power usage : 856 <<<< 1 PSU is more than enough to cover this
Power budget requested : 5472 < However the chassis asks for 3 PSUs to be active, this is not a expected value
-----------
Grid : 0
Active PS : 0 2 3
Spare PS : 1 <<<< 1 is actually PSU 2, which shows up in power save mode
Unavailable PS :
-----------
==============================
Actions suggested:
1-Change the power policy from N+1 to Grid and vice versa
2-Follow the instructions in the bug CSCty64894 (Note those steps are not disruptive)
http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCty64894
I hope this helps, otherwise, let us know.
-Kenny
06-21-2013 04:15 PM
Kenny,
Just to confirm with you:
Regarding the suggested actions:
1-Change the power policy from N+1 to Grid and vice versa
2-Follow the instructions in the bug CSCty64894 (Note those steps are not disruptive)
Neither step 1 or 2 area disruptive correct ? Step 2 has stated it's not....regarding step 1 I see no reason for being disruptive but I'm not 100% confident, sorry for the basic question, but I have no spare UCS to confirm and this is already in production....so I need to be 100% confident
Kind regards,
Bruno
06-21-2013 04:18 PM
Bruno,
Totally save steps, no disruption whatsoever since all your PSUs show up as operable.
-Kenny
06-22-2013 02:49 AM
Hi Kenny,
I have done both steps with no result, but then this morning juste repeated step 1 and waitted a little longer and the fault went gone, also the chassis recovered is healthy state (regarding poert redundancy). But still the the same PSU has a strange result "Threshold and Voltage Status"
Could this be that since we are using N+1 and in this case we area using only 3 psu ?????
FI-BERNA-B /chassis # show psu detail
PSU:
PSU: 1
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis
PID: UCSB-PSU-2500ACPL
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM163000M7
HW Revision: 0
Firmware Version: N/A
PSU: 2
Overall Status: Operable
Operability: Operable
Threshold Status: N/A
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: N/A
Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis
PID: UCSB-PSU-2500ACPL
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM163000MT
HW Revision: 0
Firmware Version: N/A
PSU: 3
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis
PID: UCSB-PSU-2500ACPL
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM162900A0
HW Revision: 0
Firmware Version: N/A
PSU: 4
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis
PID: UCSB-PSU-2500ACPL
VID: V00
Vendor: Cisco Systems Inc
Serial (SN): DTM162900A1
HW Revision: 0
Firmware Version: N/A
Kind regards,
Bruno Fernandes
06-24-2013 08:55 AM
Bruno,
Thanks for the feedback, I am glad the power redundancy error message is gone now.
In regards to the power supply not showing all the correct status info, I will recommend you to open a case, like Javier mentioned, this can be a I2C bus issue, where your PSU is not either being able to deliver his status messages or the primary IOM is just not receiving it, but this definitely needs further/deeper analysis.
Please open a TAC case.
-Kenny
06-23-2013 02:03 PM
Hi Bruno,
Not yet. TAC engineers are still working in the case (625642101). We're waiting for an RMA of the 4 PSUs in one chassis. One PSU seems to be caussing errors in the I2C bus. We'll probably upgrade to a 2.1 due to compatibility with new
SAN equipment (also to solve the bugs that seems to be affecting the system).
Regards
06-21-2013 07:03 PM
Hello,
Please check mentioned link. I hope it will help.
06-23-2013 10:07 AM
Hi Guys
Just to give all an update ,
1-Change the power policy from N+1 to Grid and vice versa
Worked for us.
07-25-2013 02:06 PM
Hi,
We recently upgrade to 2.1(2a). All seems to be working fine. Let's see how it behaves from now on...
Regards,
Javier
07-26-2013 12:07 AM
Got notification this week only reagrding the power supplies for the Chassis.
UCS B-Series chassis power supplies have an issue which can cause shutdown when activated in a redundancy switchover.
Affected units can be identified by the version and serial number format defined in below link.
07-26-2013 12:38 AM
Hi,
Thanks for the info. We have 2 chassis potentially affected by this issue. We have to check the deviation label.
Regards
07-26-2013 07:58 AM
Hello All,
If you happen to be affected by this Field Notice, please remember you need a TAC Service Request Number and just make reference to this FN#. If you may attach screenshots/pics that will make processes to be faster and that way TAC does not have to ask for any further information.
Also, please remember that there is no need for a single case for each PSU; you may confirm how many of these PSUs have problems and then just specify the quantity in the form with the Serial Numbers separated by commas, if this will include more than 4,000 characters including blank spaces and commas, then you will need to fill up more forms.
-Kenny
05-24-2015 11:31 PM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide