06-17-2015 04:53 AM - edited 03-01-2019 12:14 PM
Have 4 new UCS chassis in rack. 3 of them have PSUs 2 & 4 in an 'off' state, even though the green light is lit on all 4 PSUs in the chassis. All chassis are plugged in same PDU in rear of rack so I know we have clean power.
I have tried setting from Grid to N+1 and back to no avail. Any tips would be helpful
Chassis 1:
Overall Status: Power Problem
Operability: Operable
Power State: Ok
Thermal Status: Ok
PSU 1:
Threshold Status: OK
Overall Status: Operable
Operability: Operable
Power State: On
Thermal Status: OK
Voltage Status: OK
PSU 2:
Threshold Status: N/A
Overall Status: N/A
Operability: N/A
Power State: Off
Thermal Status: N/A
Voltage Status: N/A
PSU 3:
Threshold Status: N/A
Overall Status: Operable
Operability: Operable
Power State: On
Thermal Status: OK
Voltage Status: N/A
PSU 4:
Threshold Status: N/A
Overall Status: N/A
Operability: N/A
Power State: Off
Thermal Status: N/A
Voltage Status: N/A
Solved! Go to Solution.
06-17-2015 09:34 AM
On that version, most of the I2C issues have been fixed but if the chassis has been affected for a while, your I2C may have been congested for a while.
To clear the I2C bus, you need to reseat the components that use it... PSUs, Fans and IOMs so the recommendation would be to:
1-reseat each fan and PSU (one at the time and waiting 3 minutes before putting it back) once up again, proceed with the next one
2-Remove the subordinate IOM, keep it out of the chassis for about 5 mins to drain all power, put it back and wait until it comes back before proceeding with the other one.
Be sure you have network redundancy configured before doing this and if you need further assistance or dont feel comfortable with the process, contact TAC for further assistance.
If this helps, please rate it... if it solves the issue, please mark it as such for future users to take advantage of it too.
-Kenny
06-17-2015 05:54 AM
We need to begin with the physical layer... did you reseat the power cables and PSUs already? Have you changed the PSU's cable with one of the working PSUs to see if the issue is with the cables? (unlikely, but better rule it out).
Let me know if it doesnt help.
-Kenny
06-17-2015 06:44 AM
Yes I reseated PSU 2 & 4, I have also swapped cables from PSU 3 & 4 and even swapped PSU 1 & 4
Something appears to be looping maybe ...Here's some output from my FI:
cdcfi01-B /chassis # show psu detail
PSU:
PSU: 1
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: Platinum II AC Power Supply for UCS 5108 Chassis
PID: UCSB-PSU-2500ACDV
VID: V01
Part Number: 341-0571-01
Vendor: Cisco Systems Inc
Serial (SN):
HW Revision: 0
Firmware Version: N/A
PSU: 2
Overall Status: Operable
Operability: Operable
Threshold Status: N/A
Power State: On
Presence: Equipped
Thermal Status: N/A
Voltage Status: N/A
Product Name: Platinum II AC Power Supply for UCS 5108 Chassis
PID: UCSB-PSU-2500ACDV
VID: V01
Part Number: 341-0571-01
Vendor: Cisco Systems Inc
Serial (SN):
HW Revision: 0
Firmware Version: N/A
PSU: 3
Overall Status: Operable
Operability: Operable
Threshold Status: N/A
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: N/A
Product Name: Platinum II AC Power Supply for UCS 5108 Chassis
PID: UCSB-PSU-2500ACDV
VID: V01
Part Number: 341-0571-01
Vendor: Cisco Systems Inc
Serial (SN):
HW Revision: 0
Firmware Version: N/A
PSU: 4
Overall Status: Operable
Operability: Operable
Threshold Status: N/A
Power State: On
Presence: Equipped
Thermal Status: N/A
Voltage Status: N/A
Product Name: Platinum II AC Power Supply for UCS 5108 Chassis
PID: UCSB-PSU-2500ACDV
VID: V01
Part Number: 341-0571-01
Vendor: Cisco Systems Inc
Serial (SN):
HW Revision: 0
Firmware Version: N/A
cdcfi01-B /chassis # show psu
PSU:
PSU Overall Status
---------- --------------
1 Operable
2 N/A
3 Operable
4 N/A
cdcfi01-B /chassis # show psu detail
<CR>
> Redirect it to a file
>> Redirect it to a file in append mode
expand Expand
| Pipe command output to filter
cdcfi01-B /chassis # show psu detail
PSU:
PSU: 1
Overall Status: Operable
Operability: Operable
Threshold Status: OK
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: OK
Product Name: Platinum II AC Power Supply for UCS 5108 Chassis
PID: UCSB-PSU-2500ACDV
VID: V01
Part Number: 341-0571-01
Vendor: Cisco Systems Inc
Serial (SN):
HW Revision: 0
Firmware Version: N/A
PSU: 2
Overall Status: N/A
Operability: N/A
Threshold Status: N/A
Power State: Off
Presence: Equipped
Thermal Status: N/A
Voltage Status: N/A
Product Name: Platinum II AC Power Supply for UCS 5108 Chassis
PID: UCSB-PSU-2500ACDV
VID: V01
Part Number: 341-0571-01
Vendor: Cisco Systems Inc
Serial (SN):
HW Revision: 0
Firmware Version: N/A
PSU: 3
Overall Status: Operable
Operability: Operable
Threshold Status: N/A
Power State: On
Presence: Equipped
Thermal Status: OK
Voltage Status: N/A
Product Name: Platinum II AC Power Supply for UCS 5108 Chassis
PID: UCSB-PSU-2500ACDV
VID: V01
Part Number: 341-0571-01
Vendor: Cisco Systems Inc
Serial (SN):
HW Revision: 0
Firmware Version: N/A
PSU: 4
Overall Status: N/A
Operability: N/A
Threshold Status: N/A
Power State: Off
Presence: Equipped
Thermal Status: N/A
Voltage Status: N/A
Product Name: Platinum II AC Power Supply for UCS 5108 Chassis
PID: UCSB-PSU-2500ACDV
VID: V01
Part Number: 341-0571-01
Vendor: Cisco Systems Inc
Serial (SN):
HW Revision: 0
Firmware Version: N/A
cdcfi01-B /chassis #
cdcfi01-B /chassis #
This is the first time I've caught status 'On' for PSU 2 & 4 via CLI
06-17-2015 07:43 AM
Do the following:
#connect local a
#connect iom <Chassis number>
#show platform software cmc power red
#show platform software cmc showi2c
#exit
#connect local b >>>>> and repeat the commands above
What is the firmware version you are running?
-Kenny
06-17-2015 07:49 AM
06-17-2015 08:34 AM
This seems to be a problem with the I2C bus, what firmware are you running?
cdcfi01-B# connect local-mgmt a
cdcfi01-A(local-mgmt)# connect iom 1
Attaching to FEX 1 ...
To exit type 'exit', to abort type '$.'
Bad terminal type: "dumb". Will assume vt100.
fex-1# show platform software cmcctrl po
post power
fex-1# show platform software cmcctrl power redundancy
==============================
Cluster master : yes <<< Primary IOM in the chassis
Policy : Grid
State : Lost <<<< Not complaint with the power policy
Total power available : 5000 <<<< 2 PSUs available
Total power usage : 412
Power budget requested : 5000
-----------
Grid : 0
Active PS : 0
Spare PS :
Unavailable PS : 1 <<<<<< PSU 2
-----------
-----------
Grid : 1
Active PS : 2
Spare PS :
Unavailable PS : 3 <<<< PSU 4
-----------
==============================
fex-1#
fex-1# show platform software cmcctrl showi2c
# I2C Bus Statistics Wed Jun 17 09:46:56 CDT 2015
# I2C Bus 2
busn=1 nseg=5
segment 0 local
segment 1 chassis
norxack 25
pca9541seterr 32
wait_gt_deadline 67
segment 2 blade
segment 3 fan
norxack 333 <<< Not good
wait_gt_deadline 22068
segment 4 psu
norxack 766 <<<< not good
pca9541clrerrprs 572 <<<< not good
pca9541seterr 194
wait_gt_deadline 20453
gilroy.error.pca9541_control_state 35
error_pca9541_per_device:
c.ms 3
p.psu3.psmi 52 <<<< PSU ERRORS
p.psu0.ms 13 <<<< PSU ERRORS
p.psu1.fru 54 <<<< PSU ERRORS
p.psu1.ms 253 <<<< PSU ERRORS
p.psu1.psmi 35 <<<< PSU ERRORS
p.psu3.fru 53 <<<< PSU ERRORS
p.psu3.ms 306 <<<< PSU ERRORS
c.gpio0 8 <<< Chassis midplane errors
c.gpio1 8
c.gpio2 8
c.gpio3 8
# I2C Device Statistics
p.psu0.psmi={SUCCESS=959191,EBUSY=1} <<< BUSY signal from PSU when the IOM queries its status
p.psu1.fru={SUCCESS=2,ETIMEDOUT=9} <<< Timeout signal from PSU2 when the IOM queries its status
p.psu3.fru={SUCCESS=2,ETIMEDOUT=8} <<< Timeout signal from PSU4 when the IOM queries its status
*******************************************************************************************
cdcfi01-A(local-mgmt)# exit
cdcfi01-B# connect iom 1
fex-1# show platform software cmcctrl po
post power
fex-1# show platform software cmcctrl power redundancy
==============================
Cluster master : no
Policy : Grid
State : Lost
Total power available : 5000
Total power usage : 414
Power budget requested : 5000
-----------
Grid : 0
Active PS : 0
Spare PS :
Unavailable PS : 1
-----------
-----------
Grid : 1
Active PS : 2
Spare PS :
Unavailable PS : 3
-----------
==============================
fex-1# show platform software cmcctrl showi2c
# I2C Bus 2
busn=1 nseg=5
segment 0 local
segment 1 chassis
norxack 52
pca9541seterr 31
wait_gt_deadline 61
segment 2 blade
segment 3 fan <<< Clear, compared against the other IOM
segment 4 psu
norxack 398 <<< Not good
pca9541clrerrprs 268
pca9541seterr 130
wait_gt_deadline 12969
error_pca9541_per_device:
c.ms 2
p.psu3.psmi 42 <<< Not good
p.psu0.ms 26 <<< Not good
p.psu1.fru 30
p.psu1.ms 119
p.psu1.psmi 34
p.psu3.fru 24
p.psu3.ms 123
c.gpio0 8
c.gpio1 8
c.gpio2 8
c.gpio3 7
# I2C Device Statistics
p.psu1.fru={SUCCESS=5,ETIMEDOUT=5} <<< Timeout signal from PSU2 when the IOM queries its status
p.psu3.fru={SUCCESS=4,ETIMEDOUT=4} <<< Timeout signal from PSU4 when the IOM queries its status
-Kenny
06-17-2015 08:38 AM
cdcfi01-B# sh version
System version: 2.2(3c)
06-17-2015 09:34 AM
On that version, most of the I2C issues have been fixed but if the chassis has been affected for a while, your I2C may have been congested for a while.
To clear the I2C bus, you need to reseat the components that use it... PSUs, Fans and IOMs so the recommendation would be to:
1-reseat each fan and PSU (one at the time and waiting 3 minutes before putting it back) once up again, proceed with the next one
2-Remove the subordinate IOM, keep it out of the chassis for about 5 mins to drain all power, put it back and wait until it comes back before proceeding with the other one.
Be sure you have network redundancy configured before doing this and if you need further assistance or dont feel comfortable with the process, contact TAC for further assistance.
If this helps, please rate it... if it solves the issue, please mark it as such for future users to take advantage of it too.
-Kenny
06-17-2015 11:26 AM
Great, none of this gear is in production yet - we just racked/stacked a couple weeks ago. I'll proceed and report back -thanks for the help
06-18-2015 07:09 AM
Thanks Kenny for the help....Since our gear wasn't even setup fully yet or running anything we just pulled power for 10minutes per chassis and it cleared the errors.
06-18-2015 07:16 AM
That was another way to go! ;0)
Glad to help!
-Kenny
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide