cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
8449
Views
19
Helpful
1
Replies

Thermal condition on chassis error

jibber.mark1
Level 1
Level 1

Hi Memebers

I Received the following error; Thermal condition on chassis 2. IOM-B reports: IOM unable to get thermal sensor reading from blades(2); IOM-A reports: Thermal sensor reading not available from blades(2)

any suggestions on what it could be

1 Accepted Solution

Accepted Solutions

Qiese Dides
Cisco Employee
Cisco Employee

Hi Mark,

I have seen this error in alot of issues relating to I2C. If you upload your Chassis logs we could get a better idea. However, if this is I2C there are two possible work arounds, one is less disruptive than the other but the last one is the best method if you are hitting this issue.

A less disruptive workaround would be the following, you still might have I2C issues and need to do the full work around.

  1. SSH to your Fabric Interconnect
    2. Switch to side A local-management:
    connect local-mgmt a
    3. Connect to side A IOM of your chassis:
    connect iom <chassis #>
    4. Enter this command to find out if it's active or not:
    show platform soft cmc thermal status | grep status:
    => If it says PASSIVE, you need to restart this IOM (IOM 1 on fabric A)
    via UCSM, if it says ACTIVE, it's the other side (IOM 2 on fabric B) that
    needs to be reset.
    5. To do the reset, please just go Equipment > Chassis > Chassis <#> > IO
    Modules > IO Module 1/2 > General tab > Reset IO Module

The full work around would be below;

Remove PSU1 let sit for 2 minutes replace, wait 10 seconds confirm PSU1 has power, Move to PSU2
Remove PSU2 let sit for 2 minutes replace, wait 10 seconds has power, Move to PSU3
Remove PSU3 let sit for 2 minutes replace, wait 10 seconds PSU3 has power, Move to PSU4
Remove PSU4 let sit for 2 minutes replace, wait 10 seconds PSU4 has power, Move to Fan1

Remove Fan1 let sit for 30 seconds replace, wait 10 seconds confirm Fan1 has power, Move to Fan2
Remove Fan2 let sit for 30 seconds replace, wait 10 seconds confirm Fan2 has power, Move to Fan3
Remove Fan3 let sit for 30 seconds replace, wait 10 seconds confirm Fan3 has power, Move to Fan4
Remove Fan4 let sit for 30 seconds replace, wait 10 seconds confirm Fan4 has power, Move to Fan5
Remove Fan5 let sit for 30 seconds replace, wait 10 seconds confirm Fan5 has power, Move to Fan6
Remove Fan6 let sit for 30 seconds replace, wait 10 seconds confirm Fan6 has power, Move to Fan7
Remove Fan7 let sit for 30 seconds replace, wait 10 seconds confirm Fan7 has power, Move to Fan8
Remove Fan8 let sit for 30 seconds replace, wait 10 seconds confirm Fan8 has power, Move to IO MOD1

Remove IO Mod 1 let sit for 5 minutes replace, confirm that IO MOD is UP and Running before you reseat IOMOD 2
Once IO MOD1 is Up and Running finally reseat IO MOD 2 let sit for 5 minutes, and place it back into the chassis.

This is the complete reseat process to clear the i2c bus.

Once again this is the workaround for I2C we can verify this if you attach your Chassis logs but I know due to security concerns that this may not be possible. If that is the case the best solution would be to open a case with TAC and they can verify this issue.

Please let me know if there is anything else I could assist you with. If you don't mind uploading the logs I can take a look at the issue.

Regards,

Qiese Dides

View solution in original post

1 Reply 1

Qiese Dides
Cisco Employee
Cisco Employee

Hi Mark,

I have seen this error in alot of issues relating to I2C. If you upload your Chassis logs we could get a better idea. However, if this is I2C there are two possible work arounds, one is less disruptive than the other but the last one is the best method if you are hitting this issue.

A less disruptive workaround would be the following, you still might have I2C issues and need to do the full work around.

  1. SSH to your Fabric Interconnect
    2. Switch to side A local-management:
    connect local-mgmt a
    3. Connect to side A IOM of your chassis:
    connect iom <chassis #>
    4. Enter this command to find out if it's active or not:
    show platform soft cmc thermal status | grep status:
    => If it says PASSIVE, you need to restart this IOM (IOM 1 on fabric A)
    via UCSM, if it says ACTIVE, it's the other side (IOM 2 on fabric B) that
    needs to be reset.
    5. To do the reset, please just go Equipment > Chassis > Chassis <#> > IO
    Modules > IO Module 1/2 > General tab > Reset IO Module

The full work around would be below;

Remove PSU1 let sit for 2 minutes replace, wait 10 seconds confirm PSU1 has power, Move to PSU2
Remove PSU2 let sit for 2 minutes replace, wait 10 seconds has power, Move to PSU3
Remove PSU3 let sit for 2 minutes replace, wait 10 seconds PSU3 has power, Move to PSU4
Remove PSU4 let sit for 2 minutes replace, wait 10 seconds PSU4 has power, Move to Fan1

Remove Fan1 let sit for 30 seconds replace, wait 10 seconds confirm Fan1 has power, Move to Fan2
Remove Fan2 let sit for 30 seconds replace, wait 10 seconds confirm Fan2 has power, Move to Fan3
Remove Fan3 let sit for 30 seconds replace, wait 10 seconds confirm Fan3 has power, Move to Fan4
Remove Fan4 let sit for 30 seconds replace, wait 10 seconds confirm Fan4 has power, Move to Fan5
Remove Fan5 let sit for 30 seconds replace, wait 10 seconds confirm Fan5 has power, Move to Fan6
Remove Fan6 let sit for 30 seconds replace, wait 10 seconds confirm Fan6 has power, Move to Fan7
Remove Fan7 let sit for 30 seconds replace, wait 10 seconds confirm Fan7 has power, Move to Fan8
Remove Fan8 let sit for 30 seconds replace, wait 10 seconds confirm Fan8 has power, Move to IO MOD1

Remove IO Mod 1 let sit for 5 minutes replace, confirm that IO MOD is UP and Running before you reseat IOMOD 2
Once IO MOD1 is Up and Running finally reseat IO MOD 2 let sit for 5 minutes, and place it back into the chassis.

This is the complete reseat process to clear the i2c bus.

Once again this is the workaround for I2C we can verify this if you attach your Chassis logs but I know due to security concerns that this may not be possible. If that is the case the best solution would be to open a case with TAC and they can verify this issue.

Please let me know if there is anything else I could assist you with. If you don't mind uploading the logs I can take a look at the issue.

Regards,

Qiese Dides

Review Cisco Networking for a $25 gift card

Review Cisco Networking for a $25 gift card