cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3456
Views
0
Helpful
1
Replies

UCSBlade B200M4 degraded Memory errors

dganta
Level 1
Level 1

Hi Team,

There are few degradabale memory errors. As per the documents it says it need to do scope cimc and then reset. Though I see in the document just wanted to make sure the doine scope cimc and then reset for the server is not disruptive or do we have to plan a maintenance window  for the same. Please clarify.

Thanks,

Dinesh

1 Accepted Solution

Accepted Solutions

Kirk J
Cisco Employee
Cisco Employee

Greetings.

There are a couple of methods to reset the DIMM counters.  One is to reset the DIMM counters themselves, and is referenced below.  The other is the one you have referred to, by resetting the CIMC.  Resetting the CIMC causes what is termed a 'shallow' discovery, and does not impact the running OS.

http://www.cisco.com/c/dam/en/us/products/collateral/servers-unified-computing/ucs-manager/whitepaper-c11-736116.pdf

Cisco UCS B-Series and C-Series Operating in UCSM 2.2 and 3.1
To reset memory-error counters on a Cisco UCS B-Series or C-Series server in UCSM 2.2 and 3.1, run the following commands on the CLI:
ca-1-A# scope server 1/8
ca-1-A /chassis/server # reset-all-memory-errors
ca-1-A /chassis/server* # commit
Cisco UCS B-Series and C-Series Operating in UCSM 2.1
To reset memory-error counters on a Cisco UCS B-Series or C-Series server in UCSM 2.1, run the following commands on the CLI:
Switch-A # scope server 1/1
Switch-A /chassis/server # scope memory-array 1
Switch-A /chassis/server/memory-array # scope dimm 2
Switch-A /chassis/server/memory-array/dimm # reset-errors

If you have a blade that continues to have incrementing ECC counters, try reseating  and swapping the DIMM(s) between slots during a maintenance window.  

If the ECCs follow the moved DIMM in sizable numbers, then you may want to contact TAC at that point.

The 2.27 and 3.11 UCSM versions do not trigger alerts unless there are significant amounts of ECCs to be concerned about.

Thanks,

Kirk...

View solution in original post

1 Reply 1

Kirk J
Cisco Employee
Cisco Employee

Greetings.

There are a couple of methods to reset the DIMM counters.  One is to reset the DIMM counters themselves, and is referenced below.  The other is the one you have referred to, by resetting the CIMC.  Resetting the CIMC causes what is termed a 'shallow' discovery, and does not impact the running OS.

http://www.cisco.com/c/dam/en/us/products/collateral/servers-unified-computing/ucs-manager/whitepaper-c11-736116.pdf

Cisco UCS B-Series and C-Series Operating in UCSM 2.2 and 3.1
To reset memory-error counters on a Cisco UCS B-Series or C-Series server in UCSM 2.2 and 3.1, run the following commands on the CLI:
ca-1-A# scope server 1/8
ca-1-A /chassis/server # reset-all-memory-errors
ca-1-A /chassis/server* # commit
Cisco UCS B-Series and C-Series Operating in UCSM 2.1
To reset memory-error counters on a Cisco UCS B-Series or C-Series server in UCSM 2.1, run the following commands on the CLI:
Switch-A # scope server 1/1
Switch-A /chassis/server # scope memory-array 1
Switch-A /chassis/server/memory-array # scope dimm 2
Switch-A /chassis/server/memory-array/dimm # reset-errors

If you have a blade that continues to have incrementing ECC counters, try reseating  and swapping the DIMM(s) between slots during a maintenance window.  

If the ECCs follow the moved DIMM in sizable numbers, then you may want to contact TAC at that point.

The 2.27 and 3.11 UCSM versions do not trigger alerts unless there are significant amounts of ECCs to be concerned about.

Thanks,

Kirk...

Review Cisco Networking products for a $25 gift card