UCS C210 Disk Replacement Guide

khanfawaz · ‎05-30-2013

Hi,

We have 2 UCS C210M2 servers. Each UCS has 10 HDD in it. So there are a total of 20 HDD on both the UCS C210M2 servers. UCS-1 is the primary one and UCS-2 acts as the redundant one. Both the UCS servers are on production. ESXi 4.0 is installed on both the UCS and a number of virtual servers are running on the both the UCS. Now I am going to replace all the 20 HDDs of both the UCS. Kindly let me know the steps to be followed here. Also kindly give me any cisco procedure links for the same.

Regards,

Fawaz

Keny Perez · ‎05-30-2013

Fawaz,

The HDD on all our UCS servers are hot swappable, there is no major thing you need to do to replace the disks, unfortunately, we lack of some details in your original questions, so I will try to cover a few escenarios.

If what you are trying to do is increase the array space, there are some recommendations:

http://www.lsi.com/sep/Documents/oracle/files/SAS2_IR_User_Guide.pdf

*The new drives must be at least 50 GB larger than the original drives of the volume.

* After you replace the disk drives and complete the explansion, you must use a commercial tool specific to the operating system to move or increase the size of the partition on the volume.

You can look for "Online Capacity Expansion" and your RAID controller to have more instructions/ideas.

Another point to keep in mind is the RAID level used, remember that for example RAID 0 has no tolerance to failures

(removals), so by removing a disk you might lose all data.

RAID 5 does not support the failure (removal) of more that one disk and RAID 6 does not support the failure (removal) of more that two disks.

One more thing, if the situation is that you are trying to replace a motherboard and keep the existing data, just make sure you install the disk in the new chassis in exactly the same order they are originally installed.

Let me know if none of this situations match your question.

I hope this helps.

-Kenny

khanfawaz · ‎05-30-2013

Hi Keny,

Thanks for the reply.

Once one of the HDD had gone bad in our UCS server. When we raised a rma, Cisco told there is some kind of issue with all the existing HDDs and they have sent 20 HDD. Now I have to replace all the 20 HDDs in both the UCS server. So I have to physically replace the HDDs and also migrate the data from the existing HDDs to the new HDDs. The virtual servers which are running are 1) CUCM, 2) Cisco Unity Connection & 3) Cisco Presence. Please let me know the details you are looking for.

Regards,

Fawaz

Zaira Vega · ‎05-31-2013

What is you RAID array and RAID controller?

Supposing it is not RAID 0, you can replace the disks one at the time, allowing some time for the rebuild process to be completed.

Once you replace a disk, it will have an amber LED until the rebuilt process completes. You can also see on CIMC or WebBios once the drive has completed rebuilding, depending on your RAID controller.

Hard disk replacement:

http://www.cisco.com/en/US/docs/unified_computing/ucs/c/hw/C210M1/install/replace.html#wp1053178

jgarlock9999 · ‎06-01-2013

I believe the only way you'll see the status of the drive replacement is in the BIOS utility for the LSI adapter. You might be rebooting a whole bunch.

Those drives are likely in a striped mirror. If so, each one will probably take a looong time to be completed. Probably 24 hours long. I just replaced a single 300gb drive in a Raid 1 mirror and it took ~18 hours to sync.

Keep in mind that my post is mostly "likelys" and "probablys".

Keny Perez · ‎06-01-2013

Fawaz,

Thanks for the details given.

Since you have a few VMs running on that server, you still have a chance that NOT all the disks belong to the same array, which can make the replacement a little faster.

If you have, for example, 2 arrays configured, let´s assume RAID 1 (2 HDDs) and RAID 6 (4 HDDs), then you have the chance to try this:

RAID 1 = since it supports the failure of one disk, verify both disks are online, if yes, replace one and wait for the rebuild to complete and begin the work on the other array with RAID 6.

RAID 6 = While RAID 1 disk 1 is rebuilding, check that all 4 HDDs are fine and then proceed with the replacement (I suggest 1 at the time [to always have a spare available for the array, but itt is your choice] cause if you take 2 HDDs out and then another one fails, you will loose data)

Check how many arrays you have in each of the servers and what RAID levels configured, take screenshots if necessary and let us know, we might come up with some suggestions to make the process easier; 20 disks, one at the time might be just too much time.

Also to monitor the rebuild progress/status of your arrays, you can also use MegaCLI or MSM (MegaRAID Storage Manager), both are LSI tools, so as long as you have LSI RAID controllers, you should be able to use them for manage purposes.

In the other hand, if you already have a TAC case opened, have your CSE take a look at the environment and suggest you the best way.

-Kenny

khanfawaz · ‎06-03-2013

Hi,

Thank you all for the update.

The first, two HDD are configured as Raid 1. The remaining eight HDD are configured as Raid 5. The same Raid configuration is repeated on UCS-2.

We are planning to do it ourselves and arriving on the best possible way to do the activity.

Regards,

Fawaz