cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1417
Views
0
Helpful
1
Replies

MegaCli, Raid 5, Try to reintroduce a failed disk after repair-it.

turbide
Level 1
Level 1

Hi, I have a big problem with one Raid 5 array. Someone known if we can reintroduce a drive mark failed in the array with MegaCli? I explain my problem:

My enclosure has 12 drives. I put 11 disks in Raid 5 and 1 Hot swap Disk. Monday I lost 1 disk. The HSP disk start to rebuild but after some minute, I lost a second disk. Rebuilt stop and I lost all RAID 5 (19TB). But after some manipulation, I was able to repair one disk (I lost disk 5 and after I lost 3, I repair disk 3 by change the electonic board). When I put Disk 3 in the enclosure, MegaCli mark Foreign, Unconfigure(good). I clear the  Foreign but The Array is still Offline  and When I list with command : 

MegaCli -LdPdInfo -aAll

Disk 3 and Disk 5 have still no info.

So, is-it possible to say at the Array that Disk3 are OK and is part of the array and after that I can transfert my data on other disk (I don't have Backup of that Raid 5 Array)?  

1 Reply 1

Kirk J
Cisco Employee
Cisco Employee

Greetings.

You are likely going to want to open a TAC case to make sure you don't have systemic issue such as raid controller or backplane issues contributing to your multidrive failure scenario, or drive firmware timeout issues, etc.

Generally, it is probably easier to try to repair with the WebBios, optionROM utility (Cntrl+H)

Major Disclaimer>>>>>>Drive Loss that exceeds the Raid parity capabilities is always susceptible to some corruption, even if the original drives are brought back online.  The chances that you may have to rebuild the raid volume are pretty high.

Try this from your server's LSI Webbios (hitting Cntrl + H during post when prompted)

Your config will probably show 1 failed drive, one missing, and the VD offline.

The drive that you cleared the 'foreign config' from, should be listed under as unconfigured good.

Click on that drive.

Check the box that says 'replace missing PD'.

Although you had 2 failed drives, only 1 should be 'missing', and you should be able to leave the 'drive group missing row' value with the default it contains.

Click 'Go'

If you have the option to mark it 'online', do so, and then click 'Go'

The system's warning that VD corruption can occur (you theoretically have already lost your raid array anyway), click Yes.

My config below shows the VD healthy again, because I only had one drive down, and re-inserted it.  In your config, if all goes well, will still be degraded due to your original drive failure, but the VD should come back online.

At this point your hotspare should kick back in to start the rebuild of remaining failed drive.  If the hotspare is in a funny state, you may have to remark it as a hot spare.

Good luck!!!!