07-10-2014 09:29 AM - edited 03-01-2019 11:44 AM
Hello. We currently have about 12 Cisco UCS C210 M2's in production and about once a week recently have seen a disk with either a "Predictive Failure" or completely failed. Are there any known issues with this model and failing disks? I've seen the field notice FN - 63499 (http://www.cisco.com/c/en/us/support/docs/field-notices/634/fn63499.html) and will start investigating that, but in the mean time I'm wondering if anyone is having this issue with this model of UCS. The disks are all in either a RAID 1 or 5 configuration although it is always been a disk in a RAID 5 that has failed. Any help will be greatly appreciated.
07-10-2014 03:37 PM
Hi,
How do you recover from those failures? Are they cleared after a server reboot or you have always just replaced them?....
-Kenny
09-17-2014 03:18 PM
Cisco has been sending us new disks each time. It's getting old running down to the data center though and the customer is concerned about the stability of the UCS platform.
07-11-2014 02:46 AM
I have few of them in production and no problems so far...on what firmware are you right now?
Maybe disk manufacturer changed something :)
BR,
Dragan
09-17-2014 03:19 PM
We are running either firmware version 1.4(2) or 1.4(3). The reoccurance of this issue seems to be the same across firmware versions.
09-17-2014 08:11 PM
This might be realted to something that is called Punctured RAID, in summary one disk failure causes cascade disk issues, google it, it might be the root cause but something that called my attention is that the issue you mention is about Predictive Failures, those generally mean that the disk run out of spare blocks which should be expected if the disks have been deployed about the same time.
Would be useful to open a TAC case for investigation an dif necessary, you may want to ask for a EFA (engineering Failure Analysis) if you have seen the issue is very consistent (80-90% of disks have the same issue).
-Kenny
06-10-2016 06:22 AM
Hi
Did you ever get a resolution to this issue? We have 5 servers, all running BIOS ver 1.4.3f. Every few months I have a drive fail. If we reseat it, it often rebuilds and continues working, although sometimes they fail again in a week or two and then we get Cisco to replace them. The original disks were Seagate, but I see the latest replacement disks are now Toshiba.
But it makes me nervous - I have already had two fail at the same time, and rebuilding servers is not a fun pastime!
Thanks.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide