[mythtv-users] 2 TB Hard Drive Recommendations

Brad Templeton brad+myth at templetons.com
Sat Dec 11 07:26:06 UTC 2010


On 12/06/2010 11:20 AM, Kevin Ross wrote:
> On 12/06/2010 07:28 AM, Fedor Pikus wrote:
>> Do not use RAID5 with disks over 1TB. Use RAID6. The probability of
>> disk error while reading the whole disk is so high that it is very
>> likely that after replacing a failed disk the array will fail to
>> reconstruct. In RAID5 second fault, even transient, destroys the
>> array.
>>
>
> This isn't 100% true.  You aren't likely to lose the entire array, 
> just a few files.  I've had it happen to me. When the second drive 
> starts dropping out of the array during a rebuild, you copy that drive 
> using ddrescue to a new drive, put the new drive in the array 
> replacing the old one, and let it rebuild.  Any bad sectors that 
> ddrescue couldn't recover will be replaced with 0's.  This could be in 
> the middle of a file, or in filesystem metadata, or completely unused 
> space.  Run an fsck (or equivalent) after the rebuild to fix 
> filesystem metadata.  If you're using a filesystem that keeps 
> checksums of files, like ZFS or BtrFS, then you'll know which files 
> have become corrupt.
>
> Obviously this isn't ideal, but being able to recover as much data as 
> possible is better than losing the entire array.
> _______________________________________________
> mythtv-users mailing list
> mythtv-users at mythtv.org
> http://mythtv.org/cgi-bin/mailman/listinfo/mythtv-users

I have read this report about the risk of the random bit error 
destroying a rebuild and I am flabbergasted if it's true, or that you 
would need to pull the trick above to save yourself from it.   I would 
have thought that surely the RAID, seeing read errors on certain blocks, 
would simply declare anything that depends on those blocks to be bad, 
and not make the entire array unusable.  To do so would be to go against 
the whole philosophy of RAID, in that you've made yourself *more* 
vulnerable to a typical disk failure with the RAID than without it.

If linux RAID-5 really works this way I would hope there would be 
efforts to fix it.

In particular, the way you replace a working (but bound for failure) 
drive in a RAID-5 is particularly badly designed in light of this sort 
of risk.    Today, the way you do that is you add your hot spare and 
then you soft-fail the drive you wish to pull.   The system then 
rebuilds onto the hot spare using the other drives, but not the one you 
soft-failed.    Should there be an error on the remaining drives, the 
rebuild will fail for those blocks (and possibly for the whole array?)   
That is dreadful because the drive that was soft-failed is not yet bad, 
and so the data is still on it and all data can be reconstructed.

RAID-6 can help but that requires a very large array.  With a 4 disk 
RAID-6 you might as well be running RAID-1 or RAID-10, I would imagine, 
though those are slightly less robust in that they can handle the loss 
of 2 drives only half the time.

However, to have 5 drives in a RAID 6 is a large array.   For personal 
systems and Myth TV boxes it's overkill.  It is a lot noisier, a lot 
hotter, and uses a lot more power.   The extra power in fact costs quite 
a bit of money over the life of the system, at least around here.


More information about the mythtv-users mailing list