[mythtv-users] Question re: available SATA ports and linux software RAID

Simon Hobson linux at thehobsons.co.uk
Thu Apr 7 11:58:38 UTC 2011


Richard Morton wrote:
>  > > RAID1 or RAID10, depending on budget. Mine is probably a 
>minority opinion,
>>  > though.
>>
>>  What a weird view.. So you use RAID1 no matter what? forget about
>>  wasting that much disk ?
>>
>>  You'll also find you get much greater resistance to hardware failure
>>  with parity RAID. You also get a greater throughput.
>>
>
>I would point out that a single write on a raid 5 or 6 requires 
>multiple reads and writes if the file is smaller than the 
>blocksize... whereas this is not the case with raid 1...
>
>So as always it depends on the application, myth video recordings 
>raid5/6 is fine but for databases raid 1 or 10 will provide higher 
>performance

I agree, mirroring (or mirroring and striping) is generally better 
for performance in many applications - especially for random 
read/write - I believe the only aspect where RAID5 wins is on 
available space/physical space.

So in the past I've configured machines with a mix of storage. Active 
DBs have been on striped/mirrored arrays (RAID10) for performance, 
while less active DBs have been on RAID5 for better space utilisation.

Comparison :
A single write on RAID5 requires a minimum of 2 reads and two writes 
(one each on two different disks). Caching (especially write cache) 
can reduce the impact of this on performance, but then can create 
other issues.

Any individual read can only be satisfied from one drive (or by 
reading every other drive and doing the parity calculation).

On a mirrored set, any single write involves exactly two writes and 
no reads. Any individual read can be satisfied from either drive. In 
a "mostly read" situation, an intelligent controller can improve 
performance further by allocating reads to drives based on where the 
drive heads are going to be left by the previous access.

So in most use cases, mirroring (and striping for larger capacity) 
will give better performance than RAID5. RAID5 may win where activity 
is mostly streaming long sequential writes or reads.


There is another interesting fact to consider.
In both setups, you can get a corruption in the data which is 
invisible to the OS. The RAID controller (or software) can detect 
this if it does a consistency check - but in neither setup can it 
correct it as there is no way to determine which drive has the 
incorrect data (I believe RAID6 allows certain errors to be 
attributed to a specific drive).
But mirrored drives will correct a block if it is ever over-written 
since the same data will be written to both drives without reference 
to their previous contents. This is not the case with RAID5 since the 
parity is normally calculated by reference to previous data, rather 
than computing a complete fresh parity across the stripe - thus no 
matter how many times the data is changed, the parity will always be 
wrong.
If the parity in RAID5 is wrong, then it may never show until a drive 
fails - at which point the erroneous parity will be used to compute 
the missing data, and previously correct data will suddenly go bad, 
possibly many years after the error occurred.


-- 
Simon Hobson

Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed
author Gladys Hobson. Novels - poetry - short stories - ideal as
Christmas stocking fillers. Some available as e-books.


More information about the mythtv-users mailing list