[mythtv-users] Fwd: Further Notice of Seagate Hard Drive Class Action and Proposed Settlement

f-myth-users at media.mit.edu f-myth-users at media.mit.edu
Sat Mar 13 20:49:24 UTC 2010


    > Date: Sat, 13 Mar 2010 11:35:11 -0800
    > From: Manuel McLure <manuel at mclure.org>

    > [ . . . ]

    > I've had severe filesystem corruption happen to me because of bad
    > non-ECC RAM. Hundreds of files corrupted before anything crashed. I
    > would rather have had the machine crash immediately on detection of
    > bad RAM.

    > [ . . . ]

    > Yes, but the fact that the machine was crashing hard would give you
    > the idea that maybe RAM was bad. A corrupt filesystem might send you
    > down the path of checking the disk subsystem instead, or you might not
    > notice the corrupt data for months. And again - earlier crash, less
    > chance of corrupted filesystem data. A crash in this case is a _good_
    > thing.

Yeah, what he said.

I have a machine that corrupts memory---every few hundred gigabits---
if and only if CPU throttling is turned on and only with certain bit
patterns.  Discovered it because I'm anal about checksumming things
and saw it during a disk-mirror transfer that moved hundreds of gig to
another machine; the checksums didn't match when I was done.  Spent a
while trying to figure out if it was a netwokr problem; then tracked
it down to memory behavior and not any I/O subsystem; the fact that it
was a crypto filesystem instantly exonerated the disk, and the fact
that it happened w/IDE, SATA, and USB instantly exonarated the I/O
datapaths.  Some work with md5sum in a loop vs a loop with several-
second delays made it fairly obvious fairly quickly what was going
on...

And then I had to spend ten times the effort to make sure that, in the
couple of months the machine had been in that configuration, it hadn't
silently corrupted files it had stored.

Man, I'd rather have had a memory subsystem that would have told me
something was wrong earlier.  (Even a -crash- would have been better
than silent corrupt, but notification would have been better still.)

Cheap commodity hardware.  (And no, this isn't worth the 10x cost
increase of enterprise-class stuff--but it -is- worth having the cheap
commodity hardware actually have the ability to see -some- errors...
Or, of course, of expecting motherboard manufacturers to get it right
the first time.  Ha.)


More information about the mythtv-users mailing list