[mythtv-users] Is this a failing drive?

Keith Pyle kpyle at austin.rr.com
Mon Nov 21 16:12:56 UTC 2011


On 11/21/11 06:00, Manuel McLure wrote:
>
> On Sun, Nov 20, 2011 at 7:40 PM, Don Brett <dlbrett at zoominternet.net> wrote:
>> > This is a little off-topic, but it's a new drive on a new Mythbuntu
>> > installation. ?Symptoms are:
>> >
>> > -partition table got corrupted (after about 50 hours on the drive); an
>> > 8-hour low level format got it back
>> > -the box occasionally freezes-up for a few seconds
>> > -I see multiple instances of these errors in /var/log/syslog:
>> >
>> > Nov 20 09:54:02 zedo kernel: [ ? ?6.292384] EXT4-fs (sda2): re-mounted.
>> > Opts: errors=remount-ro
>> > Nov 20 09:55:12 zedo kernel: [ ? 81.573425] EXT4-fs (sda2): re-mounted.
>> > Opts: errors=remount-ro,commit=0
>> > Nov 20 09:58:51 zedo kernel: [ ?300.004041] [Hardware Error]: Machine
>> > check events logged
>> >
>> >
>> > ?From /var/log/mcelog, I see multiple entries of this:
>> >
>> > mcelog: failed to prefill DIMM database from DMI data
>> > Kernel does not support page offline interface
>> > mcelog: mcelog read: No such device
>> > Hardware event. This is not a software error.
>> > MCE 0
>> > CPU 0 4 northbridge
>> > MISC c008000001000000 ADDR 1844184
>> > TIME 1321843678 Sun Nov 20 21:47:58 2011
>> > ? Northbridge NB Array Error
>> > ? ? ? ?bit42 = L3 subcache in error bit 0
>> > ? ? ? ?bit43 = L3 subcache in error bit 1
>> > ? ? ? ?bit46 = corrected ecc error
>> > ? ? ? ?bit59 = misc error valid
>> > ? ? ? ?bit62 = error overflow (multiple errors)
>> > ? memory/cache error 'evict mem transaction, generic transaction, level
>> > generic'
>> > STATUS dc074c60001c017b MCGSTATUS 0
>> > MCGCAP 106 APICID 0 SOCKETID 0
>> > CPUID Vendor AMD Family 16 Model 5
>> > Hardware event. This is not a software error.
>> >
>> >
>> > I replaced the sata drive cables, disconnected the dvd drive, tried a
>> > different power supply. ?I also tried another drive on the same box (but
>> > it was an ide) and had no errors. ?With a different motherboard...still
>> > got the "re-mount" errors but none of the "[Hardware Error]" entries.
>> > Anyone have a suggestion?
>> >
>> >
>> > PS - the hardware is: (everything but the power supply and case is new)
>> > -ASUS M4A78LT-M AM3 AMD 760G HDMI Micro ATX AMD Motherboard
>> > -SAMSUNG EcoGreen F4 HD204UI 2TB SATA 3.0Gb/s 3.5" Internal Hard Drive
>> > -ADATA Gaming Series 2GB 240-Pin DDR3 SDRAM DDR3 1600 (PC3 12800)
>> > Desktop Memory
>> > -ZOTAC ZT-20203-10L GeForce GT 220 1GB 128-bit DDR2 PCI Express 2.0 x16
>> > HDCP Ready Video Card
>> > -ThermalTake 430 power supply
> It's not a disk problem, that's a CPU or motherboard problem. The disk
> corruption is caused by your memory contents getting corrupted and
> being written to disk.
>
> See http://halobates.de/mce.pdf for details on exactly what a "machine
> check exception" is.
With the caveat that I'm not an expert on this...

I suspect your cache memory or perhaps the Northbridge (handles
communication among cores, possibly RAM, video) has a problem. 
Depending on your specific CPU, the Northbridge may be on the CPU, i.e.,
CPU cores and Northbridge are all in one package.  This is the case for
many recent, mainstream processors from both Intel and AMD.

The errors you included suggest a multi-bit error in the L3 cache.  L3
is special memory where (a limited amount of) recently used instructions
and data are stored for faster access by the CPU than going to main
RAM.  As Manuel wrote, if the cache is corrupted, it could lead to all
manner of intermittent and seemingly random problems, including those
you mentioned.

If this is a Northbridge/cache problem and the Northbridge is on the CPU
die, then your only fix will be to replace the CPU.  If the Northbridge
is a separate chip on the motherboard, then you'll have to replace the
motherboard but could keep the CPU.

It may be worthwhile trying to see if Asus will help you if this a new
motherboard.  (I have no personal experience with Asus support and don't
know if/how they will help.)

Keith


More information about the mythtv-users mailing list