[mythtv-users] Is this a failing drive?
wrsturm at shaw.ca
Tue Nov 22 14:57:13 UTC 2011
On Tue, 2011-11-22 at 09:04 +0000, Tim Draper wrote:
> On 22 November 2011 03:32, Don Brett <dlbrett at zoominternet.net> wrote:
> > On 11/21/2011 10:29 PM, Don Brett wrote:
> > On 11/21/2011 11:12 AM, Keith Pyle wrote:
> > On 11/21/11 06:00, Manuel McLure wrote:
> > On Sun, Nov 20, 2011 at 7:40 PM, Don Brett <dlbrett at zoominternet.net> wrote:
> > This is a little off-topic, but it's a new drive on a new Mythbuntu
> > installation. ?Symptoms are:
> > -partition table got corrupted (after about 50 hours on the drive); an
> > 8-hour low level format got it back
> > -the box occasionally freezes-up for a few seconds
> > -I see multiple instances of these errors in /var/log/syslog:
> > Nov 20 09:54:02 zedo kernel: [ ? ?6.292384] EXT4-fs (sda2): re-mounted.
> > Opts: errors=remount-ro
> > Nov 20 09:55:12 zedo kernel: [ ? 81.573425] EXT4-fs (sda2): re-mounted.
> > Opts: errors=remount-ro,commit=0
> > Nov 20 09:58:51 zedo kernel: [ ?300.004041] [Hardware Error]: Machine
> > check events logged
> > ?From /var/log/mcelog, I see multiple entries of this:
> > mcelog: failed to prefill DIMM database from DMI data
> > Kernel does not support page offline interface
> > mcelog: mcelog read: No such device
> > Hardware event. This is not a software error.
> > MCE 0
> > CPU 0 4 northbridge
> > MISC c008000001000000 ADDR 1844184
> > TIME 1321843678 Sun Nov 20 21:47:58 2011
> > ? Northbridge NB Array Error
> > ? ? ? ?bit42 = L3 subcache in error bit 0
> > ? ? ? ?bit43 = L3 subcache in error bit 1
> > ? ? ? ?bit46 = corrected ecc error
> > ? ? ? ?bit59 = misc error valid
> > ? ? ? ?bit62 = error overflow (multiple errors)
> > ? memory/cache error 'evict mem transaction, generic transaction, level
> > generic'
> > STATUS dc074c60001c017b MCGSTATUS 0
> > MCGCAP 106 APICID 0 SOCKETID 0
> > CPUID Vendor AMD Family 16 Model 5
> > Hardware event. This is not a software error.
> > I replaced the sata drive cables, disconnected the dvd drive, tried a
> > different power supply. ?I also tried another drive on the same box (but
> > it was an ide) and had no errors. ?With a different motherboard...still
> > got the "re-mount" errors but none of the "[Hardware Error]" entries.
> > Anyone have a suggestion?
> > PS - the hardware is: (everything but the power supply and case is new)
> > -ASUS M4A78LT-M AM3 AMD 760G HDMI Micro ATX AMD Motherboard
> > -SAMSUNG EcoGreen F4 HD204UI 2TB SATA 3.0Gb/s 3.5" Internal Hard Drive
> > -ADATA Gaming Series 2GB 240-Pin DDR3 SDRAM DDR3 1600 (PC3 12800)
> > Desktop Memory
> > -ZOTAC ZT-20203-10L GeForce GT 220 1GB 128-bit DDR2 PCI Express 2.0 x16
> > HDCP Ready Video Card
> > -ThermalTake 430 power supply
> > It's not a disk problem, that's a CPU or motherboard problem. The disk
> > corruption is caused by your memory contents getting corrupted and
> > being written to disk.
> > See http://halobates.de/mce.pdf for details on exactly what a "machine
> > check exception" is.
> > With the caveat that I'm not an expert on this...
> > I suspect your cache memory or perhaps the Northbridge (handles
> > communication among cores, possibly RAM, video) has a problem.
> > Depending on your specific CPU, the Northbridge may be on the CPU, i.e.,
> > CPU cores and Northbridge are all in one package. This is the case for
> > many recent, mainstream processors from both Intel and AMD.
> > The errors you included suggest a multi-bit error in the L3 cache. L3
> > is special memory where (a limited amount of) recently used instructions
> > and data are stored for faster access by the CPU than going to main
> > RAM. As Manuel wrote, if the cache is corrupted, it could lead to all
> > manner of intermittent and seemingly random problems, including those
> > you mentioned.
> > If this is a Northbridge/cache problem and the Northbridge is on the CPU
> > die, then your only fix will be to replace the CPU. If the Northbridge
> > is a separate chip on the motherboard, then you'll have to replace the
> > motherboard but could keep the CPU.
> > It may be worthwhile trying to see if Asus will help you if this a new
> > motherboard. (I have no personal experience with Asus support and don't
> > know if/how they will help.)
> > Keith
> > _______________________________________________
> > mythtv-users mailing list
> > mythtv-users at mythtv.org
> > http://www.mythtv.org/mailman/listinfo/mythtv-users
> > I just notice that I hadn't included the cpu on the list of hardware; it's
> > an AMD Athlon II X3 445 Rana (3.1GHz Socket AM3 95W Triple-Core Desktop
> > Processor ADX445WFGMBOX). Apparently this processor doesn't have an L3
> > cache (from Toms Hardware - Rana, triple-core, no L3 cache (2.7+ GHz)), does
> > that matter, or is the cache on the motherboard?
> shouldnt really matter.
> > I looked up the features on the motherboard chipset, it has a North Bridge
> > (AMD 760G); I assume that means the cpu does not have an integrated
> > northbridge...right? So it looks like my problem might be with the
> > motherboard.
> > Sidenote - Some other threads implied it might be a memory problem, so I
> > played with it a little. The box started with (2) 2 gig ddr3's. I removed
> > one of the sticks...similar behavior. Replaced that with the other stick
> > (still running with a single stick)...errors increased a lot, 2-3 errors a
> > minute. Does that mean anything?
> could well be a faulty motherboard. although i dont have any real-life
> tails, i'd of thought a faulty DIMM slot would of also showed in the
> memtest you've ran.
> > I forgot to mention, I ran memtest86. I showed no errors after 5 passes (it
> > ran for about 4 hours).
> it's not the ram then.
> the only thing left now is CPU and/or motherboard, so based of other
> peoples' and your own opinions, i'd also say motherboard.
> mythtv-users mailing list
> mythtv-users at mythtv.org
Perhaps one thing to check is if cpuspeed is running. I had similar
freezes that went away after I removed cpuspeed.
More information about the mythtv-users