[mythtv-users] mythbackend hanging with busy disk

Hika van den Hoven hikavdh at gmail.com
Thu Apr 7 10:12:09 UTC 2016


Hoi Simon,

Thursday, April 7, 2016, 11:34:18 AM, you wrote:

> A hardware fault can also manifest like this. In theory a disk
> error should result in timeouts and/or errors and things crashing as
> they can't access what they need, but there are some modes where I
> think faults can cause uninterruptible calls which then block. Once
> that happens, any process which needs to read from disk will block
> and the system will grind to a halt.
> I once had a SCO Openserver system running on IBM hardware with a
> bug in the raid card driver. Every now and then it would fail to
> return from an I/O call and then that block on the volume would
> become blocked. Anything trying to access that blocked block would
> then block (in a non-interruptible manner), and one day it hit
> something commonly used - the result "wasn't pretty" 8-O#
> Fortunately it didn't take IBM long to fix the bug, unfortunately
> this was in the days before fast internet and it took a day or two
> just to download the new software.

> If you can, try leaving an SSH session open with top running. If
> the system hangs again you should be able to get some ideas from
> what happens in the terminal session.

> Top is still running: It'll show you what processes are using CPU,
> and most importantly, what the cpu %s are. Unless you see "wa" go to
> some large value, then it's not processes thrashing the disk.

> Top "freezes": Suggests the system has 'stopped', top is likely to
> show you a snapshot of what was going on just prior to stopping.

> Top quits and you get some errors: See what the errors say.

> If you can, having a console (monitor screen) connected will show
> you any messages from the last throws of the system crashing. Issue
> "setterm -blank 0" on a console login to disable the screen blanking
> or you won't be able to see the output. Unfortunately, this will
> only show you the last few lines (one screens worth) and if the
> system has crashed then you won't be able to scroll back.
> In extreme, you configure a serial console and connect it to
> another computer set to capture all serial input ..

> _______________________________________________

Next to that not reproducible errors could come from any hardware part
getting marginal or connections getting dirty. In this case the most
obvious would be memory, sata controller or disk, but any part could
cause it. These kind of issues often can be temperature related. When
the system gets hotter, they occur more often.
If your system has been running for some time, taking every part
loose, dusting the system, and reseating them can solve it. But never
use a vacuum cleaner as the statics ... 


Tot mails,
  Hika                            mailto:hikavdh at gmail.com

"Zonder hoop kun je niet leven
Zonder leven is er geen hoop
Het eeuwige dilemma
Zeker als je hoop moet vernietigen om te kunnen overleven!"

De lerende Mens



More information about the mythtv-users mailing list