[mythtv-users] Making the scheduler and/or MySQL thrash the disk less

Wed Sep 13 06:44:52 UTC 2006

The scheduler is trashing my recordings, and I'd like advice on how to
make it stop.  I've taken some rather extreme steps and they've only
helped a little.

When a recording ends, it triggers the scheduler to run.  That run
thrashes the disk hard enough that it leads to IOBOUND errors that
drop 10-30 seconds from all streams being captured at the time.
Since almost everything I record runs with 1-2 minutes of postroll,
this pretty much guarantees that any recording that ends will trash
an in-progress recording 1-2 minutes in.  I'm running with "-v all"
logging and it's really, really obvious that the scheduler runs all
start about 10 seconds before IOBOUNDs start cropping up.

(Amazingly, checking, optimizing, and mysqldumping the database does
-not- thrash the disk hard enough to cause this---but the scheduler
query -does-.)

If 0.20 thrashes the disk less, that'd be good to know, but somehow I
doubt it.  Can I retune MySQL somehow?  Do something else?

Configuration:  MBE w/5 PVR-250's on an MSI K7N2 Delta-L w/AMD 2800+
CPU, 1 GB RAM, and 2 Seagate 200 GB PATA IDE drives on opposite buses
(e.g., /dev/hda and /dev/hdc).  SBE/FE w/1 PVR-350, identical memory,
RAM, and drives.  100Mbps switched net between them.  Running Ubuntu
Breezy & Myth 0.18.1; root filesystem (also used by MySQL) is ext3fs;
media filesystems are JFS.  The main recording directory is on the
JFS on the MBE's hdc; the SBE streams over the network via NFS to
that directory.  The OS & MySQL DB are on MBE's hda.

Here's what I've done:
o  Yes, the database is frequently checked & optimized.
o  Maximal buffering (see the wiki, e.g., "options ivtv yuv_buffers=32
   mpg_buffers=16 vbi_buffers=16 pcm_buffers=16 dec_osd_buffers=2"),
   which chews up a rather enormous amount of RAM but still leaves me
   with ~150meg free RAM on the MBE (I sent a message a week or two
   ago asking -why- the memory usage was so high, but got no response).
o  The OS (and MySQL) are currently using hda, while the media
   filesystem is on a separate spindle -and- bus (hdc).

If the machine is otherwise unloaded but recording a single stream,
the scheduler query isn't bad enough to get an IOBOUND.  Even if it's
recording all 6 possible streams (but doing nothing else), the
scheduler won't get an IOBOUND.  (OTOH, before I went to maximal
buffering -and- put the DB on its own spindle, I could get IOBOUNDs by
doing innocent scheduling operations [e.g., in MythWeb, going from
dup-check of "All Recordings" to "Only New Episodes" if that caused
the un- or re- scheduling of a dozen episodes of something], so doing
both the buffering and the second spindle -did- help.)

If, on the other hand, I try copying a file from hdc to hda (I've got
a JFS on each spindle [the hda one isn't used directly by Myth]),
that's enough stress that a scheduler run will cause IOBOUNDs on
running streams.

If enough commflagging jobs are running, they will -also- cause this
problem.  (There -could- be 5 running at once, if most of the tuners
are/were recently in use when the scheduler runs, and apparently
that's enough disk load that it's comparable to a a direct hdc-to-hda
copy, even though the commflaggers are mostly running on the SBE, not
the MBE.)

Here's why I'm confused:  Clearly, adding lots and lots of ivtv
buffering isn't quite enough to deal with the peaks.  Fine.  But it
was my impression that putting the DB on a spindle -and- IDE bus (hda)
that was -not- in any way being used for recording (hdc) meant that
hda could -not- starve hdc for disk access!  Yet that seems to be
exactly what's going on here.  It's -better- than when everything was
on hda, but I would have expected it to be entirely gone, and it's
not.

When I spent a whole bunch of time last week increasing buffering,
adding a spindle, and running tests, I didn't immediately see this
problem, because my testing regime was (a) start 5 manual recordings
of 10 minutes; (b) start 1 manual recording of 5 minutes, and (c)
check to see if the 5-minute recording's end caused glitches in the
other 5 streams.  Unfortunately, the commflagger doesn't -start- until
5 minutes in, and staggers each start by a minute, so I wasn't provoking
enough disk I/O to get hit by the misbehavior (while I figured I
wouldn't be deliberately doing huge hda<->hdc copies by hand while the
scheduler might be running, I'd forgotten about the commflagger's
behavior).  But under a real load, in which most of the streams get
commflagged, I just got bitten by this---in fact, one recording got
bitten by it at about 1, 1.5, and 2 minutes in, from several
successive scheduler runs as postrolls of 1 and 2 minutes ended!

Things I could try, in increasing order of undesireability:
o  Run only 1 simultaneous commflagger, not 5.  This is annoying
   because when the machine is making lots of simultaneous recordings,
   commflagging can fall -way- behind.  This machine is used for
   research and is often in this position; -at the moment- it probably
   doesn't record so much that a single commflagger couldn't
   eventually keep up, but it might in the future.  And if it weren't
   for the scheduler-caused glitches, it's got plenty of disk, net,
   and CPU power to run those 5 jobs and record on 6 streams just
   fine.  (This also means that, even if it -is- keeping up, it might
   be hours before any particular recording is flagged, whereas right
   now I'm telling it to start commflagging "as soon as possible" (not
   at the end of a recording) and doing a bunch in parallel.)
o  Put the DB (and hence the MBE) on a machine that doesn't have any
   tuners at all [or at least doesn't own the filesystem those tuners
   are writing to]---or perhaps flip things around so the MBE only has
   1 tuner of its own and the SBE has the other 5.  This solution is
   really fragile, because it means losing either of these machines
   kills both dead, and I -do- occasionally have wedges (generally
   caused by the PVR-350 on the FE).  The nice thing about the current
   setup (MBE also has most of the tuners -and- the disk that the data
   is going to) is that a crash of the FE/SBE only takes out a single
   tuner and doesn't affect the rest of the system.  And having to put
   the DB on a -third- machine would be even worse---expensive, hot,
   lots of network traffic, etc etc---all because of that single MySQL
   query.

Assuming that there's nothing I can do to either tune MySQL -or-
change the scheduler query so it doesn't push MySQL so hard:

Can I throttle the flagger's rate somehow?  But I might have to
throttle it so much that I might as well just one 1, or even run
1 that only starts -after- the recording ends (still bites me if
there are many recordings back-to-back, which is usually the case).

How feasible would it be to, e.g., -suspend- commflagging for a couple
of minutes around each scheduler run?  I don't know if there's any way
for the scheduler (or whatever calls that query) to talk to all the
commflaggers to tell them to hold off for a bit---especially since
those flaggers are mostly running on the SBE, not the MBE.

Any other ideas?

Thanks.