[mythtv-users] Mooting architecture for a DataDirect replacement

Thu Jun 21 04:33:38 UTC 2007

Michael Jones wrote:
> Ok.. if the SQL only is too "myth specific" try this on for size... :-)
> 
> If the data were in an SQL database, a frontend on the service could  
> EASILY export it to a file, if that's what the user/subscription/ 
> preference whatever decided.  The point is not the format that is  
> used.. since the contents of the SQL database can be output into  
> almost any format imaginable at the drop of a digital hat (usually  
> measured in milliseconds).   Keep the format in most general,  
> manageable format and distribute it as needed.

Sure, there's nothing wrong with using a database, but I think an XMLTV file
would work just as well, and could be much cheaper on server resources to
handle.  The scheme you're suggesting requires the server to figure out what's
changed since a certain date for every transaction.  If the number of requests
gets high, that could be expensive.  It also requires a fair bit of coding.  I
would think rsync would be more efficient for the server.

And if rsync is too expensive, a simple alternative requiring only a little
coding and just an anonymous ftp or http server could lower the costs.  This
would be for the server to store for each market a set of difference files
sufficient to bring the client up to date from whenever it last updated.  Every
5 minutes, it could compute and store the changes in the last 5 minutes.  On
the hour, it could delete all the 5-minute files and compute the changes over
the last hour, and at midnight it could delete all the 1-hour files and compute
the differences since midnight yesterday.  This only requires the server to
compute one diff every 5 minutes for each TV market it services, plus a service
a bunch of (mostly small) ftp/http file requests.

Then the client just needs to keep around the last midnight file and the last
on-the-hour file in addition to the latest file.  When it connects to the host,
it just downloads all the files generated since it last connected.  If it
downloaded any on-the-hour diffs, it deletes its local 5-minute file; if it
downoaded any midnight files, it also deletes its local on-the-hour file.  Then
it applies the diffs it downloaded in order to its latest file, bringing it up
to date.  This shouldn't require too much CPU resources.  And the coding should
just require a couple of reasonably simple shell or python scripts.

The whole thing could be made more efficient, too, by using a binary difference
format.  It's not like anyone is going to be editing these XMLTV files by hand,
or they're going to be changed by different parties independently, so the
redundancy of diff isn't needed.  A binary diff format could contain a
cryptographic hash of the final file, so after application the result could be
verified to be byte-correct.  The file itself could just be a sequence of
simple commands to move forward a number of bytes, delete a number of bytes,
and insert a string.  That would all take a bit more coding, but it's optional.

-- 
Peter Schachte              We hang the petty thieves and appoint the great
schachte at cs.mu.OZ.AU        ones to public office.
www.cs.mu.oz.au/~schachte/      -- Aesop
Phone: +61 3 8344 1338