[mythtv-users] Mooting architecture for a DataDirect replacement

Wed Jun 27 14:36:40 UTC 2007

On Wed, Jun 27, 2007 at 04:27:34PM +1000, Peter Schachte wrote:
> Jay R. Ashworth wrote:
> > On Sun, Jun 24, 2007 at 01:11:17PM -0400, Rod Smith wrote:
> >> That does bring up a question that needs answering before such a
> >> system could be formally designed: Just what form do updates take?
> > 
> > I was planning on an update message to carry one or more XML wrapped
> > collections of scheduling and program data fields, which might be ADD,
> > REMOVE, or REPLACE (which is actually a special case, and might be
> > dropped -- except that I don't think you can; see below).
> 
> I think you can probably get away with just REPLACE. If this is really
> an ADD, then there won't be anything there already, so treat is as
> replacing emptiness. And why would you ever want to REMOVE an entry?
> If you ever do, then replace it with an entry that says "Dead Air."
> Having only REPLACE makes the process more robust, since it works
> correctly even if the previous ADD got lost.

You're right, here, I think.

> Specifically, I don't think replacements want to be like diff entries
> that say "look for something like THIS and replace it with THAT."
> That's just extra overhead, and unnecessary because each program has a
> date, time, and channel that uniquely determines where it belongs. The
> only tricky problem is handling the situation where an update replaces
> part of the time span of an existing program.

Another item to think about.

> > 1) Most updates will come directly out of the automation software
> > (or something which drives it) at the station/network), as they're
> > put in.
>
> This is a really crucial point. This would be really good in terms of
> reliability of the data, but do you have any reason to believe that
> you could convince anyone to actually do this? For one thing, this
> assumes that the machines that do the scheduling are networked. If
> I were managing a TV network, I'd probably not want the scheduling
> machine networked, because I wouldn't want someone to be able to hack
> in and air his diatribe on my network. The ultimate defacement: airing
> a documentary like Outfoxed on Fox, or the movie The Insider on CBS,
> right after 60 Minutes. What fun!

There are issues there, certainly; a lot of it depends on whether those
machines are already *on* a LAN -- and I suspect they are; I don't,
actually, think the automation system gets updated manually; I believe
they have station management packages that push it, and that those are
actually what we need to target.

I'm sure *some* solution can be found, but I do think that working
downhill from the software companies is the best approach.

> > 2) Some updates may come from external sources (production companies
> > may see fit to send out more comprehensive program descriptions at
> > some point, we might talk TV Barn's Aaron Barnhart into sending out
> > his talk show guest updates as overlays, individuals in specific
> > communities may want to send out flash updates as they hear things,
> > for those who choose to accept updates signed with *their* keys,
> > etc.
>
> That sort of update may be tricky to handle. The production company
> might have good description info to post but not know when it will be
> aired in each market, particularly for a syndicated show. They'd want
> to send out a message like: year 20, show 47 of Oprah has guests X,
> Y, and Z. Then it's up to someone else to work out which program to
> update. There's also the lifetime issue to think about: what about
> when year 20, show 47 of Oprah is rerun some months later? You'd like
> to get the correct guest list, but you don't want to keep around every
> posting made about every show ever aired to get it.

This right here is probably the best argument that's been made yet for
separating the program and airing data; the problem is how to store the
program data.

It's harder to justify pawning *that* off on NNTP companies.  Perhaps
Google would take on *that* part of the problem?  :-)  Perhaps we can
create a protocol for semantic tagging of that data in such a fashion
that the production companies can publish it on their own website in a
locatable fashion.

Or maybe it really *can't* be done without a central server, as much as
I'd hate to admit that.  I really *a lot* don't want to have to have a
central server.

> I do think there's a strong argument for having a central site
> that collates the incoming schedule information and distributes an
> authoritative schedule to everyone. The central site can have the
> resources to keep gigabytes of old program descriptions around, and
> pluck them from the database when they're being re-aired. And the
> central site can date and sign the authoritative schedule, so everyone
> knows they've got the correct latest schedule as of a certain time. In
> fact, if the central site has a policy about update frequency, clients
> can know they have the absolute latest schedule available.

Yeah, but then, again, *someone* gotta run it.

> From the central site, the data could be distributed by NNTP, UUCP,
> FTP, HTTP, P2P, or carrier pigeon.

Does RFC 1149 have the bandwidth for that?

> > an easy fix for the common case is just serial number the updates;
> > you may not know what you missed from the station, but at least
> > you'll know you missed something
>
> Yeah, but if your server doesn't have the missing posting, how can you
> get it?

Well, there you take me into deep water.  Ask Again Later.  :-)

Cheers,
-- jra
-- 
Jay R. Ashworth                   Baylink                      jra at baylink.com
Designer                     The Things I Think                       RFC 2100
Ashworth & Associates     http://baylink.pitas.com                     '87 e24
St Petersburg FL USA      http://photo.imageinc.us             +1 727 647 1274