[mythtv-users] Mooting architecture for a DataDirect replacement

Fri Jun 22 01:46:36 UTC 2007

On Thu, Jun 21, 2007 at 08:19:27PM -0400, Christopher X. Candreva wrote:
> On Thu, 21 Jun 2007, Jay R. Ashworth wrote:
> > I was merely trying to propose an architecture that would make
> > practical the distribution of the load of 200,000 Mythboxen looking for
> > guide data every day.  NNTP would.
> 
> I used to run an NNTP system. I'm going to assume INN or it's replacement 
> has gotten better, but it wasn't easy, and there is going to be 
> significantly less experience in it today, especially most ISPs are 
> outsourcing it.

Centralized traffic could be *run* via the commercial providers.
Local machines wouldn't need to carry but a couple newsgroups, and not
a lot of data traffic.

> There is another distributed-database system in place, complete with local 
> caching and variable cache time: DNS . 

Not built for it.  Our data objects are too big, DNS is synchronous,
and it's also non-flooding, in any real sense.  It doesn't keep the
back data the way NNTP does either.

> The anti-spam community has used DNS for years for real-time blacklists of 
> IP address. In it's simplest form: Lets say you want to know if IP address 
> 1.2.3.4 is something you should accept mail from. Do a dns lookup  on 
> 4.3.2.1.blacklistexample.com . If it returns a value, it's blacklisted. DNS 
> not found, it's OK  . Usually the IP returns is in 127.0.0.x , and by the 
> last digit information as to the type of listing is returned.

Sure.

> Another example: The Clam AntiVirus project distributed the virus databases 
> via simple http. They were doing http HEAD requests to check if a database 
> was available, but the bandwidth was killing them. I suggesting using DNS to 
> publish the current serial number of the database. If that number indicates 
> a new version is available, the updater downloads a new copy of the virus 
> DB. In this way, you can check for new virus signatures every 5 minutes, and 
> it only costs a 58 byte UDP packet.
> 
> I can see a similar method, using DNS to indicate an update is available, 
> triggering the transfer of an updated data set/file.

My intuition is that that's not  practical, because the entire idea is
to *deaggregate*: I don't *want* One Big Database, that has to process
specific queries for 200K users.   eBay does it pretty well, but they
make a lot more money.

NNTP servers will do all the work for you, leveraging properties
already built into the protocol.

I dunno; maybe I'm way off base; I don't seem to be getting any
traction.  But it seems pretty obvious to me...

Cheers,
-- jra
-- 
Jay R. Ashworth                   Baylink                      jra at baylink.com
Designer                     The Things I Think                       RFC 2100
Ashworth & Associates     http://baylink.pitas.com                     '87 e24
St Petersburg FL USA      http://photo.imageinc.us             +1 727 647 1274