[mythtv-users] How does scraped listings data compare withDataDirect's?

Jake Palmer jakep_82 at hotmail.com
Wed Jun 20 17:10:07 UTC 2007


>From: Yeechang Lee <ylee at pobox.com>
>Reply-To: Discussion about mythtv <mythtv-users at mythtv.org>
>To: MythTV user mailing list <mythtv-users at mythtv.org>
>Subject: [mythtv-users] How does scraped listings data compare 
>withDataDirect's?
>Date: Wed, 20 Jun 2007 09:45:53 -0700
>
>I hope, as we all do, that some kind of solution--likely one involving
>fees--will emerge between now and 1 September for the end of the free
>Zap2It service. However, we should also start planning as if
>DataDirect will go away and no comparable direct feed will appear to
>take its place.
>
>I built my MythTV setup after DataDirect's debut and so have never
>known anything else. Compared to DataDirect.com, scraped data from
>tvlistings.com would lack the following:
>
>* Original air dates for TV shows
>* Episode numbers
>* Category
>* The Credits list is less detailed. For example, the _Arrested
>   Development_ episode "The Sword of Destiny" lists nine actors under
>   "Cast" on Datadirect but only five under "Credits" on
>   tvlistings.com.
>* Miscellaneous data like "Guest Stars," "Executive Producer," ratings
>   stars, and "Directed by" is completely missing.
>
>While distressing, none of these to my untrained eye is a truly
>serious loss. Lack of proper air dates is a pity given that I sort by
>it in Recorded Programs, but sorting by Program ID (which, thankfully,
>*is* available, in the URL) would, I suspect, be a mostly-acceptable
>substitute (not always, because program ID doesn't necessarily
>coincide with air order). As I recently learned here, MythTV doesn't
>actually use episode numbers.Searches, including Power Searches, would
>suffer from the lack of ancillary program data, of course. Perhaps
>IMDb's user-contributed ratings can substitute for the ratings stars.
>
>I suggest that we all start testing scrapers now, and am happy to
>volunteer once one is available to try.
>
>--
>Yeechang Lee <ylee at pobox.com> | +1 650 776 7763 | San Francisco CA US

The problem as I see it has nothing to do with the lack of data info.  It's 
the game of cat and mouse that will surely ensue once thousands of people 
begin scraping sites.  All it would take is a small tweak to the code on a 
website to break a scraper.  If they do that just once a week, the upkeep 
would be exhausting and the WAF drops dramatically.  What if you go on 
vacation for 2 weeks and return to discover that nothing recorded because 
the scraper broke?  Ultimately if reliable guide data isn't available, my 
2-1/2 year love affair with myth will come to a sad and sudden end.

But for now I'm just holding my breath and waiting for Isaac to weigh in on 
the subject.

Jake

_________________________________________________________________
PC Magazine’s 2007 editors’ choice for best Web mail—award-winning Windows 
Live Hotmail. 
http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_pcmag_0507



More information about the mythtv-users mailing list