[mythtv-users] Duplicate matching methods - make program ID an explicit choice

Michael T. Dean mtdean at thirdcontact.com
Thu May 19 18:21:52 UTC 2016


On 05/19/2016 02:12 PM, John Veness wrote:
>
> On May 19, 2016 5:13:01 PM GMT+01:00, "Michael T. Dean" <mtdean at thirdcontact.com> wrote:
>> MythTV has always used program ID if it was there and ignore the
>> specified duplicate matching method.  What you're likely thinking of is
>>
>> the fact you used to think you had control because users of XMLTV
>> grabbers used to not get program IDs.  Now they do (they have for a
>> long, long time, now, but you probably remember the before).  For the
>> XMLTV parser, the title of a show is used to generate a series ID
>> that's
>> the ELF hash of the title.  If a program's XMLTV <episode-num>
>> element's
>> system attribute is "dd_progid", the episode_num is used as the
>> programid (and this should remain this way).  This provides a means for
>>
>> the provider to say, "this value should be used as the definitive
>> episode identifier".  Otherwise, if the <episode-num> system attribute
>> is "xmltv_ns" and if the XMLTV listings contain both a season and
>> episode number and if the season is 35 or less, a program ID is
>> generated by concatenating a category type identifier (MV, EP, SP, or
>> SH), the series ID, the episode number, a single-digit season number
>> (in
>> base 36--1-Z, where Z has a decimal value of 35), and, if applicable,
>> the partnumber and parttotal.  This seems like it would be a good
>> "universal" program ID, but it's not--because the season and episode
>> numbers aren't universally consistent (nor defined--reference Firefly
>> and it's out-of-order airing, for one).
>>
>> In fact, IMHO, autogenerating program IDs is just plain wrong.  If you
>> want a quick fix, I'd rather see a patch that removes the program ID
>> generation from the xmltvparser (
>> https://github.com/MythTV/mythtv/blob/master/mythtv/programs/mythfilldatabase/xmltvparser.cpp#L551
>>
>> ).  Then, the motivated user could even add in a duplicate matching
>> mechanism that compares season and episode (and part/parttotal) number
>> (after adding season and episode columns to program data and updating
>> the code to insert them appropriately at listings retrieval) since
>> that's really all the autogenerated program IDs were comparing.***
>> Now,
>> if you're switching to Schedules Direct (or some other provider that
>> provides a program ID encoded using the dd_progid system, this won't
>> help to fix duplicate matching for previous episodes--but if nothing
>> else, it should convince you that there's no harm removing the broken
>> program ID data from your database.
>>
>   [snip] >The problem is that requires the user to know too much about the
>> internal workings of MythTV.  It should just do the right
>> thing--meaning
>> it should always use program IDs (since they're defined to be unique
>> identifiers of episodes) if they exist and--ideally, after someone who
>> actually cares/switches listings providers or has multiple listings
>> providers with differing program ID writes a patch to make program ID
>> listings-source-specific--come from the same listings provider.  A
>> quick
>> and messy fix isn't the right solution here--it will just prevent
>> anyone
> >from ever doing things right and we already have plenty of old "good
>> enough for now" garbage in MythTV that still needs to be fixed.  If
>> it's
>> going to be messy, it's better to keep the mess out of MythTV and let
>> the users mess with the data in their database and clear out their old
>> program IDs.
>>
>> So, basically, your concerns over program ID result solely from your
>> experience with a) multiple providers with different program ID values
>> (so we need listings-source awareness in the program ID usage) and b)
>> having program IDs where you shouldn't--all of those XMLTV-parser
>> autogenerated garbage values.
>>
>> Mike
>>
>> *** And, FWIW, there are good reasons to add season and episode to
>> program, so doing this would have other benefits, too.
> Thanks for the explanation of the auto-generated program IDs, Mike. Now you mention it, it does sound familiar, but I had forgotten the details.
>
> I haven't checked the raw output of the uk_rt grabber or the new SD JSON one, but if it turns out to be the case that they both use xmltv_ns season/episode numbering, and if program titles are the same, maybe the autogenerated program IDs would match too? Does anyone else know?

That would be the case.  However, I'd expect that the SD JSON one would 
use the dd_progid system.

> I think my problems previously have been when a program is broadcast both on a channel for which I used XMLTV to get listings and on another where only EIT was available. These seemed to get different IDs.
>

Right, they would.  And although the EIT only gets a programid if it 
includes an authority (where the authority serves the purpose of 
identifying the program ID source), that authority isn't used to 
distinguish whether to use the programid when checking for duplicates.  
So, EIT kind of has a half-way solution in place (the authority is the 
"programid source", but programid isn't ignored for programs with 
differing authorities).

Mike


More information about the mythtv-users mailing list