[mythtv-users] Duplicate detection

Jan Ceuleers jan.ceuleers at gmail.com
Tue Sep 20 18:02:45 UTC 2016


On 20/09/16 19:05, Michael T. Dean wrote:
> Well, since you've already determined that dup matching won't work for
> this specific situation--regardless of whether you have scrubbed the
> program IDs--removing program IDs isn't helping.  If you stop removing
> program IDs, you'll get valid dup matching when you have showings from
> the same program ID provider that the program you previously recorded
> used.  Otherwise, your rule-specified method will be used and (assuming
> you choose "subtitle" method) it will be treated as a generic (meaning
> it will be recorded).  You're no worse off than you are now, and you're
> better off when the repeat is on the same program ID source as the
> original recording.
> 
> However, for all other "proper" programs--where there is something that
> can be used for dup matching--it will just work.  The program ID will be
> used when available and when matching authorities are specified,
> otherwise, the method your rule specifies will be used.

Excellent points; thank you very much.

Meanwhile gossamer has caught up and I have added a link to the wiki
page pointing here.

>> I had another thought: a duplicate-matching method based on the inetref
>> field. This wouldn't find defects until the metadata has been retrieved,
>> of course,
> 
> Right--and does require a lot of hits against a metadata source (there
> are a lot of episodes in people's program listings and they're replaced
> a lot--daily for about 2 weeks, usually--causing re-retrievals).  This
> might even be so many hits we may not want to encourage it.

I wasn't suggesting querying the metadata sources from within the
scheduler; that would of course be too expensive. Only for upcoming
recordings, and from a cron job or an schedule internal to the backend.

>> and it relies on there being a history of inetrefs employing
>> the current format (i.e. not just the number but also the tmdb3.py_ or
>> ttvdb.py_ prefix). Furthermore, it breaks if a new metadata source is
>> introduced in the future.
> 
> Well, the program ID authorities would fix all of that.

I still don't see how this would result in mythtv being able to catch
duplicates between an upcoming recording on a channel whose listings
data has come from listings provider A versus a past recording on a
channel whose listings data came from listings provider B. Both of these
would have program IDs; their program ID authorities would prevent them
being used for duplicate matching and we'd still fall back to
comparisons of some combination of title and subtitle and/or
description. (IOW the program ID authorities wouldn't "help").

Whereas if they are duplicates they'd have the same inetref.

> The easiest generally-good approach for this specific issue--the movie
> rule--is the title-only dup matching method.  Again, this might be
> considered if someone went to the trouble of coding it, but no one has
> yet felt sufficient need to actually do the work.

Understood.

As always thanks for your time and patience Mike. The work you're doing
as a dev answering questions is at least as helpful as writing code.
Much appreciated.

Jan


More information about the mythtv-users mailing list