[mythtv] New duplicate check method 'smart'

martin at longhome.co.uk martin at longhome.co.uk
Wed Nov 8 09:22:39 UTC 2006



> Also, feel free to come up with additional fixups that massage
> the EIT data into something coherant. I maintain these as I
> see things that need fixing (like the recentish fixup for 24 eps)

I will do. Still finding my way around the code yet, but getting there.
What was the recent fix you did for 24? There's a few things I've come
across from time to time, but I guess you've always got there before be -
things like the [S] and then various other things enclosed in []. Most of
the stuff now seems to be beyond simple text processing, eg splitting
sentences etc, some of it you would need to refer to existing records to
make decisions.

I thought about doing some kind of intelligent based string matching,
scoring based on similarity between strings. This would be fairly easy to
implement in programinfo.cpp, however part of the duplicate matching is
done by SQL, and that would be really difficult to implement (not to
mention costly on performance). I guess it would also be kind of unsafe
too. 

One I noticed the other day was "X Factor - The Result" (the wife, not
me!), they had included an extra space after the '-' so it nearly didn't
record. Sometimes capitalisation can throw it too. I considered creating
some fields in the db to contain lowercase versions of title, description,
and subtitle to get around this, so that they can be used for matching, but
the original field for display (or use whatever the SQL syntax is in the
query, TOLOWER()?, again, performance!), but I reckon I'd struggle to get
such things committed. 


> I'm looking forward to seeing this in the wild. Soon as they start
> broadcasting it i'm sitting down for the weekend to write the support
> for it.

Me too.... looking forward to this. Let me know if there's anything I can
do to help


--------------------------------------------------------------------
mail2web - Check your email from the web at
http://mail2web.com/ .




More information about the mythtv-dev mailing list