[mythtv-users] Mythvideo Metadata and Punctuation

Daniel Osborne myth at danielosborne.net
Sat Oct 1 19:32:34 UTC 2011

Thanks for the detailed response.

On Sat, Oct 1, 2011 at 8:16 AM, Raymond Wagner <raymond at wagnerrp.com> wrote:

> On 10/1/2011 01:12, Daniel Osborne wrote:
> > Is there a way to ignore punctuation in the metadata grabber?
> >
> > For example, I have my movies organized as such:
> > Star Trek - First Contact.mkv
> > Star Trek - Generations.mkv
> Correct the punctuation and use colons instead.
> > Now, the actual movie contains a colon (see:
> > http://www.themoviedb.org/movie/199), but I obviously can't use that
> > as a character (interoperability with Windows).
> Sure you can.  Use the 'mangled map' operator to have samba convert
> between the two.
> My interoperability with Windows really means I do a lot of file management
through it. How would name mangling work if I wanted to rename the file in
Windows? I know it won't let me type a colon, but could I reverse-map
something to a colon?

> > I know that I can manually edit the title metadata to remove the
> > hyphen then refetch, and it successfully works. However, I'd like the
> > ability to ignore punctuation in Mythvideo automatically.
> That would be the incorrect way to do things, as many movies actually
> have punctuation in their titles, including hyphens.  Take Wall-E for
> example.  "Wall-E" and "Wall·E" work fine with TMDb.  Meanwhile, "WallE"
> returns a movie called "Walled In", and "Wall E" simply faults.
> In your example, "Wall E" works for me (at least from the tmdb script
portion), not sure if Myth itself has a problem though.
/usr/share/mythtv/metadata/Movie/tmdb.py -l en -M "Wall E"
<?xml version='1.0' encoding='UTF-8'?>
blah, blah, blah...

I would agree that WallE would (and should) fail, my suggestion wouldn't be
as simple as removing any and all punctuation.
In my head (too bad you can't read minds [?]), I was thinking it would try
both methods, since you are correct, TMDb doesn't consider the hyphen close
to a colon, then pick the closest match from there.

MythTV filters results based off Levenshtein distance, and distance
> between your title and the correct title is two.  By default, MythTV
> filters anything above five, so as far as MythTV is concerned, it's a
> valid match.  The tmdb.py script just passes the string onto the API,
> and lets the web API deal with it as it chooses.  The problem is the
> TMDb API does not think a ": " and " - " are sufficiently close to
> return a match.
> > If the devs are interested in a patch, I could write up another one.
> Since your proposed solution would solve your specific problem, but in
> the process cause others, it is not something we could accept.
> _______________________________________________
> mythtv-users mailing list
> mythtv-users at mythtv.org
> http://www.mythtv.org/mailman/listinfo/mythtv-users

On Sat, Oct 1, 2011 at 8:46 AM, Raymond Wagner <raymond at wagnerrp.com> wrote:

> On 10/1/2011 05:15, Daniel Osborne wrote:
> > A little off topic though, I'd also like it to be able to take the
> > year out of the filename (if any), and use that to match a specific
> > remake. For example:
> > "Alice in Wonderland (2010)"
> Take a look at the code that produces movie titles.
> https://github.com/MythTV/mythtv/blob/master/mythtv/libs/libmythmetadata/videometadata.cpp#L1036
> Right now, it simply truncates anything within any form of braces.
> While I don't have the final say on parsing formats, something that
> handled the specific regular expression "\((0-9){4}\)$" to parse out
> four digit years to allow later filtering of the results would likely be
> acceptable.  Note that due to how that function operates, with a
> position response of "0" telling it to stop looping back through, the
> interface would have to be altered to support new values.
> If that's an option, then I may dig into that code and see what I can do

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.mythtv.org/pipermail/mythtv-users/attachments/20111001/35e7679f/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 635 bytes
Desc: not available
Url : http://www.mythtv.org/pipermail/mythtv-users/attachments/20111001/35e7679f/attachment.png 

More information about the mythtv-users mailing list