[mythtv-users] commercial flagging idea - commercial "fingerprinting"

Matt skd5aner at gmail.com
Mon Apr 11 21:24:56 UTC 2005


Hello All, =D

Sorry this is long:  For the short version, just read the first two paragraphs.


   I had an idea a few months ago, and just thought I'd throw it out
there.  I think that mythcommflag is getting better and better. 
However, sometimes there are times it'll goof up, and some shows are
notroriously harder than others to flag correctly by their unique
filmographic nature(Lost, Law & Order, CSI, etc...).

How about a method where users can validate mythcommflag's results and
make that info available to the myth community to strengthen the
accuracy of commercial flagging by using verified data.


For example.  You record an epsiode of "Lost".  You let mythcommflag
run, and it detects commercials 90% correct.  Then you go in and edit
the recording and import the commercial flags by pressing "Z"... so
far, everything like normal.  Once the commercial markings are in
there, you edit them so they are correct (cut off the begining
commercials before the start of the show, make sure commercials are
detected correctly, and make sure that there aren't any improperly
detected cuts in the middle of the show, etc).

Then, once you have explicitly defined the commercials, some process
could identify these as " Human Verified to be Correct".  Then, the
next time a user runs mythcommflag on that episode, the service can
verify it's results based on those verified results and adjust
accordingly.

This is where it gets difficult and several different solutions might
be available:

1) Digital fingerprints - Not sure if you anyone is familiar with
MusicBrainz (http://www.musicbrainz.org/) or not.  What it does is
take an mp3 file, analysis it and make a digital finger print of it. 
Then, it sends that fingerprint to a database to compare with other
fingerprints that people have submitted.  It can then make a
comparison and correctly identify the file and populate the ID3 tags
of an otherwise un-identified file.  The more people that ID a file,
the "smarter" it becomes and the better it can do at identifying
files.  Once the program has been correctly marked, and some kind of
"fingerprint" is made from the show, then it could be uploaded to a
central database where other's can look for that show and episode to
compare mythcommflag results so that the mythcommflag can do a better
job based on the verified results sent by users who manually made sure
the commercial detection was correct.

2) Time based - Possibly more difficult, however you have to assume
that the time of the show between commercial breaks is going to be
(nearly) identical.  So, in almost the same way, if mythcommflag has a
false positive for a commercial, it can review what others have
verified as correct and say "hey, there shouldn't be a commercial here
because 40 other users say there is a time block of 15 minutes, not
7.5 + commericial + 7.5 more, etc.

3) Identify individual commercials (not the entire commercial break) -
Along the same lines as #1, identify specific commercials.  So,
anytime an acme company commercial is recorded, mythcommflag
could/would be able to identify it as a commercial soley because it's
"learned" that it is.  This could possibly have more overhead and
possibly labor intensive.  With the constant turnover of advertising,
as well as locally broadcast commercials, it might not be worth it and
it'd be a ton of information.  However, it could see where this might
be useful for short teaser clips you see at the begining or the end of
certain shows or that certain networks frequently use.

4) combination of the above - I think if you went with the
fingerprint, a time based solution would almost be included in that.

5) Other - I'm sure others would have other suggestions that might
work for this.


I know that this is a centralized approach, but seeings how it would
be what I call a "bonus" feature, people that don't want to
participate wouldn't have to sort of thing.  Perhaps their's a
decentralized approach as well.  Mythcommflag could still work WITHOUT
this feature, it's just something that can reinforce the accuracy of
commercial flagging in myth.  I'm also not sure about the legalities,
PC, etc that a methodology like this would have or Issac's stance on
the issue as I don't know if it's ever been thought of up to the
point.

Anyway, I'm not a developer, nor am I pleading or begging anyone to
develop this... I just think it sounds cool and if anyone has the
know-how or if a few developers think it'd be fun to work on it, I'd
love to help out any way I can.

So, what do all of you think?  I know it could be a large project...

Thanks!
Matt


More information about the mythtv-users mailing list