[mythtv-users] Extracting closed captions / content analysis

Marc Randolph mrand at pobox.com
Fri Aug 14 20:01:31 UTC 2009


On Fri, Aug 14, 2009 at 2:28 PM, Aaron
Delwiche<carbonel.tigereye at gmail.com> wrote:
> Hi,
> I'm a researcher who studies media content. I would like to create a
> relatively inexpensive system that will allow me to capture video content
> while also storing the closed captions in separate text files. This system
> should have at least four separate tuners, and it should be possible to
> easily skip between searchable transcripts and the actual stored video.
> There are commercial systems that offer this functionality, but they are
> extremely expensive (upwards of $18,000). I would like to come up with a
> shoe-string solution that delivers comparable functionality.
> Is MythTV a reasonable platform for this type of project? Is there another
> platform that would be more workable?
> I've done a bit of programming in the past with C# and ASP.NET, and am
> capable of writing some of the code from scratch. However, if there are
> affordable widgets out there that would perform these functions, that would
> definitely be preferable.
> Thanks in advance for your time. If you think this question would be more
> appropriate in a different set of forums, please let me know.
> Aaron Delwiche

Howdy Aaron,

MythTV will easily handle your hardware/capture requirements, and cost
20x less (just ballpark - maybe even cheaper if you're willing to buy
used equipment off ebay).
When you say "skip between searchable transcripts and actual stored
video", you are probably talking about a some new functionality that
isn't there, but it doesn't sound hard.
http://www.mythtv.org/wiki/Closed_captioning mentions that
http://ccextractor.sourceforge.net/ is used to generate .srt or .smi
files, so I assume that all you would need to do is:
1. Pop-up a window which displays lots of the caption text (so that
you can see more context than just one lines worth)
2. a user entry box to allow you to type some text to search for
3. a "find next" button to launch finding the next occurrence of the
entered text in the .srt or .smi and jump to that text in the pop-up
window
4. A "play this" button to to close the pop-up window and jump to that
spot in the video (or else you use the "find next" or alter your
search text).

An interesting angle would be to allow the search at a higher level -
not while viewing a particular captured show, but to search across a
user-selectable list of captured shows.  Or maybe that is what you
intended on saying to begin with.  Either way, sounds relatively
straight forward to me.

Adding the ability to search across shows could open up new
possibilities for people trying to find a certain episode of a show
(even though they don't care about CC otherwise).

Disclaimer: I'm not a Myth developer, and barely play one on my MythTV.

Have fun,

   Marc


More information about the mythtv-users mailing list