[mythtv-users] How to extract subtitles/captions from DVB-T using ProjectX?

Alex Butcher mythlist at assursys.co.uk
Sat Jan 10 22:08:47 UTC 2009


On Sat, 10 Jan 2009, Alex Butcher wrote:

> On Sat, 10 Jan 2009, UB40D wrote:
>> Me, I'd just like something much simpler: a text file with all the text
>> and nothing more. I have tried selecting an output of "txt" and also an
>> output of "none" but neither has produced the desired text file.
>>
>> Has anyone managed to extract the text? If so, what are the steps?
>
> I think you'll need to individually OCR those bitmaps, and concatenate the
> result. Maybe I'll go have a play with gocr.

Hmmm. Mixed results, but would need a well trained-database for the
subtitles in order to be useful. There may even be multiple fonts in use
even within a single national broadcasting system.

Best Regards,
Alex.


More information about the mythtv-users mailing list