<br><br><div class="gmail_quote">On Sat, Jan 10, 2009 at 9:40 PM, Alex Butcher <span dir="ltr"><<a href="mailto:mythlist@assursys.co.uk">mythlist@assursys.co.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="Ih2E3d">On Sat, 10 Jan 2009, UB40D wrote:<br>
<br>
> As we know, mythtranscode destroys the DVB-T subtitles/captions (which by<br>
> the way are in the stream as character data, not bitmaps).<br>
<br>
</div>Are you quite sure about that? I was under the impression that they're in<br>
the stream as a series of bitmaps, like DVDs.</blockquote><div><br>In ProjectX, the drop-down Presettings > Presettings > Subtitle > Schriftart even lets you choose which font you want to render them in, for SUP output, so it's obvious that those bitmaps are being generated on the fly.<br>
</div><div> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">> Has anyone managed to extract the text? If so, what are the steps?<br>
<br>
</div>I think you'll need to individually OCR those bitmaps, and concatenate the<br>
result. Maybe I'll go have a play with gocr.<br>
</blockquote></div><br>Nonono... see above. If it required OCR I wouldn't even think about it.<br><br>