[mythtv] Improving closed captions/subtitles

faginbagin mythtv at hbuus.com
Sat Feb 1 06:03:11 UTC 2014

On 1/31/2014 11:48 PM, Jim Stichnoth wrote:
> On Fri, Jan 31, 2014 at 3:45 PM, faginbagin <mythtv at hbuus.com
> <mailto:mythtv at hbuus.com>> wrote:
> I'd like to improve the appearance and legibility of 608/708 captions
> in fixes/0.27. Some of the problems I see are:
> Cool, it's nice to have more people looking at this.

Great! I know you're the one who's been most active on this front recently, so I was really hoping you would welcome my efforts.

> - Default font size is smaller than should be necessary to fit 15
> lines in a 90% safe area. I estimate the default FontMono pixel size
> could be increased from 32 to 38 for a 1280x720 display and I think I
> have a relatively simple algorithm to compute a good font pixel size
> for a variety of algorithms.
> This is something you can set in osd_subtitle.xml, right?

I'll have to experiment with this. There are comments in the code that lead me to believe setting pixelsize would be ignored. But maybe I was wrong. 

> - I suspect support for subtitle zoom was added because of the first
> problem. Maybe it can be eliminated at some point?
> Subtitle zoom is there purely for readability and user preference.
> The best value depends on TV size, viewing distance, the viewer's
> eyesight, and other factors.  This is perhaps the most important
> factor to make or break usability of subtitles, which is why it's one
> of the few settings available to the user instead of being baked into
> the theme.  I'm not really interested in removing it. :)

No problem, I won't rip it out and will make sure I don't break it.

> - I suspect fudge factors like LINE_SPACING and PAD_HEIGHT were added
> for the same reason there's subtitle zoom and to work around problems
> created by subtitle zoom. Same goes for word wrap.
> There are several tricky issues here.  The PAD_WIDTH and PAD_HEIGHT
> fudge factors are there because Qt font definitions lie about sizes.
> When text is drawn on the screen, the code has to specify a rectangle
> to draw the text on top of, and any pixels outside of the rectangle
> are clipped.  For the vast majority of cases, the fudge factors keep
> that unsightly clipping from happening.  This also doubles as the
> extra space to add to the black (or whatever) rectangle drawn behind
> the text if specified.

I've been tinkering with a little Qt font test app so I'm well aware what Qt says about font metrics. And I think I've got a solution that works well with all but one font. I forget which one, but it's one that's not an attractive font for subtitles.

> The other issue is design consistency between the 3 subtitle types -
> cc608, cc708, and srt.  The goal is that with all things being equal,
> the text should look consistent across the 3 subtitle types.  The
> most important factor here is font pixel size.  Since we allow up to
> 17 cc608 lines (I think that number is historical in MythTV) and up
> to 20 cc708 lines, we set the pixel size based on 20 cc708 lines, and
> just add extra line spacing for cc608.  And to confuse matters
> further, there is code that keeps lines from overlapping when the
> pixel size is too large (such as using a high zoom level) and from
> falling outside the safe area.  The consistency result is that if you
> flip between cc608 and cc708 subtitles on the same program, barring
> font differences, it will be very difficult tell the two apart.  And
> if you run mythccextractor and install one of the resulting .srt
> files, it should also look almost identical to the original.

Can you direct me to anything about the 708 spec, even a summary? The only articles I could find lead me to understand that the spec supports 42 columns by 15 rows when the video has an aspect ratio of 16:9. Otherwise it supports 32 x 15 for 4:3, just like 608. But there's also mention of a coordinate system of 210 x 75 for 16:9 and 160 x 75 for 4:3. Both sets of numbers seem to be chosen because when you divide them by 5, you get 42 x 15 and 32 x 15 respectively. But I have no idea what the factor of 5 is all about. And I haven't found anything suggesting the 708 spec calls for more than 15 rows. So I'd really like confirmation that the 708 spec does allow for up to 20 rows, and whether there are certain conditions that determine how may rows a decoder is expected to display, like aspect ratio affecting number of columns.

FWIW, the articles I've found most instructive were:
I've also got sample videos from wgbh.org.

> Word wrapping is only done for srt subtitles, and that logic has been
> there for a very long time.  This is motivated by subtitles
> translated into a different language.  Translations are almost always
> longer than the original, causing frequent clipping in the absence of
> word wrapping.  The cc708 spec has a provision for word wrap, but it
> is not implemented in MythTV and I have never seen it in the wild.
> (This is not surprising since pretty much all cc708 subtitle streams
> are automatic conversions from cc608, the most imaginative difference
> being a sans serif font.

That's very helpful to know. What might be the worst case of word wrapped srt text, specifically max number of rows. Have you seen it exceed 5 rows of text, for example?

> - Captions are shifted too far to the left. As a result, it is hard
> to tell who is speaking when two people are on screen. I believe they
> should be positioned to center 32 characters per line when the video
> is 4:3 aspect and 42 characters per line when the video is 16:9,
> based on what I've gleaned about the 708 standard.
> I do think this is a problem in the cc708 stream from the
> broadcaster.  But if MythTV is interpreting the data wrong, I would
> definitely like to know.

Too bad the spec isn't public domain.

> I've tinkered with osd_subtitle.xml and I don't see how I can fix
> these problems just by tweaking the theme file. I see that the code
> involved has diverged in master, so I have backported a number of
> files in my development environment in the hopes that my work might
> eventually be considered for inclusion in master. I'm sure I can
> improve the display of 608/708 captions while simplifying the code at
> the same time, and I believe I know how to test support for SRT
> subtitles and make sure I don't break DVD subtitles (I've got both
> NTSC and PAL DVDs). I am worried that I could break teletext
> subtitles unless I know more about them and/or have test samples.
> I'd be very happy to see your code changes.  SRT files are generally
> very easy to test -- for file foo.mpg, just drop a foo.srt file next
> to it.  If you get your .srt file from mythccextractor, you'll have
> to rename it to follow the pattern.  This layout code is entirely for
> text subtitles, whereas DVD subtitles are bitmaps, so there's almost
> no chance of breaking that.  Text-based teletext subtitles (as
> opposed to the bitmap DVB subtitles) are currently in a completely
> different part of the code - teletextscreen.cpp instead of
> subtitlescreen.cpp.

I'm relieved to hear that DVD, DVB and text based teletext are elsewhere.

> My questions are:
> - Am I off base w.r.t zoom, fudge factors and word wrap? If so,
> please educate me.
> I hope the above clarified it a bit.

Most definitely.

> - Would there be interest in any work I do to improve the display of
> 608/708 captions?
> Definitely!


> - If so, can someone direct me to some test videos for teletext, and
> perhaps advise me on unit test cases I should pay special attention
> to or haven't mentioned above?
> No need to worry about teletext since the code is separate.  Be sure
> to consider that the theme might explicitly set the font pixel size.

Definitely got to play with osd_subtitle.xml some more.


More information about the mythtv-dev mailing list