[mythtv-commits] Ticket #12308: Improve mythcommflag accuracy
MythTV
noreply at mythtv.org
Mon Oct 27 04:58:01 UTC 2014
#12308: Improve mythcommflag accuracy
---------------------------------------+----------------------
Reporter: faginbagin <mythtv@…> | Owner: cpinkham
Type: Patch - Bug Fix | Status: new
Priority: minor | Milestone: unknown
Component: MythTV - Mythcommflag | Version: 0.27.4
Severity: medium | Keywords:
Ticket locked: 0 |
---------------------------------------+----------------------
The focus of the patches submitted with this ticket was on improving
accuracy for SD recordings, but care has also been given to make sure
commflagging for HD recordings was either unchanged or improved.
In summary, I found and fixed the following problems:
1) Mythcommflag assumed that the number of bytes per scan line for the
luma array equals the video width. That may have been true before
mythcommflag was changed to use ffmpeg's lowres decoding option, but it is
not true when the lowres option is used. The bytes per scan line are
actually the width rounded up to a multiple of 16. Some of the SD
resolutions, when divided by 4 (as is the case for the lowres option), are
not multiples of 16. The patch, mcf-bytesPerLine.patch, fixes this
problem, and improved accuracy from 51% to 54%. More on how I came up with
these percentages later.
2) Many years before the lowres option was added to improve performance, a
default value for the CommDetectBorder setting was set to 20. It is used
to exclude CommDetectBorder pixels from the top, bottom, left and right of
the frame from examination. It should, for example, exclude any VBI data
that might be visible at the top of a frame or parts of the video that
would be hidden due to overscan. The default value was never adjusted to
account for lowres decoding, nor did it take into consideration the
variety of video widths and heights that exist today. The patch, mcf-
border.patch, assumes the CommDetectBorder setting was established for 720
height video and uses those two numbers and the lowres video height to
compute the border used. For example:
If video source height = 480 then border = 20 * 480 / 4 / 720 = 2
If video source height = 720 then border = 20 * 720 / 4 / 720 = 5
If video source height = 1080 then border = 20 * 1080 / 4 / 720 = 7
mcf-border.patch improves accuracy to 67%.
3) During the "Initial Block pass" where scoring based on different
criteria is applied, there was one test that didn't take into
consideration whether logo detection had failed. The patch, mcf-
nologo.patch, addresses that oversight, and improves accuracy to 69%.
4) The CommDetectBorder setting did dual duty, as it was also used to
limit the area examined by logo detection. The mcf-border.patch for
problem 2) did not help a number of SD recordings, because it allowed logo
detection to examine letter-boxed areas of SD video and caused logo
detection to fail on recordings where it used to work. To detect logos, we
need to exclude letter-boxed areas from SD video, but if we exclude too
much from HD video, we'll miss the logo. Using the same border of 16 for
both SD & HD with no scaling seems to be a good compromise. The patch,
mcf-logo-border.patch, adds a new setting, CommDetectLogoBorder with a
default value of 16 and uses it as the border for logo detection. It
improves accuracy to 72%.
5) Logo detection required a minimum number of pixels in a mask found by
examining several video frames to be greater than 50 to be considered a
valid logo mask. No consideration of the variety of video resolutions was
taken into consideration. And the 50 pixel minimum was not adjusted when
lowres decoding was implemented. The patch, mcf-logo-pixInMask.patch,
assumes the 50 pixel value was established for 1280x720 video and uses
that area to compute a minimum pixel value using this formula: 50 *
(width*height) / (1280*720 / 16). It improves accuracy to 75%.
6) Mythcommflag assumed that video can be either normal, letter-boxed or
pillar-boxed, but not both letter-boxed and pillar-boxed. I have found
there are often instances where it can be both letter-boxed and pillar-
boxed, especially in commercials found in SD recordings and on the SD
version of our PBS channel. The patch, mcf-letter-pillar.patch changes the
enum frameFormat into a bit mask allowing the detection of video that's
both letter and pillar-boxed. This patch did not make any overall change
to accuracy, at least not with the set of recordings I chose to measure
results. But it didn't make things worse, and I do have other recordings
where it helps. I just wasn't prepared to go back and repeat all the tests
with a different set of recordings.
The details:
To gain more insight into how mythcommflag works, I found debugging code
that Chris Pinkham had posted on mythtv-users but had never put under
source control, even though there was ifdef'ed code referencing the
debugging code. Here's his post:
https://www.mythtv.org/pipermail/mythtv-dev/2006-January/043094.html
The debugging code can be used to visualize logo detection and also to see
video frames as they are processed. I'm glad I found it, because it is
what lead to the bytes per scan line discovery. I could clearly see that
some SD recordings were severely skewed. The patch, mcf-display-
debug.patch, adds the debugging code. It is not enabled by default, but
can be, by executing these commands in the mythtv/programs/mythcommflag
directory:
$ make clean
$ qmake DEFINES+=SHOW_DEBUG_WIN
$ make
The resulting version of mythcommflag will prompt you to hit enter during
each pass of logo detection. You will also need to hit enter as each video
frame is decoded, although there is no prompt. To disable the debugging
code, execute the following:
$ make clean
$ qmake
$ make
So how did I come up with the accuracy percentages mentioned in my
summary? I wanted to figure out how to measure the impact of any code
changes. So, I picked a set of 20 test recordings, one from each channel I
regularly record. 13 are from SD channels, 7 are from HD channels.
The SD recordings have a mix of resolutions, all have a height of 480, but
the widths vary: 2 are 528, 1 is 640, 5 are 704 (two of those have some
commercials that are 720), and 5 are 720.
11 have an aspect ratio of 4:3, 2 have 16:9 aspect ratio.
Of the 4:3 recordings: 5 are letter-boxed and 6 are full-screen.
Of the 16:9 recordings, one is wide-screen and one is pillar-boxed.
The HD recordings come in the two ATSC HD resolutions: 3 are 1280x720p, 4
are 1920x1080i.
I then used the cut list editor to mark commercial breaks. This gave me
something to compare to the commercial marks found by different iterations
of mythcommflag. If a starting commercial mark was within 200 frames of a
starting cut mark, I counted it as a match. And if an ending commercial
mark was within 200 frames of an ending cut mark, I counted it as a match.
These were divided by the total number of cut marks (both start and end)
to arrive at a measure of accuracy. I used some shell scripting, a little
java program to do the calculations given the recordedmarkup entries for a
given recording, and a spreadsheet to get the numbers.
In all tests, I only used the default commercial detection method: "All
Available Methods" (CommercialSkipMethod = 7). Before making any changes,
I collected statistics with "Strict commercial detection" disabled and
enabled (AggressiveCommDetect = 0 and 1). Turns out that "Strict
commercial detection" is more accurate.
For the unpatched mythcommflag the difference is 36% vs 51%.
For the patched mythcommflag, the difference is 66% vs 75%.
I'm also attaching mcf-combined.patch, which combines all the patches
described above into one patch.
--
Ticket URL: <https://code.mythtv.org/trac/ticket/12308>
MythTV <http://www.mythtv.org>
MythTV Media Center
More information about the mythtv-commits
mailing list