[mythtv-commits] Ticket #12308: Improve mythcommflag accuracy

Mon Oct 27 04:58:01 UTC 2014

#12308: Improve mythcommflag accuracy
---------------------------------------+----------------------
     Reporter:  faginbagin <mythtv@…>  |      Owner:  cpinkham
         Type:  Patch - Bug Fix        |     Status:  new
     Priority:  minor                  |  Milestone:  unknown
    Component:  MythTV - Mythcommflag  |    Version:  0.27.4
     Severity:  medium                 |   Keywords:
Ticket locked:  0                      |
---------------------------------------+----------------------
 The focus of the patches submitted with this ticket was on improving
 accuracy for SD recordings, but care has also been given to make sure
 commflagging for HD recordings was either unchanged or improved.

 In summary, I found and fixed the following problems:

 1) Mythcommflag assumed that the number of bytes per scan line for the
 luma array equals the video width. That may have been true before
 mythcommflag was changed to use ffmpeg's lowres decoding option, but it is
 not true when the lowres option is used. The bytes per scan line are
 actually the width rounded up to a multiple of 16. Some of the SD
 resolutions, when divided by 4 (as is the case for the lowres option), are
 not multiples of 16. The patch, mcf-bytesPerLine.patch, fixes this
 problem, and improved accuracy from 51% to 54%. More on how I came up with
 these percentages later.

 2) Many years before the lowres option was added to improve performance, a
 default value for the CommDetectBorder setting was set to 20. It is used
 to exclude CommDetectBorder pixels from the top, bottom, left and right of
 the frame from examination. It should, for example, exclude any VBI data
 that might be visible at the top of a frame or parts of the video that
 would be hidden due to overscan. The default value was never adjusted to
 account for lowres decoding, nor did it take into consideration the
 variety of video widths and heights that exist today. The patch, mcf-
 border.patch, assumes the CommDetectBorder setting was established for 720
 height video and uses those two numbers and the lowres video height to
 compute the border used. For example:
 If video source height =  480 then border = 20 *  480 / 4 / 720 = 2
 If video source height =  720 then border = 20 *  720 / 4 / 720 = 5
 If video source height = 1080 then border = 20 * 1080 / 4 / 720 = 7
 mcf-border.patch improves accuracy to 67%.

 3) During the "Initial Block pass" where scoring based on different
 criteria is applied, there was one test that didn't take into
 consideration whether logo detection had failed. The patch, mcf-
 nologo.patch, addresses that oversight, and improves accuracy to 69%.

 4) The CommDetectBorder setting did dual duty, as it was also used to
 limit the area examined by logo detection. The mcf-border.patch for
 problem 2) did not help a number of SD recordings, because it allowed logo
 detection to examine letter-boxed areas of SD video and caused logo
 detection to fail on recordings where it used to work. To detect logos, we
 need to exclude letter-boxed areas from SD video, but if we exclude too
 much from HD video, we'll miss the logo. Using the same border of 16 for
 both SD & HD with no scaling seems to be a good compromise. The patch,
 mcf-logo-border.patch, adds a new setting, CommDetectLogoBorder with a
 default value of 16 and uses it as the border for logo detection. It
 improves accuracy to 72%.

 5) Logo detection required a minimum number of pixels in a mask found by
 examining several video frames to be greater than 50 to be considered a
 valid logo mask. No consideration of the variety of video resolutions was
 taken into consideration. And the 50 pixel minimum was not adjusted when
 lowres decoding was implemented. The patch, mcf-logo-pixInMask.patch,
 assumes the 50 pixel value was established for 1280x720 video and uses
 that area to compute a minimum pixel value using this formula: 50 *
 (width*height) / (1280*720 / 16). It improves accuracy to 75%.

 6) Mythcommflag assumed that video can be either normal, letter-boxed or
 pillar-boxed, but not both letter-boxed and pillar-boxed. I have found
 there are often instances where it can be both letter-boxed and pillar-
 boxed, especially in commercials found in SD recordings and on the SD
 version of our PBS channel. The patch, mcf-letter-pillar.patch changes the
 enum frameFormat into a bit mask allowing the detection of video that's
 both letter and pillar-boxed. This patch did not make any overall change
 to accuracy, at least not with the set of recordings I chose to measure
 results. But it didn't make things worse, and I do have other recordings
 where it helps. I just wasn't prepared to go back and repeat all the tests
 with a different set of recordings.

 The details:

 To gain more insight into how mythcommflag works, I found debugging code
 that Chris Pinkham had posted on mythtv-users but had never put under
 source control, even though there was ifdef'ed code referencing the
 debugging code. Here's his post:
 https://www.mythtv.org/pipermail/mythtv-dev/2006-January/043094.html
 The debugging code can be used to visualize logo detection and also to see
 video frames as they are processed. I'm glad I found it, because it is
 what lead to the bytes per scan line discovery. I could clearly see that
 some SD recordings were severely skewed. The patch, mcf-display-
 debug.patch, adds the debugging code. It is not enabled by default, but
 can be, by executing these commands in the mythtv/programs/mythcommflag
 directory:
 $ make clean
 $ qmake DEFINES+=SHOW_DEBUG_WIN
 $ make
 The resulting version of mythcommflag will prompt you to hit enter during
 each pass of logo detection. You will also need to hit enter as each video
 frame is decoded, although there is no prompt. To disable the debugging
 code, execute the following:
 $ make clean
 $ qmake
 $ make

 So how did I come up with the accuracy percentages mentioned in my
 summary? I wanted to figure out how to measure the impact of any code
 changes. So, I picked a set of 20 test recordings, one from each channel I
 regularly record. 13 are from SD channels, 7 are from HD channels.

 The SD recordings have a mix of resolutions, all have a height of 480, but
 the widths vary: 2 are 528, 1 is 640, 5 are 704 (two of those have some
 commercials that are 720), and 5 are 720.
 11 have an aspect ratio of 4:3, 2 have 16:9 aspect ratio.
 Of the 4:3 recordings: 5 are letter-boxed and 6 are full-screen.
 Of the 16:9 recordings, one is wide-screen and one is pillar-boxed.

 The HD recordings come in the two ATSC HD resolutions: 3 are 1280x720p, 4
 are 1920x1080i.

 I then used the cut list editor to mark commercial breaks. This gave me
 something to compare to the commercial marks found by different iterations
 of mythcommflag. If a starting commercial mark was within 200 frames of a
 starting cut mark, I counted it as a match. And if an ending commercial
 mark was within 200 frames of an ending cut mark, I counted it as a match.
 These were divided by the total number of cut marks (both start and end)
 to arrive at a measure of accuracy. I used some shell scripting, a little
 java program to do the calculations given the recordedmarkup entries for a
 given recording, and a spreadsheet to get the numbers.

 In all tests, I only used the default commercial detection method: "All
 Available Methods" (CommercialSkipMethod = 7). Before making any changes,
 I collected statistics with "Strict commercial detection" disabled and
 enabled (AggressiveCommDetect = 0 and 1). Turns out that "Strict
 commercial detection" is more accurate.
 For the unpatched mythcommflag the difference is 36% vs 51%.
 For the patched mythcommflag, the difference is 66% vs 75%.

 I'm also attaching mcf-combined.patch, which combines all the patches
 described above into one patch.

--
Ticket URL: <https://code.mythtv.org/trac/ticket/12308>
MythTV <http://www.mythtv.org>
MythTV Media Center