[mythtv] Comm Flag questions/confusion

Tue Sep 4 22:56:46 UTC 2012

On 2012-09-04 22:11, Bryan wrote:
> It's entirely possible I'm reading it wrong.  I'm not familiar with
> video stuff, but even given what you've taught me about pixel formats, I
> still think its broken.
>
> Here's a slightly simplified version of the Logo Detector (
> https://github.com/MythTV/mythtv/blob/master/mythtv/programs/mythcommflag/ClassicLogoDetector.cpp#L524 ):
>
>      unsigned char *buf = videoFrame->buf
>      unsigned char pixel;
>      unsigned int pos, x, y;
>
>      for (y = 0; y < videoFrame->height; y++)
>      {
>          for (x = 0; x < videoFrame->width; x++)
>          {
>              pos = y * width + x;
>              pixel = buf[pos];
>
>              // ... use pixel ...
>           }
>       }
>
> I don't see how that can work properly unless the video frame is 8bpp.
>
> That's the easy one to understand.  The rest of the ClassicCommDetector
> code (including the scene change detector) do something a little more
> weird.  They skip and only look at every 4th byte.  It seems like that
> code would work (examining 1/2 of the data) if the video data was stored
> in 16bpp (or 12bpp with 4bits padding).
>
> Here's a simplified version of the Histogram thing used in the
> SceneChangeDetector (
> https://github.com/MythTV/mythtv/blob/master/mythtv/programs/mythcommflag/Histogram.cpp#L32 )
>
>      for(unsigned int y = 0; y < videoFrame->height; y += 4)
>      {
>          for(unsigned int x = 0; x < videoFrame->width; x += 4)
>          {
>              data[frame[y * frameWidth + x]]++;
>              numberOfSamples++;
>          }
>      }
>
> ... I don't see how this could work unless the data is byte-aligned,
> which it isn't for me. I am getting 12-bit packed video data.  Perhaps
> it was at one time padded to 16-bits, or a different format used? That
> would make this code work and simply skip every other pixel.

As Raymond suggests below, it's only the gray-scale data that is being 
read in the examples above.

The thing is that with YV12 all the 480*272 Y samples comes first in one 
sequence, then 480*272/4 samples of V, and then 480*272/4 samples of U. 
  YUV420P is identical except the order of V and U is reversed.

> Here's a gdb print of a typical VideoFrame while either of these pieces
> of code are executing.  The original video is 1080p, but with
> kDecodeLowRes 480x272 comes out:
> $1 = {codec = FMT_YV12, buf = 0x7cbe40 "", width = 480, height = 272,
> aspect = 1.77777779,
>    frame_rate = -1, bpp = 12, size = 195842, frameNumber = 3040,
> timecode = 74484127,
>    disp_timecode = 101306, priv = {0x0, 0x0, 0x0, 0x0}, qscale_table =
> 0x0, qstride = 0,
>    interlaced_frame = 1, top_field_first = 1, repeat_pict = 0, forcekey
> = 0, dummy = 0, pitches = {480,
>      240, 240}, offsets = {0, 130560, 163200}, pix_fmt = 0}

Notice the offsets, 130560 = 480*272, 163200 = 480*272 + 480*272/4.

> Note that size is within 2 bytes of width*height*12/8, implying that the
> buffer is not padded to 16bpp.
>
> My doubts come from that the code works at all as it is now.  I feel
> like I must be missing something because if it is actually doing what I
> think, then I'd expect almost complete garbage to come out, but it
> actually works pretty well.
>
> Maybe I ought to just do it and submit a patch and then whoever looks at
> it can verify I didn't do something horribly stupid?
>
> On Mon, Sep 3, 2012 at 10:07 PM, Raymond Wagner <raymond at wagnerrp.com
> <mailto:raymond at wagnerrp.com>> wrote:
>
>     On 9/3/2012 21:07, Bryan wrote:
>
>         However, all of the "Classic" commflag code treats the data as
>         if it is
>         8bpp (YUV420P is 12).
>
>
>     That's not exactly true. YUV420P still stores 24 bits of data for
>     each each pixel. The gray (Y) channel is at full resolution, while
>     the color (U and V) are both at one quarter resolution, half in both
>     directions. The color data is applied across a 2x2 square of gray
>     pixels. Is it possible you are misinterpreting the code reading a
>     stream of gray-scale data?

--
Jonatan