[mythtv] Comm Flag questions/confusion

Mark Spieth mark at digivation.com.au
Tue Sep 4 22:54:53 UTC 2012


On 9/5/2012 6:11 AM, Bryan wrote:
> It's entirely possible I'm reading it wrong.  I'm not familiar with 
> video stuff, but even given what you've taught me about pixel formats, 
> I still think its broken.
>
> Here's a slightly simplified version of the Logo Detector ( 
> https://github.com/MythTV/mythtv/blob/master/mythtv/programs/mythcommflag/ClassicLogoDetector.cpp#L524 ):
>
>     unsigned char *buf = videoFrame->buf
>     unsigned char pixel;
>     unsigned int pos, x, y;
>
>     for (y = 0; y < videoFrame->height; y++)
>     {
>         for (x = 0; x < videoFrame->width; x++)
>         {
>             pos = y * width + x;
>             pixel = buf[pos];
>
>             // ... use pixel ...
>          }
>      }
>
> I don't see how that can work properly unless the video frame is 8bpp.
>
> That's the easy one to understand.  The rest of the 
> ClassicCommDetector code (including the scene change detector) do 
> something a little more weird.  They skip and only look at every 4th 
> byte.  It seems like that code would work (examining 1/2 of the data) 
> if the video data was stored in 16bpp (or 12bpp with 4bits padding).
>
> Here's a simplified version of the Histogram thing used in the 
> SceneChangeDetector ( 
> https://github.com/MythTV/mythtv/blob/master/mythtv/programs/mythcommflag/Histogram.cpp#L32 )
>
>     for(unsigned int y = 0; y < videoFrame->height; y += 4)
>     {
>         for(unsigned int x = 0; x < videoFrame->width; x += 4)
>         {
>             data[frame[y * frameWidth + x]]++;
>             numberOfSamples++;
>         }
>     }
>
> ... I don't see how this could work unless the data is byte-aligned, 
> which it isn't for me. I am getting 12-bit packed video data.  Perhaps 
> it was at one time padded to 16-bits, or a different format used? That 
> would make this code work and simply skip every other pixel.
>
12bpp is average not packed format. YV12 or YUV420 has all of Y first at 
full res and U and V at 1/4 res (1/2x and 1/2 y)
Y is 8 bits at fukll res, U and V average 2 bits because the 8 bit 
samples cover/apply to 4 full res pixels.

so struct for image would be
struct pic {
uint8_t Y[272][480];
uint8_t U[272/2][480/2];
uint8_t V[272/2][480/2];
};

size = 272*480+272/2*480/2+272/2*480/2
bpp = size/(272*480)
bpp = (1+1/4+1/4) = 1.5bytes/pix = 12bpp

hope this is clear. see wiki or whatever for YV12

there is nothing wrong with the existing implementation AFAICT
It jusy doesnt work very well (logodet that is)

mark

> Here's a gdb print of a typical VideoFrame while either of these 
> pieces of code are executing.  The original video is 1080p, but with 
> kDecodeLowRes 480x272 comes out:
> $1 = {codec = FMT_YV12, buf = 0x7cbe40 "", width = 480, height = 272, 
> aspect = 1.77777779,
>   frame_rate = -1, bpp = 12, size = 195842, frameNumber = 3040, 
> timecode = 74484127,
>   disp_timecode = 101306, priv = {0x0, 0x0, 0x0, 0x0}, qscale_table = 
> 0x0, qstride = 0,
>   interlaced_frame = 1, top_field_first = 1, repeat_pict = 0, forcekey 
> = 0, dummy = 0, pitches = {480,
>     240, 240}, offsets = {0, 130560, 163200}, pix_fmt = 0}
>
> Note that size is within 2 bytes of width*height*12/8, implying that 
> the buffer is not padded to 16bpp.
>
> My doubts come from that the code works at all as it is now.  I feel 
> like I must be missing something because if it is actually doing what 
> I think, then I'd expect almost complete garbage to come out, but it 
> actually works pretty well.
>
> Maybe I ought to just do it and submit a patch and then whoever looks 
> at it can verify I didn't do something horribly stupid?
>
> On Mon, Sep 3, 2012 at 10:07 PM, Raymond Wagner <raymond at wagnerrp.com 
> <mailto:raymond at wagnerrp.com>> wrote:
>
>     On 9/3/2012 21:07, Bryan wrote:
>
>         However, all of the "Classic" commflag code treats the data as
>         if it is
>         8bpp (YUV420P is 12).
>
>
>     That's not exactly true. YUV420P still stores 24 bits of data for
>     each each pixel. The gray (Y) channel is at full resolution, while
>     the color (U and V) are both at one quarter resolution, half in
>     both directions. The color data is applied across a 2x2 square of
>     gray pixels. Is it possible you are misinterpreting the code
>     reading a stream of gray-scale data?
>     _______________________________________________
>     mythtv-dev mailing list
>     mythtv-dev at mythtv.org <mailto:mythtv-dev at mythtv.org>
>     http://www.mythtv.org/mailman/listinfo/mythtv-dev
>
>
>
>
> _______________________________________________
> mythtv-dev mailing list
> mythtv-dev at mythtv.org
> http://www.mythtv.org/mailman/listinfo/mythtv-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mythtv.org/pipermail/mythtv-dev/attachments/20120905/3f6adff5/attachment.html>


More information about the mythtv-dev mailing list