[mythtv] ffmpeg SWSCALE!

Thu Aug 31 05:00:50 UTC 2006

On Wed, 2006-08-30 at 23:51 -0400, Yeasah Pell wrote:
> Daniel Kristjansson wrote:
> >> The answer I came up with is "if you have a properly set up viewing 
> >> distance given your eyes and the size of the display device, you do not."
> Forget about the subpixels for a second. Assume the pixel on a panel is 
> simply a colored square. That is what I mean by "ideal discrete output 
> device". In visible character, LCDs and the ilk come quite close to this 
> in my opinion, but I'm just talking theoretically here -- a grid of 
> colored squares.
Right, I understand the artifact you are talking about, it
is simply not called aliasing.

> A grid of colored squares is an ideal representation of a pre-filtered 
> raw discrete output, it's the 2 dimensional analog of the stair-stepped
Umm, I would not consider that ideal at all. I would consider point
samples convolved with an infinite sinc function to be ideal. And
I would consider the sampled Gaussian blur of a CRT to be more ideal
than this the box filter.
 
> output of an audio DAC that hasn't been filtered at all. In essence, it 
> contains the original signal, plus all the output sample aliasing that 
Yes in the sense of information, it contains all the data that the
sinc reconstructed signal would have, but the box filter is just about
the worst filter you can use. And, it entirely because of this poor
reconstruction filter that the difference image is so large.

> is of a higher frequency than the Nyquist frequency of the image being 
> displayed. In audio, this aliasing would be ideally stripped off with 
> the equivalent of a low pass filter. In video, it is ideally stripped 
> off with a convolution that is a 2 dimensional low pass filter ("blur")
No, ideally they would be reconstructed with an infinite sinc. Practically
you would use a low pass filter as a minimum, and preferably would use
something that approximated sinc without giving you too much ringing.

> Ok? Understand what I mean by aliasing here? Some pictures might help.
Yes, but it is not called aliasing. This artifact is due to the
reconstruction filter not being a sinc function. All physically
realizable filters give you reconstruction errors.

> The difference between those two pictures shows the high frequency 
> components that should have been filtered out, i.e. the nyquist aliasing:
Nope, only a small portion of the residual is due to the high frequency
components, you would need to filter that with the sinc function in
order for the image to contain just the high frequencies. And that
still wouldn't be called aliasing.

> You can't filter this out -- it's inherent in the displaying of a grid 
> of colored blocks. The only way you can possibly "filter" that is to a) 
> position yourself so you can't see details of that fineness,
This is correct.

> or b) feed your display with content that is of lower resolution, which
> allows you to reduce it by filtering the image.
You still get blocking artifacts :-|
Sitting far away or putting a diffusor between you and the LCD
is the only workable solution (an effective diffusor would lower
the resolution greatly, so it isn't practical with the current
LCD technology).

> Option b) is what I believe Mike has been talking about.
I dunno. Maybe Mike can answer this?

> Yes, actual display devices aren't idealized discrete output devices, 
> especially CRTs. But I believe LCDs, et. al. are close enough that they 
> can be viably thought of as such. Certainly the aliasing described above 
> will be present. Some differences will be present on a CRT as well, but 
> they will probably be more like errors -- lower frequency deviations in 
> the signal that are below the nyquist limit of the original signal.
The difference between an LCD and a decent CRT of the same
resolution is that the CRT is like an LCD with a 10x as many
pixels, but those extra pixels are filled in with the values
you would get if you convolved the Gaussian with original
image. There is no more information, in fact there is a bit
less in the CRT image, but the reconstruction of the original
analog signal is much better. You have the Gaussian as your
reconstruction filter rather than the box filter. If you blur
your original image so that it is the same resolution as your
lowres image, then subtract that from the your original image
you will see that the difference image has a smaller magnitude,
and it will be have much more of it's energy in the higher
frequencies than the difference image from the box filter
reconstructed image. The Gaussian is NOT an ideal filter, but
it is better than the box filter for reconstruction. It loses
information, but it gives a more accurate reconstruction of
the analog input signal than the box filter.

<snip, why the MythTV video is not aliased>
> Of course that's all true -- but it also has nothing to do with what I 
> was talking about. Xv will not scale a signal that is the same 
> resolution as the output device, which as I've said is explicitly the 
> case I'm talking about. I'm off on a tangent here, you should probably 
> be replying to somebody else's less-tangential post. :-)
You are right, I was thinking of some other people's misconceptions.
I don't think Micheal has this misconception either, I got a little
too high on my horse there.

> > I think there are better ways to improve DVD playback though.   
> I don't disagree with any of that either, thought the judder issue is 
> more complicated than that --
Yep..

> you can't always just change the 
> framerate, since a very common setup is to have the multichannel audio 
> decoded in an external receiver, so you don't have direct control over 
> the audio clock (and can't play bending tricks with the audio, the only
Oh, I wasn't saying you should play the video back any faster.
What I was saying you could do is interpolate the motion vectors
and the residuals to generate new frames in between the encoded
frames. This would work 90% of the time, but you would need to
detect scene changes. You would need to use more accuracy in the
reconstruction which means more CPU, and it would require very
extensive changes to ffmpeg.. But you can think of the video as
just a 3-D space column of data from which you sample, and you
do not have to sample at the same spots as in the source, so long
as there is no discontinuity. This can not be done practically
as a post filter, but the data needed for this is already in
the MPEG stream. MPEG works by creating this function in 3-D
and then coding residuals for the errors in the prediction based
on the function representing the 3-D column. The problem would
be how good your reconstructions would be because your residuals
would be off, but you could maybe snap to the encoder frames if
the residuals were large and the motion was small. Then you would
get judder only at scene changes where it is less noticable anyway.
In high motion losing a little pixel fidelity is probably not as
important as keeping the motion spot on, think of Michael Jordan
making a slam dunk. I didn't say it would be completely trivial
to implement :)

-- Daniel