[mythtv] A prelude to transcoding MPEG2->MPEG2
ou401cru02 at sneakemail.com
Sun Dec 7 14:28:13 EST 2003
Well, I've done some more investigating, and I am beginning to formulate
a plan for removing chunks from mpeg2 streams.
>From my analysis, the way to go appears to be to demux the stream on the
fly, clip out GOP chunks, rewrite the GOP header, and remux.
I found a very good spec of the MPEG2-PS stream on one of the sites
mentioned earlier, and implemented a crude mpeg parser (which can find
interesting frames, and decode the data inside).
There are nasty things that need to be dealt with though.
A GOP is not necessarily self contained (and usually not). A common GOP
a b c d e f g h i j k l m n o p q r s
I B B P B B P B B I B B P B B P B B I ...
(a-i would be within a single GOP).
However, to decode, h and i, both g and j frames are needed, so this gets
a d b c g e f j h i m k l p n o s q r ...
I P B B P B B I B B P B B P B B I B B
as you can see frames are all out of order, and a given GOP can contain
frames from the previous GOP. This will make accurate (frame-exact)
clipping more challenging.
Unfortunately MPEG2 is not designed for random access (there is no way to
tell how big a given GOP with it's frames will be), so searching for
headers is sort of slow.
For now, I plan to rewrite the gop headers with correct timecodes and
frame numbers, and to strip out any 'B' frames following the initial I
frame. In theory, the MPEG2 spec allows for setting a 'broken' flag to
indicate to the decoder to ignore B frames that refer to the previous
GOP, but it seems that these flags are not always well supported, and
discaring the bogus B frames will work better. It will also mean
rewriting the frame numbers inside the frames, but that isn't too
Things can be further complicated by the sequence headers which can
contain different quantization matrices for a given set of GOPs, so I may
need to build new sequence headers too (so far the MPEG2 streams I've
seen have a 1:! correspondance between GOP and SEQ headers, but this
It does not appear that libavcodec/libavformat gives me sufficient
control to do this sort of thing (it may well be able to, but the
documentation is horrendous, and I haven't figured out how yet). Since
for this type of transcoding, there is no need to decode the actual
frames, I don't need av* for that.
So what does this all mean to the end user? I think I can do commercial
clipping tothe GOP boundaries without too much difficulty. It will
initially only work for PVR streams (MPEG-PS). I will have to write my
own MPEG parsing code (this isn't really too difficult, but could
probably be replaced with some other parser in the future).
This will all suck, because it requires a completely different code-path
to deal with the MPEG2-MPEG2 case in the transcoder. All the
fast-forward code will need to be duplicated into my mpeg parser.
So far, I think I can actually implement all of the above in a reasobably
straight-forward manner. I haven't looked into processing the audio yet,
and that may make things more challenging, as I expect I'll need to
reencode the audio to keep the sync.
I will try to implement the above. I'll probably start with implementing
a standalone prototype which takes the cutpoints on the command line. If
that all works, we'll see how to best merge it with myth.
More information about the mythtv-dev