mirror of
https://github.com/mpv-player/mpv
synced 2024-12-26 00:42:57 +00:00
08ec85ba4b
XviD is the correct spelling of the codec. You can see it written in the codec own documentation and header files. Prefered name capitalization confirmed in conversation with XviD developer (prunedtree). git-svn-id: svn://svn.mplayerhq.hu/mplayer/trunk@26915 b3059339-0415-0410-9bf9-f77b7e298cf2
726 lines
34 KiB
Plaintext
726 lines
34 KiB
Plaintext
|
|
Some important URLs:
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
http://mplayerhq.hu/~michael/codec-features.html <- lavc vs. divx5 vs. xvid
|
|
http://www.ee.oulu.fi/~tuukkat/mplayer/tests/rguyom.ath.cx/ <- lavc benchmarks, options
|
|
http://ffdshow.sourceforge.net/tikiwiki/tiki-view_articles.php <- lavc for win32 :)
|
|
http://www.bunkus.org/dvdripping4linux/index.html <- a nice tutorial
|
|
http://forum.zhentarim.net/viewtopic.php?p=237 <- lavc option comparison
|
|
http://www.ee.oulu.fi/~tuukkat/mplayer/tests/readme.html <- series of benchmarks
|
|
http://thread.gmane.org/gmane.comp.video.mencoder.user/1196 <- free codecs shoutout and recommended encoding settings
|
|
|
|
|
|
================================================================================
|
|
|
|
|
|
FIXING A/V SYNC WHEN ENCODING
|
|
|
|
I know this is a popular topic on the list, so I thought I'd share a
|
|
few comments on my experience fixing a/v sync. As everyone seems to
|
|
know, mencoder unfortunately doesn't have a -delay option. But that
|
|
doesn't mean you can't fix a/v sync. There are a couple ways to still
|
|
do it.
|
|
|
|
In example 1, we'll suppose you want to re-encode the audio anyway.
|
|
This will be essential if your source audio isn't mp3, e.g. for DVD's
|
|
or nasty avi files with divx/wma audio. This approach makes things
|
|
much easier.
|
|
|
|
Step 1: Dump the audio with mplayer -ao pcm -nowaveheader. There are
|
|
various options that can be used to speed this up, most notably -vo
|
|
null, -vc null, and/or -hardframedrop. -benchmark also seemed to help
|
|
in the past. :)
|
|
|
|
Step 2: Figure out what -delay value syncs the audio right in mplayer.
|
|
If this number is positive, use a command like the following:
|
|
|
|
dd if=audiodump.wav bs=1764 skip=[delay] | lame -x - out.mp3
|
|
|
|
where [delay] is replaced by your delay amount in hundredths of a
|
|
second (1/10 the value you use with mplayer). Otherwise, if delay is
|
|
negative, use a command like this:
|
|
|
|
( dd if=/dev/zero bs=1764 skip=[delay] ; cat audiodump.wav ) | lame -x - out.mp3
|
|
|
|
Don't include the minus (-) sign in delay. Also, keep in mind you'll
|
|
have to change the 1764 number and provide additional options to lame
|
|
if your audio stream isn't 44100/16bit/little-endian/stereo.
|
|
|
|
Step 3: Use mencoder to remux your new mp3 file with the movie:
|
|
|
|
mencoder -audiofile out.mp3 -oac copy ...
|
|
|
|
You can either copy video as-is (with -ovc copy) or re-encode it at
|
|
the same time you merge in the audio like this.
|
|
|
|
Finally, as a variation on this method (makes things a good bit faster
|
|
and doesn't use tons of temporary disk space) you can merge steps 1
|
|
and 2 by making a named pipe called "audiodump.wav" (type mkfifo
|
|
audiodump.wav) and have mplayer write the audio to it at the same time
|
|
you're running lame to encode.
|
|
|
|
Now for example 2. This time we won't re-encode audio at all. Just
|
|
dump the mp3 stream from the avi file with mplayer -dumpaudio. Then,
|
|
you have to cut and paste the raw mp3 stream a bit...
|
|
|
|
If delay is negative, things are easier. Just use lame to encode
|
|
silence for the duration of delay, at the same samplerate and
|
|
samplesize used in your avi file. Then, do something like:
|
|
|
|
cat silence.mp3 stream.dump > out.mp3
|
|
mencoder -audiofile out.mp3 -oac copy ...
|
|
|
|
On the other hand, if delay is positive, you'll need to crop off part
|
|
of the mp3 from the beginning. If it's (at least roughly) CBR this is
|
|
easy -- just take off the first (bitrate*delay/8) bytes of the file.
|
|
You can use the excellent dd tool, or just your favorite
|
|
binary-friendly text editor to do this. Otherwise, you'll have to
|
|
experiment with cutting off different amounts. You can test with
|
|
mplayer -audiofile before actually spending time remuxing/encoding
|
|
with mencoder to make sure you cut the right amount.
|
|
|
|
I hope this has all been informative. If anyone would like to clean
|
|
this message up a bit and make it into part of the docs, feel free. Of
|
|
course mencoder should eventually just get -delay. :)
|
|
|
|
Rich
|
|
|
|
|
|
================================================================================
|
|
|
|
|
|
ENCODING QUALITY - OR WHY AUTOMATISM IS BAD.
|
|
|
|
Hi everyone.
|
|
|
|
Some days ago someone suggested adding some preset options to mencoder.
|
|
At that time I replied 'don't do that', and now I decided to elaborate
|
|
on that.
|
|
|
|
Warning: this is rather long, and it involves mathematics. But if you
|
|
don't want to bother with either then why are you encoding in the
|
|
first place? Go do something different!
|
|
|
|
The good news is: it's all about the bpp (bits per pixel).
|
|
|
|
The bad news is: it's not THAT easy ;)
|
|
|
|
This mail is about encoding a DVD to MPEG4. It's about the video
|
|
quality, not (primarily) about the audio quality or some other fancy
|
|
things like subtitles.
|
|
|
|
The first step is to encode the audio. Why? Well if you encode the
|
|
audio prior to the video you'll have to make the video fit onto one
|
|
(or two) CD(s). That way you can use oggenc's quality based encoding
|
|
mode which is much more sophisticated than its ABR based mode.
|
|
|
|
After encoding the audio you have a certain amount of space left to
|
|
fill with video. Let's assume the audio takes 60M (no problem with
|
|
Vorbis), and you aim at a 700M CD. This leaves you 640M for the video.
|
|
Let's further assume that the video is 100 minutes or 6000 seconds
|
|
long, encoded at 25fps (those nasty NTSC fps values give me
|
|
headaches. Adjust to your needs, of course!). This leaves you with
|
|
a video bitrate of:
|
|
|
|
$videosize * 8
|
|
$videobitrate = --------------
|
|
$length * 1000
|
|
|
|
$videosize in bytes, $length in seconds, $videobitrate in kbit/s.
|
|
In my example I end up with $videobitrate = 895.
|
|
|
|
And now comes the question: how do I chose my encoding parameters
|
|
so that the results will be good? First let's take a look at a
|
|
typical mencoder line:
|
|
|
|
mencoder dvd://1 -o /dev/null -oac copy -ovc lavc \
|
|
-lavcopts vcodec=mpeg4:vbitrate=1000:vhq:vqmin=2:\
|
|
vlelim=-4:vcelim=9:lumi_mask=0.05:dark_mask=0.01:vpass=1 \
|
|
-vf crop=716:572:2:2,scale=640:480
|
|
|
|
Phew, all those parameters! Which ones should I change? NEVER leave
|
|
out 'vhq'. Never ever. 'vqmin=2' is always good if you aim for sane
|
|
settings - like 'normal length' movies on one CD, 'very long movies'
|
|
on two CDs and so on. vcodec=mpeg4 is mandatory.
|
|
|
|
The 'vlelim=-4:vcelim=9:lumi_mask=0.05:dark_mask=0.01' are parameters
|
|
suggested by D Richard Felker for non-animated movies, and they
|
|
improve quality a bit.
|
|
|
|
But the two things that have the most influence on quality are
|
|
vbitate and scale. Why? Because both together tell the codec how
|
|
many bits it may spend on each frame for each bit: and this is
|
|
the 'bpp' value (bits per pixel). It's simply defined as
|
|
|
|
$videobitrate * 1000
|
|
$bpp = -----------------------
|
|
$width * $height * $fps
|
|
|
|
I've attached a small Perl script that calculates the $bpp for
|
|
a movie. You'll have to give it four parameters:
|
|
a) the cropped but unscaled resolution (use '-vf cropdetect'),
|
|
b) the encoded aspect ratio. All DVDs come at 720x576 but contain
|
|
a flag that tells the player wether it should display the DVD at
|
|
an aspect ratio of 4/3 (1.333) or at 16/9 (1.777). Have a look
|
|
at mplayer's output - there's something about 'prescaling'. That's
|
|
what you are looking for.
|
|
c) the video bitrate in kbit/s and
|
|
d) the fps.
|
|
|
|
In my example the command line and calcbpp.pl's output would look
|
|
like this (warning - long lines ahead):
|
|
|
|
mosu@anakin:~$ ./calcbpp.pl 720x440 16/9 896 25
|
|
Prescaled picture: 1023x440, AR 2.33
|
|
720x304, diff 5, new AR 2.37, AR error 1.74% scale=720:304 bpp: 0.164
|
|
704x304, diff -1, new AR 2.32, AR error 0.50% scale=704:304 bpp: 0.167
|
|
688x288, diff 8, new AR 2.39, AR error 2.58% scale=688:288 bpp: 0.181
|
|
672x288, diff 1, new AR 2.33, AR error 0.26% scale=672:288 bpp: 0.185
|
|
656x288, diff -6, new AR 2.28, AR error 2.17% scale=656:288 bpp: 0.190
|
|
640x272, diff 3, new AR 2.35, AR error 1.09% scale=640:272 bpp: 0.206
|
|
624x272, diff -4, new AR 2.29, AR error 1.45% scale=624:272 bpp: 0.211
|
|
608x256, diff 5, new AR 2.38, AR error 2.01% scale=608:256 bpp: 0.230
|
|
592x256, diff -2, new AR 2.31, AR error 0.64% scale=592:256 bpp: 0.236
|
|
576x240, diff 8, new AR 2.40, AR error 3.03% scale=576:240 bpp: 0.259
|
|
560x240, diff 1, new AR 2.33, AR error 0.26% scale=560:240 bpp: 0.267
|
|
544x240, diff -6, new AR 2.27, AR error 2.67% scale=544:240 bpp: 0.275
|
|
528x224, diff 3, new AR 2.36, AR error 1.27% scale=528:224 bpp: 0.303
|
|
512x224, diff -4, new AR 2.29, AR error 1.82% scale=512:224 bpp: 0.312
|
|
496x208, diff 5, new AR 2.38, AR error 2.40% scale=496:208 bpp: 0.347
|
|
480x208, diff -2, new AR 2.31, AR error 0.85% scale=480:208 bpp: 0.359
|
|
464x192, diff 7, new AR 2.42, AR error 3.70% scale=464:192 bpp: 0.402
|
|
448x192, diff 1, new AR 2.33, AR error 0.26% scale=448:192 bpp: 0.417
|
|
432x192, diff -6, new AR 2.25, AR error 3.43% scale=432:192 bpp: 0.432
|
|
416x176, diff 3, new AR 2.36, AR error 1.54% scale=416:176 bpp: 0.490
|
|
400x176, diff -4, new AR 2.27, AR error 2.40% scale=400:176 bpp: 0.509
|
|
384x160, diff 5, new AR 2.40, AR error 3.03% scale=384:160 bpp: 0.583
|
|
368x160, diff -2, new AR 2.30, AR error 1.19% scale=368:160 bpp: 0.609
|
|
352x144, diff 7, new AR 2.44, AR error 4.79% scale=352:144 bpp: 0.707
|
|
336x144, diff 0, new AR 2.33, AR error 0.26% scale=336:144 bpp: 0.741
|
|
320x144, diff -6, new AR 2.22, AR error 4.73% scale=320:144 bpp: 0.778
|
|
|
|
A word for the $bpp. For a fictional movie which is only black and
|
|
white: if you have a $bpp of 1 then the movie would be stored
|
|
uncompressed :) For a real life movie with 24bit color depth you
|
|
need compression of course. And the $bpp can be used to make the
|
|
decision easier.
|
|
|
|
As you can see the resolutions suggested by the script are all
|
|
dividable by 16. This will make the aspect ratio slightly wrong,
|
|
but no one will notice.
|
|
|
|
Now if you want to decide which resolution (and scaling parameters)
|
|
to chose you can do that by looking at the $bpp:
|
|
|
|
< 0.10: don't do it. Please. I beg you!
|
|
< 0.15: It will look bad.
|
|
< 0.20: You will notice blocks, but it will look ok.
|
|
< 0.25: It will look really good.
|
|
> 0.25: It won't really improve visually.
|
|
> 0.30: Don't do that either - try a bigger resolution instead.
|
|
|
|
Of course these values are not absolutes! For movies with really lots
|
|
of black areas 0.15 may look very good. Action movies with only high
|
|
motion scenes on the other hand may not look perfect at 0.25. But these
|
|
values give you a great idea about which resolution to chose.
|
|
|
|
I see a lot of people always using 512 for the width and scaling
|
|
the height accordingly. For my (real-world-)example this would be
|
|
simply a waste of bandwidth. The encoder would probably not even
|
|
need the full bitrate, and the resulting file would be smaller
|
|
than my targetted 700M.
|
|
|
|
After encoding you'll do your 'quality check'. First fire up the movie
|
|
and see whether it looks good to you or not. But you can also do a
|
|
more 'scientific' analysis. The second Perl script I attached counts
|
|
the quantizers used for the encoding. Simply call it with
|
|
|
|
countquant.pl < divx2pass.log
|
|
|
|
It will print out which quantizer was used how often. If you see that
|
|
e.g. the lowest quantizer (vqmin=2) gets used for > 95% of the frames
|
|
then you can safely increase your picture size.
|
|
|
|
> The "counting the quantesizer"-thing could improve the quality of
|
|
> full automated scripts, as I understand ?
|
|
|
|
Yes, the log file analysis can be used be tools to automatically adjust
|
|
the scaling parameters (if you'd do that you'd end up with a three-pass
|
|
encoding for the video only ;)), but it can also provide answers for
|
|
you as a human. From time to time there's a question like 'hey,
|
|
mencoder creates files that are too small! I specified this bitrate and
|
|
the resulting file is 50megs short of the target file size!'. The
|
|
reason is probably that the codec already uses the minimum quantizer
|
|
for nearly all frames so it simply does not need more bits. A quick
|
|
glance at the distribution of the quantizers can be enlightening.
|
|
|
|
Another thing is that q=2 and q=3 look really good while the 'bigger'
|
|
quantizers really lose quality. So if your distribution shows the
|
|
majority of quantizers at 4 and above then you should probably decrease
|
|
the resolution (you'll definitly see block artefacts).
|
|
|
|
|
|
Well... Several people will probably disagree with me on certain
|
|
points here, especially when it comes down to hard values (like the
|
|
$bpp categories and the percentage of the quantizers used). But
|
|
the idea is still valid.
|
|
|
|
And that's why I think that there should NOT be presets in mencoder
|
|
like the presets lame knows. 'Good quality' or 'perfect quality' are
|
|
ALWAYS relative. They always depend on a person's personal preferences.
|
|
If you want good quality then spend some time reading and - more
|
|
important - understanding what steps are involved in video encoding.
|
|
You cannot do it without mathematics. Oh well, you can, but you'll
|
|
end up with movies that could certainly look better.
|
|
|
|
Now please shoot me if you have any complaints ;)
|
|
|
|
--
|
|
==> Ciao, Mosu (Moritz Bunkus)
|
|
|
|
===========
|
|
ANOTHER APPROACH: BITS PER BLOCK:
|
|
|
|
> $videobitrate * 1000
|
|
> $bpp = -----------------------
|
|
> $width * $height * $fps
|
|
|
|
Well, I came to similar equation going through different route. Only I
|
|
didn't use bits per pixel, in my case it was bits per block (BPB). The block
|
|
is 16x16 because lots of software depends on video width/height being
|
|
divisable by 16. And because I didn't like this 0.2 bit per pixel, when
|
|
bit is quite atomic ;)
|
|
|
|
So the equation was something like:
|
|
|
|
bitrate
|
|
bpb = -----------------
|
|
fps * ((width * height) / (16 * 16))
|
|
|
|
(width and height are from destination video size, and bitrate is in
|
|
bits (i.e. 900kbps is 900000))
|
|
|
|
This way it apeared that the minimum bits per block is ~40, very
|
|
good results are with ~50, and everything above 60 is a waste of bandwidth.
|
|
And what's actually funny is that it was independent of codec used. The
|
|
results were exactly the same, whether I used DIV3 (with tricky nandub's
|
|
magick), ffmpeg odivx, DivX5 on Windows or XviD.
|
|
|
|
Surprisingly there is one advantage of using nandub-DIV3 for bitrate
|
|
starved encoding: ringing almost never apears this way.
|
|
|
|
But I also found out, that the quality/BPB isn't constant for
|
|
drastically different resolutions. Smaller picture (like MPEG1 sizes)
|
|
need more BPB to look good than say typical MPEG2 resolutions.
|
|
|
|
Robert
|
|
|
|
|
|
===========
|
|
DON'T SCALE DOWN TOO MUCH
|
|
|
|
Sometimes I found that encoding to y-scaled only DVD qualty (ie 704 x
|
|
288 for a 2.85 film) gives better visual quality than a scaled-down
|
|
version even if the quantizers are significantly higher than for the
|
|
scaled-down version.
|
|
Keep in mind that blocs, fuzzy parts and generaly mpeg artefacts in a
|
|
704x288 image will be harder to spot in full-screen mode than on a
|
|
512x208 image. In fact I've see example where the same movie looks
|
|
better compressed to 704x288 with an average weighted quantizer of
|
|
~3 than the same movie scaled to 576x240 with an average weighted
|
|
quantizer of 2.4.
|
|
Btw, a print of the weighted average quantizer would be nice in
|
|
countquant.pl :)
|
|
|
|
Another point in favor of not trying to scale down too much : on hard
|
|
scaled-down movies, the MPEG codec will need to compress relatively
|
|
high frequencies rather than low frequencies and it doesn't like that
|
|
at all. You will see less and less returns while you scale down and
|
|
scale down again in desesperate need of some bandwidth :)
|
|
|
|
In my experience, don't try to go below a width of 576 without closely
|
|
watching what's going on.
|
|
|
|
--
|
|
Rémi
|
|
|
|
===========
|
|
TIPS FOR ENCODING
|
|
|
|
That being said, with video you have some tradeoffs you can make. Most
|
|
people seem to encode with really basic options, but if you play with
|
|
single coefficient elimination and luma masking settings, you can save lots
|
|
of bits, resulting in lower quantizers, which means less blockiness and
|
|
less ugly noise (ringing) around sharp borders. The tradeoff, however, is
|
|
that you'll get some "muddiness" in some parts of the image. Play around
|
|
with the settings and see for yourself. The options I typically use for
|
|
(non-animated) movies are:
|
|
|
|
vlelim=-4
|
|
vcelim=9
|
|
lumi_mask=0.05
|
|
dark_mask=0.01
|
|
|
|
If things look too muddy, making the numbers closer to 0. For anime and
|
|
other animation, the above recommendations may not be so good.
|
|
|
|
Another option that may be useful is allowing four motion vectors per
|
|
macroblock (v4mv). This will increase encoding time quite a bit, and
|
|
last I checked it wasn't compatible with B frames. AFAIK, specifying
|
|
v4mv should never reduce quality, but it may prevent some old junky
|
|
versions of DivX from decoding it (can anyone conform?). Another issue
|
|
might be increased cpu time needed for decoding (again, can anyone
|
|
confirm?).
|
|
|
|
To get more fair distribution of bits between low-detail and
|
|
high-detail scenes, you should probably try increasing vqcomp from the
|
|
default (0.5) to something in the range 0.6-0.8.
|
|
|
|
Of course you also want to make sure you crop ALL of the black border and
|
|
any half-black pixels at the edge of the image, and make sure the final
|
|
image dimensions after cropping and scaling are multiples of 16. Failing to
|
|
do so will drastically reduce quality.
|
|
|
|
Finally, if you can't seem to get good results, you can try scaling the
|
|
movie down a bit smaller or applying a weak gaussian blur to reduce the
|
|
amount of detail.
|
|
|
|
Now, my personal success story! I just recently managed to fit a beautiful
|
|
encode of Kundun (well over 2 hours long, but not too many high-motion
|
|
scenes) on one cd at 640x304, with 66 kbit/sec abr ogg audio, using the
|
|
options I described above. So, IMHO it's definitely possible to get very
|
|
good results with libavcodec (certainly MUCH better than all the idiot
|
|
"release groups" using DivX3 make), as long as you take some time to play
|
|
around with the options.
|
|
|
|
|
|
Rich
|
|
|
|
============
|
|
ABOUT VLELIM, VCELIM, LUMI_MASK AND DARK_MASK PART I: LUMA & CHROMA
|
|
|
|
|
|
The l/c in vlelim and vcelim stands for luma (brightness plane) and chroma
|
|
(color planes). These are encoded separately in all mpeg-like algorithms.
|
|
Anyway, the idea behind these options is (at least from what I understand)
|
|
to use some good heuristics to determine when the change in a block is less
|
|
than the threshold you specify, and in such a case, to just encode the
|
|
block as "no change". This saves bits and perhaps speeds up encoding. Using
|
|
a negative value for either one means the same thing as the corresponding
|
|
positive value, but the DC coefficient is also considered. Unfortunately
|
|
I'm not familiar enough with the mpeg terminology to know what this means
|
|
(my first guess would be that it's the constant term from the DCT), but it
|
|
probably makes the encoder less likely to apply single coefficient
|
|
elimination in cases where it would look bad. It's presumably recommended
|
|
to use negative values for luma (which is more noticable) and positive for
|
|
chroma.
|
|
|
|
The other options -- lumi_mask and dark_mask -- control how the quantizer
|
|
is adjusted for really dark or bright regions of the picture. You're
|
|
probably already at least a bit familiar with the concept of quantizers
|
|
(qscale, lower = more precision, higher quality, but more bits needed to
|
|
encode). What not everyone seems to know is that the quantizer you see
|
|
(e.g. in the 2pass logs) is just an average for the whole frame, and lower
|
|
or higher quantizers may in fact be used in parts of the picture with more
|
|
or less detail. Increasing the values of lumi_mask and dark_mask will cause
|
|
lavc to aggressively increase the quantizer in very dark or very bright
|
|
regions of the picture (which are presumably not as noticable to the human
|
|
eye) in order to save bits for use elsewhere.
|
|
|
|
Rich
|
|
|
|
===================
|
|
ABOUT VLELIM, VCELIM, LUMI_MASK AND DARK_MASK PART II: VQSCALE
|
|
|
|
OK, a quick explanation. The quantizer you set with vqscale=N is the
|
|
per-frame quantizer parameter (aka qp). However, with mpeg4 it's
|
|
allowed (and recommended!) for the encoder to vary the quantizer on a
|
|
per-macroblock (mb) basis (as I understand it, macroblocks are 16x16
|
|
regions composed of 4 8x8 luma blocks and 2 8x8 chroma blocks, u and
|
|
v). To do this, lavc scores each mb with a complexity value and
|
|
weights the quantizer accordingly. However, you can control this
|
|
behavior somewhat with scplx_mask, tcplx_mask, dark_mask, and
|
|
lumi_mask.
|
|
|
|
scplx_mask -- raise quantizer on mb's with lots of spacial complexity.
|
|
Spacial complexity is measured by variance of the texture (this is
|
|
just the actual image for I blocks and the difference from the
|
|
previous coded frame for P blocks).
|
|
|
|
tcplx_mask -- raise quantizer on mb's with lots of temporal
|
|
complexity. Temporal complexity is measured according to motion
|
|
vectors.
|
|
|
|
dark_mask -- raise quantizer on very dark mb's.
|
|
|
|
lumi_mask -- raise quantizer on very bright mb's.
|
|
Somewhere around 0-0.15 is a safe range for these values, IMHO. You
|
|
might try as high as 0.25 or 0.3. You should probably never go over
|
|
0.5 or so.
|
|
|
|
Now, about naq. When you adjust the quantizers on a per-mb basis like
|
|
this (called adaptive quantization), you might decrease or (more
|
|
likely) increase the average quantizer used, so that it no longer
|
|
matches the requested average quantizer (qp) for the frame. This will
|
|
result in weird things happening with the bitrate, at least from my
|
|
experience. What naq does is "normalize adaptive quantization". That
|
|
is, after the above masking parameters are applied on a per-mb basis,
|
|
the quantizers of all the blocks are rescaled so that the average
|
|
stays fixed at the desired qp.
|
|
|
|
So, if I used vqscale=4 with naq and fairly large values for the
|
|
masking parameters, I might be likely to see lots of frames using
|
|
qscale 2,3,4,5,6,7 across different macroblocks as needed, but with
|
|
the average sticking around 4. However, I haven't actually tested such
|
|
a setup yet, so it's just speculation right now.
|
|
|
|
Have fun playing around with it.
|
|
|
|
Rich
|
|
|
|
|
|
================================================================================
|
|
|
|
|
|
TIPS FOR ENCODING OLD BLACK & WHITE MOVIES:
|
|
|
|
I found myself that 4:3 B&W old movies are very hard to compress well. In
|
|
addition to the 4:3 aspect ratio which eats lots of bits, those movies are
|
|
typically very "noisy", which doesn't help at all. Anyway :
|
|
|
|
> After a few tries I am
|
|
> still a little bit disappointed with the video quality. Since it is a
|
|
> "dark" movies, there is a lot of black on the pictures, and on the
|
|
> encoded avi I can see a lot of annoying "mpeg squares". I am using
|
|
> avifile codec, but the best I think is to give you the command line I
|
|
> used to encode a preview of the result:
|
|
|
|
>
|
|
> First pass:
|
|
> mencoder TITLE01-ANGLE1.VOB -oac copy -ovc lavc -lavcopts
|
|
> vcodec=mpeg4:vhq:vpass=1:vbitrate=800:keyint=48 -ofps 23.976 -npp lb
|
|
> -ss 2:00 -endpos 0:30 -vf scale -zoom -xy 640 -o movie.avi
|
|
|
|
1) keyint=48 is way too low. The default value is 250, this is in *frames*
|
|
not seconds. Keyframes are significantly larger than P or B frames, so the
|
|
less keyframes you have, better the overall movie will be. (huh, like Yoda
|
|
I speak ;). Try keyint=300 or 350. Don't go beyond that if you want
|
|
relatively precise seeking.
|
|
|
|
2) you may want to play with vlelim and vcelim options. This can gives you
|
|
a significant "quality" boost. Try one of these couples :
|
|
|
|
vlelim=-2:vcelim=3
|
|
vlelim=-3:vcelim=5
|
|
vlelim=-4:vcelim=7
|
|
(and yes, there's a minus)
|
|
|
|
3) crop & rescale the movie before passing it to the codec. First crop the
|
|
movie to not encode black bars if there's any. For a 1h40mn movie
|
|
compressed to a 700 MB file, I would try something between 512x384 and
|
|
480x320. Don't go below that if you want something relatively sharp when
|
|
viewed fullscreen.
|
|
|
|
4) I would recommend using the Ogg Vorbis audio codec with the .ogm
|
|
container format. Ogg Vorbis compress audio better than MP3. On a typical
|
|
old, mono-only audio stream, a 45 kbits/s Vorbis stream is ok. How to
|
|
extract & compress an audio stream from a ripped DVD (mplayer dvd://1
|
|
-dumpstream) :
|
|
|
|
rm -f audiodump.pcm ; mkfifo -m 600 audiodump.pcm
|
|
mplayer -quiet -vc null -vo null -aid 128 -ao pcm -nowaveheader stream.dump &
|
|
oggenc --raw --raw-bits=16 --raw-chan=2 --raw-rate=48000 -q 1 -o audio-us.ogg
|
|
+audiodump.pcm &
|
|
wait
|
|
|
|
For a nice set of utilities to manager the .ogm format, see Moritz Bunkus'
|
|
ogmtools (google is your friend).
|
|
|
|
5) use the "v4mv" option. This could gives you a few more bits at the
|
|
expense of a slightly longer encoding. This is a "lossless" option, I mean
|
|
with this option you don't throw away some video information, it just
|
|
selects a more precise motion estimation method. Be warned that on some
|
|
very un-typical scenes this option may gives you a longer file than
|
|
without, although it's very rare and on a whole film I think it's always a
|
|
win.
|
|
|
|
6) you can try the new luminance & darkness masking code. Play
|
|
with the "lumi_mask" and "dark_mask" options. I would recommend using
|
|
something like :
|
|
lumi_mask=0.07:dark_mask=0.10:naq:
|
|
lumi_mask=0.10:dark_mask=0.12:naq:
|
|
lumi_mask=0.12:dark_mask=0.15:naq
|
|
lumi_mask=0.13:dark_mask=0.16:naq:
|
|
Be warned that these options are really experimental and the result
|
|
could be very good or very bad depending on your visualization device
|
|
(computer CRT, TV or TFT screen). Don't push too hard these options.
|
|
|
|
> Second pass:
|
|
> the same with vpass=2
|
|
|
|
7) I've found that lavc gives better results when the first pass is done
|
|
with "vqscale=2" instead of a target bitrate. The statistics collected
|
|
seems to be more precise. YMMV.
|
|
|
|
> I am new to mencoder, so please tell me any idea you have even if it
|
|
> obvious. I also tried the "gray" option of lavc, to encode B&W only,
|
|
> but strangely it gives me "pink" squares from time to time.
|
|
|
|
Yes, I've seen that too. Playing the resulting file with "-lavdopts gray"
|
|
fix the problem but it's not very nice ...
|
|
|
|
> So if you could tell me what option of mencoder or lavc I should be
|
|
> looking at to lower the number of "squares" on the image, it would be
|
|
> great. The version of mencoder i use is 0.90pre8 on a macos x PPC
|
|
> platform. I guess I would have the same problem by encoding anime
|
|
> movies, where there are a lot of region of the image with the same
|
|
> color. So if you managed to solve this problem...
|
|
|
|
You could also try the "mpeg_quant" flag. It selects a different set of
|
|
quantizers and produce somewhat sharper pictures and less blocks on large
|
|
zones with the same or similar luminance, at the expense of some bits.
|
|
|
|
> This is completely off topic, but do you know how I can create good
|
|
> subtitles from vobsub subtitles ? I checked the -dumpmpsub option of
|
|
> mplayer, but is there a way to do it really fast (ie without having to
|
|
> play the whole movie) ?
|
|
|
|
I didn't find a way under *nix to produce reasonably good text subtitles
|
|
from vobsubs. OCR *nix softwares seems either not suited to the task, not
|
|
powerful enough or both. I'm extracting the vobsub subtitles and simply use
|
|
them with the .ogm
|
|
|
|
/ .avi :
|
|
1) rip the DVD to harddisk with "mplayer dvd://1 -dumpstream"
|
|
2) mount the DVD and copy the .ifo file
|
|
2) extract all vobsubs to one single file with something like :
|
|
|
|
for f in 0 1 2 3 4 5 6 7 8 9 10 11 ; do \
|
|
mencoder -ovc copy -oac copy -o /dev/null -sid $f -vobsubout sous-titres
|
|
+-vobsuboutindex $f -ifo vts_01_0.ifo stream.dump
|
|
done
|
|
|
|
(and yes, I've a DVD with 12 subtitles)
|
|
--
|
|
Rémi
|
|
|
|
|
|
================================================================================
|
|
|
|
|
|
TIPS FOR SMOKE & CLOUDS
|
|
|
|
Q: I'm trying to encode Dante's Peak and I'm having problems with clouds,
|
|
fog and smoke: They don't look fine (they look very bad if I watch the
|
|
movie in TV-out). There are some artifacts, white clouds looks as snow
|
|
mountains, there are things likes hip in the colors so one can see frontier
|
|
curves between white and light gray and dark gray ... (I don't know if you
|
|
can understand me, I want to mean that the colors don't change smoothly)
|
|
In particular I'm using vqscale=2:vhq:v4mv
|
|
|
|
A: Try adding "vqcomp=0.7:vqblur=0.2:mpeg_quant" to lavcopts.
|
|
|
|
Q: I tried your suggestion and it improved the image a little ... but not
|
|
enough. I was playing with different options and I couldn't find the way.
|
|
I suppose that the vob is not so good (watching it in TV trough the
|
|
computer looks better than my encoding, but it isn't a lot of better).
|
|
|
|
A: Yes, those scenes with qscale=2 looks terrible :-(
|
|
|
|
Try with vqmin=1 in addition to mpeg_quant:vlelim=-4:vcelim=-7 (and maybe
|
|
with "-sws 10 -ssf ls=1" to sharpen a bit the image) and read about vqmin=1
|
|
in DOCS/tech/libavc-options.txt.
|
|
|
|
If after the whole movie is encoded you still see the same problem, it will
|
|
means that the second pass didn't picked-up q=1 for this scene. Force q=1
|
|
with the "vrc_override" option.
|
|
|
|
Q: By the way, is there a special difficult in encode clouds or smoke?
|
|
|
|
A: I would say it depends on the sharpness of these clouds / smokes and the
|
|
fact that they are mostly black/white/grey or colored. The codec will do
|
|
the right thing with vqmin=2 for example on a cigarette smoke (sharp) or on
|
|
a red/yellow cloud (explosion, cloud of fire). But may not with a grey and
|
|
very fuzzy cloud like in the chocolat scene. Note that I don't know exactly
|
|
why ;)
|
|
|
|
A = Rémi
|
|
|
|
|
|
================================================================================
|
|
|
|
|
|
TIPS FOR TWEAKING RATECONTROL
|
|
|
|
(For the purpose of this explanation, consider "2nd pass" to be any beyond
|
|
the 1st. The algorithm is run only on P-frames; I- and B-frames use QPs
|
|
based on the adjacent P. While x264's 2pass ratecontrol is based on lavc's,
|
|
it has diverged somewhat and not all of this is valid for x264.)
|
|
|
|
Consider the default ratecontrol equation in lavc: "tex^qComp".
|
|
At the beginning of the 2nd pass, rc_eq is evaluated for each frame, and
|
|
the result is the number of bits allocated to that frame (multiplied by
|
|
some constant as needed to match the total requested bitrate).
|
|
|
|
"tex" is the complexity of a frame, i.e. the estimated number of bits it
|
|
would take to encode at a given quantizer. (If the 1st pass was CQP and
|
|
not turbo, then we know tex exactly. Otherwise it is calculated by
|
|
multiplying the 1st pass's bits by the QP of that frame. But that's not
|
|
why CQP is potentially good; more on that later.)
|
|
"qComp" is just a constant. It has no effect outside the rc_eq, and is
|
|
directly set by the vqcomp parameter.
|
|
|
|
If vqcomp=1, then rc_eq=tex^1=tex, so 2pass allocates to each frame the
|
|
number of bits needed to encode them all at the same QP.
|
|
If vqcomp=0, then rc_eq=tex^0=1, so 2pass allocates the same number of
|
|
bits to each frame, i.e. CBR. (Actually, this is worse than 1pass CBR in
|
|
terms of quality; CBR can vary within its allowed buffer size, while
|
|
vqcomp=0 tries to make each frame exactly the same size.)
|
|
If vqcomp=0.5, then rc_eq=sqrt(tex), so the allocation is somewhere
|
|
between CBR and CQP. High complexity frames get somewhat lower quality
|
|
than low complexity, but still more bits.
|
|
|
|
While the actual selection of a good value of vqcomp is experimental, the
|
|
following underlying factors determine the result:
|
|
Arguing towards CQP: You want the movie to be somewhere approaching
|
|
constant quality; oscillating quality is even more annoying than constant
|
|
low quality. (However, constant quality does not mean constant PSNR nor
|
|
constant QP. Details are less noticeable in high-motion scenes, so you can
|
|
get away with somewhat higher QP in high-complexity frames for the same
|
|
perceived quality.)
|
|
Arguing towards CBR: You get more quality per bit if you spend those bits
|
|
in frames where motion compensation works well (which tends to be
|
|
correlated with "tex"): A given artifact may stick around several seconds
|
|
in a low-motion scene, and you only have to fix it in one frame to improve
|
|
the quality of the whole sequence.
|
|
|
|
Now for why the 1st pass ratecontrol method matters:
|
|
The number of bits at constant quant is as good a measure of complexity as
|
|
any other simple formula for the purpose of allocating bits. But it's not
|
|
perfect for predicting which QP will produce the desired number of bits.
|
|
Bits are approximately inversely proportional to QP, but the exact scaling
|
|
is non-linear, and depends on the amount of detail/noise, the complexity of
|
|
motion, the quality of previous frames, and other stuff not measured by the
|
|
ratecontrol. So it predicts best when the QP used for a given frame in the
|
|
2nd pass is close to the QP used in the 1st pass. When the prediction is
|
|
wrong, lavc needs to distribute the surplus or deficit of bits among future
|
|
frames, which means that they too deviate from the planned distribution.
|
|
Obviously, with vqcomp=1 you can get the 1st pass QPs very close by using
|
|
CQP, and with vqcomp=0 a CBR 1st pass is very close. But with vqcomp=0.5
|
|
it's more ambiguous. The accepted wisdom is that CBR is better for
|
|
vqcomp=0.5, mostly because you then don't have to guess a QP in advance.
|
|
But with vqcomp=0.6 or 0.7, the 2nd pass QPs vary less, so a CQP 1st pass
|
|
(with the right QP) will be a better predictor than CBR.
|
|
|
|
To make it more confusing, 1pass CBR uses the same rc_eq with a different
|
|
meaning. In CBR, we don't have a real encode to estimate from, so "tex" is
|
|
calculated from the full-pixel precision motion-compensation residual.
|
|
While the number of bits allocated to a given frame is decided by the rc_eq
|
|
just like in 2nd pass, the target bitrate is constant (instead of being the
|
|
sum of per-frame rc_eq values). So the scaling factor (which is constant in
|
|
2nd pass) now varies in order to keep the local average bitrate near the
|
|
CBR target. So vqcomp does affect CBR, though it only determines the local
|
|
allocation of bits, not the long-term allocation.
|
|
|
|
--Loren Merritt
|