1
0
mirror of https://github.com/mpv-player/mpv synced 2025-03-25 04:38:01 +00:00

x264's encoding and install guide

Based on Jeff Clagg's "preliminary x264 encoding help text"


git-svn-id: svn://svn.mplayerhq.hu/mplayer/trunk@15327 b3059339-0415-0410-9bf9-f77b7e298cf2
This commit is contained in:
gpoirier 2005-05-02 17:45:23 +00:00
parent d07fabd85f
commit 97378713fd
2 changed files with 443 additions and 0 deletions

View File

@ -530,6 +530,149 @@ decoders:
<step><para>compile <application>MPlayer</application></para></step>
</procedure>
</sect3>
<sect3 id="codec-x264">
<title>x264</title>
<sect4 id="codec-x264-whatis">
<title>What is x264?</title>
<para>
<systemitem class="library">x264</systemitem> is a library for
creating H.264 video streams.
It is not 100% complete, but currently it has at least some kind
of support for most of the H.264 features which impact quality.
There are also many advanced features in the H.264 specification
which have nothing to do with video quality per se; many of these
are not yet implemented in
<systemitem class="library">x264</systemitem>.
</para>
<itemizedlist>
<title>Encoder features</title>
<listitem><para>CAVLC/CABAC</para></listitem>
<listitem><para>Multi-references</para></listitem>
<listitem><para>Intra: all macroblock types (16x16 and 4x4 with
all predictions)</para></listitem>
<listitem><para>Inter P: all partitions (from 16x16 down to
4x4)</para></listitem>
<listitem><para>Inter B: partitions from 16x16 down to 8x8
(including SKIP/DIRECT)</para></listitem>
<listitem><para>Ratecontrol: constant quantizer, constant bitrate,
or multipass ABR</para></listitem>
<listitem><para>Scene cut detection</para></listitem>
<listitem><para>Adaptive B-frame placement</para></listitem>
<listitem><para>B-frames as references / arbitrary frame
order</para></listitem>
</itemizedlist>
<itemizedlist>
<title>Encoder limitations</title>
<listitem><para>No real RD</para></listitem>
</itemizedlist>
</sect4>
<sect4 id="codec-x264-whatis">
<title>What is H.264?</title>
<para>
H.264 is one name for a new digital video codec jointly developed
by the ITU and MPEG.
It can also be correctly referred to by the cumbersome names of
"ISO/IEC 14496-10" or "MPEG-4 Part 10".
More frequently, it is referred to as "MPEG-4 AVC" or just "AVC".
</para>
<para>
Whatever you call it, H.264 may be worth trying because it can
typically match the quality of MPEG-4 ASP with 5%-30% less
bitrate.
Actual results will depend on both the source material and the
encoder.
The gains from using H.264 do not come for free: decoding H.264
streams seems to have steep CPU and memory requirements.
For instance, on a 1733 MHz Athlon, a 1500kbps H.264 video uses
around 50% CPU to decode.
By comparison, decoding a 1500kbps MPEG4-ASP stream requires
around 10% CPU.
This means that decoding high-definition streams is almost out of
the question for most users.
It also means that even a decent DVD rip may sometimes stutter on
processors slower than 2.0 GHz or so.
</para>
<para>
At least with <systemitem class="library">x264</systemitem>,
encoding requirements are not much worse than what you are used to
with MPEG4-ASP.
For instance, on a 1733 MHz Athlon a typical DVD encode would run
at 5-15fps.
</para>
<para>
This document is not intended to explain the details of H.264,
but if you are interested in a brief overview, you may want to read
<ulink url="http://www.cdt.luth.se/~peppar/kurs/smd151/spie04-h264OverviewPaper.pdf">The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions</ulink>.
</para>
</sect4>
<sect4 id="codec-x264-playback">
<title>How can I play H.264 videos with <application>MPlayer</application>?</title>
<para>
<application>MPlayer</application> uses
<systemitem class="library">libavcodec</systemitem>'s H.264
decoder.
<systemitem class="library">libavcodec</systemitem> has had at
least minimally usable H.264 decoding since around July 2004,
however major changes and improvements have been implemented since
that time, both in terms of more functionalities supported and in
terms of improved CPU usage.
Just to be certain, it is always a good idea to use a recent CVS
checkout.
</para>
<para>
If you want a quick and easy way to know whether there have been
recent changes to <systemitem class="library">libavcodec</systemitem>'s
H.264 decoding, you might keep an eye on
<ulink url="http://mplayerhq.hu/cgi-bin/cvsweb.cgi/ffmpeg/libavcodec/h264.c?cvsroot=FFMpeg">FFmpeg CVS repository's web interface</ulink>.
</para>
</sect4>
<sect4 id="codec-x264-encode">
<title>How can I encode videos using <application>MEncoder</application> and <systemitem class="library">x264</systemitem>?</title>
<para>
If you have the subversion client installed, the latest x264
sources can be gotten with this command:
<screen>
svn co svn://svn.videolan.org/x264/trunk x264
</screen>
<application>MPlayer</application> sources are updated whenever
an <systemitem class="library">x264</systemitem> API change
occurs, so it is always suggested to use CVS
<application>MPlayer</application> as well.
Perhaps this situation will change when and if an
<systemitem class="library">x264</systemitem> "release" occurs.
Meanwhile, <systemitem class="library">x264</systemitem> should
be considered very unstable, in the sense that its programming
interface is subject to change.
</para>
<para>
<systemitem class="library">x264</systemitem> is built and
installed in the standard way:
<screen>
./configure &amp;&amp; make &amp;&amp; sudo make install
</screen>
This installs libx264.a in /usr/local/lib and x264.h is placed in
/usr/local/include.
With the <systemitem class="library">x264</systemitem> library
and header placed in the standard locations, building
<application>MPlayer</application> with
<systemitem class="library">x264</systemitem> support is easy.
Just run the standard:
<screen>./configure &amp;&amp; make &amp;&amp; sudo make install</screen>
The configure script will autodetect that you have satisfied the
requirements for <systemitem class="library">x264</systemitem>.
</para>
</sect4>
</sect3>
</sect2>

View File

@ -1806,6 +1806,306 @@ vcodec=mpeg2video:intra_matrix=8,9,12,22,26,27,29,34,9,10,14,26,27,29,34,37,
</sect1>
<sect1 id="menc-feat-x264">
<title>Encoding with the <systemitem class="library">x264</systemitem> codec</title>
<para>
<systemitem class="library">x264</systemitem> is a free library for
encoding H264/AVC video streams.
Before starting to encode, you need to <link linkend="codec-x264-encode">
set up <application>MEncoder</application> to support it</link>.
</para>
<sect2 id="menc-feat-x264-intro">
<title>What options should I use to get the best results?</title>
<para>
Please begin by reviewing the
<systemitem class="library">x264</systemitem> section of
<application>MPlayer</application>'s man page.
This section is intended to be a supplement to the man page.
</para>
<orderedlist>
<title>There are mainly three types of considerations when choosing encoding
options:</title>
<listitem><para>Trading off encoding time vs. quality</para></listitem>
<listitem><para>Frame type decision options</para></listitem>
<listitem><para>Ratecontrol and quantization decision options</para></listitem>
</orderedlist>
<para>
This guide is mostly concerned with the first class of options.
The other two types often have more to do with personal
preferences and individual requirements.
</para>
<para>
Before continuing, please note that this guide uses only one
quality metric: global PSNR.
For a brief explanation of what PSNR is, see
<ulink url="http://en.wikipedia.org/wiki/PSNR">the Wikipedia article on PSNR</ulink>.
Global PSNR is the last PSNR number reported when you include
the <option>psnr</option> option in <option>x264encopts</option>.
Any time you will read a claim about PSNR, one of the assumptions
behind the claim is that equal bitrates are used.
</para>
<para>
Nearly all of this guide's comments assume you are using
two pass.
When comparing options, there are two major reasons for using
two pass encoding.
First, using two pass often gains around 1dB PSNR, which is a
very big difference.
Secondly, testing options by doing direct quality comparisons
with 1-pass encodes is a dubious proposition because bitrate
often varies significantly with each encode.
It is not always easy to tell whether quality changes are due
mainly to changed options, or if they mostly reflect
differences in the achieved bitrate.
</para>
<para>
Of the options which allow you to trade off speed for quality,
<option>subq</option> and <option>frameref</option> are usually
by far the most important.
If you are interested in tweaking either speed or quality, these
are the first options you should consider.
</para>
<para>
On the speed dimension, the <option>frameref</option> and
<option>subq</option> options interact with each other fairly
strongly.
Experience shows that, with one reference frame,
<option>subq=5</option> takes about 35% more time than
<option>subq=1</option>.
With 6 reference frames, the penalty grows to over 60%.
<option>subq</option>'s effect on PSNR seems fairly constant
regardless of the number of reference frames.
Typically, <option>subq=5</option> gains 0.2-0.5 dB
global PSNR over <option>subq=1</option>.
This is usually enough to be visible.
</para>
</sect2>
<sect2 id="menc-feat-x264-encoding-options">
<title>Encoding options of x264</title>
<itemizedlist>
<listitem><para>
<emphasis role="bold">frameref</emphasis>:
<option>frameref</option> is set to 1 by default, but this
should not be taken to imply that it is reasonable to set it
to 1.
Merely raising <option>frameref</option> to 2 gains around
0.15dB PSNR with a 5-10% speed penalty; this seems like a
good tradeoff.
<option>frameref=3</option> gains around 0.25dB PSNR over
<option>frameref=1</option>, which should be a visible
difference.
<option>frameref=3</option> is around 15% slower than
<option>frameref=1</option>.
Unfortunately, diminishing returns set in rapidly.
<option>frameref=6</option> can be expected to gain only
0.05-0.1 dB over <option>frameref=3</option> at an additional
15% speed penalty.
Above <option>frameref=6</option>, the quality gains are
usually very small (although you should keep in mind throughout
this whole discussion that it can vary quite a lot depending on
your source).
In a fairly typical case, <option>frameref=12</option>
will improve global PSNR by a tiny 0.02dB over
<option>frameref=6</option>, at a speed cost of 15%-20%.
At such high <option>frameref</option> values, the only really
good thing that can be said is that increasing even further will
almost certainly never <emphasis role="bold">harm</emphasis>
PSNR, but the additional quality benefits are barely even
measurable, let alone perceptible.
</para>
<note><title>Note:</title>
<para>
Raising <option>frameref</option> to unnecessarily high values
<emphasis role="bold">can</emphasis> and
<emphasis role="bold">usually does</emphasis>
hurt coding efficiency if you turn CABAC off.
With CABAC on (the default behavior), the possibility of setting
<option>frameref</option> "too high" currently seems too remote
to even worry about, and in the future, optimizations may remove
the possibility altogether).
</para>
</note>
<para>
If you care about speed, a reasonable compromise is to use low
<option>subq</option> and <option>frameref</option> values on
the first pass, and then raise them on the second pass.
Typically, this has a negligible negative effect on the final
quality: you will probably lose well under 0.1dB PSNR, which
should be much too small of a difference to see.
However, different values of <option>frameref</option> can
occasionally affect frametype decision.
Most likely, these are rare outlying cases, but if you want to
be pretty sure, consider whether your video has either
fullscreen repetitive flashing patterns or very large temporary
occlusions which might force an I-frame.
Adjust the first-pass <option>frameref</option> so it is large
enough to contain the duration of the flashing cycle (or occlusion).
For example, if the scene flashes back and forth between two images
over a duration of three frames, set the first pass
<option>frameref</option> to 3 or higher.
This issue is probably extremely rare in live action video material,
but it does sometimes come up in video game captures.
</para></listitem>
<listitem><para>
<emphasis role="bold">bframes</emphasis>:
The usefulness of B-frames is questionable in most other codecs
you may be used to.
In H.264, this has changed: there are new techniques and block
types that are possible in B-frames.
Usually, even a naive B-frame choice algorithm can have a
significant PSNR benefit.
It is also interesting to note that if you turn off the adaptive
B-frame decision (<option>nob_adapt</option>), encoding with
<option>bframes</option> usually speeds up encoding speed somewhat.
</para>
<para>
With adaptive B-frame decision turned off
(<option>x264encopts</option>'s <option>nob_adapt</option>),
the optimal value for this setting will usually range from
<option>bframes=1</option> to <option>bframes=3</option>.
With adaptive B-frame decision on (the default behavior), it is
probably safe to use higher values; the encoder will try to
reduce the use of B-frames in scenes where they would hurt
compression.
</para>
<para>
If you are going to use <option>bframes</option> at all, consider
setting the maximum number of B-frames to 2 or higher in order to
take advantage of weighted prediction.
</para></listitem>
<listitem><para>
<emphasis role="bold">b_adapt</emphasis>:
Note: this is on by default.
</para>
<para>
With this option enabled, the encoder will use some simple
heuristics to reduce the number of B-frames used in scenes that
might not benefit from them as much.
You can use <option>b_bias</option> to tweak how B-frame-happy
the encoder is.
The speed penalty of adaptive B-frames is currently rather modest,
but so is the potential quality gain.
It usually does not hurt, however.
Note that this only affects speed and frametype decision on the
first pass.
<option>b_adapt</option> and <option>b_bias</option> have no
effect on subsequent passes.
</para></listitem>
<listitem><para>
<emphasis role="bold">b_pyramid</emphasis>:
You might as well enable this option if you are using >2 B-frames;
as the man page says, you get a little quality improvement with no
speed cost.
Note that these videos cannot be read by libavcodec-based decoders
older than about March 5, 2005.
</para></listitem>
<listitem><para>
<emphasis role="bold">weight_b</emphasis>:
In typical cases, there is not much gain with this option.
However, in crossfades or fade-to-black scenes, weighted
prediction gives rather large bitrate savings.
In MPEG-4 ASP, a fade-to-black is usually best coded as a series
of expensive I-frames; using weighted prediction in B-frames
makes it possible to turn at least some of these into much more
reasonably-sized B-frames.
Encoding time cost seems to be minimal, if there is any.
Also, contrary to what some people seem to guess, the decoder
CPU requirements are not much affected by weighted prediction,
all else being equal.
</para>
<para>
Unfortunately, the current adaptive B-frame decision algorithm
has a strong tendency to avoid B-frames during fades.
Until this changes, it may be a good idea to add
<option>nob_adapt</option> to your x264encopts, if you expect
fades to have a significant effect in your particular video
clip.
</para></listitem>
<listitem><para>
<emphasis role="bold">deblockalpha, deblockbeta</emphasis>:
This topic is going to be a bit controversial.
</para>
<para>
H.264 defines a simple deblocking procedure on I-blocks that uses
pre-set strengths and thresholds depending on the QP of the block
in question.
By default, high QP blocks are filtered heavily, and low QP blocks
are not deblocked at all.
The pre-set strengths defined by the standard are well-chosen and
the odds are very good that they are PSNR-optimal for whatever
video you are trying to encode.
The <option>deblockalpha</option> and <option>deblockbeta</option>
parameters allow you to specify offsets to the preset deblocking
thresholds.
</para>
<para>
Many people seem to think it is a good idea to lower the deblocking
filter strength by large amounts (say, -3).
This is however almost never a good idea, and in most cases,
people who are doing this do not understand very well how
deblocking works by default.
</para>
<para>
The first and most important thing to know about the in-loop
deblocking filter is that the default thresholds are almost always
PSNR-optimal.
In the rare cases that they are not optimal, the ideal offset is
plus or minus 1.
Adjusting deblocking parameters by a larger amount is almost
guaranteed to hurt PSNR.
Strengthening the filter will smear more details; weakening the
filter will increase the appearance of blockiness.
</para>
<para>
It is definitely a bad idea to lower the deblocking thresholds if
your source is mainly low in spacial complexity (i.e., not a lot
of detail or noise).
The in-loop filter does a rather excellent job of concealing
the artifacts that occur.
If the source is high in spacial complexity, however, artifacts
are less noticeable.
This is because the ringing tends to look like detail or noise.
Human visual perception easily notices when detail is removed,
but it does not so easily notice when the noise is wrongly
represented.
When it comes to subjective quality, noise and detail are somewhat
interchangeable.
By lowering the deblocking filter strength, you are most likely
increasing error by adding ringing artifacts, but the eye does
not notice because it confuses the artifacts with detail.
</para>
<para>
This <emphasis role="bold">still</emphasis> does not justify
lowering the deblocking filter strength, however.
You can generally get better quality noise from postprocessing.
If your H.264 encodes look too blurry or smeared, try playing with
<option>-vf noise</option> when you play your encoded movie.
<option>-vf noise=8a:4a</option> should conceal most mild
artifacting.
It will almost certainly look better than the results you
would have gotten just by fiddling with the deblocking filter.
</para></listitem>
</itemizedlist>
</sect2>
</sect1>
<sect1 id="menc-feat-telecine">
<title>How to deal with telecine and interlacing within NTSC DVDs</title>