1
0
mirror of https://github.com/mpv-player/mpv synced 2024-12-30 11:02:10 +00:00
mpv/DOCS/tech/encoding-guide.txt
diego 8302f9e9d6 ilmv --> ilme typo fix
patch by compn < tempn == at == twmi == dot == rr == dot == com >


git-svn-id: svn://svn.mplayerhq.hu/mplayer/trunk@16817 b3059339-0415-0410-9bf9-f77b7e298cf2
2005-10-20 13:45:41 +00:00

820 lines
34 KiB
Plaintext

Topics:
I. Preparing to encode
1. Identifying source material and framerate
2. Selecting the quality you want
3. Constraints for efficient encoding
4. Cropping and scaling
5. Choosing resolution and bitrate
II. Containers and codecs
1. Where the movie will be played
2. Constraints of DVD, SVCD, and VCD
3. Limitations of AVI container
III. Basic MEncoder usage
1. Selecting codecs & format
2. Selecting input file or device
3. Loading video filters
4. Notes on A/V sync
IV. Encoding procedures
1. Encoding progressive video
2. Two-pass encoding
3. Encoding interlaced video
4. Deinterlacing
5. Inverse telecine
6. Capturing TV input
7. Dealing with mixed-source content
8. Low-quality & damaged sources
V. Optimizing encoding quality
1. Noise removal
2. Pure quality-gain options
3. Questionable-gain options
4. Advanced MPEG-4 features
I. Preparing to encode
Before you even think about encoding a movie, you need to take several
preliminary steps to
I.1. Identifying source material and framerate
The first and most important step before you encode should be
determining what type of content you're dealing with. If your source
material comes from DVD or broadcast/cable/satellite TV, it will be
stored in one of two formats: NTSC for North America and Japan, and
PAL for Europe, etc. But it's important to realize that this is just
the formatting for presentation on a television, and often does NOT
correspond to the original format of the movie. In order to produce a
suitable encode, you need to know the original format. Failure to take
this into account will result in ugly combing (interlacing) artifacts
in your encode, and will greatly reduce the quality/bitrate ratio of
the encoder!
Here is a list of common types of source material, where you're likely
to find them, and their properties:
Standard Film: Produced for theatrical display at 24fps.
PAL video: Recorded with a PAL video camera at 50 fields per second. A
field consists of just the even or odd numbered lines of a frame.
Television was designed to refresh these in alternation as a cheap
form of analog compression. The human eye supposedly compensates for
this, but once you understand interlacing you'll learn to see it on TV
too and never enjoy TV again. Two fields do NOT make a complete frame,
because they are captured 1/50 of a second apart in time, and thus
they do not line up unless there is no motion.
NTSC Video: Recorded with an NTSC video camera at 59.94 fields per
second, or 60 fields per second in the pre-color era. Otherwise
similar to PAL.
Animation: Usually drawn at 24fps, but animation also comes in
mixed-framerate varieties.
Computer Graphics (CG): Can be any framerate, but 24 and 30 fps are
the most frequently encountered in NTSC regions, and 25 fps in PAL
regions.
Old Film: Various lower framerates.
Movies consisting of frames are referred to as progressive, while
those consisting of independent fields are called interlaced, or
sometimes video, although this latter term is ambiguous.
To further complicate matters, some movies will be a mix of several of
the above.
The most important distinction to make between all of these formats is
that some are frame-based, while others are field-based. WHENEVER a
movie is prepared for display on television (including DVD), it is
converted to a field-based format. The various methods by which this
can be done are collectively referred to as "pulldown", of which the
infamous NTSC "3:2 telecine" is one variety. Unless the original
material was also field-based (and the same fieldrate), you are
getting the movie in a format other than the original.
There are several common types of pulldown:
PAL 2:2 pulldown: The nicest of them all. Each frame is shown for two
fields duration, by extracting the even and odd lines and showing them
in alternation. If the original material is 24fps, this process speeds
up the movie by 4%.
PAL 2:2:2:2:2:2:2:2:2:2:2:3 pulldown: Every 12th frame is shown for
three fields duration, instead of just two. This avoids the 4% speedup
issue, but makes the process much more difficult to reverse. It is
usually seen in musical productions where adjusting the speed by 4%
would seriously damage the musical score.
NTSC 3:2 telecine: Frames are shown alternatively for 3 fields or 2
fields duration. This gives a fieldrate 5/2 times the original
framerate. The result is also slowed down very slightly from 60 fields
per second to 59.94 fields per second to maintain NTSC fieldrate.
NTSC 2:2 pulldown: Used for showing 30fps material on NTSC. Nice, just
like 2:2 PAL pulldown.
There are also methods for converting between NTSC and PAL video. Such
topics are beyond the scope of this guide. If you encounter such a
movie and want to encode it, your best bet is to find a copy in the
original format. NTSC/PAL conversion is highly destructive and cannot
be reversed cleanly, so your encode will greatly suffer if it is made
from a converted source.
When video is stored on DVD, consecutive pairs of fields are grouped
as a frame, even though they are not intended to be shown at the same
moment in time. The MPEG2 standard used on DVD and digital TV provides
a way to encode the original progressive frames, and store the number
of fields for which each should be shown in the frame headers. If this
method has been used, the term "soft telecine" will often be used to
describe the movie, since the process only directs the DVD player to
apply pulldown to the movie rather than altering the movie itself.
This case is highly preferable since it can easily be reversed
(actually ignored) by the encoder, and since it preserves maximal
quality. However, many DVD and broadcast production studios do not use
proper encoding techniques, and instead produce movies with "hard
telecine", where fields are actually duplicated in the encoded MPEG2.
The procedures for dealing with these cases will be covered later in
this guide. For now, we leave you with some guides to identifying
which type of material you're dealing with:
NTSC regions:
- If MPlayer prints that the framerate has changed to 23.976 when
watching your movie, and never changes back, it's almost certainly
24fps content that has been "soft telecined".
- If MPlayer shows the framerate switching back and forth between
23.976 and 29.97, and you see "combing" at times, then there are
several possibilities. The 23.976 fps segments are almost certainly
24fps progressive content, "soft telecined", but the 29.97 fps parts
could be either hard-telecined 24fps content or NTSC video content.
Use the same guidelines as the following two cases to determine
which.
- If MPlayer never shows the framerate change, and every single frame
with motion appears combed, your movie is NTSC video at 59.94 fields
per second.
- If MPlayer never shows the framerate change, and two frames out of
every five appear combed, your movie is "hard telecined" 24fps
content.
PAL regions:
- If you never see any combing, your movie is 2:2 pulldown.
- If you see combing alternating in and out every half second, then
your movie is 2:2:2:2:2:2:2:2:2:2:2:3 pulldown.
- If you always seem combing during motion, then your movie is PAL
video at 50 fields per second.
Hint: MPlayer can slow down movie playback with the -speed option. Try
using -speed 0.2 to watch the movie very slowly and identify the
pattern, if you can't see it at full speed.
I.2. Selecting the quality you want
It's possible to encode your movie at a wide range of qualities. With
modern video encoders and a bit of pre-codec compression (downscaling
and denoising), it's possible to achieve very good quality at 700 MB,
for a 90-110 minute widescreen movie. And all but the longest movies
can be encoded with near-perfect quality at 1400 MB.
If you do not plan to store your movies on CD or other size-limited
media, and you want maximal quality at all costs, you can encode in
constant quantizer mode, which will not aim to meet a specific target
bitrate or filesize but instead use the maximal accuracy encoding for
all frames. This is not recommended in most cases, because you can
achieve significantly smaller file sizes without noticeable loss.
However, it may be desirable for the hardcore archivists out there.
I.4. Cropping and scaling
Recall from the previous section that the final picture size you
encode should be a multiple of 16 (in both width and height). This can
be achieved by cropping, scaling, or a combination of both.
When cropping, there are a few guidelines that must be followed to
avoid damaging your movie. The normal YUV format, 4:2:0, stores chroma
(color) information subsampled, i.e. chroma is only sampled half as
often in each direction as luma (intensity) information. Observe this
diagram, where L indicates luma sampling points and C chroma.
L L L L L L L L
C C C C
L L L L L L L L
L L L L L L L L
C C C C
L L L L L L L L
As you can see, rows and columns of the image naturally come in pairs.
Thus your crop offsets and dimensions MUST be even numbers. If they
are not, the chroma will no longer line up correctly with the luma. In
theory, it's possible to crop with odd offsets, but it requires
resampling the chroma which is potentially a lossy operation and not
supported by the crop filter.
Further, interlaced video is sampled as follows:
TOP FIELD BOTTOM FIELD
L L L L L L L L
C C C C
L L L L L L L L
L L L L L L L L
C C C C
L L L L L L L L
L L L L L L L L
C C C C
L L L L L L L L
L L L L L L L L
C C C C
L L L L L L L L
As you can see, the pattern does not repeat until after 4 lines. So
for interlaced video, your y-offset and height for cropping must be
multiples of 4.
So how do you determine a crop rectangle to begin with? Sometimes you
can guess, but the cropdetect filter in MPlayer can make it easy. Run
MPlayer with -vf cropdetect and it will print out the crop settings to
remove the borders. You should let the movie run long enough that the
whole picture area is used, in order to get accurate crop values.
Then, test the values you get with MPlayer, using the command line
cropdetect printed, and adjust the rectangle as needed. The rectangle
filter can help by allowing you to interactively position the crop
rectangle over your movie. Remember to follow the above divisibility
guidelines so that you do not misalign the chroma planes.
If you will be scaling your movie, it's usually best to crop only the
black borders and noise, then scale so that the resulting dimensions
are multiples of 16. This can slightly distort the aspect ratio of
your movie, but in practice the error cannot be seen. It's certainly
much less visible than the MPEG artifacts you will see from failing to
crop & scale well.
In certain cases, scaling may be undesirable. Scaling in the vertical
direction is difficult with interlaced video, and if you wish to
preserve the interlacing, you should usually refrain from scaling. If
you will not be scaling but you still want to use multiple-of-16
dimensions, you will have to overcrop. Do not undercrop, since black
borders are very bad for encoding!
I.5. Choosing resolution and bitrate
If you will not be encoding in constant quantizer mode, you need to
select a bitrate. The concept of bitrate is quite simple. It's the
(average) number of bits that will be consumed to store your movie,
per second. Normally bitrate is measured in kilobits (1000 bits) per
second. The size of your movie on disk is the bitrate times the length
of the movie in time, plus a small amount of "overhead" (see the
section on codecs and containers). Other parameters such as scaling,
cropping, etc. will NOT alter the file size unless you change the
bitrate as well!
Bitrate does NOT scale proportional to resolution. That is to say, a
320x240 file at 200 kbit/sec will not be the same quality as the same
movie at 640x480 and 800 kbit/sec! There are two reasons for this:
1. Perceptual: You notice MPEG artifacts more if they're scaled up
bigger! Artifacts appear on the scale of blocks (8x8). Your eye
will not see errors in 4800 small blocks as easily as it sees
errors in 1200 large blocks (assuming you'll be scaling both to
fullscreen).
2. Theoretical: When you scale down an image but still use the same
size (8x8) blocks for the frequency space transform, you move more
data to the high frequency bands. Roughly speaking, each pixel
contains more of the detail than it did before. So even though your
scaled-down picture contains 1/4 the information in the spacial
directions, it could still contain a large portion of the
information in the frequency domain (assuming that the high
frequencies were underutilized in the original 640x480 image).
Past guides have recommended choosing a bitrate and resolution based
on a "bits per pixel" approach, but this is usually not valid due to
the above reasons. A better estimate seems to be that bitrates scale
proportional to the square root of resolution, so that 320x240 and 400
kbit/sec would be comparable to 640x480 at 800 kbit/sec. However this
has not been verified with theoretical or empirical rigor. Further,
given that movies vary greatly with regard to noise, detail, degree of
motion, etc., it's futile to make general recommendations for bits per
length-of-diagonal (the analogue of bits per pixel, using the square
root).
So far we have discussed the difficulty of choosing a bitrate and
resolution.
.................
II. Containers and codecs
II.1. Where the movie will be played
Perhaps the most important factor to choosing the format in which you
will encode your movie is where you want to be able to play it.
Usually this involves a tradeoff between quality and features, since
the formats supported by the widest variety of players are also the
worst in regards to compression.
If you want to be able to play your encode on standalone/set-top
players, your primary choices are DVD, VCD, and SVCD. There are also
extensions such as KVCD and XVCD which violate the standards but work
on many players and deliver higher quality. Modern players are
beginning to support MPEG-4 ("DivX") movies in AVI and perhaps other
containers as well, but these are often buggy and require you to
restrict your encodes to certain subsets of the full MPEG-4
functionality.
If you wish to be able to share your movies with Windows or Macintosh
users, without them having to install additional software, your
choices are very limited. The ancient MPEG-1 format with MP2 or PCM
audio is probably the only choice that is universally supported.
Interoperability with Windows/Mac also comes into play when deciding
how to encode and whether to scale to preserve aspect, since popular
media player applications for these systems do not honor the aspect
ratio encoding stored in MPEG-4 avi files.
II.2. Constraints of DVD, SVCD, and VCD
Unfortunately, the DVD, SVCD, and VCD formats are subject to heavy
constraints. Only a small selection of encoded picture sizes & aspect
ratios are available. If your movie does not meet one of these, you
must scale and crop or add black borders (which are bad for quality!)
to make it compliant.
Format Resolution V.Codec A.Codec FPS Aspect
NTSC DVD 720x480 MPEG-2 AC3,PCM 24,30 4:3,16:9
NTSC DVD 352x240 * MPEG-1 AC3,PCM 24,30 4:3
NTSC SVCD 480x480 MPEG-2 MP2 30 4:3
NTSC VCD 352x240 MPEG-1 MP2 24,30 4:3
PAL DVD 720x576 MPEG-2 MP2,AC3,PCM 25 4:3,16:9
PAL DVD 352x288 * MPEG-1 MP2,AC3,PCM 25 4:3
PAL SVCD 480x576 MPEG-2 MP2 25 4:3
PAL VCD 352x288 MPEG-1 MP2 25 4:3
* These resolutions are rarely used in DVD because they are fairly low
quality.
DVD, VCD, and SVCD also constrain you to relatively low GOP sizes. 18 is
supposed to be the largest allowed GOP size for 30 fps NTSCP material;
for 25 or 24 fps, the GOP size should be 15.
VCD video is required to be CBR at 1152 kbps. This highly limiting
constraint also comes along with an extremly low vbv buffer size of
327 kilobits. SVCD allows varying video bitrates up to 2500 kbps, and
a somewhat less insane vbv buffer size of 917 kilobits is allowed. DVD
video bitrates may range anywhere up to 9800 kbps (though typical bitrates
are about half that), and the vbv buffer size is 1835 kilobits.
Here is a list of fields in lavcopts that you may be required to change
in order to make usable video for VCD, SVCD, or DVD:
acodec: mp2 for VCD, SVCD, or PAL DVD; ac3 is most commonly used for DVD.
PCM audio may also be used for DVD, but this is mostly a big
waste of space. Note that mp3 audio isn't spec-compliant for
any of these formats, but players often have no problem playing
it anyway.
abitrate: 224 for VCD; user-selectable for DVD and SVCD, but commonly used
values range from 192 to 384 kbps.
vcodec: mpeg1video for VCD; mpeg2video for SVCD; mpeg2video is usually
used for DVD but you may also use mpeg1video for CIF resolutions.
keyint: 18 for 30fps material, or 15 for 25/24 fps material. Commercial
producers seem to prefer keyframe intervals of 12.
vrc_buf_size: 327 for VCD, 917 for SVCD, and 1835 for DVD.
vrc_minrate: 1152, for VCD. May be left alone for SVCD and DVD.
vrc_maxrate: 1152 for VCD; 2500 for SVCD; 9800 for DVD. For SVCD and DVD,
you might wish to use lower values depending on your own
personal preferences and requirements.
vbitrate: 1152 for vcd; up to 2500 for SVCD; up to 9800 for DVD. For the
latter two formats, vbitrate should be set based on personal
preference. For instance, if you insist on fitting 20 or so
hours on a DVD, you could use vbitrate=400. The resulting
video quality would probably be quite bad. If you are trying
to squeeze out the maximum possible quality on a DVD, use
vbitrate=9800, but be warned that this could constrain you to
less than an hour of video on a single-layer DVD.
Here is a typical minimum set of lavcopts for encoding video for a VCD:
-lavcopts vcodec=mpeg1video:vrc_buf_size=327:vrc_minrate=1152:\
vrc_maxrate=1152:vbitrate=1152:keyint=15:acodec=mp2
SVCD:
-lavcopts vcodec=mpeg2video:vrc_buf_size=917:vrc_maxrate=2500:vbitrate=1800:\
keyint=15:acodec=mp2
DVD:
-lavcopts vcodec=mpeg2video:vrc_buf_size=1835:vrc_maxrate=9800:\
vbitrate=5000:keyint=15:acodec=ac3
For higher quality encoding, you may also wish to add quality-enhancing
options to lavcopts, such as trell, mbd=2, and others. Note that qpel
and v4mv, while often useful with MPEG-4, are not usable in MPEG-1 or
MPEG-2. Also, if you are trying to make a very high quality DVD encode,
it may be useful to add dc=10 to lavcopts. Doing so may help reduce the
appearance of blocks in flat-colored areas. Putting it all together,
here is an example of a set of lavcopts for a higher quality DVD:
-lavcopts vcodec=mpeg2video:vrc_buf_size=1835:vrc_maxrate=9800:\
vbitrate=8000:keyint=15:trell:mbd=2:precmp=2:subcmp=2:cmp=2:dia=-10:\
predia=-10:cbp:mv0:vqmin=1:lmin=1:dc=10
If your movie has 2.35:1 aspect (most recent action movies), you will
have to add black borders or crop the movie down to 16:9 to make a DVD
or VCD. If you add black borders, try to align them at 16-pixel
boundaries in order to minimize the impact on encoding performance.
Thankfully DVD has sufficiently excessive bitrate that you do not have
to worry too much about encoding efficiency, but SVCD and VCD are
highly bitrate-starved and require effort to obtain acceptable
quality.
II.3. Limitations of the AVI container
Although it's the most widely-supported format after MPEG-1, AVI also
has some major drawbacks. Perhaps the most obvious is the overhead.
For each chunk of the AVI file, 24 bytes are wasted on headers and
index. This translates into a little over 5 MB per hour, or 1-2.5%
overhead for a 700 MB movie. This may not seem like much, but it could
mean the difference between being able to use 700 kbit/sec video or
714 kbit/sec, and every bit of quality counts.
In addition to gross inefficiency, AVI also has the following major
limitations:
1. Only fixed-fps content can be stored. This is particularly limiting
if the original material you want to encode is mixed content, for
example a mix of NTSC video and film material. Actually there are
hacks that can be used to store mixed-framerate content in AVI, but
they increase the (already huge) overhead fivefold or more so they
are not practical.
2. Audio in AVI files must be either constant-bitrate (CBR) or
constant-framesize (i.e. all frames decode to the same number of
samples). Unfortunately, the most efficient codec, Vorbis, does not
meet either of these requirements. Therefore, if you plan to store
your movie in AVI, you'll have to use a less efficient codec such
as MP3 or AC3.
With all of that said, MEncoder does not support variable-fps output
or Vorbis encoding. Therefore, you may not see these as limitations if
MEncoder is the only tool you will be using to produce your encodes.
However, it is possible to use MEncoder only for the video encoding,
and then use external tools to encode the audio and mux it into
another container format.
III. Basic MEncoder usage
III.1. Selecting codecs & format
Audio and video codecs for encoding are selected with the -oac and
-ovc options, respectively. The following choices are available,
although some may not have been enabled at compiletime:
Audio Codecs
mp3lame Encode VBR or CBR MP3 with LAME
lavc Use one of libavcodec's audio encoders
pcm Uncompressed PCM audio
copy Do not reencode, just copy compressed frames
Video codecs
lavc Use one of libavcodec's video encoders
xvid XviD
raw Uncompressed video frames
copy Do not reencode, just copy compressed frames
frameno Used for 3-pass encoding (not recommended)
Several other video codecs are available, but not recommended. The
lavc audio and video encoders have additional suboptions to select
which codec to use within lavc. The syntax is:
-lavcopts acodec=audio_codec_name
-lavcopts vcodec=video_codec_name
Your choices for lavc audio are mp2, ac3, and various adpcm formats
(low efficiency). For lavc video, you have many more choices:
mpeg1video MPEG-1 video
mpeg2video MPEG-2 video
mpeg4 MPEG-4 video, standards-compliant
msmpeg4 Pre-standard MPEG-4 used by MS (aka DivX3)
msmpeg4v2 Pre-standard MPEG-4 used by MS (low quality)
msmpeg4v1 Pre-standard MPEG-4 used by MS (low quality)
wmv1 Windows Media Video, V1 (aka WMV7)
wmv2 Windows Media Video, V2 (aka WMV8)
dvvideo DV video (used by DV cameras)
mjpeg Motion JPEG
ljpeg Lossless JPEG
ffv1 Lossless FFmpeg video codec #1 (slow)
huffyuv A standard lossless codec
...and lots more that aren't worth mentioning for most people.
III.2. Selecting input file or device
MEncoder can encode from files or directly from a DVD or VCD disc.
Simply include the filename on the command line to encode from a file,
or dvd://titlenumber or vcd://tracknumber to encode from a DVD title
or VCD track. If you have already copied a DVD to your hard drive and
wish to encode from the copy, you should still use the dvd:// syntax,
along with -dvd-device followed by the path to the copied DVD root.
The -dvd-device and -cdrom-device options can also be used to override
the paths to the device nodes for reading directly from disc, if the
defaults of /dev/dvd and /dev/cdrom do not work on your system.
When encoding from DVD, it is often desirable to select a chapter or
range of chapters to encode. You can use the -chapter option for this
purpose. For example, -chapter 1-4 will only encode chapters 1 through
4 from the DVD. This is especially useful if you will be making a 1400
MB encode targetted for two CDs, since you can ensure the split occurs
exactly at a chapter boundary rather than in the middle of a scene.
If you have a supported TV capture card, you can also encode from the
TV-in device. Use tv://channelnumber as the filename, and -tv to
configure various capture settings. DVB input works similarly.
III.3. Loading video filters
Learning how to use MEncoder's video filters is essential to producing
good encodes. All video processing is performed through the filters --
cropping, scaling, color adjustment, noise removal, sharpening,
deinterlacing, telecine, inverse telecine, and deblocking, just to
name a few. Along with the vast number of supported input formats, the
variety of filters available in MEncoder is one of its main advantages
over other similar programs.
Filters are loaded in a chain using the -vf option:
-vf filter1=options,filter2=options,...
Most filters take several numeric options separated by colons, but the
syntax for options varies from filter to filter, so read the man page
for details on the filters you wish to use.
Filters operate on the video in the order they are loaded. For
example, the following chain:
-vf crop=688:464:12:4,scale=640:464
will first crop the 688x464 region of the picture with upper-left
corner at (12,4), and then scale the result down to 640x464.
Certain filters need to be loaded at or near the beginning of the
filter chain, in order to take advantage of information from the video
decoder that will be lost or invalidated by other filters. The
principal examples are pp (postprocessing, only when it is performing
deblock or dering operations), spp (another postprocessor to remove
MPEG artifacts), pullup (inverse telecine), and softpulldown (for
converting soft telecine to hard telecine).
Advanced topics in filter chains and usage information for specific
filters will follow in chapters IV and V, as they are needed for the
topics covered.
III.4. Notes on A/V sync
MEncoder's audio/video synchronization algorithms were designed with
the intention of recovering files with broken sync. However they seem
to cause unnecessary skipping and duplication of frames, and possibly
slight A/V desync, when used with proper input. It is therefore
recommended that you switch to basic A/V sync with the -mc 0 option,
or put this in your ~/.mplayer/mencoder config file, as long as you
are only working with good sources (DVD, TV capture, high quality
MPEG-4 rips, etc) and not broken ASF/RM/MOV files.
If you want to further guard against strange frame skips and
duplication, you can use both -mc 0 and -noskip. This will prevent ALL
A/V sync, and copy frames one-to-one, so you cannot use it if you will
be using any filters that unpredictably add or drop frames, or if your
input file has variable framerate! Therefore, using -noskip is not in
general recommended.
The so-called "three-pass" encoding which MEncoder supports has been
reported to cause A/V desync. This will definitely happen if it is
used in conjunction with certain filters, therefore, it is now
recommended NOT to use three-pass mode. This feature is only left for
compatibility purposes and for expert users who understand when it is
safe to use and when it is not. If you have never heard of three-pass
mode before, forget that we even mentioned it!
There have also been reports of A/V desync when encoding from stdin
with MEncoder. Do not do this! Always use a file or CD/DVD/etc device
as input.
IV.1. Encoding progressive video
As long as your input video is progressive (see section I.1),
Let's finally see a few examples:
Encoding from 2:2 pulldown PAL DVD, title 1
2.35:1 picture aspect
1200 kbit/sec MPEG-4 video
128 kbit/sec average-bitrate MP3 audio
mencoder dvd://1 -vf crop=712:432,scale=640:288 -mc 0 -oac mp3lame\
-lameopts abr:br=128 -ovc lavc -lavcopts vcodec=mpeg4:vbitrate=1200
The crop size was presumably obtained by using the cropdetect filter
in MPlayer, or experimenting first with crop rectangles in MPlayer.
The output framerate will be 25 fps, the same as the original DVD. It
would be preferable to adjust the playback speed to match the original
24 fps theatrical rate, but this is not yet possible with MEncoder.
The options we pass to libavcodec are the bare minimum, and will yield
relatively poor quality. We will refine then in subsequent sections.
Now, a second example:
Encoding from soft-telecined NTSC DVD, title 3
2.35:1 picture aspect
900 kbit/sec MPEG-4 video
Keeping the original AC3 audio
mencoder dvd://1 -vf crop=708:360,scale=640:288 -mc 0 -oac copy \
-ovc lavc -lavcopts vcodec=mpeg4:vbitrate=900 -ofps 23.976023976
This example is very similar to the first example, except for the
-ofps option to adjust the output framerate. Unless you tell it
otherwise, MEncoder takes its output framerate from the input
framerate. This is reported as 29.97 fps (actually 30000/1001), or
rather, 29.97 pairs of fields per second. But since the DVD is
soft-telecined, 1/5 of these fields are not actually present, but
intended to be added by the player when it telecines the movie in
realtime. There are actually only 23.976 (24000/1001) frames per
second. If you leave the framerate at the default, 29.97, it will
still work, but every 4th frame will get encoded in duplicate, making
the motion appear choppy.
Finally, a comment on the number 23.976023976. You'll often see
recommendations to use -ofps 23.976, but this is wrong. MEncoder will
reduce 23.976 to 2997/125, which is not the same as 24000/1001. So in
order to get the right framerate written in the output file's header,
always use plenty of precision.
IV.2. Two-pass encoding
The complexity (and thus the number of bits) required to compress the
frames of a movie can vary greatly from one scene to another. Modern
video encoders can adjust to these needs as they go and vary the
bitrate. However, they cannot exceed the requested average bitrate for
long stretches of time, because they do not know the bitrate needs of
future scenes.
Two-pass encoding solves this problem by encoding the movie twice.
During the first pass, statistics are generated regarding the number
of bits used by each frame and the quantization level (quality) at
which it was encoded. Then, when the second pass begins, the encoder
reads these statistics and redistributes the bits from frames where
they are in excess to frames that are suffering from low quality.
In order for the process to work properly, the encoder should be given
exactly the same sequence of frames during both passes. This means
that the same filters must be used, the same encoder parameters must
be used (with the possible exception of bitrate), and the same frame
drops and duplications (if any) must take place.
In theory it's possible to use -oac pcm or -oac copy during the first
pass to avoid spending time encoding the audio. However, this can
result in slight variations in which frames get dropped or duplicated,
so it may be preferable to encode the audio during the first pass as
well as the second. This also allows you to examine the final audio
bitrate and filesize, and to adjust the audio or video bitrate
slightly between passes if you don't meet your target size.
Here is an example:
Encoding from an existing AVI file
500 kbit/sec MPEG-4 video
96 kbit/sec average-bitrate MP3 audio
mencoder bar.avi -vf scale=448:336 -mc 0 -oac mp3lame -lameopts \
abr:br=96 -ovc lavc -lavcopts vcodec=mpeg4:vbitrate=500:vpass=1
mencoder bar.avi -vf scale=448:336 -mc 0 -oac mp3lame -lameopts \
abr:br=96 -ovc lavc -lavcopts vcodec=mpeg4:vbitrate=500:vpass=2
If you do not want to overwrite the output from the first pass when
you begin the second, you can use the -o option to choose a different
output filename. Note the addition of the vpass option in this
example. If vpass is not specified, single-pass encoding is performed.
If vpass=1, a log file is written with statistics from the first pass.
If vpass=2, the log file is read and the second pass is encoded based
on those statistics. If you are short on disk space or don't want the
extra disk wear from writing the file twice, you can use -o /dev/null
during the first pass. However, sometimes it is beneficial to watch
the first-pass file before beginning the second pass to make sure
nothing went wrong in the encoding.
Next, an example using XviD instead of libavcodec:
Encoding from an existing AVI file
500 kbit/sec MPEG-4 video
Copying the existing audio stream unmodified
mencoder foo.avi -vf scale=320:240 -mc 0 -oac copy -ovc xvid \
-xvidencopts bitrate=400:pass=1
mencoder foo.avi -vf scale=320:240 -mc 0 -oac copy -ovc xvid \
-xvidencopts bitrate=400:pass=2
The options used are slightly different, but the process is otherwise
the same.
IV.3. Encoding interlaced video
If the movie you want to encode is interlaced (NTSC video or PAL
video), you will need to choose whether you want to deinterlace or
not. While deinterlacing will make your movie usable on progressive
scan displays such a computer monitors and projectors, it comes at a
cost: the field rate of 50 or 59.94 fields per second is halved to 25
or 29.97 frames per second, and roughly half the information in your
movie will be lost during scenes with significant motion.
Therefore, if you are encoding for high quality archival purposes, it
is recommended not to deinterlace. You can always deinterlace the
movie at playback time when displaying it on progressive scan devices,
and future players will be able to deinterlace to full fieldrate,
interpolating 50 or 59.94 entire frames per second from the interlaced
video.
Special care must be taken when working with interlaced video:
1. Crop height and y-offset must be multiples of 4.
2. Any vertical scaling must be performed in interlaced mode.
3. Postprocessing and denoising filters may not work as expected
unless you take special care to operate them a field at a time, and
they may damage the video if used incorrectly.
With these things in mind, here is our first example:
mencoder capture.avi -mc 0 -oac lavc -ovc lavc -lavcopts \
vcodec=mpeg2video:vbitrate=6000:ilme:ildct:acodec=mp2:abitrate=224
Note the ilme and ildct options.