Decoding H264 using Video Decode Acceleration used the custom 'vda_h264_dec'
decoder in FFmpeg.
The Good: This new implementation has some advantages over the previous one:
- It works with Libav: vda_h264_dec never got into Libav since they prefer
client applications to use the hwaccel API.
- It is way more efficient: in my tests this implementation yields a
reduction of CPU usage of roughly ~50% compared to using `vda_h264_dec` and
~65-75% compared to h264 software decoding. This is mainly because
`vo_corevideo` was adapted to perform direct rendering of the
`CVPixelBufferRefs` created by the Video Decode Acceleration API Framework.
The Bad:
- `vo_corevideo` is required to use VDA decoding acceleration.
- only works with versions of ffmpeg/libav new enough (needs reference
refcounting). That is FFmpeg 2.0+ and Libav's git master currently.
The Ugly: VDA was hardcoded to use UYVY (2vuy) for the uploaded video texture.
One one end this makes the code simple since Apple's OpenGL implementation
actually supports this out of the box. It would be nice to support other
output image formats and choose the best format depending on the input, or at
least making it configurable. My tests indicate that CPU usage actually
increases with a 420p IMGFMT output which is not what I would have expected.
NOTE: There is a small memory leak with old versions of FFmpeg and with Libav
since the CVPixelBufferRef is not automatically released when the AVFrame is
deallocated. This can cause leaks inside libavcodec for decoded frames that
are discarded before mpv wraps them inside a refcounted mp_image (this only
happens on seeks).
For frames that enter mpv's refcounting facilities, this is not a problem
since we rewrap the CVPixelBufferRef in our mp_image that properly forwards
CVPixelBufferRetain/CvPixelBufferRelease calls to the underying
CVPixelBufferRef.
So, for FFmpeg use something more recent than `b3d63995` for Libav the patch
was posted to the dev ML in July and in review since, apparently, the proposed
fix is rather hacky.
This is based on the MPlayer VA API patches. To be exact it's based on
a very stripped down version of commit f1ad459a263f8537f6c from
git://gitorious.org/vaapi/mplayer.git.
This doesn't contain useless things like benchmarking hacks and the
demo code for GLX interop. Also, unlike in the original patch, decoding
and video output are split into separate source files (the separation
between decoding and display also makes pixel format hacks unnecessary).
On the other hand, some features not present in the original patch were
added, like screenshot support.
VA API is rather bad for actual video output. Dealing with older libva
versions or the completely broken vdpau backend doesn't help. OSD is
low quality and should be rather slow. In some cases, only either OSD
or subtitles can be shown at the same time (because OSD is drawn first,
OSD is prefered).
Also, libva can't decide whether it accepts straight or premultiplied
alpha for OSD sub-pictures: the vdpau backend seems to assume
premultiplied, while a native vaapi driver uses straight. So I picked
straight alpha. It doesn't matter much, because the blending code for
straight alpha I added to img_convert.c is probably buggy, and ASS
subtitles might be blended incorrectly.
Really good video output with VA API would probably use OpenGL and the
GL interop features, but at this point you might just use vo_opengl.
(Patches for making HW decoding with vo_opengl have a chance of being
accepted.)
Despite these issues, decoding seems to work ok. I still got tearing
on the Intel system I tested (Intel(R) Core(TM) i3-2350M). It was also
tested with the vdpau vaapi wrapper on a nvidia system; however this
was rather broken. (Fortunately, there is no reason to use mpv's VAAPI
support over native VDPAU.)
Move the decoder parts from vo_vdpau.c to a new file vdpau_old.c. This
file is named so because because it's written against the "old"
libavcodec vdpau pseudo-decoder (e.g. "h264_vdpau").
Add support for the "new" libavcodec vdpau support. This was recently
added and replaces the "old" vdpau parts. (In fact, Libav is about to
deprecate and remove the "old" API without deprecation grace period,
so we have to support it now. Moreover, there will probably be no Libav
release which supports both, so the transition is even less smooth than
we could hope, and we have to support both the old and new API.)
Whether the old or new API is used is checked by a configure test: if
the new API is found, it is used, otherwise the old API is assumed.
Some details might be handled differently. Especially display preemption
is a bit problematic with the "new" libavcodec vdpau support: it wants
to keep a pointer to a specific vdpau API function (which can be driver
specific, because preemption might switch drivers). Also, surface IDs
are now directly stored in AVFrames (and mp_images), so they can't be
forced to VDP_INVALID_HANDLE on preemption. (This changes even with
older libavcodec versions, because mp_image always uses the newer
representation to make vo_vdpau.c simpler.)
Decoder initialization in the new code tries to deal with codec
profiles, while the old code always uses the highest profile per codec.
Surface allocation changes. Since the decoder won't call config() in
vo_vdpau.c on video size change anymore, we allow allocating surfaces
of arbitrary size instead of locking it to what the VO was configured.
The non-hwdec code also has slightly different allocation behavior now.
Enabling the old vdpau special decoders via e.g. --vd=lavc:h264_vdpau
doesn't work anymore (a warning suggesting the --hwdec option is
printed instead).
This caused all formats with fewer than 8 bits per component marked as
little endian. (Normally, only some messed up packed RGB formats are
endian-specific in this case.)
Options that take pixel format names now also accept ffmpeg names.
mpv internal names are preferred. We leave this undocumented
intentionally, and may be removed once libswscale stops printing
ffmpeg pixel format names to the terminal (or if we stop passing the
SWS_PRINT_INFO flag to it, which makes it print these).
(We insist on keeping the mpv specific names instead of dropping them
in favor of ffmpeg's name due to NIH, and also because ffmpeg always
appends the endian suffixes "le" and "be".)
This field contained the "average" bit depth per pixel. It serves no
purpose anymore. Remove it.
Only vo_opengl_old still uses this in order to allocate a buffer that is
shared between all planes.
This used to mean that there is more than one plane. This is not very
useful: IMGFMT_Y8 was not considered planar, but it's just a Y plane,
and could be treated the same as actual planar formats. On the other
hand, IMGFMT_NV12 is partially packed, and usually requires special
handling, but was considered planar.
Change its meaning. Now the flag is set if the format has a separate
plane for each component. IMGFMT_Y8 is now planar, IMGFMT_NV12 is not.
As an odd special case, IMGFMT_MONO (1 bit per pixel) is like a planar
RGB format with a single plane.
Remove the strange things the old mp_image_setfmt() code did to the
image format parameters. This includes setting chroma shift to 31 for
gray (Y8) formats and more.
Y8 + vo_opengl_old didn't actually work for unknown reasons (regression
in this branch). Fix this. The difference is that Y8 is now interpreted
as gray RGB (LUMINANCE texture) instead of involving YUV (and levels)
conversion.
Get rid of distinguishing RGB and BGR. There wasn't really any good
reason for this.
Remove mp_get_chroma_shift() and IMGFMT_IS_YUVP16*(). mp_imgfmt_desc
gives the same information and more.
Even though #ifdef ACCURATE is removed, the result should be about the
same. The fallback is only used by packed YUV formats (YUYV, NV12), and
doing 16 bit for them instead of 8 bit is not useful.
A side effect is that Y8 (gray) is not converted drawing subs, and for
alpha formats, the alpha plane is not removed. This means the number of
planes after upsampling can be 1-4 (1: gray, 2: gray+alpha, 3: planar,
4: planar+alpha). The code has to be adjusted accordingly to work on the
color planes only. Also remove the workaround for the chroma shift 31
hack.
mplayer's video chain traditionally used FourCCs for pixel formats. For
example, it used IMGFMT_YV12 for 4:2:0 YUV, which was defined to the
string 'YV12' interpreted as unsigned int. Additionally, it used to
encode information into the numeric values of some formats. The RGB
formats had their bit depth and endian encoded into the least
significant byte. Extended planar formats (420P10 etc.) had chroma
shift, endian, and component bit depth encoded. (This has been removed
in recent commits.)
Replace the FourCC mess with a simple enum. Remove all the redundant
formats like YV12/I420/IYUV. Replace some image format names by
something more intuitive, most importantly IMGFMT_YV12 -> IMGFMT_420P.
Add img_fourcc.h, which contains the old IDs for code that actually uses
FourCCs. Change the way demuxers, that output raw video, identify the
video format: they set either MP_FOURCC_RAWVIDEO or MP_FOURCC_IMGFMT to
request the rawvideo decoder, and sh_video->imgfmt specifies the pixel
format. Like the previous hack, this is supposed to avoid the need for
a complete codecs.cfg entry per format, or other lookup tables. (Note
that the RGB raw video FourCCs mostly rely on ffmpeg's mappings for NUT
raw video, but this is still considered better than adding a raw video
decoder - even if trivial, it would be full of annoying lookup tables.)
The TV code has not been tested.
Some corrective changes regarding endian and other image format flags
creep in.
According to DOCS/OUTDATED-tech/colorspaces.txt, the following formats
are supposed to be palettized:
IMGFMT_BGR8
IMGFMT_RGB8,
IMGFMT_BGR4_CHAR
IMGFMT_RGB4_CHAR
IMGFMT_BGR4
IMGFMT_RGB4
Of these, only BGR8 and RGB8 are actually treated as palettized in some
way. ffmpeg has only one palettized format (AV_PIX_FMT_PAL8), and
IMGFMT_BGR8 was inconsistently mapped to packed non-palettized RGB
formats too (AV_PIX_FMT_BGR8). Moreover, vf_scale.c contained messy
hacks to generate a palette when AV_PIX_FMT_BGR8 is output. (libswscale
does not support AV_PIX_FMT_PAL8 output in the first place.)
Get rid of all of this, and introduce IMGFMT_PAL8, which directly maps
to AV_PIX_FMT_PAL8. Remove the palette creation code from vf_scale.c.
IMGFMT_BGR8 maps to AV_PIX_FMT_RGB8 (don't ask me why it's swapped),
without any palette use. Enabling it in vo_x11 or using it as vf_scale
input seems to give correct results.
Replace the internal pixel format stuff with code that queries the
libavutil list of pixel format descriptors.
Trying to map IMGFMT_IS_RGB() etc. turned out extremely hacky.
Based on a patch by qyot27. Add the missing parts in mp_get_chroma_shift(),
which allow allocation of such images, and which make vo_opengl
automatically accept the new formats. Change the IMGFMT_IS_YUVP16_LE/BE
macros to properly report IMGFMT_444P14 as supported: this pixel format
has the highest numerical bit width identifier (0x55), which is not
covered by the mask ~0xfc. Remove 1 bit from the mask (makes it 0xf8) so
that IMGFMT_IS_YUVP16(IMGFMT_444P14) is 1. This is slightly risky, as
the organization of the image format IDs (actually FourCCs + mplayer
internal IDs) is messy at best, but it should be ok.
This pixel format is sometimes used with yuv4mpeg.
vo_direct3d used its own IMGFMT_Y16 internally for some reason.
vo_opengl, vo_opengl_old, and vo_direct3d should be able to display
this pixel format natively.
Finish renaming directories and moving files. Adjust all include
statements to make the previous commit compile.
The two commits are separate, because git is bad at tracking renames
and content changes at the same time.
Also take this as an opportunity to remove the separation between
"common" and "mplayer" sources in the Makefile. ("common" used to be
shared between mplayer and mencoder.)
Tis drops the silly lib prefixes, and attempts to organize the tree in
a more logical way. Make the top-level directory less cluttered as
well.
Renames the following directories:
libaf -> audio/filter
libao2 -> audio/out
libvo -> video/out
libmpdemux -> demux
Split libmpcodecs:
vf* -> video/filter
vd*, dec_video.* -> video/decode
mp_image*, img_format*, ... -> video/
ad*, dec_audio.* -> audio/decode
libaf/format.* is moved to audio/ - this is similar to how mp_image.*
is located in video/.
Move most top-level .c/.h files to core. (talloc.c/.h is left on top-
level, because it's external.) Park some of the more annoying files
in compat/. Some of these are relicts from the time mplayer used
ffmpeg internals.
sub/ is not split, because it's too much of a mess (subtitle code is
mixed with OSD display and rendering).
Maybe the organization of core is not ideal: it mixes playback core
(like mplayer.c) and utility helpers (like bstr.c/h). Should the need
arise, the playback core will be moved somewhere else, while core
contains all helper and common code.