The new API works like the new vaapi API, using generic hwaccel support.
One minor detail is the error message that will be printed if using
non-4:2:0 surfaces (which as far as I can tell is completely broken in
the nVidia drivers and thus not supported by mpv). The HEVC warning
(which is completely broken in the nVidia drivers but should work with
Mesa) had to be added to the generic hwaccel code.
This also trashes display preemption recovery. Fuck that. It never
really worked. If someone complains, I might attempt to add it back
somehow.
This is the 4th iteration of the libavcodec vdpau API (after the
separate decoder API, the manual hwaccel API, and the automatic vdpau
hwaccel API). Fortunately, further iterations will be generic, and not
require much vdpau-specific changes (if any at all).
For an unknown reason, '-Wl -export-dynamic' doesn't work anymore
on the last macOS build (10.12.3 with Apple LLVM 8.0.0) so forcing
cplugins is useless because the check fails. Replacing the linker
option with its substitute '-rdynamic' do the trick.
The syms module from waf still works as expected and only the
symbols specified in mpv.def are exported.
Not needed under any circumstances. While the Windows ones export
functions to which we must link, these functions are always available,
even if libavcodec was compiled with D3D disabled.
Using these was a temporary solution while some compilers implemented
the underlying atomic mechanisms, but not the C11 language parts (or
that's what I guess). Not really useful for us anymore. Also, there is
the slight risk of having subtly incorrect semantics by using
potentially changing compiler internals and such.
This replaces the old backend that exclusively used EGL windowing with
one that can also use ANGLE's ability to render to directly to a
texture. The advantage of this is that it allows mpv to create the swap
chain itself and this allows mpv to use a flip-mode swap chain on a HWND
(which avoids problems with DirectComposition) and to use a longer swap
chain that has six backbuffers by default (which reportedly fixes
problems with rendering 24fps video on 24Hz monitors.)
Also, "screenshot window" should now work on DXGI 1.2 and up (Windows 8
and up.)
video/out/opengl/hwdec_cuda.c is enabled with cuda-hwaccel. But it makes
only sense to build if GL is enabled, and in fact it fails to link
without GL as it calls a specific GL helper function.
In theory it would be perfectly possible to use cuda-copy with GL
disabled. But I'm not bothering with the complexity.
Upstream provides pkgconfig files for quite some time now [1,2].
Use them to determine the required flags instead of hard coding.
This makes cross-compilation easy, which I dare to say is important for
many raspberry-pi users. This also prevents picking libEGL and libGLESv2
from mesa when they are present, which can happen with the current code.
Good distros should put these pkgconfig files into default pkg-config
search path or populate PKG_CONFIG_PATH for users. However, be nice to
everybody and manually look into '/opt/vc/lib/pkgconfig' just in case.
Hence the PKG_CONFIG_PATH mangling.
[1]: https://github.com/raspberrypi/userland/issues/245
[2]: 05d60a01d5
In a first pass, we check whether libavcodec is present.
Then we try to compile a snippet and check for FFmpeg vs. Libav. (This
could probably also be done by somehow checking the pkgconfig version.
But pkg-config can't deal with that idiotic FFmpeg idea that a micro
version number >= 100 identifies FFmpeg vs. Libav.)
After that we check the project-specific version numbers. This means it
can no longer happen that we accidentally allow older, unsupported
versions of FFmpeg, just because the Libav version numbers are somehow
this way.
Also drop the resampler checks. We hardcode which resampler to each with
each project. A user can no longer force use of libavresample with
FFmpeg.
The correctness of the stdatomic.h emulation via the __sync builtins is
questionable, and we've been relying on exact stdatomic semantics for a
while, so just get rid of it. Compilers which support __sync but not
stdatomic.h will use to the slow mutex fallback.
Not sure about the __atomic builtins. It doesn't seem to harm either, so
leave it for now.
Not even Libav does. Whoops. The developer who wrote the FFmpeg code for
this said he could not find any improvements when using the "GPU memcpy"
; instead, it made it actually slower on some hardware.
It's not clear to what extent the "GPU memcpy" was needed for vaapi, but
hopefully not very much (see #2317).
This commit enables use of the new vaapi API by default with FFmpeg.
The FFmpeg versions we support all have the APIs we were checking for.
Only Libav missed them. Simplify this by explicitly checking for FFmpeg
in the code, instead of trying to detect the presence of the API.
Since the only way to detect the API is by a version check, this had to
wait until the patches were actually pushed to FFmpeg git (which now
happened).
Since this does not include the new magic GPU memcpy libavutil function
yet, the new vaapi code would be slower if copy mode (like vaapi-copy)
is used. This would be quite bad to use by default, so check for the
function, and if not present, disable the new vaapi code. This
effectively disables it by default on FFmpeg.
(We assume that if the new GPU memcpy exists, vaapi's AVHWFramesContext
implementation will use it.)
libavutil does this for us. Although the new vaapi decode API does not
strictly introduce or even need av_image_copy_uc_from(), it's implied
that it will be present if the new decode API is present - even if it's
not, we can't use our own SSE code with it anyway.
This basically reuses the scripting infrastructure.
Note that this needs to be explicitly enabled at compilation. For one,
enabling export for certain symbols from an executable seems to be quite
toolchain-specific. It might not work outside of Linux and cause random
problems within Linux.
If C plugins actually become commonly used and this approach is starting
to turn out as a problem, we can build mpv CLI as a wrapper for libmpv,
which would remove the requirement that plugins pick up host symbols.
I'm being lazy, so implementation/documentation are parked in existing
files, even if that stuff doesn't necessarily belong there. Sue me, or
better send patches.
The old API is deprecated, and libavcodec prints a warning at runtime.
The new API is a bit nicer and does many things for you, such as
managing the underlying hwaccel decoder. libavutil also provides code
for managing surfaces (we use their surface pool).
The new code does not contain any code from the original MPlayer VAAPI
patch (that was used as base for some of the vaapi code in mpv). Thus
the new code is LGPL.
The new API actually does not add any visible symbols, so the only way
to detect it is a version check. Of course, the versions overlap
between FFmpeg and Libav, which requires additional care. The new
API did not get merged into FFmpeg yet, so there's no check for
FFmpeg.
This reverts commit fae7307931.
Before the waf build system was used, we had a configure script written
in shell. To drop the build dependency on Python, someone rewrote the
Python scripts we had to Perl. Now the shell configure script is gone,
and it makes no sense to have a build dependency on both Perl and
Python.
This isn't just a straight revert. It adds the new Matroska EBML
elements to the old Python scripts, adjusts the waf build system, and of
course doesn't add anything back needed by the old build system.
It would be better if this used matroska.py/file2string.py directly by
importing them as modules, instead of calling them via "python". But for
now this is simpler.
Enca is dead. libguess is relatively useless due to not having an
universal detection mode. On the other hand, libuchardet is actively
developed.
Manpages changes in the following commit.
The test ended up failing if cuda.h wasn't present, even if cuda.h
isn't used during the actual build.
This test is attempting to establish if the ffmpeg being built
against has dynlink_cuda support. While it might theoretically be
possible to build against the older normally-linked-cuda version
of ffmpeg, it seems more trouble than it's worth.
This fixes the build in mingw-w64/Clang on MSYS2. It also disables the
use of gnu_printf in Clang, which was what was causing most of the
warnings. The Clang-compiled mpv binary appears to work, but there are
no guarantees yet, since until now mpv has only been tested with
mingw-w64/GCC on Windows.
Fixes#3800
This fixes waf setting the wrong LIBDIR for DEST_OS=win32 in
waflib/Tools/c_config.py:get_cc_version()
Any scripts assuming the implib and pkgconfig are in the wrong
place should be changed to move the .dll instead.
Thread-local storage in GCC is platform-specific, and some platforms that
are otherwise perfectly capable of running mpv may lack TLS support in GCC.
This change adds a test for GCC variant of TLS and relies on its result
instead of assumption.
Provided that LLVM's `__thread` support is similar to GCC, the test is
called "GCC/LLVM TLS".
Signed-off-by: wm4 <wm4@nowhere>
We always want to use __declspec(selectany) to declare GUIDs, but
manually including <initguid.h> in every file that used GUIDs was
error-prone. Since all <initguid.h> does is define INITGUID and include
<guiddef.h>, we can remove all references to <initguid.h> and just
compile with -DINITGUID to get the same effect.
Also, this partially reverts 622bcb0 by re-adding libuuid.a to the
build, since apparently some GUIDs (such as GUID_NULL) are not declared
in the source file, even when INITGUID is set.
'cuda-gl' isn't right - you can turn this on without any GL and
get some non-zero benefit (with the cuda-copy hwaccel). So
'cuda-hwaccel' seems more consistent with everything else.
Minimal support just for testing.
Only the window surface creation (including size determination) is
really platform specific, so this could be some generic thing with
platform-specific support as some sort of sub-driver, but on the other
hand I don't see much of a need for such a thing.
While most of the fbdev usage is done by the EGL driver, using this
fbdev ioctl is apparently the only way to get the display resolution.
Nvidia's "NvDecode" API (up until recently called "cuvid" is a cross
platform, but nvidia proprietary API that exposes their hardware
video decoding capabilities. It is analogous to their DXVA or VDPAU
support on Windows or Linux but without using platform specific API
calls.
As a rule, you'd rather use DXVA or VDPAU as these are more mature
and well supported APIs, but on Linux, VDPAU is falling behind the
hardware capabilities, and there's no sign that nvidia are making
the investments to update it.
Most concretely, this means that there is no VP8/9 or HEVC Main10
support in VDPAU. On the other hand, NvDecode does export vp8/9 and
partial support for HEVC Main10 (more on that below).
ffmpeg already has support in the form of the "cuvid" family of
decoders. Due to the design of the API, it is best exposed as a full
decoder rather than an hwaccel. As such, there are decoders like
h264_cuvid, hevc_cuvid, etc.
These decoders support two output paths today - in both cases, NV12
frames are returned, either in CUDA device memory or regular system
memory.
In the case of the system memory path, the decoders can be used
as-is in mpv today with a command line like:
mpv --vd=lavc:h264_cuvid foobar.mp4
Doing this will take advantage of hardware decoding, but the cost
of the memcpy to system memory adds up, especially for high
resolution video (4K etc).
To avoid that, we need an hwdec that takes advantage of CUDA's
OpenGL interop to copy from device memory into OpenGL textures.
That is what this change implements.
The process is relatively simple as only basic device context
aquisition needs to be done by us - the CUDA buffer pool is managed
by the decoder - thankfully.
The hwdec looks a bit like the vdpau interop one - the hwdec
maintains a single set of plane textures and each output frame
is repeatedly mapped into these textures to pass on.
The frames are always in NV12 format, at least until 10bit output
supports emerges.
The only slightly interesting part of the copying process is that
CUDA works by associating PBOs, so we need to define these for
each of the textures.
TODO Items:
* I need to add a download_image function for screenshots. This
would do the same copy to system memory that the decoder's
system memory output does.
* There are items to investigate on the ffmpeg side. There appears
to be a problem with timestamps for some content.
Final note: I mentioned HEVC Main10. While there is no 10bit output
support, NvDecode can return dithered 8bit NV12 so you can take
advantage of the hardware acceleration.
This particular mode requires compiling ffmpeg with a modified
header (or possibly the CUDA 8 RC) and is not upstream in ffmpeg
yet.
Usage:
You will need to specify vo=opengl and hwdec=cuda.
Note that hwdec=auto will probably not work as it will try to use
vdpau first.
mpv --hwdec=cuda --vo=opengl foobar.mp4
If you want to use filters that require frames in system memory,
just use the decoder directly without the hwdec, as documented
above.
This time it's emulation that's supposed to work (not just dummied out).
Unlike the previous emulation, no mpv code has to be disabled, and
everything should work (albeit possibly a bit slowly). On the other
hand, it's not possible to implement this kind of emulation without
compiler support. We use GNU statement expressions and __typeof__ in
this case.
This code is inactive if stdatomic.h is available.