The correctness of the stdatomic.h emulation via the __sync builtins is
questionable, and we've been relying on exact stdatomic semantics for a
while, so just get rid of it. Compilers which support __sync but not
stdatomic.h will use to the slow mutex fallback.
Not sure about the __atomic builtins. It doesn't seem to harm either, so
leave it for now.
Not even Libav does. Whoops. The developer who wrote the FFmpeg code for
this said he could not find any improvements when using the "GPU memcpy"
; instead, it made it actually slower on some hardware.
It's not clear to what extent the "GPU memcpy" was needed for vaapi, but
hopefully not very much (see #2317).
This commit enables use of the new vaapi API by default with FFmpeg.
The FFmpeg versions we support all have the APIs we were checking for.
Only Libav missed them. Simplify this by explicitly checking for FFmpeg
in the code, instead of trying to detect the presence of the API.
Since the only way to detect the API is by a version check, this had to
wait until the patches were actually pushed to FFmpeg git (which now
happened).
Since this does not include the new magic GPU memcpy libavutil function
yet, the new vaapi code would be slower if copy mode (like vaapi-copy)
is used. This would be quite bad to use by default, so check for the
function, and if not present, disable the new vaapi code. This
effectively disables it by default on FFmpeg.
(We assume that if the new GPU memcpy exists, vaapi's AVHWFramesContext
implementation will use it.)
libavutil does this for us. Although the new vaapi decode API does not
strictly introduce or even need av_image_copy_uc_from(), it's implied
that it will be present if the new decode API is present - even if it's
not, we can't use our own SSE code with it anyway.
This basically reuses the scripting infrastructure.
Note that this needs to be explicitly enabled at compilation. For one,
enabling export for certain symbols from an executable seems to be quite
toolchain-specific. It might not work outside of Linux and cause random
problems within Linux.
If C plugins actually become commonly used and this approach is starting
to turn out as a problem, we can build mpv CLI as a wrapper for libmpv,
which would remove the requirement that plugins pick up host symbols.
I'm being lazy, so implementation/documentation are parked in existing
files, even if that stuff doesn't necessarily belong there. Sue me, or
better send patches.
The old API is deprecated, and libavcodec prints a warning at runtime.
The new API is a bit nicer and does many things for you, such as
managing the underlying hwaccel decoder. libavutil also provides code
for managing surfaces (we use their surface pool).
The new code does not contain any code from the original MPlayer VAAPI
patch (that was used as base for some of the vaapi code in mpv). Thus
the new code is LGPL.
The new API actually does not add any visible symbols, so the only way
to detect it is a version check. Of course, the versions overlap
between FFmpeg and Libav, which requires additional care. The new
API did not get merged into FFmpeg yet, so there's no check for
FFmpeg.
This reverts commit fae7307931.
Before the waf build system was used, we had a configure script written
in shell. To drop the build dependency on Python, someone rewrote the
Python scripts we had to Perl. Now the shell configure script is gone,
and it makes no sense to have a build dependency on both Perl and
Python.
This isn't just a straight revert. It adds the new Matroska EBML
elements to the old Python scripts, adjusts the waf build system, and of
course doesn't add anything back needed by the old build system.
It would be better if this used matroska.py/file2string.py directly by
importing them as modules, instead of calling them via "python". But for
now this is simpler.
Enca is dead. libguess is relatively useless due to not having an
universal detection mode. On the other hand, libuchardet is actively
developed.
Manpages changes in the following commit.
The test ended up failing if cuda.h wasn't present, even if cuda.h
isn't used during the actual build.
This test is attempting to establish if the ffmpeg being built
against has dynlink_cuda support. While it might theoretically be
possible to build against the older normally-linked-cuda version
of ffmpeg, it seems more trouble than it's worth.
This fixes the build in mingw-w64/Clang on MSYS2. It also disables the
use of gnu_printf in Clang, which was what was causing most of the
warnings. The Clang-compiled mpv binary appears to work, but there are
no guarantees yet, since until now mpv has only been tested with
mingw-w64/GCC on Windows.
Fixes#3800
This fixes waf setting the wrong LIBDIR for DEST_OS=win32 in
waflib/Tools/c_config.py:get_cc_version()
Any scripts assuming the implib and pkgconfig are in the wrong
place should be changed to move the .dll instead.
Thread-local storage in GCC is platform-specific, and some platforms that
are otherwise perfectly capable of running mpv may lack TLS support in GCC.
This change adds a test for GCC variant of TLS and relies on its result
instead of assumption.
Provided that LLVM's `__thread` support is similar to GCC, the test is
called "GCC/LLVM TLS".
Signed-off-by: wm4 <wm4@nowhere>
We always want to use __declspec(selectany) to declare GUIDs, but
manually including <initguid.h> in every file that used GUIDs was
error-prone. Since all <initguid.h> does is define INITGUID and include
<guiddef.h>, we can remove all references to <initguid.h> and just
compile with -DINITGUID to get the same effect.
Also, this partially reverts 622bcb0 by re-adding libuuid.a to the
build, since apparently some GUIDs (such as GUID_NULL) are not declared
in the source file, even when INITGUID is set.
'cuda-gl' isn't right - you can turn this on without any GL and
get some non-zero benefit (with the cuda-copy hwaccel). So
'cuda-hwaccel' seems more consistent with everything else.
Minimal support just for testing.
Only the window surface creation (including size determination) is
really platform specific, so this could be some generic thing with
platform-specific support as some sort of sub-driver, but on the other
hand I don't see much of a need for such a thing.
While most of the fbdev usage is done by the EGL driver, using this
fbdev ioctl is apparently the only way to get the display resolution.
Nvidia's "NvDecode" API (up until recently called "cuvid" is a cross
platform, but nvidia proprietary API that exposes their hardware
video decoding capabilities. It is analogous to their DXVA or VDPAU
support on Windows or Linux but without using platform specific API
calls.
As a rule, you'd rather use DXVA or VDPAU as these are more mature
and well supported APIs, but on Linux, VDPAU is falling behind the
hardware capabilities, and there's no sign that nvidia are making
the investments to update it.
Most concretely, this means that there is no VP8/9 or HEVC Main10
support in VDPAU. On the other hand, NvDecode does export vp8/9 and
partial support for HEVC Main10 (more on that below).
ffmpeg already has support in the form of the "cuvid" family of
decoders. Due to the design of the API, it is best exposed as a full
decoder rather than an hwaccel. As such, there are decoders like
h264_cuvid, hevc_cuvid, etc.
These decoders support two output paths today - in both cases, NV12
frames are returned, either in CUDA device memory or regular system
memory.
In the case of the system memory path, the decoders can be used
as-is in mpv today with a command line like:
mpv --vd=lavc:h264_cuvid foobar.mp4
Doing this will take advantage of hardware decoding, but the cost
of the memcpy to system memory adds up, especially for high
resolution video (4K etc).
To avoid that, we need an hwdec that takes advantage of CUDA's
OpenGL interop to copy from device memory into OpenGL textures.
That is what this change implements.
The process is relatively simple as only basic device context
aquisition needs to be done by us - the CUDA buffer pool is managed
by the decoder - thankfully.
The hwdec looks a bit like the vdpau interop one - the hwdec
maintains a single set of plane textures and each output frame
is repeatedly mapped into these textures to pass on.
The frames are always in NV12 format, at least until 10bit output
supports emerges.
The only slightly interesting part of the copying process is that
CUDA works by associating PBOs, so we need to define these for
each of the textures.
TODO Items:
* I need to add a download_image function for screenshots. This
would do the same copy to system memory that the decoder's
system memory output does.
* There are items to investigate on the ffmpeg side. There appears
to be a problem with timestamps for some content.
Final note: I mentioned HEVC Main10. While there is no 10bit output
support, NvDecode can return dithered 8bit NV12 so you can take
advantage of the hardware acceleration.
This particular mode requires compiling ffmpeg with a modified
header (or possibly the CUDA 8 RC) and is not upstream in ffmpeg
yet.
Usage:
You will need to specify vo=opengl and hwdec=cuda.
Note that hwdec=auto will probably not work as it will try to use
vdpau first.
mpv --hwdec=cuda --vo=opengl foobar.mp4
If you want to use filters that require frames in system memory,
just use the decoder directly without the hwdec, as documented
above.
This time it's emulation that's supposed to work (not just dummied out).
Unlike the previous emulation, no mpv code has to be disabled, and
everything should work (albeit possibly a bit slowly). On the other
hand, it's not possible to implement this kind of emulation without
compiler support. We use GNU statement expressions and __typeof__ in
this case.
This code is inactive if stdatomic.h is available.
The current stdatomic check verifies the availability of the function by
calling atomic_load(). It also uses this test to check if linking
against libatomic is needed or not.
Unfortunately, on specific architectures (namely SPARC), using
atomic_load() does *not* require linking against libatomic, while other
atomic operations do. Due to this, mpv's wscript concludes that
stdatomic is available, and that linking against libatomic is not
needed, causing the following link failure:
[190/190] Linking build/mpv
audio/out/ao.c.13.o: In function `ao_query_and_reset_events':
/home/peko/autobuild/instance-0/output/build/mpv-0.18.1/build/../audio/out/ao.c:399: undefined reference to `__atomic_fetch_and_4'
In order to fix this, the stdatomic check is adjusted to call
atomic_fetch_add() instead, which does require libatomic. Thanks to
this, the wscript realizes that linking against libatomic is needed, and
the build works fine.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Always require them, instead of just for some components which have hard
requirements on correct atomic semantics. They should be widely
available, and are supported by all recent gcc and clang compiler
versions. We even have the fallbacks builtins, which should keep this
working on very old gcc releases.
In particular, w32_common.c recently added a hard requirement on
atomics, but checking this properly in the build system would have been
messy. This commit makes sure it always works.
The fallback where weak atomic semantics are always fine is in theory
rather questionable as well.
This greatly improves the result when decoding typical (ST.2084) HDR
content, since the job of tone mapping gets significantly easier when
you're only mapping from 1000 to 250, rather than 10000 to 250.
The difference is so drastic that we can now even reasonably use
`hdr-tone-mapping=linear` and get a very perceptually uniform result
that is only slightly darker than normal. (To compensate for the extra
dynamic range)
Due to weird implementation details, this only seems to be present on
keyframes (or something like that), so we have to cache the last seen
value for the frames in between.
Also, in some files the metadata is just completely broken /
nonsensical, so I decided to apply a simple heuristic to detect
completely broken metadata.
This HDR function is unique in that it's still display-referred, it just
allows for values above the reference peak (super-highlights). The
official standard doesn't actually document this very well, but the
nominal peak turns out to be exactly 12.0 - so we normalize to this
value internally in mpv. (This lets us preserve the property that the
textures are encoded in the range [0,1], preventing clipping and making
the best use of an integer texture's range)
This was grouped together with SMPTE ST2084 when checking libavutil
compatibility since they were added in the same release window, in a
similar timeframe.
This reverts commit b51957fab59fd5a2fde824e1946befd29ed172c1.
Breaks big time. It appears to ignore explicitly configured paths within
the libav* .pc files, which for example breaks mpv-build.
Distros and users alike should be made aware of the fact that old
FFmpeg versions are bad. When users come to us with FFmpeg-related
trouble, the answer is “update FFmpeg” more often than not
(and no further support will be provided until they have done so),
so instead we just nag them about it here.
This now lets us auto-detect appropriately tagged HDR content using
FFmpeg's new TRC entries (when available).
Hidden behind an #if because Libav stable doesn't have it yet.