'cuda-gl' isn't right - you can turn this on without any GL and
get some non-zero benefit (with the cuda-copy hwaccel). So
'cuda-hwaccel' seems more consistent with everything else.
Minimal support just for testing.
Only the window surface creation (including size determination) is
really platform specific, so this could be some generic thing with
platform-specific support as some sort of sub-driver, but on the other
hand I don't see much of a need for such a thing.
While most of the fbdev usage is done by the EGL driver, using this
fbdev ioctl is apparently the only way to get the display resolution.
Nvidia's "NvDecode" API (up until recently called "cuvid" is a cross
platform, but nvidia proprietary API that exposes their hardware
video decoding capabilities. It is analogous to their DXVA or VDPAU
support on Windows or Linux but without using platform specific API
calls.
As a rule, you'd rather use DXVA or VDPAU as these are more mature
and well supported APIs, but on Linux, VDPAU is falling behind the
hardware capabilities, and there's no sign that nvidia are making
the investments to update it.
Most concretely, this means that there is no VP8/9 or HEVC Main10
support in VDPAU. On the other hand, NvDecode does export vp8/9 and
partial support for HEVC Main10 (more on that below).
ffmpeg already has support in the form of the "cuvid" family of
decoders. Due to the design of the API, it is best exposed as a full
decoder rather than an hwaccel. As such, there are decoders like
h264_cuvid, hevc_cuvid, etc.
These decoders support two output paths today - in both cases, NV12
frames are returned, either in CUDA device memory or regular system
memory.
In the case of the system memory path, the decoders can be used
as-is in mpv today with a command line like:
mpv --vd=lavc:h264_cuvid foobar.mp4
Doing this will take advantage of hardware decoding, but the cost
of the memcpy to system memory adds up, especially for high
resolution video (4K etc).
To avoid that, we need an hwdec that takes advantage of CUDA's
OpenGL interop to copy from device memory into OpenGL textures.
That is what this change implements.
The process is relatively simple as only basic device context
aquisition needs to be done by us - the CUDA buffer pool is managed
by the decoder - thankfully.
The hwdec looks a bit like the vdpau interop one - the hwdec
maintains a single set of plane textures and each output frame
is repeatedly mapped into these textures to pass on.
The frames are always in NV12 format, at least until 10bit output
supports emerges.
The only slightly interesting part of the copying process is that
CUDA works by associating PBOs, so we need to define these for
each of the textures.
TODO Items:
* I need to add a download_image function for screenshots. This
would do the same copy to system memory that the decoder's
system memory output does.
* There are items to investigate on the ffmpeg side. There appears
to be a problem with timestamps for some content.
Final note: I mentioned HEVC Main10. While there is no 10bit output
support, NvDecode can return dithered 8bit NV12 so you can take
advantage of the hardware acceleration.
This particular mode requires compiling ffmpeg with a modified
header (or possibly the CUDA 8 RC) and is not upstream in ffmpeg
yet.
Usage:
You will need to specify vo=opengl and hwdec=cuda.
Note that hwdec=auto will probably not work as it will try to use
vdpau first.
mpv --hwdec=cuda --vo=opengl foobar.mp4
If you want to use filters that require frames in system memory,
just use the decoder directly without the hwdec, as documented
above.
This time it's emulation that's supposed to work (not just dummied out).
Unlike the previous emulation, no mpv code has to be disabled, and
everything should work (albeit possibly a bit slowly). On the other
hand, it's not possible to implement this kind of emulation without
compiler support. We use GNU statement expressions and __typeof__ in
this case.
This code is inactive if stdatomic.h is available.
The current stdatomic check verifies the availability of the function by
calling atomic_load(). It also uses this test to check if linking
against libatomic is needed or not.
Unfortunately, on specific architectures (namely SPARC), using
atomic_load() does *not* require linking against libatomic, while other
atomic operations do. Due to this, mpv's wscript concludes that
stdatomic is available, and that linking against libatomic is not
needed, causing the following link failure:
[190/190] Linking build/mpv
audio/out/ao.c.13.o: In function `ao_query_and_reset_events':
/home/peko/autobuild/instance-0/output/build/mpv-0.18.1/build/../audio/out/ao.c:399: undefined reference to `__atomic_fetch_and_4'
In order to fix this, the stdatomic check is adjusted to call
atomic_fetch_add() instead, which does require libatomic. Thanks to
this, the wscript realizes that linking against libatomic is needed, and
the build works fine.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Always require them, instead of just for some components which have hard
requirements on correct atomic semantics. They should be widely
available, and are supported by all recent gcc and clang compiler
versions. We even have the fallbacks builtins, which should keep this
working on very old gcc releases.
In particular, w32_common.c recently added a hard requirement on
atomics, but checking this properly in the build system would have been
messy. This commit makes sure it always works.
The fallback where weak atomic semantics are always fine is in theory
rather questionable as well.
This greatly improves the result when decoding typical (ST.2084) HDR
content, since the job of tone mapping gets significantly easier when
you're only mapping from 1000 to 250, rather than 10000 to 250.
The difference is so drastic that we can now even reasonably use
`hdr-tone-mapping=linear` and get a very perceptually uniform result
that is only slightly darker than normal. (To compensate for the extra
dynamic range)
Due to weird implementation details, this only seems to be present on
keyframes (or something like that), so we have to cache the last seen
value for the frames in between.
Also, in some files the metadata is just completely broken /
nonsensical, so I decided to apply a simple heuristic to detect
completely broken metadata.
This HDR function is unique in that it's still display-referred, it just
allows for values above the reference peak (super-highlights). The
official standard doesn't actually document this very well, but the
nominal peak turns out to be exactly 12.0 - so we normalize to this
value internally in mpv. (This lets us preserve the property that the
textures are encoded in the range [0,1], preventing clipping and making
the best use of an integer texture's range)
This was grouped together with SMPTE ST2084 when checking libavutil
compatibility since they were added in the same release window, in a
similar timeframe.
This reverts commit b51957fab5.
Breaks big time. It appears to ignore explicitly configured paths within
the libav* .pc files, which for example breaks mpv-build.
Distros and users alike should be made aware of the fact that old
FFmpeg versions are bad. When users come to us with FFmpeg-related
trouble, the answer is “update FFmpeg” more often than not
(and no further support will be provided until they have done so),
so instead we just nag them about it here.
This now lets us auto-detect appropriately tagged HDR content using
FFmpeg's new TRC entries (when available).
Hidden behind an #if because Libav stable doesn't have it yet.
We don't have any reason to disable either. Both are loaded dynamically
at runtime anyway. There is also no reason why dxva2 would disappear
from libavcodec any time soon.
ANGLE is _really_ annoying to build. (Requires special toolchain and a
recent MSVC version.) This results in various issues with people
having trouble to build mpv against ANGLE (apparently linking it
against a prebuilt binary doesn't count, or using binaries from
potentially untrusted sources is not wanted).
Dynamically loading ANGLE is going to be a huge convenience. This commit
implements this, with special focus on keeping it source compatible to
a normal build with ANGLE linked at build-time.
Including initguid.h at the top of a file that uses references to GUIDs
causes the GUIDs to be declared globally with __declspec(selectany). The
'selectany' attribute tells the linker to consolidate multiple
definitions of each GUID, which would be great except that, in Cygwin
and MinGW GCC 6.1, this method of linking makes the GUIDs conflict with
the ones declared in libuuid.a.
Since initguid.h obsoletes libuuid.a in modern compilers that support
__declspec(selectany), add initguid.h to all files that use GUIDs and
remove libuuid.a from the build.
Fixes#3097
FFmpeg partially merged the API change. It added the AVCodecParameters
definition, but not the AVCodecContext.codecpar field. The new code
compiles only with the API fully merged, so adjust the check.
Encoding mode uses deprecated API. See previous commit. Encoding mode
will stop working/compiling at some point in the future, so unless
someone fixes the encoding code, it will stay disabled by default.
(Note that the deprecations are not merged in FFmpeg yet, but they will
soon. They've been deprecated in Libav for a while now.)
AVFormatContext.codec is deprecated now, and you're supposed to use
AVFormatContext.codecpar instead.
Handle this for all of the normal playback code.
Encoding mode isn't touched.
This commit adds the d3d11va-copy hwdec mode using the ffmpeg d3d11va
api. Functions in common with dxva2 are handled in a separate decode/d3d.c
file. A future commit will rewrite decode/dxva2.c to share this code.
OpenSL ES is used on Android. At the moment only stereo output is
supported. Two options are supported: 'frames-per-buffer' and
'sample-rate'. To get better latency the user of libmpv should pass
values obtained from AudioManager.getProperty(PROPERTY_OUTPUT_FRAMES_PER_BUFFER)
and AudioManager.getProperty(PROPERTY_OUTPUT_SAMPLE_RATE).
To be more specific, enable vo_opengl and vo_opengl_cb, if libmpv is
compiled, and the GL headers happen to be in the default search paths.
Although platform specific code can be useful for libmpv (for window
embedding, and even with vo_opengl_cb for certain forms of hardware
decoding), it's not a requirement to use the opengl_cb API.
Enabling vo_opengl is not useful here, but the rest of the build system
doesn't distinguish vo_opengl and vo_opengl_cb, and I see no reason to.
We either need to be on Windows, or have posix_spawn available.
If someone can come up with a system that is POSIX, but does not provide
posix_spawn, we could make it optional.
The complex filter support that will be added makes much more complex
use of libavfilter, and I'm not going to bother with adding hacks to
keep libavfilter optional.
It existed for XP-compatibility only. There was also a time where
ao_wasapi caused issues, but we're relatively confident that ao_wasapi
works better or at least as good as ao_dsound on Windows Vista and
later.
Commit 1a6f3c56 added a fallback for the case when C11 TLS was not
available, but GCC TLS was. But it forget to enable VAAPI EGL interop in
the build system in this case.
Just remove the build system check. Should someone find a compiler that
works on Linux and does not support GCC extensions or C11, it will still
compile and just fail to init at runtime.
Actually fixes#2631 (hopefully).
libwaio was added due to the complete inability to cancel synchronous
I/O cleanly using the public Windows API in Windows XP. Even calling
TerminateThread on the thread performing I/O was a bad solution, because
the TerminateThread function in XP would leak the thread's stack.
In Vista and up, however, this is no longer a problem. CancelIoEx can
cancel synchronous I/O running on other threads, allowing the thread to
exit cleanly, so replace libwaio usage with native Vista API functions.
It should be noted that this change also removes the hack added in
8a27025 for preventing a deadlock that only seemed to happen in Windows
XP. KB2009703 says that Vista and up are not affected by this, due to a
change in the implementation of GetFileType, so the hack should not be
needed anymore.