1
0
mirror of https://github.com/mpv-player/mpv synced 2025-01-09 16:39:49 +00:00
Commit Graph

4429 Commits

Author SHA1 Message Date
Jan Ekström
e6447e2e89 vo_gpu/d3d11: add an option for the adapter name
Set it from the adapter name in the d3d11 options.
2019-09-29 19:39:26 +03:00
Jan Ekström
8163906299 video/d3d11: add adapter selection by name into d3d11 options
This lets the user define an adapter name that can then be passed
further into the internals.
2019-09-29 19:39:26 +03:00
Jan Ekström
e205e179e0 vo_gpu/d3d11_helpers: also load up CreateDXGIFactory1
Just a factory, without a device, is required for listing of devices.
2019-09-29 19:39:26 +03:00
Anton Kindestam
0d4f165d81 vo_drm: fix flickering when setting pan/scan
Turns out clearing all frambuffers in reconfig isn't such a great idea
when you also end up here when setting pan/scan.

I guess this is just a leftover from a previous iteration of vo_drm
where doing this made sense.
2019-09-29 12:16:26 +02:00
Philip Langdale
8c1f94f0e7 vo_gpu: hwdec_cuda: Synchronise OpenGL Interop
Previously, there appeared to be implicit synchronisation in the
GL interop path, and we never observed any visual glitches. However,
recently, I started seeing stuttering in the GL path and on closer
examination it looked like read-before-write behaviour where GL
would display an old frame again rather than the current one.

After verifying that disabling hwdec made the problem go away,
I tried adding a cuStreamSynchronize() after the memcpys and that
also resolved the problem, so it's clearly sync related.

cuStreamSynchronize() is a CPU sync and so more heavy-weight than
you want, but it's the only tool we have. There is no mechanism
defined for synchronising GL to CUDA (It looks like there is a way
to synchronise CUDA to EGL but it appears one way and so wouldn't
directly address this problem).

Anyway, empirically, the output now looks the same as with hwdec
off.
2019-09-28 19:24:24 +03:00
Anton Kindestam
0c8eb80e98 vo_drm: support controlling swapchain depth using swapchain-depth option 2019-09-28 14:10:01 +03:00
Anton Kindestam
6290420380 vo: make swapchain-depth option generic for all VOs
In preparation for making vo_drm able to use swapchain-depth
2019-09-28 14:10:01 +03:00
Anton Kindestam
9538fb5a7a drm: refactor page_flipped callback
Avoid duplicating the same callback function in both context_drm_egl
and vo_drm.
2019-09-28 14:10:01 +03:00
Anton Kindestam
77980c8184 vo_drm: Implement N-buffering and presentation feedback
Swapchain depth currently hard-coded to 3 (4 buffers).

As we now avoid redrawing on repeat frames (we simply requeue the same fb
again), this should give a nice performance boost when playing videos with a
lower FPS than the display FPS in video-sync=display-resample mode.

Presentation feedback has also been implemented to help counter the
significant amounts of jitter we would otherwise be seeing.
2019-09-28 14:10:01 +03:00
Anton Kindestam
dfe45f018e vo_drm: fix more than 2 buffers
Now we can increase BUF_COUNT. Yay.
2019-09-28 14:10:01 +03:00
Anton Kindestam
2cf8dd6451 drm: move struct vsync_tuple into drm_common as drm_vsync_tuple
This struct will be useful in vo_drm as well.
2019-09-28 14:10:01 +03:00
Anton Kindestam
22252432e2 context_drm_egl: define EGL_PLATFORM_GBM_MESA, EGL_PLATFORM_GBM_KHR if not in system headers
To account for oddball setups where EGL_PLATFORM_GBM_MESA or
EGL_PLATFORM_GBM_KHR might not be defined for whatever reason.
2019-09-27 20:01:15 +02:00
Wessel Dankers
643417dd17 video: add pure gamma TRC curves for 2.0, 2.4 and 2.6. 2019-09-27 13:21:41 +02:00
Philip Sequeira
21a5c416d5 options: add M_OPT_FILE to some more options that take files 2019-09-27 13:19:29 +02:00
Jonas Karlman
16d2ddb505 vo_gpu: hwdec_drmprime_drm: add hwdec ctx
This allows to use drm hwaccels that require a hwdevice.

Tested with v4l2request hwaccel and cedrus driver on an allwinner device
running mpv with --vo=gpu --gpu-context=drm --hwdec=drm.
2019-09-27 13:08:27 +02:00
wm4
c3687b9eaa hwdec_vaapi_gl: add missing compatibility defines
At first, this code used only 1 plane, so the compatibility stuff was
sufficient. But then use of planes 1 and 2 was added, without extending
the compatibility stuff.

I think I've seen a case recently where this broke the build and caused
users to apply invalid fixes, but I don't remember where.

It's possible that I didn't get all defines that are needed.
2019-09-27 13:05:21 +02:00
sfan5
e350ceef4c vo_gpu: vulkan: add Android context 2019-09-27 00:05:06 +03:00
sfan5
508e35881e context_android: move common code to a separate file
In preparation for a Vulkan Android context.
This also replaces querying for EGL_WIDTH and EGL_HEIGHT
with equivalent ANativeWindow calls.
2019-09-27 00:05:06 +03:00
James Ross-Gowan
03cb8755e1 vo_gpu: d3d11: don't reset frame stats after pause
I think I was wrong about having to reset the stats when mpv stops
producing frames, eg. when it's paused. As long as the swapchain doesn't
underflow, last_queue_display_time will still be accurate, because the
next frame produced should still be presented one vsync after the
last one in the swapchain.

If the swapchain underflows (which is the common case for when mpv is
paused for more than 150ms,) the next predicted frame time should be in
the past. It should be fine to leave last_queue_display_time unset in
this case, since vo.c will use the current time instead, which is a
decent guess (though it doesn't take vsync phase into account.)

last_sync_refresh_count and last_sync_qpc_time should be kept on
swapchain underflow as well. Assuming the display refresh rate doesn't
change while mpv is paused, they'll only provide a more accurate guess
of the vsync duration when mpv starts playing again. Also,
vsync_duration_qpc never needs to get reset. It will get overwritten
immediately in most cases, and when it doesn't, it's still a better
guess of the vsync duration than nothing.
2019-09-26 23:41:38 +03:00
wm4
4d43c79e4c client API: fix potential deadlock problems by throwing more shit at it
The render API (vo_libmpv) had potential deadlock problems with
MPV_RENDER_PARAM_ADVANCED_CONTROL. This required vd-lavc-dr to be
enabled (the default). I never observed these deadlocks in the wild
(doesn't mean they didn't happen), although I could specifically provoke
them with some code changes.

The problem was mostly about DR (direct rendering, letting the video
decoder write to OpenGL buffer memory). Allocating/freeing a DR image
needs to be done on the OpenGL thread, even though _lots_ of threads are
involved with handling images. Freeing a DR image is a special case that
can happen any time. dr_helper.c does most of the evil magic of
achieving this. Unfortunately, there was a (sort of) circular lock
dependency: freeing an image while certain internal locks are held would
trigger the user's context update callback, which in turn would call
mpv_render_context_update(), which processed all pending free requests,
and then acquire an internal lock - which the caller might not release
until a further DR image could be freed.

"Solve" this by making freeing DR images asynchronous. This is slightly
risky, but actually not much. The DR images will be free'd eventually.
The biggest disadvantage is probably that debugging might get trickier.

Any solution to this problem will probably add images to free to some
sort of queue, and then process it later. I considered making this more
explicit (so there'd be a point where the caller forcibly waits for all
queued items to be free'd), but discarded these ideas as this probably
would only increase complexity.

Another consequence is that freeing DR images on the GL thread is not
synchronous anymore. Instead, it mpv_render_context_update() will do it
with a delay. This seems roundabout, but doesn't actually change
anything, and avoids additional code.

This also fixes that the render API required the render API user to
remain on the same thread, even though this wasn't documented. As such,
it was a bug. OpenGL essentially forces you to do all GL usage on a
single thread, but in theory the API user could for example move the GL
context to another thread.

The API bump is because I think you can't make enough noise about this.
Since we don't backport fixes to old versions, I'm specifically stating
that old versions are broken, and I'm supplying workarounds.

Internally, dr_helper_create() does not use pthread_self() anymore, thus
the vo.c change. I think it's better to make binding to the current
thread as explicit as possible.

Of course it's not sure that this fixes all deadlocks (probably not).
2019-09-26 14:14:49 +02:00
der richter
41f290f54e cocoa-cb: add support for 10bit opengl rendering
this will request a 16bit half-float framebuffer instead if a 8bit
integer framebuffer.

Fixes #3613
2019-09-26 00:02:02 +02:00
Anton Kindestam
bbf6e103b4 drm_common: add missing zero-initialization of struct vt_mode variable
Some fields were being left uninitialized. This could be a problem
particularly on non-Linux OS:s with vt_mode (see PR #6976).
2019-09-24 21:46:52 +02:00
der richter
422b486200 cocoa-cb: fix title bar button state on start up
on start up it was possible to click the hidden buttons. hide the
buttons ons tart up to make the state consistent with the visible state.
2019-09-23 21:10:38 +02:00
Anton Kindestam
b6def652a4 context_drm_egl: Don't get stuck forever if drmHandleEvent fails 2019-09-22 22:39:10 +02:00
Anton Kindestam
e2f96535f5 vo_drm: 30bpp support 2019-09-22 15:59:24 +02:00
James Ross-Gowan
a9f9601ca8 vo_gpu: d3d11: add support for presentation feedback
This adds vsync reporting to the D3D11 backend using the presentation
feedback provided by DXGI, which is pretty similar to what's provided by
GLX_OML_sync_control in the GLX backend. In DirectX, PresentCount is the
SBC, PresentRefreshCount and SyncRefreshCount are kind of like the MSC
and SyncQPCTime is the UST.

Unlike GLX, the DXGI API makes it possible for PresentCount and
SyncQPCTime to refer to different physical vsyncs, in which case
PresentRefreshCount and SyncRefreshCount will be different. The code
supports this possibility, even though it's not clear whether it can
happen when using flip-model presentation. The docs say for flip-model
apps, PresentRefreshCount is equal to SyncRefreshCount "when the app
presents on every vsync," but on my hardware, they're always equal, even
when mpv misses a vsync. They can definitely be different in exclusive
fullscreen bitblt mode, though, which mpv doesn't support now, but might
support in future.

Another difference to GLX is that, at least on my hardware,
PresentRefreshCount and SyncRefreshCount always refer to the last
physical vsync on which mpv presented a frame, but glxGetSyncValues can
apparently return a MSC and UST from the most recent physical vsync,
even if mpv didn't present a new frame on it. This might result in
different behaviour between the two backends after dropped frames or
brief pauses.

Also note, the docs for the DXGI presentation feedback APIs are pretty
awful, even by Microsoft standards. In particular the docs for
DXGI_FRAME_STATISTICS are misleading (PresentCount really is the number
of times Present() has been called for that frame, not "the running
total count of times that an image was presented to the monitor since
the computer booted.")

For good documentation, try these:
https://docs.microsoft.com/en-us/windows/win32/direct3ddxgi/dxgi-flip-model
https://docs.microsoft.com/en-us/windows/win32/direct3d9/d3dpresentstats
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/d3dkmthk/ns-d3dkmthk-_d3dkmt_present_stats

(Yeah, the docs for the D3D9Ex and even the kernel-mode version of this
structure are better than the DXGI ones. It seems possible that they're
all rewordings of the same internal Microsoft docs, but whoever wrote
the DXGI one didn't really understand it.)
2019-09-22 23:18:40 +10:00
dudemanguy
0f938b197a wayland: create current_output in wayland_reconfig
Certain mpv config options require wl->current_output to be created
before the video can actually start rendering. Just always create it
here if the current_output doesn't exist (the one exception being the
--fs option with no --fs-screen flag). Incidentally, this also fixes
--fs-screen not working on wayland.
2019-09-22 03:33:21 +00:00
Jan Ekström
6c6aba4359 vf_fingerprint: remove extraneous single quote from description
This happened to break ZSH completion and seemed to be extraneous.

Reported by LaserEyess on IRC.
2019-09-21 23:46:41 +03:00
Dudemanguy911
685e927fbe wayland: avoid handling a 0-value axis event
This shouldn't be possible, but an extra check never hurts.
2019-09-21 10:38:43 -05:00
emersion
600824494d wayland: read xcursor size from XCURSOR_SIZE env
This allows compositors to set the cursor size from user
configuration.
2019-09-21 15:43:54 +02:00
slatchurie
1591ccfff5 x11: fix ICC profiling for multiple monitors
To find the correct ICC profile X atom, the screen number was calculated
directly from the xrandr order of the screens.
But if a primary screen is set, it should be the first Xinerama screen,
even if it is not the first xrandr screen.

Calculate the the proper atom id for each screen.
2019-09-21 15:36:16 +02:00
dudemanguy
cdad5cc65f wayland: don't show cursor when fullscreening 2019-09-21 15:24:06 +02:00
Thomas Weißschuh
9e304ab974 wayland: reconfigure cursor on pointer enter event
On wayland the cursor has to be configured each time the pointer enters.
Currently if the window (re)gains the focus, the pointer is not hidden,
even when configured. After the mouse has been moved the pointer hides
correctly.

https://wayland.freedesktop.org/docs/html/apa.html#protocol-spec-wl_pointer:

    wl_pointer::enter - enter event

    ...

    When a seat's focus enters a surface, the pointer image is undefined
    and a client should respond to this event by setting an appropriate
    pointer image with the set_cursor request.

Fixes #6185.

Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
2019-09-21 15:24:06 +02:00
dudemanguy
f54ad8eb05 wayland: add mouse buttons and fix axis scaling
Previously, the only mouse buttons supported in wayland were left,
right, and middle click. This adds the thumb back/forward buttons as
valid bindings. Also it removes the old, default behavior of always
sending a right click if an unrecognized mouse button is clicked.
In a related but different fix, the magnitude of an axis event in
wayland is not important to mpv since it internally handles all scaling.
The only thing we care about is getting the sign when the event occurs.
2019-09-21 14:35:03 +02:00
Cameron Cawley
70f8154b3d vo_sdl: Only create the SDL window once 2019-09-21 12:58:01 +02:00
memeka
0bdcbd75e0 context_drm_egl: Use eglGetPlatformDisplayEXT if available
Check if eglGetPlatformDisplayEXT is available and try to
use it to obtain the display connection. Fall back to eglGetDisplay
if eglGetPlatformDisplayEXT is not available or failing.

From PR #5992
2019-09-20 19:09:36 +02:00
wm4
4fa8f33b92 client API, vo_libmpv: document random deadlock problems
I guess trying to make DR work on libmpv was a mistake.

I never observed such a deadlock, but it's looks like it's theoretically
possible.
2019-09-20 16:47:16 +02:00
wm4
f00af71d12 vo_libmpv: fix some more uninit issues
This is mostly for the case when mpv_render_context_free() is called
while video is going on. This is supposed to gracefully stop video and
deinitialize everything properly. (I feel like it would put too much on
the API user to require that video is stopped before calling this
function. Whether video is running or not is a fairly highlevel thing,
and the API user could not do it in a race-free way.)

One problem was that unit() accessed ctx after ctx->in_use was set to
false. The update(ctx) call was basically a racy use-after-free. It
needed that call to wake up the mpv_render_context_free() loop that
waited for VO uninit. Fix this by triggering the wakeup inside the lock,
and then doing "barrier" locking in mpv_render_context_free().

Another problem was that the wait loop didn't really wait properly. IT
seems the had_kill_update field was a botched attempt to do that. It's
indeed quite hairy to do that with update(). Instead make use of the
dispatch queue (infinite timeout, using mp_dispatch_interrupt()), which
handles the problem of having to wait both for dispatch queue updates
and VO uninit at the same time.
2019-09-20 16:31:53 +02:00
wm4
d12264acc0 vo_libmpv: always create ctx->dispatch
Preparation for the next commit. Until now, it was only needed if DR was
involved. One reason for not always creating it was that you normally
must not use it if advanced_control is not enabled. This is why e.g.
VOCTRL_SCREENSHOT now checks for that variable; it still can't use
ctx->dispatch if the render API user did not enable it.
2019-09-20 14:53:21 +02:00
wnoun
35da5a4d8e render api: fix use-after-free
render api needs to wait for vo to be destroyed before frees the context.
The purpose of kill_cb is to wake up render api after vo is destroyed,
but uninit did that before kill_cb, so kill_cb tries using the freed
memory. Remove kill_cb to fix the issue as uninit is able to do the
work.
2019-09-20 13:54:17 +02:00
Cameron Cawley
db09d77e46 rpi: Update for modern systems 2019-09-20 11:39:06 +02:00
wm4
257534ed18 vo: remove unused equalizer control remains
Equalizer control was redone in 03cf150ff3 (over 2 years
ago). Ever since, the equalizer control structs and the GET voctrl have
been unused. Only the SET voctrl is still used as notification mechanism
(actually a bad hack to avoid some further option change handling
complexity).

Remove the unused parts.
2019-09-20 00:45:17 +02:00
wm4
8e5cd62dca oml_sync: fix typo in comment
I think... Also reword another part of the text.
2019-09-20 00:32:29 +02:00
wm4
7000d91cf8 vf_fingerprint: use aligned_alloc instead of posix_memalign
I was assuming posix_memalign was the most portable function to use, but
MinGW does not provide it for some reason. Switch to C11 aligned_alloc()
which someone suggested was provided by MinGW (but actually isn't,
someone probably confused it with the incompatible _aligned_malloc),
and add a configure check.

Even though it turned out that MinGW doesn't provide it, the function
is slightly more elegant than posix_memalign(), so stay with it.
2019-09-19 23:09:02 +02:00
wm4
9cfeafa89e video: add vf_fingerprint and a skip-logo script
skip-logo.lua is just what I wanted to have. Explanations are on the top
of that file. As usual, all documentation threatens to remove this stuff
all the time, since this stuff is just for me, and unlike a normal user
I can afford the luxuary of hacking the shit directly into the player.

vf_fingerprint is needed to support this script. It needs to scale down
video frames as part of its operation. For that, it uses zimg. zimg is
much faster than libswscale and generates more correct output. (The
filter includes a runtime fallback, but it doesn't even work because
libswscale fucks up and can't do YUV->Gray with range adjustment.)

Note on the algorithm: seems almost too simple, but was suggested to me.
It seems to be pretty effective, although long time experience with
false positives is missing. At first I wanted to use dHash [1][2], which
is also pretty simple and effective, but might actually be worse than
the implemented mechanism. dHash has the advantage that the fingerprint
is smaller. But exact matching is too unreliable, and you'd still need
to determine the number of different bits for fuzzier comparison. So
there wasn't really a reason to use it.

[1] https://pypi.org/project/dhash/
[2] http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html
2019-09-19 20:37:05 +02:00
wm4
e1157cb6e8 video: generally try to align image data on 64 bytes
Generally, using x86 SIMD efficiently (or crash-free) requires aligning
all data on boundaries of 16, 32, or 64 (depending on instruction set
used). 64 bytes is needed or AVX-512, 32 for old AVX, 16 for SSE. Both
FFmpeg and zimg usually require aligned data for this reason.

FFmpeg is very unclear about alignment. Yes, it requires you to align
data pointers and strides. No, it doesn't tell you how much, except
sometimes (libavcodec has a legacy-looking avcodec_align_dimensions2()
API function, that requires a heavy-weight AVCodecContext as argument).

Sometimes, FFmpeg will take a shit on YOUR and ITS OWN alignment. For
example, vf_crop will randomly reduce alignment of data pointers,
depending on the crop parameters. On the other hand, some libavfilter
filters or libavcodec encoders may randomly crash if they get the wrong
alignment. I have no idea how this thing works at all.

FFmpeg usually doesn't seem to signal alignment internal anywhere, and
usually leaves it to av_malloc() etc. to allocate with proper alignment.
libavutil/mem.c currently has a ALIGN define, which is set to 64 if
FFmpeg is built with AVX-512 support, or as low as 16 if built without
any AVX support. The really funny thing is that a normal FFmpeg build
will e.g. align tiny string allocations to 64 bytes, even if the machine
does not support AVX at all.

For zimg use (in a later commit), we also want guaranteed alignment.
Modern x86 should actually not be much slower at unaligned accesses, but
that doesn't help. zimg's dumb intrinsic code apparently randomly
chooses between aligned or unaligned accesses (depending on compiler, I
guess), and on some CPUs these can even cause crashes. So just treat the
requirement to align as a fact of life.

All this means that we should probably make sure our own allocations are
64 bit aligned. This still doesn't guarantee alignment in all cases, but
it's slightly better than before.

This also makes me wonder whether we should always override libavcodec's
buffer pool, just so we have a guaranteed alignment. Currently, we only
do that if --vd-lavc-dr is used (and if that actually works). On the
other hand, it always uses DR on my machine, so who cares.
2019-09-19 20:37:05 +02:00
wm4
e265c07547 vo: fix missed option updates under rare circumstances
Dear diary,

today I fixed a shitty bug that was all my fault because I made a
horrible mess. (Except it was a horrible mess before I even touched
this shit, but let's not blame others.)

Sometimes, updates to VO option that control video sizing (like panscan)
didn't update the screen correctly. They were delayed until the next
option change or so.

It turns out that if the option update happens at the "same" time as a
VOCTRL, update_opts() doesn't actually notify the vo_driver of the
change. This in turn happened because run_control() called
m_config_cache_update(). The latter function returns true if the options
changed since the last call, and update_opts() also calls it (on the
same config cache) for the same purpose. The update_opts() call, which
is triggered by a third mechanism, comes later, but the cache update
call will return false (as it should). Basically, given the config API,
you can't act differently on multiple update calls and expect it to
work. The skipped handling in update_opts() meant that the notification
required to apply the changed option wasn't run.

Fix this by simply calling update_opts() directly instead. Now there's
only 1 m_config_cache_update() call on this specific instance. Fix the
call in run_reconfig() too, so the previous sentence isn't a lie (but it
probably doesn't make a difference in practice due to certain details).

I'm not sure how I even ran into this sort-of race condition. The VOCTRL
that messed up the option update was VOCTRL_UPDATE_PLAYBACK_STATE, which
happens semi-regularly.

Why this config cache shit and all the other shit? Rediscovering this
crap wasn't pleasant. It's a bunch of hacks that became necessary when
the ancient MPlayer architecture made it hard to move the VO to a
separate thread.

All the VO code typically accesses vo->opts (whose fields all used to be
global variables in MPlayer). The frontend changes these on user input.
Putting locking around all the options would be a nightmare, and keeping
a copy of the options in the thread was much simpler. You need a way to
propagate option changes, notify the thread, and update the local copy
too. And the result of these thoughts was the config cache mechanism.

In this specific case, the relevant cache update call in update_opts()
triggers a VOCTRL_SET_PANSCAN to the VO driver, which isn't related to
its former function anymore. Instead, it causes the VO driver to update
the video sizing/placing options, which the generic VO code can't do.
(Mostly because the VO driver includes the windowing stuff and is
responsible for resizing etc. itself.)

VOCTRLs sent by the frontend are even worse. MPlayer had no real runtime
option change mechanism. Some options were vaguely duplicated by
properties, so you could effectively change those options at runtime.
Each of these options had its own VOCTRL, which still exist today, e.g.
VOCTRL_FULLSCREEN, or VOCTRL_ONTOP. I tried to make all options runtime
changeable, and to unify properties with options. But I couldn't be
bothered with updating all VO drivers to listen to option changes
directly, because that would be pretty tedious. So the property code is
still all there and sends the old VOCTRLs. But of course you need to
sync up the options, which is why the run_control() code did that.

(Unrelated: VO_EVENT_FULLSCREEN_STATE is the worst shithack of them all.
Currently, only the frontend can actually write to options (for awful
reasons), so if the fullscreen state changes due to outside interaction,
the VO driver can't update the corresponding option fields. So the VO
notifies the frontend with said VO_EVENT_, and the frontend then sends
VOCTRL_GET_FULLSCREEN, and updates the global copy of the option with
the value returned by that. I still like to think the situation is not
that bad considering the monstrous effort of converting single-threaded
code that had hundreds of options in global variables to multi-threaded
code with no global variables at all.)
2019-09-19 20:37:05 +02:00
wm4
9e1945d307 win_state: silence a valgrind warning
m_geometry_apply() will read and modify the dummy variable. It's not
actually used for anything, but valgrind will still warn against
uninitialized data. I'm not sure whether this was UB, but in any case
it's annoying when running valgrind.
2019-09-19 20:37:05 +02:00
wm4
a8be8fe4f4 vd_lavc: put vaapi before vdpau in autoprobe order
--hwdec=auto-copy was preferring vdpau over vaapi. In the HEVC 10 bit
case, this also led to hardware decoding not being enabled. (Probably
because the probing can't start over after enabling hw decoding fails at
runtime, or something like that.)

Possible that this subtly breaks on some setups. You can't always win.
2019-09-19 20:37:05 +02:00
wm4
a17337910c vo_gpu: hwdec_vaegl: silence confusing message during probing
During probing on a system with AMD GPU, mpv used to output the
following messages if hardware decoding was enabled:

[ffmpeg] AVHWFramesContext: Failed to create surface: 2 (resource
allocation failed).
[ffmpeg] AVHWFramesContext: Unable to allocate a surface from internal
buffer pool.

This commit removed the message, with hopefully no other side effects.
Long explanations follow, better don't read them, it's just tedious
drivel about the details. People should learn to write concise commit
messages, not drone on and on endlessly all while they have no fucking
point.

The code probes supported hardware pixel format, and checks whether they
can be mapped as textures. av_hwdevice_get_hwframe_constraints() returns
a list of hardware pixel formats in the valid_sw_formats field (the "sw"
means software, but they're still hardware pixel formats, makes sense).
This contained the format yuv420p, even though this is not a valid
hardware format. Trying to create a surface of this type results in VA
surface creation failure, upon which FFmpeg prints the error messages
above. We'd be fine with this, except FFmpeg has a global log callback,
and there's no way to suppress these messages without creating other
issues.

It turns out that FFmpeg's vaapi implementation returns all formats from
vaQueryImageFormats() if no "hwconfig" is provided. This list includes
yuv420p, which is probably supported for surface upload/download, but
not as native format. Following FFmpeg's logic, it should not appear in
the valid_sw_formats list, because formats for transfers are returned by
another roundabout API.

Idiotically, there doesn't seem to be any vaapi call that determines
whether a format is a valid surface format. All mechanisms to do this
are bound to a VAConfigID (= video codec or video processor), all while
the actual surface creation API strangely does not take a VAConfigID (a
big WTF).

Also, calling the vaCreateSurfaces() API ourselves for probing is out of
the question, because that functions is utterly and idiotically complex.
Look at the FFmpeg code and how much effort it requires to setup a
complete set of attributes - we can't duplicate this.

So the only way left to do this is the most idiotic and tedious way:
enumerating all VAProfile (and VAEntrypoints) to create all possible
VAConfigIDs. Each of the VAConfigIDs is associated with a list of
formats, which FFmpeg can return (by passing the ID along with the
"hwconfig"), and which is probed separately.

Note that VAConfigID actually refers to a dynamic instance of something,
and creating a VAConfigID takes not only the VAProfile and the
VAEntrypoint, but also an arbitrary attribute array. In theory, this
means our attempt to get to know all possible configurations cannot
work, but in practice this attribute array seems to be pointless for
decoding and video processing, and FFmpeg doesn't use it (though the
encoding path does use it). This probably just makes it _barely_ OK to
do it this way.

Could we discard all this probing shit, and somehow do it another way?
Probably not. The EGL API for mapping surfaces doesn't even seem to
provide a way to enumerate supported formats, we may not even know
whether DRM/dmabuf interop is actually supported (AFAIR the EGL
extensions are present even if they don't work), nor do we know whether
the VAAPI driver supports this interop (not sure). So actually trying is
the only way.

Further, mpv initializes the decoder on a another thread, where you
can't just access OpenGL state. This suckage is mostly to be blamed on
OpenGL itself and its crazy thread boundedness. In theory, this could be
done anyway (see how software decoding "direct rendering" tries to get
around this). But to make it worse, the decoder never cares about the
list of supported formats determined by this code; instead,
f_autoconvert.c tries to deal with it and insert a video processor
(well, good luck with this crap, I bet it doesn't even work). So this
whole endeavor might be pointless, other than the fact that failed
probing can disable use of vaapi (which is correct and necessary). But
if you have a shovel, you don't use it to smash the flat end on the heap
of shit that's piled up before you, or do you?

While this method probably works, it's still orgasmically tedious. It
was tedious before: we had to create a real surface, create a GL
texture, map the surface with it, then destroy everything again. But the
added code is tedious on its own. Highlights include the need to malloc
a FFmpeg struct just to pass a single damn integer, the need to
enumerate "entrypoints" for each VA profile, even though all profiles
have exactly 1 entrypoint, and the kind of obnoxious way how vaapi
requires you to preallocate arrays for returned things, even they could
for example reasonably be returned as immutable arrays or have some
other simpler API.

The main grand fuckup is of course that vaapi requires a VAConfigID to
query surface properties, but not for creating surfaces. This
awkwardness even affected the FFmpeg API design, which has a "hwconfig"
concept that is only used by vaapi (vaapi is only 1 out of 10 hardware
decoding APIs supported by the FFmpeg hwcontext stuff). Maybe I'm just
missing something. It's as if vaapi required setting radioactive shit on
fire. Look how clean the native D3D11 code is instead. (Even the ANGLE
code manages to avoid being this fucked up. Or the VDPAU code, despite
supporting multiple mapping methods.)

Another only barely related change is that the valid_sw_formats field
can be NULL, and the API explicitly documents this. Technically, the mpv
code was buggy for not checking this, although until now the FFmpeg
implementation so far could not return it when we still passed NULL for
the hwconfig parameter.
2019-09-19 20:37:05 +02:00