When I introduced the concept of lazy loading of hwdecs by img format,
I did not propagate the probing flag correctly, leading to the new
normal loading path not runnng with probing set, meaning that any
errors would show up, creating unnecessary noise.
This change fixes this regression.
So I can reuse it in vo_gpu_next without having to reinvent the wheel.
In theory, a lot of the stuff could be made more private inside the
hwdec code itself, but for the time being I don't care about refactoring
this code, merely sharing it.
Historically, we have treated hwdec interop loading as a completely
separate step from loading the hwdecs themselves. Some hwdecs need an
interop, and some don't, and users generally configure the exact
hwdec they want, so interops that aren't relevant for that hwdec
shouldn't be loaded to save time and avoid warning/error spam.
The basic approach here is to recognise that interops are tied to
hwdecs by imgfmt. The hwdec outputs some format, and an interop is
needed to get that format to the vo without read back.
So, when we try to load an hwdec, instead of just blindly loading all
interops as we do today, let's pass the imgfmt in and only load
interops that work for that format. If more than one interop is
available for the format, the existing logic (whatever it is) will
continue to be used to pick one.
We also have one callsite in filters where we seem to pre-emptively
load all the interops. It's probably possible to trace down a specific
format but for now I'm just letting it keep loading all of them; it's
no worse than before.
You may notice there is no documentation update - and that's because
the current docs say that when the interop mode is `auto`, the interop
is loaded on demand. So reality now reflects the docs. How nice.
Today, validation is only possible for string type options. But there's
no particular reason why it needs to be restricted in this way, and
there are potential uses, to allow other options to be validated
without forcing the option to have to reimplement parsing from
scratch.
The first part, simply making the validation function an explicit
field instead of overloading priv is simple enough. But if we only do
that, then the validation function still needs to deal with the raw
pre-parsed string. Instead, we want to allow the value to be parsed
before it is validated. That in turn leads to us having validator
functions that should be type aware. Unfortunately, that means we need
to keep the explicit macro like OPT_STRING_VALIDATE() as a way to
enforce the correct typing of the function. Otherwise, we'd have to
have the validator take a void * and hope the implementation can cast
it correctly.
For help, we don't have this problem, as help doesn't look at the
value.
Then, we turn validators that are really help generators into explicit
help functions and where a validator is help + validation, we split
them into two parts.
I have, however, left functions that need to query information for both
help and validation as single functions to avoid code duplication.
In this change, I have not added an other OPT_FOO_VALIDATE() macros as
they are not needed, but I will add some in a separate change to
illustrate the pattern.
VOCTRL_UPDATE_RENDER_OPTS is supposed to be optional so check if it
actually exists before executing the function. Fixes a segfault when
changing the alpha value at runtime on non-wayland platforms.
vo_gpu has a small set of options for ra_ctx that can be set. In
practice, runtime toggling doesn't matter for most of these as they have
no effect while a video is playing. However, changing the alpha option
during runtime can actually work depending on the backend used. mpv
already detected when one of these options changed, but it made no
attempt to update the options in the ra_ctx accordingly (likely because
nothing made any use of this information). Another related change is to
add an update_render_opts to the fns and allow invidiual backends to
(optionally) use it.
Change all OPT_* macros such that they don't define the entire m_option
initializer, and instead expand only to a part of it, which sets certain
fields. This requires changing almost every option declaration, because
they all use these macros. A declaration now always starts with
{"name", ...
followed by designated initializers only (possibly wrapped in macros).
The OPT_* macros now initialize the .offset and .type fields only,
sometimes also .priv and others.
I think this change makes the option macros less tricky. The old code
had to stuff everything into macro arguments (and attempted to allow
setting arbitrary fields by letting the user pass designated
initializers in the vararg parts). Some of this was made messy due to
C99 and C11 not allowing 0-sized varargs with ',' removal. It's also
possible that this change is pointless, other than cosmetic preferences.
Not too happy about some things. For example, the OPT_CHOICE()
indentation I applied looks a bit ugly.
Much of this change was done with regex search&replace, but some places
required manual editing. In particular, code in "obscure" areas (which I
didn't include in compilation) might be broken now.
In wayland_common.c the author of some option declarations confused the
flags parameter with the default value (though the default value was
also properly set below). I fixed this with this change.
The externally driven renderloop was originally added for the wayland
context (to make display sync somewhat work), but it has a lot of issues
with mpv's internal structure. A different approach should be used.
This reverts commit a743fef837.
Use the extension to compute the (hopefully correct) video delay and
vsync phase.
This is very fuzzy, because the latency will suddenly be applied after
some frames have already been shown. This means there _will_ be "jumps"
in the time accounting, which can lead to strange effects at start of
playback (such as making initial "dropped" etc. frames worse). The only
reasonable way to fix this would be running a few dummy frame swaps at
start of playback until the latency is known. The same happens when
unpausing.
This only affects display-sync mode.
Correct function was not confirmed. It only "looks right". I don't have
the equipment to make scientifically correct measurements.
A potentially bad thing is that we trust the timestamps we're receiving.
Out of bounds timestamps could wreak havoc. On the other hand, this will
probably cause the higher level code to panic and just disable DS.
As a further caveat, this makes a bunch of assumptions about UST
timestamps. If there are delayed frames (i.e. we skipped one or more
vsyncs), the latency logic is mostly reset. There is no attempt to make
the vo.c skipped vsync logic to use this. Also, the latency computation
determines a vsync duration, and there's no effort to reconcile or share
the vo.c logic for determining vsync duration.
There is now a better way. Reading the font framebuffer was always a
hack. The new code via VOCTRL_SCREENSHOT renders it into a FBO, which
does not come with the disadvantages of reading the front buffer (like
not being supported by GLES, possibly black regions due to overlapping
windows on some systems).
For now keep VOCTRL_SCREENSHOT_WIN on the VO level, because there are
still some lesser VOs and backends that use it.
Using the GL renderer for color conversion will make sure screenshots
will use the same conversion as normal video rendering. It can do this
for all types of screenshots.
The logic when to write 16 bit PNGs changes. To approximate the old
behavior, we decide by looking whether the source video format has more
than 8 bits per component. We apply this logic even for window
screenshots. Also, 16 bit PNGs now always include an unused alpha
channel. The reason is that FFmpeg has RGB48 and RGBA64 formats, but no
RGB064. RGB48 is 3 bytes and usually not supported by GPUs for
rendering, so we have to use RGBA64, which forces an alpha channel.
Will break for users who use --target-trc and similar options.
I considered creating a new gl_video context, but it could double GPU
memory use, so I didn't.
This uses FBOs instead of glGetTexImage(), because that increases the
chance it could work on GLES (e.g. ANGLE). Untested. No support for the
Vulkan and D3D11 backends yet.
Fixes#5498. Also fixes#5240, because the code for reading back is not
used with the new code path.
Fixes display-sync (though if you change virtual desktops you'll need to seek
to re-enable display-sync) partially under wayland.
As an advantage, rendering is completely disabled if you change desktops or
alt+tab so you lose no performance if you leave mpv running elsewhere as long
as it isn't visible.
This could also be ported to other VOs which supports it.
Make the VO<->decoder interface capable of supporting multiple hwdec
APIs at once. The main gain is that this simplifies autoprobing a lot.
Before this change, it could happen that the VO loaded the "wrong" hwdec
API, and the decoder was stuck with the choice (breaking hw decoding).
With the change applied, the VO simply loads all available APIs, so
autoprobing trickery is left entirely to the decoder.
In the past, we were quite careful about not accidentally loading the
wrong interop drivers. This was in part to make sure autoprobing works,
but also because libva had this obnoxious bug of dumping garbage to
stderr when using the API. libva was fixed, so this is not a problem
anymore.
The --opengl-hwdec-interop option is changed in various ways (again...),
and renamed to --gpu-hwdec-interop. It does not have much use anymore,
other than debugging. It's notable that the order in the hwdec interop
array ra_hwdec_drivers[] still matters if multiple drivers support the
same image formats, so the option can explicitly force one, if that
should ever be necessary, or more likely, for debugging. One example are
the ra_hwdec_d3d11egl and ra_hwdec_d3d11eglrgb drivers, which both
support d3d11 input.
vo_gpu now always loads the interop lazily by default, but when it does,
it loads them all. vo_opengl_cb now always loads them when the GL
context handle is initialized. I don't expect that this causes any
problems.
It's now possible to do things like changing between vdpau and nvdec
decoding at runtime.
This is also preparation for cleaning up vd_lavc.c hwdec autoprobing.
It's another reason why hwdec_devices_request_all() does not take a
hwdec type anymore.
vo_gpu.c will call gl_video_icc_auto_enabled() to check whether it
should retrieve the ICC profile. But the value returned by this function
will be outdated, because gl_video_update_options() is not called yet.
Change the order of function calls so that this is done after updating
the options.
(This is fairly chaotic, but I guess this code will be refactored a
dozen of times anyway in the future.)
With video paused, changing the brightness controls (or similar) would
sometimes not rerender the video frame. So the OSD would redraw, but the
video wouldn't change. This is caused by output caching, and a redraw
request is free to return the cached frame. Change it such to invalidate
the cached frame if any of the options or the equalizer change.
In theory, gl_video_reset_surfaces() could be called if the equalizer
changes - this would apparently force interpolatzion to redraw all
frames. But this looks kind of crappy when changing the equalizer during
playback. It'll "eventually" use the correct settings anyway, and when
paused interpolation is off.
This time based on ra/vo_gpu. 2017 is the year of the vulkan desktop!
Current problems / limitations / improvement opportunities:
1. The swapchain/flipping code violates the vulkan spec, by assuming
that the presentation queue will be bounded (in cases where rendering
is significantly faster than vsync). But apparently, there's simply
no better way to do this right now, to the point where even the
stupid cube.c examples from LunarG etc. do it wrong.
(cf. https://github.com/KhronosGroup/Vulkan-Docs/issues/370)
2. The memory allocator could be improved. (This is a universal
constant)
3. Could explore using push descriptors instead of descriptor sets,
especially since we expect to switch descriptors semi-often for some
passes (like interpolation). Probably won't make a difference, but
the synchronization overhead might be a factor. Who knows.
4. Parallelism across frames / async transfer is not well-defined, we
either need to use a better semaphore / command buffer strategy or a
resource pooling layer to safely handle cross-frame parallelism.
(That said, I gave resource pooling a try and was not happy with the
result at all - so I'm still exploring the semaphore strategy)
5. We aggressively use pipeline barriers where events would offer a much
more fine-grained synchronization mechanism. As a result of this, we
might be suffering from GPU bubbles due to too-short dependencies on
objects. (That said, I'm also exploring the use of semaphores as a an
ordering tactic which would allow cross-frame time slicing in theory)
Some minor changes to the vo_gpu and infrastructure, but nothing
consequential.
NOTE: For safety, all use of asynchronous commands / multiple command
pools is currently disabled completely. There are some left-over relics
of this in the code (e.g. the distinction between dev_poll and
pool_poll), but that is kept in place mostly because this will be
re-extended in the future (vulkan rev 2).
The queue count is also currently capped to 1, because of the lack of
cross-frame semaphores means we need the implicit synchronization from
the same-queue semantics to guarantee a correct result.
Due to the plethora of historical baggage from different eras getting
confusing, I decided to simplify and unify the struct organization and
naming scheme.
Structs that got renamed:
1. fbodst -> ra_fbo (and moved to gpu/context.h)
2. fbotex -> removed (redundant after 2af2fa7a)
3. fbosurface -> surface
4. img_tex -> image
In addition to these structs being renamed, all of the names have been
made consistent. The new scheme is as follows:
struct image img;
struct ra_tex *tex;
struct ra_fbo fbo;
This also affects derived names, e.g. indirect_fbo -> indirect_tex.
Notably also, finish_pass_fbo -> finish_pass_tex and finish_pass_direct
-> finish_pass_fbo.
The new equivalent of fbotex_change() is called ra_tex_resize().
This commit (should) contain no logic changes, just renaming a bunch of
crap.
Turns out the option code apparently tries to directly talloc_free() the
allocated strings, instead of going through a tactx wrapper or
something. So we can't directly overwrite it. Do something else
instead..
This is done in several steps:
1. refactor MPGLContext -> struct ra_ctx
2. move GL-specific stuff in vo_opengl into opengl/context.c
3. generalize context creation to support other APIs, and add --gpu-api
4. rename all of the --opengl- options that are no longer opengl-specific
5. move all of the stuff from opengl/* that isn't GL-specific into gpu/
(note: opengl/gl_utils.h became opengl/utils.h)
6. rename vo_opengl to vo_gpu
7. to handle window screenshots, the short-term approach was to just add
it to ra_swchain_fns. Long term (and for vulkan) this has to be moved to
ra itself (and vo_gpu altered to compensate), but this was a stop-gap
measure to prevent this commit from getting too big
8. move ra->fns->flush to ra_gl_ctx instead
9. some other minor changes that I've probably already forgotten
Note: This is one half of a major refactor, the other half of which is
provided by rossy's following commit. This commit enables support for
all linux platforms, while his version enables support for all non-linux
platforms.
Note 2: vo_opengl_cb.c also re-uses ra_gl_ctx so it benefits from the
--opengl- options like --opengl-early-flush, --opengl-finish etc. Should
be a strict superset of the old functionality.
Disclaimer: Since I have no way of compiling mpv on all platforms, some
of these ports were done blindly. Specifically, the blind ports included
context_mali_fbdev.c and context_rpi.c. Since they're both based on
egl_helpers, the port should have gone smoothly without any major
changes required. But if somebody complains about a compile error on
those platforms (assuming anybody actually uses them), you know where to
complain.