Because VOCTRL_CHECK_EVENTS is processed asynchronously (as of 088a007,)
the GUI thread no longer gets regular wakeups, so the old check that
made sure the video window matched the parent window's size in --wid
embedding mode did not run very often. This made --wid embedding not
very usable.
Instead of polling for window size changes, use Windows hooks to react
to them when they happen. When the parent window is owned by the same
process as the video window, use a WH_CALLWNDPROC hook. When the parent
window is not owned by the same process, WinEvents must be used, which
are not as smooth, but still work for this purpose.
Since neither SetWindowsHookEx nor SetWinEventHook take a context
parameter to send data to the hook function, the hook functions must
find the child window by its class instead, so there are a few changes
to ensure this is fast and the class is unique.
This also fixes up the logic to handle window destruction. When a parent
window is destroyed, its children are also destroyed, so this gives us a
way to react to parent window destruction without polling.
If the video has the same size as the screen, starting with --fs and
then leaving fullscreen doesn't actually leave fullscreen.
The reason is that mpv tries to restore the previous window size if
necessary (otherwise, you'd end up with a Window of nearly the same size
as the screen with some WMs). It will typically restore with the
rectangle set exactly to the screen if no other position or size is
forced. This triggers pre-EWMH fullscreen mode, which WMs detect using
various heuristics.
Apparently we triggered this with mutter (but strangely no other WMs).
It's possible that pre-EWMH fullscreen mode actually requires removing
decorations, and mutter either ignores this. But this is speculation and
I haven't checked.
Work this around by reducing the requested size by 1 pixel if it
happens.
This was observed with mutter 3.18.2.
Fixes#2072.
This should actually be rather safe - we already check whether the
estimated value jitters less than the (possibly untrustworthy) nominal
one. Remove a "safety" check that disabled this code for small
deviations, and make it trigger sooner into playback. Also lower the log
level of messages about using the estimated display FPS down to verbose.
Normally there's another mechanism for smoothing out minor estimation
differences, but that is not good enough here.
This possibly improves behavior as reported in #3433, which can be
reproduced with --vo=null:fps=48.426 --display-fps=48 (though it doesn't
consider the jitter introduced by a real VO).
Doing this required synchronizing with the VO thread, which could lead
to audio dropouts if the VO was frozen (which can happen in practice if
e.g. an opengl_cb user is not doing what the API demands).
Add a way to send asynchronous VOCTRLs, and use that for the playback
state. In theory, it would be better to make this status update a
several function and to "merge" several queued update, but that would be
slightly more effort/code, and the update is so infrequent that the
merging would never happen anyway.
The change to vo_destroy() is to make sure all queued asynchronous
reuqests are finished before making the vo_thread exit.
Even though it's only used on MS Windows, it's run on any platform with
any VO, which makes this worse.
run_control() dereferences an uint32_t as int. Whether this is allowed
depends on what uint32_t is typedefed to (dereferencing an unsigned int
as int should be fine). Fix it by always using int. The uint32_t type
never really made sense.
In display-sync mode, the very first video frame is idiotically fully
timed, even though audio has not been synced yet at this point, and the
video frame is more like a "preview" frame. But since it's fully timed,
an underflow is detected if audio takes longer than the display time of
the frame (we send the second frame only after audio is done).
The timing code will try to compensate for the determined desync, but it
really shouldn't. So explicitly discard the timing info in this specific
case. On the other hand, if the first frame still hasn't finished
display, we can pretend everything is ok.
This is a hack - ideally, we either would send a frame without timing
info (and then send it again or so when playback starts properly), or we
would add real pause support to the VO, and pause it during syncing.
VOCTRL_CHECK_EVENTS is called on every frame. This is by design, and is
supposed to check the event queue of the windowing API.
With the decoupled GUI thread in w32_common.c this doesn't make too much
sense, and the purpose of VOCTRL_CHECK_EVENTS is really reduced to
checking event flags. Even worse, waiting on the GUI thread can
interfere with playback, since win32 sometimes blocks the event loop
(e.g. clicking the window title bar).
Change the code such that we really only query the event flags. Use
atomics to avoid having to add a new mutex. (We assume we always have
real atomics available. The build system doesn't check this properly,
and it could fall back to dummy atomics, which are not atomic.)
Should help with #3393. Doesn't help if the core happens to send a
synchronous request, most commonly via VOCTRL_SET_CURSOR_VISIBILITY or
VOCTRL_UPDATE_PLAYBACK_STATE.
Prevents segfaults when a fullscreen switch is issued before fully
initializing the VO.
Doesn't change anything since the schedule_resize is only there to
resize in case the image size switches, which happens long after init.
The problem was that when in fullscreen, switching between images did
not issue a resize event, causing none of the images to be rendered
correctly.
This fixes the problem by issuing a resize event with the screen width
and height.
This commit also moves the zeroing of the events field to when it gets
retrieved by mpv rather than randomly after a resize in the vo/backend
code.
ssurface_handle_configure()'s width and height are just hints given by
the compositor, the application's free to not respect those strictly and
to compensate for e.g. aspect ratio.
This prevents crazy scenarios in which pictures with portrait aspect
ratios have a huge black area to make them 16:9 or whatever the
compositor feels like.
With X11 it was usually left up to the window manager to prevent huge
windows from being out of range, but no Wayland compositor will do
this right now.
Hugely improves usability when using mpv as an image viewer.
Missed during the recent changes.
Also simplify error checking code and check for POLLNVAL
as well (the display fd was never actually checked to be valid).
This requires changing the pixel upload alignment because the odd sizes
might not be aligned to multiples of 4.
Anyway, the restriction has no real benefit and the sizes in between 32
and 64 might be worth using, so just drop it.
Following testing after ebe798a, this is a more than sufficient size to
cover our use case.
The old default was a drop of about 58 dB PSNR using the old code, and
this new default is about 65 dB PSNR, so it's actually an improvement
despite resulting in a smaller size.
There was no outlier whatsoever when comparing sizes around the 64
neighbourhood (with every step corresponding to a PSNR drop of about
0.07 dB), so I picked this since it's a power of two and requires no
change to the current 3dlut-size parsing logic.
I also tested smaller sizes such as 32x32x32 which performed almost as
well on colorful samples, but this results in noticeable black boost in
the dark regions, which is pretty undesirable. Therefore, we should
avoid going much further below 64x64x64.
Either way, this new size is so fast to compute that the 3dlut cache is
almost useless on my end. In fact, it might even be slower to load the
profile from the cache than to recompute it from scratch. (For caches on
a disk. For cache on a tmpfs, it makes no difference)
It seems vo_x11_check_events() was supposed to return the currently
flagged events and reset them. But there are many places where
vo_x11_check_events() is called without checking its return value. This
could lead to forgotten events.
Change the code such that they can't get lost.
This code had the exact same texture indexing bug that the original
scaler code had before the introduction of the LUT_POS macro to fix it.
We can re-use this same macro here, and the performance drop is
virtually entirely negligible. The benefit is greatly improved LUT
accuracy as the 3DLUT size decreases - in particular, the old LUT
started introducing more and more black crush the lower your LUT size is
(because the error was essentially an over-contrast bias, with a
magnitude linearly related to the lut size).
The new code improves black stability as the LUT size decreases, and
only at very low values (16 and below) do black levels start noticeably
getting affected (due to crude linearization of the nonlinear response
curve).
The default value of 3dlut-size is definitely generous enough for this
to make no difference out of the box, but it also causes no performance
drop at all on my machine so I see no harm in improving the logic.
Furthermore, this means we could easily decrease the default 3dlut size
in a future commit, perhaps even down to 64x64x64 as a default. (But
more testing is warranted here)
Both backends have code to close each FD of their wakeup_pipe array.
This array is default-initialized with 0, which means if the backends
exit before the wakeup pipe is created (e.g. when probing), they would
close FD 0.
Initialize the FDs with -1. Then we call close(-1) in these situations,
which is perfectly allowed and has no bad consequences.
This fits natively into the vo/backend and allows to simplify the
polling code.
One new change is the fact that surface_handle_enter flags VO_EVENT_WIN_STATE
and VO_EVENT_RESIZE instead of only VO_EVENT_WIN_STATE. Before this, the code
hackily relied on the timeout and the loop in the wait_frame function to track
and set the scaling factor. Instead, this triggers mpv to run a schedule_resize
and adjust the new VO output dimensions immediately. This is also more accurate
since surface_handle_enter() gets called when a surface is created, moved and
resized, which is exactly what the rest of the player might be interested in.
This uses GLSL mix() instead of going through an indirect texture
access. Easy to implement and might require less resources on some
devices, since the oversample code was already essentially just a
special case of this.
Could be made the new default (as per issue #2685), but that should be
done in a separate commit.
Until now, this has been either handled over vo.event_fd (which should
go away), or by putting event handling on a separate thread. The
backends which do the latter do it for a reason and won't need this, but
X11 and Wayland will, in order to get rid of event_fd.
There's no need to call wl_display_flush() since all the client-side
buffered data has already been flushed prior to polling the fd.
Instead only check for POLLIN and the usual ERR+HUP.
Don't just cause vo_opengl to update the ICC profile every time the
window is moved. Instead, explicitly check if the screen was changed.
Mostly untested.
The hw_subfmt field roughly corresponds to the field
AVHWFramesContext.sw_format in ffmpeg. The ffmpeg one is of the type
AVPixelFormat (instead of the underlying hardware format), so it's a
good idea to switch to this too for preparation.
Now the hw_subfmt field is an mp_imgfmt instead of an opaque/API-
specific number. VDPAU and Direct3D11 already used mp_imgfmt, but
Videotoolbox and VAAPI had to be switched.
One somewhat user-visible change is that the verbose log will now always
show the hw_subfmt as image format, instead of as nonsensical number.
(In the end it would be good if we could switch to AVHWFramesContext
completely, but the upstream API is incomplete and doesn't cover
Direct3D11 and Videotoolbox.)
This should get mpv working on Windows 7 machines without hardware
accelerated graphics adapters. It already worked on Windows 8 and up
because those systems would silently fall back to WARP if there was no
graphics hardware installed.
The normal MPGL_CAP_SW flag is not set, so unlike other opengl backends,
this will choose a software adapter even if opengl:sw is not specified.
The reason for this is, unlike on Linux, where vo_xv and vo_x11 can be
used, mpv on Windows does not have any VO to fall back on when hardware
acceleration isn't available, so if software adapters are rejected, the
user won't see any video output when using the default settings. WARP
seems to perform quite well, so it should be used in this case.
These mostly happen in situations where the correct behavior is
relatively new and not found in the wild (therefore not worth
implementing) and/or extremely complicated (and thus not worth worrying
about the potential edge cases and UI changes).
Still, it's best to document these where they happen to guide the poor
souls maintaining these files in the future.
This uses eglPostSubBufferNV to trigger ANGLE to check the window size
and update the size of the swapchain to match, which is recommended
here: https://groups.google.com/d/msg/angleproject/RvyVkjRCQGU/gfKfT64IAgAJ
With the D3D11 backend, using eglPostSubBufferNV with a 0-sized update
region will even skip the Present() call, meaning it won't block for a
vsync period. Hopefully ANGLE will have a less hacky way of doing this
in future. See the relevant ANGLE issue: http://anglebug.com/1438Fixes#3301
This can for example happen with vo_opengl_cb, if it is used with a GL
implementation that does not supports FBOs. (mpv itself should never
attempt to use FBOs if they're not available.)
Without this check it would trigger an assert() in our dummy
glBindFramebuffer wrapper.
Suspected cause of #3308, although it's still unlikely.
This moves some of the bulky user-shader specific logic into the file
dedicated to it. Rather than expose video.c state, variable lookup is
now done via a simulated closure.
This involves multiple changes:
1. Brightness metadata is split into nominal peak and signal peak.
For a quick and dirty explanation: nominal peak is the brightest value
that your color space can represent (i.e. the brightness of an encoded
1.0), and signal peak is the brightest value that actually occurs in
the video (i.e. the brightest thing that's displayed).
2. vo_opengl uses a new decision logic to figure out the right nom_peak
and sig_peak for all situations. It also does a better job of picking
the right target gamut/colorspace to use for the OSD. (Which still is
and still should be treated as sRGB). This change in logic also
fixes#3293 en passant.
3. Since it was growing rapidly, the logic for auto-guessing / inferring
the right colorimetry configuration (in pass_colormanage) was split from
the logic for actually performing the adaptation (now pass_color_map).
Right now, the new logic doesn't do a whole lot since HDR metadata is
still ignored (but not for long).
This has two reasons:
1. I tend to add new fields to this metadata, and every time I've done
so I've consistently forgotten to update all of the dozens of places in
which this colorimetry metadata might end up getting used. While most
usages don't really care about most of the metadata, sometimes the
intend was simply to “copy” the colorimetry metadata from one struct to
another. With this being inside a substruct, those lines of code can now
simply read a.color = b.color without having to care about added or
removed fields.
2. It makes the type definitions nicer for upcoming refactors.
In going through all of the usages, I also expanded a few where I felt
that omitting the “young” fields was a bug.
Commit 883d3114 seems to have (accidentally?) dropped the FBOTEX_FUZZY
from the output_fbo resize, which means that current master will keep
resizing and resizing the FBO as you change the window size, introducing
severe memory leaking after a while. (Not sure why that would cause
memory leaks, but I blame nvidia)
Either way, it's bad for performance too, so it's worth fixing.
vo_vaapi is the only thing which can't scale RGBA on the GPU. (Other
cases of RGBA scaling are handled in draw_bmp.c for some reason.)
Move this code and get rid of the osd_conv_cache thing.
Functionally, nothing changes.
This is how PBOs are normally supposed to be used.
Unfortunately I can't see an any absolute improvement on nVidia binary
drivers and playing 4K material. Compared to the "old" PBO path with 1
buffer, the measured GL time decreases significantly, though.
GL generally does not support flipping the image on upload, meaning
negative strides are not supported. vo_opengl handles this by flipping
rendering if the stride is inverted, and gl_pbo_upload() "ignores"
negative strides by uploading without flipping the image.
If individual planes had strides with different signs, this broke. The
flipping affected the entire image, and only the sign of the first plane
was respected.
This is just a crazy corner case that will never happen, but it turns
out this is quite simple to support, and actually improves the code
somewhat.
This introduces a gl_pbo_upload_tex() function, which works almost like
our gl_upload_tex() glTexSubImage2D() wrapper, except it takes a struct
which caches the PBO handles. It also takes the full texture size (to
make allocating an ideal buffer size easier), and a parameter to disable
PBOs (so that the caller doesn't have to duplicate the gl_upload_tex()
call if PBOs are disabled or unavailable).
This also removes warnings and fallbacks on PBO failure. We just
silently try using PBOs on every frame, and if that fails at some point,
revert to normal texture uploads. Probably doesn't matter.
This makes the geometry of the sizing borders more like the ones in
Windows 10. It also fixes an off-by-one error that made the right and
bottom borders thinner than the left and top borders, which made it
difficult to resize the window when using the Windows 7 classic theme
(because it has pretty thin sizing borders to begin with.)
No method of taking a screenshot was implemented at all. vo_opengl
lacked window screenshotting, because ANGLE doesn't allow reading the
frontbuffer. There was no way to read back from a D3D11 texture either.
Implement reading image data from D3D11 textures. This is a low-quality
effort to get basic screenshots done. Eventually there will be a better
implementation: once we use AVHWFramesContext natively, the readback
implementation will be in libavcodec, and will be able to cache the
staging texture correctly. Hopefully. (For now it doesn't even have a
AVHWFramesContext for D3D11 yet. But the abstraction is more appropriate
for this purpose.)
OK, this was dumb. The file didn't have much to do with ANGLE, and the
functionality can simply be moved to d3d.c. That file contains helpers
for decoding, but can always be present (on Windows) since it doesn't
access any D3D specific libavcodec APIs. Thus it doesn't need to be
conditionally built like the actual hwaccel wrappers.
Instead of hard-coding a big list, move some of the functionality
to csputils. Affects both the auto-guess blacklist and the peak
estimation.
Also update the comments.
Too many "exceptions" these days, it's easier to just hard-code a
whitelist instead of a blacklist. And besides, it only really makes
sense to avoid adaptation for BT.601 specifically, since that's the one
we auto-guess based on the resolution.
I'm not even sure why we ever consulted *_src to begin with, since that
just describes the current image format - and not the original metadata.
(And in fact, we specifically had logic to work around the impliciations
this had on linear scaling)
image_params is *the* authoritative source on the intended (i.e.
reference) image metadata, whereas *_src may be changed by previous
passes already. So only consult image_params for picking auto-generated
values.
Also, add some more missing "wide gamut" and "non-gamma" curves to the
autoconfig blacklist. (Maybe it would make sense to move this list to
csputils in the future? Or perhaps even auto-detect it based on the
associated primaries)
User request and not that hard. Closes#3157.
Note that FFmpeg doesn't support this and there's no signalling in HEVC
etc., so the only way users can access it is by using vf_format
manually.
Mind: This encoding uses full range values, not TV range.
This HDR function is unique in that it's still display-referred, it just
allows for values above the reference peak (super-highlights). The
official standard doesn't actually document this very well, but the
nominal peak turns out to be exactly 12.0 - so we normalize to this
value internally in mpv. (This lets us preserve the property that the
textures are encoded in the range [0,1], preventing clipping and making
the best use of an integer texture's range)
This was grouped together with SMPTE ST2084 when checking libavutil
compatibility since they were added in the same release window, in a
similar timeframe.
The main framebuffer is not the default framebuffer for the dxinterop
backend. Bind the main framebuffer and use the appropriate attachment
when reading the window content.
Fix#3284
The size check introduced in commit d941a57b did not consider that Xv
can round up the image size to the next chroma boundary. Doing that
makes sense, so it can't certainly be considered server misbehavior.
Do 2 things against this: allow if the server returns a larger image (we
just crop it then), and also allocate a properly aligned image in the
first place.
The GL_ARB_timer_query extension and thus the GL_TIME_ELAPSED constant
don't exist for GLES.
For ES the EXT_disjoint_timer_query is used so take the constant from
that else provide the constant manually.
See pr #3216 which introduced this error.
Until now, we've always converted vdpau video surfaces to RGB, and then
mapped the resulting RGB texture. Change this so that the surface is
mapped as NV12 plane textures.
The reason this wasn't done until now is because vdpau surfaces are
mapped in an "interlaced" way as separate fields, even for progressive
video. This requires messy reinterleraving. It turns out that even
though it's an extra processing step, the result can be faster than
going through the video mixer for RGB conversion.
Other than some potential speed-gain, doing this has multiple other
advantages. We can apply our own color conversion, which is important in
more complex cases. We can correctly apply debanding and potentially
other processing that requires chroma-specific or in-YUV handling.
If deinterlacing is enabled, this switches back to the old RGB
conversion method. Until we have at least a primitive deinterlacer in
vo_opengl, this will stay this way. The d3d11 and vaapi code paths are
similar. (Of course these don't require any crazy field reinterleaving.)
1. this basically reverts commit de4c74e5a4.
even with CVDisplayLinkCreateWithActiveCGDisplays and
CVDisplayLinkSetCurrentCGDisplayFromOpenGLContext we still have to
explicitly set the current display ID, otherwise it will just always
choose the display with the lowest refresh rate. another weird thing is,
we still have to set the display ID another time with
CVDisplayLinkSetCurrentCGDisplay after the link was started. otherwise
the display period is 0 and the fallback will be used.
if we ever use the callback method for something useful it's probably
better to use CVDisplayLinkCreateWithActiveCGDisplays since we will need
to keep the display link around instead of releasing it at the end.
in that case we have to call CVDisplayLinkSetCurrentCGDisplay two times,
once before and once after LinkStart.
2. add windowDidChangeScreen delegate to update the display refresh rate
when mpv is moved to a different screen.
We have two problems here.
1. CVDisplayLinkGetActualOutputVideoRefreshPeriod, like the name suggests,
returns a frame period and not a refresh rate. using this as screen_fps
just leads to a slideshow. why didn't this break video playback on OS X
completely? the answer to this leads us to the second problem.
2. it seems that CVDisplayLinkGetActualOutputVideoRefreshPeriod always
returns 0 if used without CVDisplayLinkSetOutputCallback and hence always
fell back to CVDisplayLinkGetNominalOutputVideoRefreshPeriod. adding a
callback to CVDisplayLink solves this problem. the callback function at
this moment doesn't do anything but could possibly used in the future.
This can also remove all the stuff for lazily attaching the texture. It
doesn't matter if the dxinterop backend changes the bound framebuffer
during a VOCTRL, since the renderer does not rely on the GL state being
preserved.
The previous few commits changed sd_lavc.c's output to packed RGB sub-
images. In particular, this means all sub-bitmaps are part of a larger,
single bitmap. Change the vo_opengl OSD code such that it can make use
of this, and upload the pre-packed image, instead of packing and copying
them again.
This complicates the upload code a bit (4 code paths due to messy PBO
handling). The plan is to make sub-bitmaps always packed, but some more
work is required to reach this point. The plan is to pack libass images
as well. Since this implies a copy, this will make it easy to refcount
the result.
(This is all targeted towards vo_opengl. Other VOs, vo_xv, vo_x11, and
vo_wayland in particular, will become less efficient. Although at least
vo_vdpau and vo_direct3d could be switched to the new method as well.)
Until now, subtitle renderers could export SUBBITMAP_INDEXED, which is a
8 bit per pixel with palette format. sd_lavc.c was the only renderer
doing this, and the result was converted to RGBA in every use-case
(except maybe when the subtitles were hidden.)
Change it so that sd_lavc.c converts to RGBA on its own. This simplifies
everything a bit, and the palette handling can be removed from the
common code.
This is also preparation for making subtitle images refcounted. The
"caching" in img_convert.c is a PITA in this respect, and needs to be
redone. So getting rid of some img_convert.c code is a positive side-
effect. Also related to refcounted subtitles is packing them into a
single mp_image. Fewer objects to refcount is easier, and for the libass
format the same will be done. The plan is to remove manual packing from
the VOs which need single images entirely.
Apply the padding internally to each input bitmap, instead of requiring
this for the semi-public API.
Right now, everything still uses packer_pack_from_subbitmaps() to fill
the input bitmap sizes, but that's going to change with the following
commit. Since bitmap_packer.in is mutated during packing anyway, it's
more convenient to add the padding automatically.
Also, guarantee that every sub-bitmap has a padding border around it.
Don't let the padding overlap. Add padding even on the containing
borders.
This is simpler, doesn't cost much in memory usage, and is convenient
for one of the following commits.
This is the ES equivalent to ARB_timer_query. It enables the performance
timers on ANGLE. All the added functions should be identical in
semantics to their desktop GL equivalents.
The OpenGL 3.0+ and ES specs are quite clear on what values are
accepted for the attachment object name parameter. And there's no
overlap for the default framebuffer. Sigh.
Probably fixes Mesa raising an error in this case and might fix#3251.
Regression by the previous vo_opengl change.
Until now, we've used system-specific API (GLX, EGL, etc.) to retrieve
the depth of the default framebuffer. (We equal this to display depth
and use the determined depth for dithering.)
We can actually retrieve this value through standard GL API, and it
works everywhere (except GLES 2 of course). This simplifies everything a
great deal.
egl_helpers.c is empty now. But I expect that some EGL boilerplate will
be moved to it, so don't remove it yet.
This was somehow done under the assumption that ANGLE would somehow
always use RG in ES2 mode. But there's no basis for this. Even if ANGLE
supports NV12 textures with drivers that do not allow for texture_rg,
this cas eis too obscure to worry about. So do the robust and correct
thing instead, and disable this code if texture_rg is not available.
Commit 74e3d11 resulted in the background overlay not getting destroyed
when mpv quits. Add back a piece of code that was removed in that commit
to restore correct functionality.
Fixes issue #3100
This is a common idiom used in MSDN docs and Raymond Chen's example
programs to get a HINSTANCE for the current module, regardless of
whether it's an .exe or a .dll. Using GetModuleHandle(NULL) for this is
technically incorrect, since it always gets a handle to the .exe, even
when the executing code (in libmpv) is running in a .dll. In this case,
using the wrong HINSTANCE could cause namespace issues with window
classes, since CreateWindowEx uses the HINSTANCE to search for the
matching window class name.
See:
https://blogs.msdn.microsoft.com/oldnewthing/20050418-59/?p=35873https://blogs.msdn.microsoft.com/oldnewthing/20041025-00/?p=37483
There were two mistakes:
- SDL's RGB565 is always equivalent to ffmpeg's RGB565 (both are packed 16bit
native-endian integers in RGB=565 form) - this was wrongly reversed on
big endian platforms.
- SDL's RGB888 doesn't actually mean RGB24, but XRGB8888 (i.e. 32bit
packed integer, top 8 bits unused).
- Use RGB0 not RGBA when there is no alpha.
This is mainly for the nnedi3 user shader. With all whose NN weights
hardcoded into the shader source code, the shader file could be as
large as 300 kB.
Although D3D11 video decoding is unuspported on Windows 7, the
associated APIs almost work. Where they fail is texture creation, where
we try to create D3D11_BIND_DECODER surfaces. So specifically try to
detect this situation.
One issue is that once the hwdec interop is created, the damage is done,
and it can't use another backend (because currently only 1 hwdec backend
is supported). So that's where we prevent attempts to use it.
It still can fail when trying to use d3d11va-copy (since that doesn't
require an interop backend), but at that point we don't care anymore -
dxva2(-copy) is tried before that anyway.
User hooks can now use an extra WHEN expression to specify when the
shader should be run. For example, this can be used to only run a chroma
scaling shader `WHEN CHROMA.w LUMA.w <`.
There's a slight semantics change to user shaders: When trying to bind a
texture that does not exist, a shader will now be silently skipped
(similar to when the condition is false) instead of generating an error.
This allows shader stages to depend on an optional earlier stage without
having to copy/paste the same condition everywhere.
(In other words: there's an implicit condition on all of the bound
textures existing)
When using --hwdec=auto, systems that don't provide
D3D11_CREATE_DEVICE_VIDEO_SUPPORT, which probably includes all Windows
Vista and 7 systems, will print an error message. Reduce the log level
to verbose when probing and skip the error message entirely if d3d11.dll
is not present.
This commit is in a similar spirit to 991af7d.
For clang, it's enough to just put (void) around usages we are
intentionally ignoring the result of.
Since GCC does not seem to want to respect this decision, we are forced
to disable the warning globally.
The default behavior of vo_opengl has pretty much always been 'show the
source colors as-is, without caring to adapt it to the target device'.
This decision is mostly based on the fact that if we do anything else,
lots of people will complain.
With the rise of content like BT.2020, however, it turns out more people
complain about this content being very desaturated than people complain
about this content not matching VLC - so let's just map ultra-wide gamut
content back down to standard gamut by default.
Instead of measuring the actual upload time, this instead measures the
time needed to render + map the texture via vdpau. These numbers are
still useful, since they're part of the critical path.
This is plumbed through a new VOCTRL, VOCTRL_PERFORMANCE_DATA, and
exposed as properties render-time-last, render-time-avg etc.
All of these numbers are in microseconds, which gives a good precision
range when just outputting them via show-text. (Lua scripts can
obviously still do their own formatting etc.)
Signed-off-by: wm4 <wm4@nowhere>
To avoid blocking the CPU, we use 8 time objects and rotate through
them, only blocking until the last possible moment (before we need
access to them on the next iteration through the ring buffer). I tested
it out on my machine and 4 query objects were enough to guarantee
block-free querying, but the extra margin shouldn't hurt.
Frame render times are just output at the end of each frame, via MP_DBG.
This might be improved in the future. (In particular, I want to expose
these numbers as properties so that users get some more visible feedback
about render times)
Currently, we measure pass_render_frame and pass_draw_to_screen
separately because the former might be called multiple times due to
interpolation. Doing it this way gives more faithful numbers. Same goes
for frame upload times.
When ANGLE is using D3D11 and not running in DirectComposition mode,
DXGI will hook the video window's message loop and override Alt+Enter to
trigger a transition to exclusive fullscreen mode (which doesn't even
work with mpv's renderer for some reason.) This behaviour can be
disabled by getting a pointer to the IDXGIFactory associated with the
D3D11 device and calling MakeWindowAssociation with the appropriate
flags.
Instead of implicitly resetting the options to defaults and then
applying the options, they're always applied on top of the current
options (in the same way adding new options to the CLI command line
will).
This does not apply to vo_opengl_cb, because that has an even worse mess
which I refuse to deal with.
Enable m_sub_options_copy() to copy nested sub-options, and also enable
it to create an option struct from defaults. We can get rid of most of
the crap in assign_options() now.
Calling handle_scaler_opt() to get a static allocation for scaler name
is still needed. It's moved to reinit_scaler(), which seems to be a
better place for it. Without it, dangling pointers could be created when
options are changed. (And in fact, this fixes possible dangling pointers
for window.name.) In theory we could create a dynamic copy, but that
seemed even more messy.
Chance of regressions.
Commit 026b75e7 actually enabled changing icc options at runtime (via
vo_cmdline), but it didn't quite work. In particular, changing the icc-
profile option just kept the old profile, because it was cached
accordingly.
As part of this, change gl_lcms.opts from a struct to a pointer to a
struct. We properly copy it, instead of allowing possibly dangling
strings, like it was done in a working but unclean way before.
Also, reinit the whole rendering chain when the auto icc profile
changes, just like it's done when icc options are changed.
Passing the bstr thing as pointer makes no sense. Everywhere else bstr
structs are passed by value because they're so small. Only when it's
supposed to receive a return value they're not.
Originally, video.c did not access any CMS things (other than lut3d
being set on it), but this has changed. In practice, almost all accesses
to it have moved to video.c. vo_opengl only created it, and set the auto
icc profile path.
Complete the move.
Some things wrt. option handling are a bit fishy. (But when is this not
the case.)
icc-profile-auto was not tested, but the distributed human CI will take
care of it.
It gets printed on every alt+tab or desktop switch under mutter and
weston, and offers no useful information since it's handled by
destroying the previous entry.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit will cause the wayland backend and vo to correctly report
the display frame rate. This didn't work as VOCTRL_GET_DISPLAY_FPS was
received way too early, before the window was created (and thus
current_output set).
The VO will now signal VO_EVENT_WIN_STATE after window initialization
and upon a resize.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This algorithm works really well. Setting it is a much better
"out-of-the-box" experience than just clipping, which will always look
ugly.
In other words, with this default, users of mpv will just be able to
play HDR content without even realizing it's HDR (pretty much).
Instead of doing HDR tone mapping on an ad-hoc basis inside
pass_colormanage, the reference peak of an image is now part of the
image params (alongside colorspace, gamma, etc.) and tone mapping is
done whenever peak_src != peak_dst.
To get sensible behavior when mixing HDR and SDR content and displays,
target-brightness is a generic filler for "the assumed brightness of SDR
content".
This gets rid of the weird display_scaled hack, sets the framework
for multiple HDR functions with difference reference peaks, and allows
us to (in a future commit) autodetect the right source peak from
the HDR metadata.
(Apart from metadata, the source peak can also be controlled via
vf_format. For HDR content this adjusts the overall image brightness,
for SDR content it's like simulating a different exposure)
The wayland protocol exposes scaling done by the compositor to
compensate for small window sizes on small high DPI displays. If the
program ignores the scaling done, what'll happen is the compositor is
going to ask the program to be scaled down by N times the window size and
then it'll upscale the program's surface by N times. The scaling
algorithm seems to be bilinear so the scaling is quite obvious.
This commit sets up callbacks to listen for the scaling factor of each
output and, on rescale events, notifies the compositor that the
surface's scale is what the compositor asked for and changes the
player's surface to the appropriate size, causing no scaling to be done
by the compositor.
Compositors not supporting this interface will ignore the callbacks and do
nothing, keeping program behaviour the same. For compositors supporting
and using this interface (mutter), this will fix the rendering to be pixel
precise as it should be.
Both the opengl wayland backend and the wayland vo have been fixed to support
this. Verified to not break either on weston and mutter.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Position the window around the original window center on video size change
(when switching to the next file with different resolution, for example)
instead of keeping the position of its top-left corner fixed.
We now have a video filter that uses the d3d11 video processor, so it
makes no sense to have one in the VO interop code. The VO uses it for
formats not directly supported by ANGLE (so the video data is converted
to a RGB texture, which ANGLE can take in).
Change this so that the video filter is automatically inserted if
needed. Move the code that maps RGB surfaces to its own inteorp backend.
Add a bunch of new image formats, which are used to enforce the new
constraints, and to automatically insert the filter only when needed.
The added vf mechanism to auto-insert the d3d11vpp filter is very dumb
and primitive, and will work only for this specific purpose. The format
negotiation mechanism in the filter chain is generally not very pretty,
and mostly broken as well. (libavfilter has a different mechanism, and
these mechanisms don't match well, so vf_lavfi uses some sort of hack.
It only works because hwaccel and non-hwaccel formats are strictly
separated.)
The RGB interop is now only used with older ANGLE versions. The only
reason I'm keeping it is because it's relatively isolated (uses only
existing mechanisms and adds no new concepts), and because I want to be
able to compare the behavior of the old code with the new one for
testing. It will be removed eventually.
If ANGLE has NV12 interop, P010 is now handled by converting to NV12
with the video processor, instead of converting it to RGB and using the
old mechanism to import that as a texture.
This avoids a copy of the video image and lowers vsync jitter. Since
there are now two options to add to the window_attribs list, it has been
made dynamic.
A lot of real-world shaders start off with comments explaining the usage
or license, generating lots of "empty" passes. This simply change allows
us to skip them, which silences the warning spam and prevents us from
having to store and copy around these empty passes.
It also adds a more useful failure check: Attempting to use a user
shader that doesn't define any passes at all.
This requires the GL_EXT_texture_norm16 extension and works in ANGLE.
A default precision had to be set for sampler3Ds, otherwise the shaders
would fail to compile.
Remove the opengl-hq option default that caused it not to autoselect
ANGLE (unlike --vo=opengl). Details see commit d5df90a2.
Back then the intention was to use ANGLE by default, since it integrates
much nicer with the Windows compositor (instead of native OpenGL, which
tends to cause crazy glitches). On the other hand, many opengl-hq
capabilities are not available with older ANGLE builds, so it didn't
make any sense to autoselect ANGLE for it.
With the GL_EXT_texture_norm16 extension recently added to ANGLE, it has
essentially reached feature parity to desktop GL for the subset we are
using. (Even the integer texture hack for high bit depth input could be
dropped now.)
It (probably) still does not support nnedi3, due to the weird way the NN
coefficients are imported. Also, it uses half-floats instead of 16 bit
fixed-point textures for technical reasons, which implies about 5 bits
of precision loss. If anyone actually manages to distinguish the two
dithering texture formats in a double-blind test, I will fix it.
This must be called if a texture shared between D3D devices is updated.
Often enough, the shared devices will be the same device, but ANGLE
forces using shared surfaces. I suppose there is no guarantee the driver
will do the expected thing. Internally, the driver could for example not
insert the required barriers before the shared texture is used.
Fixes#320 (which is closed as 'not our problem' but eh)
Relevant xorg bug: https://bugs.freedesktop.org/show_bug.cgi?id=70931
For me this happened when (accidentally) trying to play a 8460x2812 jpg
file with mpv. Like the referenced bug, xvinfo reports "maximum XvImage
size: 8192 x 8192". So the returned XvImage is 8192x2812 and memory
corruption happens.
Only after handling this BadShmSeg X11 errors are shown.
Rename it to get out of OpenGL's namespace. The gl_ prefix is used by
other mpv functions, but no OpenGL ones.
The "slice" parameter was never actually used, and all callers passed 0
for it.
The main change is actually that e first copy to a "staging" memory
frame, and then upload this at once. The old non-PBO code called
glTexsubImage2D for each OSD sub-bitmap.
The new non-PBO code path is a bit faster now if there are many small
sub-bitmaps (on Linux/nVidia). It's also a bit simpler, so this is a
win.
(Although I don't particularly appreciate the mixed normal/PBO texture
code.)
Some of these checks became pointless after dropping ES 2.0 support for
extended filtering.
GL_EXT_texture_rg is part of core in ES 3.0, and we already check for
this version, so testing for the extension is redundant.
GL_OES_texture_half_float_linear is also always available, at least as
far as our needs go.
The functionality we need from GL_EXT_color_buffer_half_float is always
available in ES 3.2, and we explicitly check for ES 3.2, so reject this
extension if the ES version is new enough.
For some reason, GLES has no glMapBuffer, only glMapBufferRange.
GLES 2 has no buffer mapping at all, and GL 2.1 does not always have
glMapBufferRange. On those PBOs remain unsupported (there's no reason to
care about GL 2.1 without the extension).
This doesn't actually work on ANGLE, and I have no idea why. (There are
artifacts on OSD, as if parts of the OSD data weren't copied.) It works
on desktop OpenGL and at least 1 other ES 3 implementation. Don't enable
it on ANGLE, I guess.
Not sure how much can be gained with this, as we can't use it properly
yet. For now, this is used only before rendering, which probably does
overwhelmingly nothing.
In the future, this should be used after temporary passes, which could
possibly reduce memory usage and even memory bandwidth usage, depending
on the drivers.