RepoMirrors/mpv - mpv

Commit Graph

Author	SHA1	Message	Date
wm4	9f595f3a80	vo_gpu: make screenshots use the GL renderer Using the GL renderer for color conversion will make sure screenshots will use the same conversion as normal video rendering. It can do this for all types of screenshots. The logic when to write 16 bit PNGs changes. To approximate the old behavior, we decide by looking whether the source video format has more than 8 bits per component. We apply this logic even for window screenshots. Also, 16 bit PNGs now always include an unused alpha channel. The reason is that FFmpeg has RGB48 and RGBA64 formats, but no RGB064. RGB48 is 3 bytes and usually not supported by GPUs for rendering, so we have to use RGBA64, which forces an alpha channel. Will break for users who use --target-trc and similar options. I considered creating a new gl_video context, but it could double GPU memory use, so I didn't. This uses FBOs instead of glGetTexImage(), because that increases the chance it could work on GLES (e.g. ANGLE). Untested. No support for the Vulkan and D3D11 backends yet. Fixes #5498. Also fixes #5240, because the code for reading back is not used with the new code path.	2018-02-11 17:45:51 -08:00
wm4	7b1e73139f	vo_gpu: add internal ability to skip osd/subs for rendering Needed for the following commit.	2018-02-11 17:45:51 -08:00
wm4	bff8cfe3f0	vo_gpu: use blit() only if target ra_tex supports it Even if RA_CAP_BLIT is set, this might just not be enabled for the target ra_tex.	2018-02-11 17:45:51 -08:00
Niklas Haas	4e7f4f10ce	vo_gpu: correctly infer HDR peak detection support The re-ordering of commits `e3d93fd` and `0870859` ended up swallowing the change which made the HDR tone mapping algorithm actually check for RA_CAP_NUM_GROUPS support.	2018-02-11 16:45:20 -08:00
Niklas Haas	4c2edecd7d	vo_gpu: refactor HDR peak detection algorithm The major changes are as follows: 1. Use `uint32_t` instead of `unsigned int` for the SSBO size calculation. This doesn't really matter, since a too-big buffer will still work just fine, but since `uint` is a 32-bit integer by definition this is the correct way to do it. 2. Pre-divide the frame_sum by the num_wg immediately at the end of a frame. This change was made to prevent overflow. At 4K screen size, this code is currently already very at risk of overflow, especially once I started playing with longer averaging sizes. Pre-dividing this out makes it just about fit into 32-bit even for worst-case PQ content. (It's technically also faster and easier this way, so I should have done it to begin with). Rename `frame_sum` to `frame_avg` to clearly signal the change in semantics. 3. Implement a scene transition detection algorithm. This basically compares the current frame's average brightness against the (averaged) value of the past frames. If it exceeds a threshold, which I experimentally configured, we reset the peak detection SSBO's state immediately - so that it just contains the current frame. This prevents annoying "eye adaptation"-like effects on scene transitions. 4. As a result of the previous change, we can now use a much larger buffer size by default, which results in a more stable and less flickery result. I experimented with values between 20 and 256 and settled on the new value of 64. (I also switched to a power-of-2 array size, because I like powers of two)	2018-02-11 16:45:20 -08:00
Niklas Haas	e3d93fde2f	vo_gpu: port HDR tone mapping algorithm from libplacebo The current peak detection algorithm was very bugged (which contributed to the excessive cross-frame flicker without long normalization) and also didn't take into account the frame average brightness level. The new algorithm both takes into account frame average brightness (in addition to peak brightness), and also computes the values in a more stable/correct way. (The old path was basically undefined behavior) In addition to improving the algorithm, we also switch to hable tone mapping by default, and try to enable peak computation automatically whever possible (compute shaders + SSBOs supported). We also make the desaturation milder, after extensive testing during libplacebo development. I also had to compensate a bit for the representational differences between mpv and libplacebo (libplacebo treats 1.0 as the reference peak, but mpv treats it as the nominal peak), but it shouldn't have caused any problems. This is still not quite the same as libplacebo, since libplacebo also allows tagging the desired scene average brightness on the output, and it also supports reading the scene average brightness from static metadata (MaxFALL) where available. But those changes are a bit more involved. It's possible we could also read this from metadata in the future, but we have problems communicating with AVFrames as it is and I don't want to touch the mpv colorimetry structs for the time being.	2018-02-05 23:11:18 -08:00
James Ross-Gowan	edb4970ca8	vo_gpu: check for RA_CAP_FRAGCOORD in dumb mode too The RA_CAP_FRAGCOORD checks apply to dumb mode as well, but they were after the check for dumb mode, which returns early, so they never ran. Fixes #5436	2018-01-30 20:22:58 +11:00
wm4	3c1566e736	video: fix crash with vdpau when reinitializing rendering Using vdpau will allocate additional textures for the reinterleaving step, which uninit_rendering() will free. This is a problem because the hwdec image remains mapped when reinitializing, so the reinterleaving textures are turned into dangling pointers. Fix this by freeing the reinterleave textures on full uninit instead. Fixes #5447.	2018-01-27 03:31:53 -08:00
Akemi	828f38e10d	video: change some remaining vo_opengl mentions to vo_gpu	2018-01-20 14:43:49 -08:00
wm4	342e36ea11	vo_gpu: skip DR for unsupported image formats DR (direct rendering) works by having the decoder decode into the GPU staging buffers, instead of copying the video data on texture upload. We did this even for formats unsupported by the GPU or the renderer. This "worked" because the staging memory is untyped, and the video frame was converted by libswscale to a supported format, and then uploaded with a copy using the normal non-DR texture upload path. Even though it "works", we don't gain anything from using the staging buffers for decoding, since we can't use them for upload anyway. Also, staging memory might be potentially limited (what really happens is up to the driver). It's easy to avoid, so just skip it in these cases.	2018-01-18 00:25:00 -08:00
wm4	07753bbb4a	vo_gpu: fix broken 10 bit via integer textures playback The check_gl_features(p) call here checks whether dumb mode can be used. It uses the field use_integer_conversion, which is set _after_ the call in the same function. Move check_gl_features() to the end of the function, when use_integer_conversion is finally set. Fixes that it tried to use bilinear filtering with integer textures. The bug disabled the code that is supposed to convert it to non-integer textures.	2018-01-17 22:59:15 -08:00
Niklas Haas	a42b8b1142	vo_gpu: attempt re-using the FBO format for p->output_tex This allows RAs with support for non-opaque FBO formats to use a more appropriate FBO format for the output tex, possibly enabling a more efficient blit operation. This requires distinguishing between real formats (which can be used to create textures) and fake formats (e.g. ra_gl's FBO hack).	2017-12-25 00:47:53 +01:00
Niklas Haas	dcda8bd36a	vo_gpu: aggressively prefer async compute On AMD devices, we only get one graphics pipe but several compute pipes which can (in theory) run independently. As such, we should prefer compute shaders over fragment shaders in scenarios where we expect them to be better for parallelism. This is amusingly trivial to do, and actually improves performance even in a single-queue scenario.	2017-12-25 00:47:53 +01:00
Niklas Haas	a3c9685257	vo_gpu: invalidate fbotex before drawing Don't discard the OSD or pass_draw_to_screen passes though. Could be faster on some hardware.	2017-12-25 00:47:53 +01:00
Niklas Haas	ba1943ac00	msg: reinterpret a bunch of message levels I've decided that MP_TRACE means “noisy spam per frame”, whereas MP_DBG just means “more verbose debugging messages than MSGL_V”. Basically, MSGL_DBG shouldn't create spam per frame like it currently does, and MSGL_V should make sense to the end-user and provide mostly additional informational output. MP_DBG is basically what I want to make the new default for --log-file, so the cut-off point for MP_DBG is if we probably want to know if for debugging purposes but the user most likely doesn't care about on the terminal. Also, the debug callbacks for libass and ffmpeg got bumped in their verbosity levels slightly, because being external components they're a bit less relevant to mpv debugging, and a bit too over-eager in what they consider to be relevant information. I exclusively used the "try it on my machine and remove messages from MSGL_* until it does what I want it to" approach of refactoring, so YMMV.	2017-12-15 22:28:47 -08:00
wm4	6047333f0b	video: remove code duplication by calling a hwdec loader helper Make gl_video_load_hwdecs() call gl_video_load_hwdecs_all() when all HW decoders should be loaded.	2017-12-11 20:44:59 +02:00
wm4	5196c34aec	video: properly initialize and set hwdec_interop Don't reset --gpu-hwdec-interop if vo_gpu uses dumb mode.	2017-12-11 20:44:59 +02:00
wm4	91586c3592	vo_gpu: make it possible to load multiple hwdec interop drivers Make the VO<->decoder interface capable of supporting multiple hwdec APIs at once. The main gain is that this simplifies autoprobing a lot. Before this change, it could happen that the VO loaded the "wrong" hwdec API, and the decoder was stuck with the choice (breaking hw decoding). With the change applied, the VO simply loads all available APIs, so autoprobing trickery is left entirely to the decoder. In the past, we were quite careful about not accidentally loading the wrong interop drivers. This was in part to make sure autoprobing works, but also because libva had this obnoxious bug of dumping garbage to stderr when using the API. libva was fixed, so this is not a problem anymore. The --opengl-hwdec-interop option is changed in various ways (again...), and renamed to --gpu-hwdec-interop. It does not have much use anymore, other than debugging. It's notable that the order in the hwdec interop array ra_hwdec_drivers[] still matters if multiple drivers support the same image formats, so the option can explicitly force one, if that should ever be necessary, or more likely, for debugging. One example are the ra_hwdec_d3d11egl and ra_hwdec_d3d11eglrgb drivers, which both support d3d11 input. vo_gpu now always loads the interop lazily by default, but when it does, it loads them all. vo_opengl_cb now always loads them when the GL context handle is initialized. I don't expect that this causes any problems. It's now possible to do things like changing between vdpau and nvdec decoding at runtime. This is also preparation for cleaning up vd_lavc.c hwdec autoprobing. It's another reason why hwdec_devices_request_all() does not take a hwdec type anymore.	2017-12-01 05:57:01 +01:00
wm4	4a6b04bdb9	vo_gpu: never pass flipped images to ra or ra backends Seems like the last refactor to this code broke playing flipped images, at least with --opengl-pbo --gpu-api=opengl. Flipping is rather a shitmess. The main problem is that OpenGL does not support flipped uploading. The original vo_gl implementation considered it important to handle the flipped case efficiently, so instead of uploading the image line by line backwards, it uploaded it flipped, and then flipped it in the renderer (basically the upload path ignored the flipping). The ra code and backends probably have an insane and inconsistent mix of semantics, so fix this by never passing it flipped images in the first place. In the future, the backends should probably support flipped images directly. Fixes #5097.	2017-11-10 10:06:33 +01:00
James Ross-Gowan	9b2dae79b1	vo_gpu: d3d11: add RA caps for ra_d3d11 ra_d3d11 uses the SPIR-V compiler to translate GLSL to SPIR-V, which is then translated to HLSL. This means it always exposes the same GLSL version that the SPIR-V compiler supports (4.50 for shaderc/glslang.) Despite claiming to support GLSL 4.50, some features that are tied to the GLSL version in OpenGL are not supported by ra_d3d11 when targeting legacy Direct3D feature levels. This includes two features that mpv relies on: - Reading from gl_FragCoord in the fragment shader (requires FL 10_0) - textureGather from any texture component (requires FL 11_0) These features have been exposed as new RA caps.	2017-11-07 20:27:13 +11:00
wm4	5261d1b099	vo_gpu: don't re-render hwdec frames when repeating frames Repeating frames (for display-sync) is not supposed to render the entire frame again. When using hardware decoding, it unfortunately did: the renderer uses the frame ID to check whether the frame data changed, and unmapping the hwdec frame clears it. Essentially reverts commit `761eeacf54`. Back then I probably thought it would be a good idea to release the hwdec image quickly in order to return it to the decoder, but they're referenced anyway. This should increase the performance and reduce GPU work.	2017-11-03 15:11:56 +01:00
Niklas Haas	c2d4fd0ef4	vo_gpu: change --tone-mapping-desaturate algorithm Comparing mpv's implementation against the ACES ODR reference samples and algorithms, it seems like they're happy desaturating highlights _way_ more aggressively than mpv currently does. And indeed, looking at some example clips like The Redwoods (which is actually well-mastered), the current desaturation produces unnatural-looking brightness fringes where the sky meets the treeline. Adjust the algorithm to make it apply to a much larger, more gradual brightness region; and change the interpretation of the parameter. As a bonus, the new parameter is actually sanely scaled (higher values = more desaturation). Also, make it scale based on the signal level instead of the luminance, to avoid under-desaturating bright blues.	2017-10-25 17:24:27 +02:00
wm4	d3c022779a	video: fix alpha handling Regression since `ec6e8a31e0`. Removal of the explicit else case always applies the conversion to premultiplied alpha in the else branch. We want to scale with multiplied alpha, but we don't want to multiply with alpha again on top of it. Fixes #4983, hopefully.	2017-10-19 19:01:33 +02:00
James Ross-Gowan	d9e3bad500	vo_gpu: add rgba16hf to the list of FBO formats This should be functionally identical to rgba16f, since the formats only differ in their representation on the CPU, but it could be useful for RA backends that don't expose rgba16f, like Vulkan. It's definitely useful for the WIP D3D11 backend.	2017-10-18 23:55:13 +11:00
wm4	c90f76d322	vo_gpu: fix video sometimes not being rerendered on equalizer change With video paused, changing the brightness controls (or similar) would sometimes not rerender the video frame. So the OSD would redraw, but the video wouldn't change. This is caused by output caching, and a redraw request is free to return the cached frame. Change it such to invalidate the cached frame if any of the options or the equalizer change. In theory, gl_video_reset_surfaces() could be called if the equalizer changes - this would apparently force interpolatzion to redraw all frames. But this looks kind of crappy when changing the equalizer during playback. It'll "eventually" use the correct settings anyway, and when paused interpolation is off.	2017-10-17 09:07:35 +02:00
Niklas Haas	eb69e73eb4	vo_gpu: enable 3DLUTs in dumb mode Unless FBOs are unsupported, this works. In particular, it's required to get ICC profiles working in voluntary dumb mode. So instead of blanket-disabling it, only disable it in the !have_fbo false case.	2017-09-30 19:03:34 +02:00
Niklas Haas	07fa5c8a8f	vo_gpu: fix --opengl-gamma redirect It still pointed at --gpu-gamma, but we decided on --gamma-factor instead.	2017-09-28 17:21:56 +02:00
Niklas Haas	791b9c4024	vo_gpu: set the correct number of vertex attribs This was always set to the length of the VAO, but it should have been set to the number of vertex attribs actually in use for this frame. No idea how that managed to survive the test framework on nvidia/linux, but ANGLE caught it.	2017-09-28 12:50:45 +02:00
Niklas Haas	67fd5882b8	vo_gpu: make the vertex attribs dynamic This has several advantages: 1. no more redundant texcoords when we don't need them 2. no more arbitrary limit on how many textures we can bind 3. (that extends to user shaders as well) 4. no more arbitrary limits on tscale radius To realize this, the VAO was moved from a hacky stateful approach (gl_sc_set_vertex_attribs) - which always bothered me since it was required for compute shaders as well even though they ignored it - to be a proper parameter of gl_sc_dispatch_draw, and internally plumbed into gl_sc_generate, which will make a (properly mangled) deep copy into params.vertex_attribs.	2017-09-28 01:54:38 +02:00
Niklas Haas	002a0ce232	vo_gpu: kill some static arrays This gets rid of the hard-coded limits on the number of hooks, textures and hook points.	2017-09-28 01:54:33 +02:00
Niklas Haas	47af509e1f	vo_gpu: attempt to avoid UBOs for dynamic variables This makes the radeon driver shut up about frequently updating STATIC_DRAW UBOs (--opengl-debug), and also reduces the amount of synchronization necessary for vulkan uniform buffers. Also add some extra debugging/tracing code paths. I went with a flags-based approach in case we ever want to extend this.	2017-09-26 17:25:35 +02:00
Niklas Haas	b0ba193b66	vo_gpu: handle texture initialization errors gracefully Tested by making the ra_tex_resize function always fail (apart from the initial FBO check). This required a few changes: 1. reset shaders on failed dispatch 2. reset cleanup binds on failed dispatch 3. fall back to initializing the struct image to 1x1 on failure 4. handle output_fbo_valid gracefully	2017-09-23 09:58:27 +02:00
Niklas Haas	f3ec494613	vo_gpu: reduce the --alpha=blend-tiles checkerboard intensity This was sort of grating by default and made it really hard to actually read e.g. text on top of a transparent background. I decided to approach the problem from both directions, making the whites darker and the grays lighter. This brings it closer to the dynamic range of e.g. the wikipedia transparent svg preview.	2017-09-22 21:14:27 +02:00
Niklas Haas	e3288c4597	vo_gpu: simplify compute shader coordinate calculation Since the removal of FBOTEX_FUZZY, this can be made slightly simpler.	2017-09-22 17:22:53 +02:00
Niklas Haas	62ddc85d17	vo_gpu: simplify structs / names Due to the plethora of historical baggage from different eras getting confusing, I decided to simplify and unify the struct organization and naming scheme. Structs that got renamed: 1. fbodst -> ra_fbo (and moved to gpu/context.h) 2. fbotex -> removed (redundant after `2af2fa7a`) 3. fbosurface -> surface 4. img_tex -> image In addition to these structs being renamed, all of the names have been made consistent. The new scheme is as follows: struct image img; struct ra_tex *tex; struct ra_fbo fbo; This also affects derived names, e.g. indirect_fbo -> indirect_tex. Notably also, finish_pass_fbo -> finish_pass_tex and finish_pass_direct -> finish_pass_fbo. The new equivalent of fbotex_change() is called ra_tex_resize(). This commit (should) contain no logic changes, just renaming a bunch of crap.	2017-09-22 16:58:55 +02:00
Niklas Haas	2af2fa7a27	vo_gpu: kill off FBOTEX_FUZZY I've observed the garbage pixels in more scenarios. They also were never really needed to begin with, originally being a discovered work-around for bug that we fixed since then anyway. Doesn't really seem to even help resizing, since the OpenGL drivers are all smart enough to pool resources internally anyway. Fixes #1814	2017-09-22 16:33:25 +02:00
wm4	fba927de41	options: properly handle deprecated options with CLI actions We want e.g. --opengl-shaders-append=foo to resolve to the new option, all while printing an option name. --opengl-shader is a similar case. These options are special, because they apply "actions" on actual options by specifying a suffix. So the alias/deprecation handling has to be part of resolving the actual option from prefix and suffix.	2017-09-22 11:31:03 +02:00
Niklas Haas	b940691784	vo_gpu: drop the RA_CAP_NESTED_ARRAY req from EWA compute Almost as fast as the old code, but more general. Notably, glslang doesn't support nested arrays. (cf. https://github.com/KhronosGroup/glslang/issues/1057) Also much cleaner code-wise, so I think I'll keep it even if glslang implements array_of_arrays.	2017-09-21 15:15:59 +02:00
Niklas Haas	db0fb3c48b	vo_gpu: fix gamma scale This never really made sense since the BT.1886 changes. It should get brighter for bright rooms, not darker for dark rooms. Picked some new values that seemed reasonable-ish.	2017-09-21 15:01:26 +02:00
Niklas Haas	e92effb14f	vo_gpu: describe the plane merging pass This can get left unknown if something hooks NATIVE	2017-09-21 15:01:22 +02:00
Niklas Haas	65979986a9	vo_opengl: refactor into vo_gpu This is done in several steps: 1. refactor MPGLContext -> struct ra_ctx 2. move GL-specific stuff in vo_opengl into opengl/context.c 3. generalize context creation to support other APIs, and add --gpu-api 4. rename all of the --opengl- options that are no longer opengl-specific 5. move all of the stuff from opengl/* that isn't GL-specific into gpu/ (note: opengl/gl_utils.h became opengl/utils.h) 6. rename vo_opengl to vo_gpu 7. to handle window screenshots, the short-term approach was to just add it to ra_swchain_fns. Long term (and for vulkan) this has to be moved to ra itself (and vo_gpu altered to compensate), but this was a stop-gap measure to prevent this commit from getting too big 8. move ra->fns->flush to ra_gl_ctx instead 9. some other minor changes that I've probably already forgotten Note: This is one half of a major refactor, the other half of which is provided by rossy's following commit. This commit enables support for all linux platforms, while his version enables support for all non-linux platforms. Note 2: vo_opengl_cb.c also re-uses ra_gl_ctx so it benefits from the --opengl- options like --opengl-early-flush, --opengl-finish etc. Should be a strict superset of the old functionality. Disclaimer: Since I have no way of compiling mpv on all platforms, some of these ports were done blindly. Specifically, the blind ports included context_mali_fbdev.c and context_rpi.c. Since they're both based on egl_helpers, the port should have gone smoothly without any major changes required. But if somebody complains about a compile error on those platforms (assuming anybody actually uses them), you know where to complain.	2017-09-21 15:00:55 +02:00

1 2

91 Commits