Now macosx_menubar.m and mpv.rc (win32) use the same copyright string.
(This is a bit roundabout, because mpv.rc can't use C constants. Also
the C code wants to avoid rebuilding real source files if only version.h
changed, so only version.c includes version.h.)
This seems to fix issues when building on windows where compiling mpv.rc
after a `waf clean` resulted in a failure because version.h was not
always present
Apparently some Intel drivers have a bug where copying from staging
buffers to constant buffers does not work. We used to keep a copy of the
buffer data in a staging buffer to enable partial constant buffer
updates. To work around this bug, keep the copy in talloc-allocated
system memory instead.
There doesn't seem to be any noticable performance difference from
keeping the copy in system memory. Our cbuffers are probably too small
for it to matter anyway.
See also: https://crbug.com/593024Fixes#5293
This reverts commit 9513165c99
and commit 4efe330efb.
I had changed --loop-file to interact with --start to work
the same way that --loop-playlist does. (That is, --loop-file
seeks to the --start time upon looping, not the beginning of
the file.) However, the consensus is that the old behavior is
preferred and the interaction with --loop-playlist is the one
that is incorrect.
In addition, this change introduced a bug in the interaction
between Quit-Watch-Later and --loop-file, where upon reaching
playback end it would seek to the resume timestamp, not the
start of the file.
As a result, this commit reverts that change.
The x264 hack requires reading the first video packet, which in turn we
handle with a hack in demux_mkv.c to get the packet without having to
add special crap to demux.c. Another useless MKV feature (which they
enabled by default at one point and which caused many demuxers to break
completely, only to disable it again when it was too late) conflicts
with this, because we actually pass a block as packet contents, instead
of after "decompression".
Fix this by calling demux_mkv_decode().
This fixes when resuming certain broken h264 files encoded by x264. See
FFmpeg commit 840b41b2a643fc8f0617c0370125a19c02c6b586 about the x264
bug itself.
Normally, the unregistered user data SEI (that contains the x264 version
string) is informational only. But libavcodec uses it to workaround a
x264 bug, which was recently fixed in both libavcodec and x264. The fact
that both encoder and decoder were buggy is the reason that it was not
found earlier, and there are apparently a lot of files around created by
the broken decoder. If libavcodec sees the SEI, this bug can be worked
around by using the old behavior.
If you resume a file with mpv (i.e. seeking when the file loads),
libavcodec never sees the first video packet. Consequently it has to
assume the file is not broken, and never applies the workaround,
resulting in garbage being played.
Fix this by always feeding the first video packet to the decoder on
init, and then flushing the codec (to avoid that an unwanted image is
output). Flushing the codec does not remove info such as the x264
version. We also abuse the fact that the first avcodec_send_packet()
always pushes the frame into the decoder (so we don't have to trigger
the decoder by requsting an output frame).
Technically, the user could just use --vd-lavc-o with the same result.
But I find it better to make this an explicit option, so we can document
the ups and downs, and also avoid setting it for non-h264.
This means that we now explicitly set an interval of 1. Although that
should be the EGL default, some drivers could possibly ignore this
(unconfirmed). In any case, this commit also allows disabling vsync, for
users who want it.
Crashed when no vdpau device was loaded. Also there was a mistake of not
setting p->ctx, which broke software surface input mode. This was not
found before, because p->ctx is not needed for anything else.
Fixes#5294.
Show total cache as well as demuxer cache separately.
This adjusts the presented values to be consistent with status line
and OSC modifications made in https://github.com/mpv-player/mpv/pull/5250
This commit introduces a new --oset-metadata key-value-list option,
allowing the user to specify output metadata when encoding
(eg. --oset-metadata=title="Hello",comment="World").
A second option --oremove-metadata is added to exclude existing metadata
from the output file (assuming --ocopy-metadata is enabled).
Not all output formats support all tags, but luckily libavcodec
simply discards unsupported keys.
--copy-metadata describes the result of the option better, (copying metadata
from the source file to the output file). Marks the old --no-ometadata
OPT_REMOVED with a suggestion for the new --no-ocopy-metadata.
A release has been made, so drop options deprecated for that release.
Also drop some options which have been deprecated a much longer time
before.
Also fix a typo in client-api-changes.rst.
The queue family index and the queue info index are not necessarily the
same, so we're forced to do a check based on the queue family index
itself.
Fixes#5049
A vulkan validation layer update pointed out that this was wrong; we
still need to use the access type corresponding to the stage mask, even
if it means our code won't be able to skip the pipeline barrier (which
would be wrong anyway).
In additiona to this, we're also not allowed to specify any source
access mask when transitioning from top_of_pipe, which doesn't make any
sense anyway.
Async compute in particular seems to cause problems on some drivers, and
even when supprted the benefits are not that massive from the tests I
have seen, so it's probably safe to keep off by default.
Async transfer on the other hand seems to work better and offers a more
substantial improvement, so it's kept on.
This gets confused by e.g. SPARSE_BIT on the TRANSFER_BIT, leading to
situations where "more specialized" is ambiguous and the logic breaks
down. So to fix it, only compare the subset we care about.
blit() implies scaling, copy() is the equivalent command to use when the
formats are compatible (same pixel size) and the rects have the same
dimensions.
This allows RAs with support for non-opaque FBO formats to use a more
appropriate FBO format for the output tex, possibly enabling a more
efficient blit operation.
This requires distinguishing between real formats (which can be used to
create textures) and fake formats (e.g. ra_gl's FBO hack).
On AMD devices, we only get one graphics pipe but several compute pipes
which can (in theory) run independently. As such, we should prefer
compute shaders over fragment shaders in scenarios where we expect them
to be better for parallelism.
This is amusingly trivial to do, and actually improves performance even
in a single-queue scenario.
Instead of using a single primary queue, we generate multiple
vk_cmdpools and pick the right one dynamically based on the intent.
This has a number of immediate benefits:
1. We can use async texture uploads
2. We can use the DMA engine for buffer updates
3. We can benefit from async compute on AMD GPUs
Unfortunately, the major downside is that due to the lack of QF
ownership tracking, we need to use CONCURRENT sharing for all resources
(buffers *and* images!). In theory, we could try figuring out a way to
get rid of the concurrent sharing for buffers (which is only needed for
compute shader UBOs), but even so, the concurrent sharing mode doesn't
really seem to have a significant impact over here (nvidia). It's
possible that other platforms may disagree.
Our deadlock-avoidance strategy is stupidly simple: Just flush the
command every time we need to switch queues, and make sure all
submission and callbacks happen in FIFO order. This required lifting the
cmds_pending and cmds_queued out from vk_cmdpool to mpvk_ctx, and some
functions died/got moved as a result, but that's a relatively minor
change.
On my hardware this is a fairly significant performance boost, mainly
due to async transfers. (Nvidia doesn't expose separate compute queues
anyway). On AMD, this should be a performance boost as well due to async
compute.
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike #4777).
Closes#4777.
This uses the new vk_signal mechanism to order all access to textures.
This has several advantageS:
1. It allows real synchronization of image access across multiple frames
when using multiple queues for parallelism.
2. It allows using events instead of pipeline barriers, which is a
finer-grained synchronization primitive that allows for more
efficient layout transitions over longer durations.
This commit also restructures some of the implicit transition code for
renderpasses to be more flexible and correct. (Note: this technically
drops the ability to transition the image out of undefined layout when
not blending, but that was a bug anyway and needs to be done properly)
vo_gpu: vulkan: remove no-longer-true optimization
The change to the output_tex format makes this no longer true, and it
actually seems to hurt performance now as well. So just don't do it
anymore. I also realized it hurts performance when drawing an OSD, so
it's probably not a good idea anyway.
This combines VkSemaphores and VkEvents into a common umbrella
abstraction which can resolve to either.
We aggressively try to prefer VkEvents over VkSemaphores whenever the
conditions are met (1. we can unsignal the semaphore, i.e. it comes from
the same frame; and 2. it comes from the same queue).
Instead of being submitted immediately, commands are appended into an
internal submission queue, and the actual submission is done once per
frame (at the same time as queue cycling). Again, the benefits are not
immediately obvious because nothing benefits from this yet, but it will
make more sense for an upcoming vk_signal mechanism.
This also cleans up the way the ra_vk submission interacts with the
synchronization/callbacks from the ra_vk_ctx. Although currently, the
way the dependency is signalled is a bit hacky: normally it would be
associated with the ra_tex itself and waited on in the appropriate stage
implicitly. But that code is just temporary, so I'm keeping it in there
for a better commit order.