Technically, the user could just use --vd-lavc-o with the same result.
But I find it better to make this an explicit option, so we can document
the ups and downs, and also avoid setting it for non-h264.
This commit introduces a new --oset-metadata key-value-list option,
allowing the user to specify output metadata when encoding
(eg. --oset-metadata=title="Hello",comment="World").
A second option --oremove-metadata is added to exclude existing metadata
from the output file (assuming --ocopy-metadata is enabled).
Not all output formats support all tags, but luckily libavcodec
simply discards unsupported keys.
--copy-metadata describes the result of the option better, (copying metadata
from the source file to the output file). Marks the old --no-ometadata
OPT_REMOVED with a suggestion for the new --no-ocopy-metadata.
Async compute in particular seems to cause problems on some drivers, and
even when supprted the benefits are not that massive from the tests I
have seen, so it's probably safe to keep off by default.
Async transfer on the other hand seems to work better and offers a more
substantial improvement, so it's kept on.
Instead of using a single primary queue, we generate multiple
vk_cmdpools and pick the right one dynamically based on the intent.
This has a number of immediate benefits:
1. We can use async texture uploads
2. We can use the DMA engine for buffer updates
3. We can benefit from async compute on AMD GPUs
Unfortunately, the major downside is that due to the lack of QF
ownership tracking, we need to use CONCURRENT sharing for all resources
(buffers *and* images!). In theory, we could try figuring out a way to
get rid of the concurrent sharing for buffers (which is only needed for
compute shader UBOs), but even so, the concurrent sharing mode doesn't
really seem to have a significant impact over here (nvidia). It's
possible that other platforms may disagree.
Our deadlock-avoidance strategy is stupidly simple: Just flush the
command every time we need to switch queues, and make sure all
submission and callbacks happen in FIFO order. This required lifting the
cmds_pending and cmds_queued out from vk_cmdpool to mpvk_ctx, and some
functions died/got moved as a result, but that's a relatively minor
change.
On my hardware this is a fairly significant performance boost, mainly
due to async transfers. (Nvidia doesn't expose separate compute queues
anyway). On AMD, this should be a performance boost as well due to async
compute.
This uses the new vk_signal mechanism to order all access to textures.
This has several advantageS:
1. It allows real synchronization of image access across multiple frames
when using multiple queues for parallelism.
2. It allows using events instead of pipeline barriers, which is a
finer-grained synchronization primitive that allows for more
efficient layout transitions over longer durations.
This commit also restructures some of the implicit transition code for
renderpasses to be more flexible and correct. (Note: this technically
drops the ability to transition the image out of undefined layout when
not blending, but that was a bug anyway and needs to be done properly)
vo_gpu: vulkan: remove no-longer-true optimization
The change to the output_tex format makes this no longer true, and it
actually seems to hurt performance now as well. So just don't do it
anymore. I also realized it hurts performance when drawing an OSD, so
it's probably not a good idea anyway.
I don't want to add another field to display stream and demuxer cache
separately, so just add them up. This strangely makes sense, since the
forward buffered stream cache amount consists of data not read by the
demuxer yet. (If the demuxer cache has buffered the full stream, the
forward buffered stream cache amount is 0.)
Reduce it from 75MB in both directions (forward/backwards) to 10MB each.
The stream cache is kind of becoming useless in favor of the demuxer
cache. Using both doesn't make much sense, because they will contain
duplicated data for no reason.
Still leave it at 10MB, which may help with mp4 a bit. libavformat's mp4
demuxer tends to seek too much, so we try to avoid triggering network
level seeks by having some caching in the stream layer.
Some old crap which nobody needs and which probably nobody uses.
This relies on a GCC extension: using "## __VA_ARGS__" to remove the
comma from the argument list if the va args are empty. It's supported
by clang, and there's some chance newer standards will introduce a
proper way to do this. (Even if it breaks somewhere, it will be a
problem only for 1 release, since I want to drop the deprecated
properties immediately.)
This was off for mpv CLI, but on for libmpv. The motivation behind this
was that it would be confusing for applications if libmpv continued
playback in a severely "degraded" way (without either audio or video),
and that it would be better to fail early.
In reality the behavior was just a confusing difference to mpv CLI, and
has confused actual users as well. Get rid of it.
Not bothering with a version bump, since this is so minor, and it's easy
to ensure compatibility in affected applications by just setting the
option explicitly.
(Also adding the missing next-release-marker in client-api-changes.rst.)
Annoying exception that makes no sense to keep. Normally, users or
client applications will either use --hwdec=auto, or not set the option
at all, which both leads to the expected result.
This commit introduces mp.utils.file_info() for querying information
on file paths, implemented for both Lua and Javascript.
The function takes a file path as an argument and returns a Lua table /
JS object upon success. The table/object will contain the values:
mode, size, atime, mtime, ctime and the convenience booleans is_file, is_dir.
On error, the Lua side will return `nil, error` and the Javascript side
will return `undefined` (and mark the last error).
This feature utilizes the already existing cross-platform `mp_stat()`
function.
This was a bit confused, and I bet nobody understood whether to use
--sub-file or --sub-files, and what the difference is. Explicitly
mention that both variants exist, and how they are related.
Previously when using a libmpv instance to play multiple videos,
once --start was set there was no clear way to unset it. You could
use --start=0, but 0 does not always mean the beginning of the file
(especially when using --rebase-start-time=no). Looking up the start
timestamp and passing that in also does not always work, particularly
when the first timestamp is negative (since negative values to --start
have a special meaning).
This commit adds a new "none" value which maps to the internal
REL_TIME_NONE, matching the default value of the play_start option.
Most options that change the playback endpoint coexist and playback
stops when it reaches any of them. (e.g. --ab-loop-b, --end, or
--chapter). This patch extends that behavior to --length so it isn't
automatically trumped by --end if both are present. These two will
interact now as the other options do.
This change is also documented in DOCS/man/options.rst.
The libavcodec mediacodec support does not conform to the new hwaccel
APIs yet. It has been agreed uppon that this glue code can be deleted
for now, and support for it will be restored at a later point.
Readding would require that it supports the AVCodecContext.hw_device_ctx
API. The hw_device_ctx would then contain the surface ID.
vo_mediacodec_embed would actually perform the task of creating
vo.hwdec_devs and adding a mp_hwdec_ctx, whose av_device_ref is a
AVHWDeviceContext containing the android surface.
Make the VO<->decoder interface capable of supporting multiple hwdec
APIs at once. The main gain is that this simplifies autoprobing a lot.
Before this change, it could happen that the VO loaded the "wrong" hwdec
API, and the decoder was stuck with the choice (breaking hw decoding).
With the change applied, the VO simply loads all available APIs, so
autoprobing trickery is left entirely to the decoder.
In the past, we were quite careful about not accidentally loading the
wrong interop drivers. This was in part to make sure autoprobing works,
but also because libva had this obnoxious bug of dumping garbage to
stderr when using the API. libva was fixed, so this is not a problem
anymore.
The --opengl-hwdec-interop option is changed in various ways (again...),
and renamed to --gpu-hwdec-interop. It does not have much use anymore,
other than debugging. It's notable that the order in the hwdec interop
array ra_hwdec_drivers[] still matters if multiple drivers support the
same image formats, so the option can explicitly force one, if that
should ever be necessary, or more likely, for debugging. One example are
the ra_hwdec_d3d11egl and ra_hwdec_d3d11eglrgb drivers, which both
support d3d11 input.
vo_gpu now always loads the interop lazily by default, but when it does,
it loads them all. vo_opengl_cb now always loads them when the GL
context handle is initialized. I don't expect that this causes any
problems.
It's now possible to do things like changing between vdpau and nvdec
decoding at runtime.
This is also preparation for cleaning up vd_lavc.c hwdec autoprobing.
It's another reason why hwdec_devices_request_all() does not take a
hwdec type anymore.
These couldn't be relicensed, and won't survive the LGPL transition. The
other existing filters are mostly LGPL (except libaf glue code).
This remove the deprecated pan option. I guess it could be restored by
inserting a libavfilter filter (if there's one), but for now let it be
gone.
This temporarily breaks volume control (and things related to it, like
replaygain).
The internal stereo3d filter was removed due to being GPL only, and due
to being a mess that somehow used libavfilter's filter. Without this
filter, it's hard to remove our internal stereo3d image attribute, so
even using libavfilter's stereo3d filter would not work too well (unless
someone fixes it and makes it able to use AVFrame metadata, which we
then could mirror in mp_image).
This was never well thought-through anyway, so just drop it. I think
some "downsampling" support would still make sense, maybe that can be
readded later.
Almost all of them had their guts removed and replaced by libavfilter
long ago, but remove them anyway. They're pointless and have been
scheduled for deprecation.
Still leave vf_format (because we need it in some form) and vf_sub (not
sure).
This will break some builtin functionality: lavfi yadif defaults are
different, auto rotation and stereo3d downconversion are broken. These
might be fixed later.
The option for enabling it has now an "auto" choice, which is the
default, and which will enable it if the media is thought to be via
network or if the stream cache is enabled (same logic as --cache-secs).
Also bump the --cache-secs default from 10 to 120.
Some back buffer is required to make the immediate forward range
seekable. This is because the back buffer limit is strictly enforced.
Just set a rather high back buffer by default. It's not use if
--demuxer-seekable-cache is disabled, so this is without risk.
Until now, the demuxer cache was limited to a single range. Extend this
to multiple range. Should be useful for slow network streams.
This commit changes a lot in the internal demuxer cache logic, so
there's a lot of room for bugs and regressions. The logic without
demuxer cache is mostly untouched, but also involved with the code
changes. Or in other words, this commit probably fucks up shit.
There are two things which makes multiple cached ranges rather hard:
1. the need to resume the demuxer at the end of a cached range when
seeking to it
2. joining two adjacent ranges when the lowe range "grows" into it (and
resuming the demuxer at the end of the new joined range)
"Resuming" the demuxer means that we perform a low level seek to the end
of a cached range, and properly append new packets to it, without adding
packets multiple times or creating holes due to missing packets.
Since audio and video never line up exactly, there is no clean "cut"
possible, at which you could resume the demuxer cleanly (for 1.) or
which you could use to detect that two ranges are perfectly adjacent
(for 2.). The way how the demuxer interleaves multiple streams is also
unpredictable. Typically you will have to expect that it randomly allows
one of the streams to be ahead by a bit, and so on.
To deal with this, we have heuristics in place to detect when one packet
equals or is "behind" a packet that was demuxed earlier. We reuse the
refresh seek logic (used to "reread" packets into the demuxer cache when
enabling a track), which checks for certain packet invariants.
Currently, it observes whether either the raw packet position, or the
packet DTS is strictly monotonically increasing. If none of them are
true, we discard old ranges when creating a new one.
This heavily depends on the file format and the demuxer behavior. For
example, not all file formats have DTS, and the packet position can be
unset due to libavformat not always setting it (e.g. when parsers are
used).
At the same time, we must deal with all the complicated state used to
track prefetching and seek ranges. In some complicated corner cases, we
just give up and discard other seek ranges, even if the previously
mentioned packet invariants are fulfilled.
To handle joining, we're being particularly dumb, and require a small
overlap to be confident that two ranges join perfectly. (This could be
done incrementally with as little overlap as 1 packet, but corner cases
would eat us: each stream needs to be joined separately, and the cache
pruning logic could remove overlapping packets for other streams again.)
Another restriction is that switching the cached range will always
trigger an asynchronous low level seek to resume demuxing at the new
range. Some users might find this annoying.
Dealing with interleaved subtitles is not fully handled yet. It will
clamp the seekable range to where subtitle packets are.
Like the manual says, this is technically undefined behaviour. See:
https://msdn.microsoft.com/en-us/library/windows/desktop/ff476085.aspx
In particular, MSDN says texture arrays created with the BIND_DECODER
flag cannot be used with CreateShaderResourceView, which means they
can't be sampled through SRVs like normal Direct3D textures. However,
some programs (Google Chrome included) do this anyway for performance
and power-usage reasons, and it appears to work with most drivers.
Older AMD drivers had a "bug" with zero-copy decoding, but this appears
to have been fixed. See #3255, #3464 and http://crbug.com/623029.
This is a new RA/vo_gpu backend that uses Direct3D 11. The GLSL
generated by vo_gpu is cross-compiled to HLSL with SPIRV-Cross.
What works:
- All of mpv's internal shaders should work, including compute shaders.
- Some external shaders have been tested and work, including RAVU and
adaptive-sharpen.
- Non-dumb mode works, even on very old hardware. Most features work at
feature level 9_3 and all features work at feature level 10_0. Some
features also work at feature level 9_1 and 9_2, but without high-bit-
depth FBOs, it's not very useful. (Hardware this old is probably not
fast enough for advanced features anyway.)
Note: This is more compatible than ANGLE, which requires 9_3 to work
at all (GLES 2.0,) and 10_1 for non-dumb-mode (GLES 3.0.)
- Hardware decoding with D3D11VA, including decoding of 10-bit formats
without truncation to 8-bit.
What doesn't work / can be improved:
- PBO upload and direct rendering does not work yet. Direct rendering
requires persistent-mapped PBOs because the decoder needs to be able
to read data from images that have already been decoded and uploaded.
Unfortunately, it seems like persistent-mapped PBOs are fundamentally
incompatible with D3D11, which requires all resources to use driver-
managed memory and requires memory to be unmapped (and hence pointers
to be invalidated) when a resource is used in a draw or copy
operation.
However it might be possible to use D3D11's limited multithreading
capabilities to emulate some features of PBOs, like asynchronous
texture uploading.
- The blit() and clear() operations don't have equivalents in the D3D11
API that handle all cases, so in most cases, they have to be emulated
with a shader. This is currently done inside ra_d3d11, but ideally it
would be done in generic code, so it can take advantage of mpv's
shader generation utilities.
- SPIRV-Cross is used through a NIH C-compatible wrapper library, since
it does not expose a C interface itself.
The library is available here: https://github.com/rossy/crossc
- The D3D11 context could be made to support more modern DXGI features
in future. For example, it should be possible to add support for
high-bit-depth and HDR output with DXGI 1.5/1.6.
The main purpose of this commit is avoiding any hidden O(n^2) algorithms
in the code for pruning the demuxer cache, and for determining the
seekable boundaries of the cache. The old code could loop over the whole
packet queue on every packet pruned in certain corner cases.
There are two ways how to reach the goal:
1) commit a cardinal sin
2) do everything incrementally
The cardinal sin is adding an extra field to demux_packet, which caches
the determined seekable range for a keyframe range. demux_packet is a
rather general data structure and thus shouldn't have any fields that
are not inherent to its use, and are only needed as an implementation
detail of code using it. But what are you gonna do, sue me?
In the future, demux.c might have its own packet struct though. Then the
other existing cardinal sin (the "next" field, from MPlayer times) could
be removed as well.
This commit also changes slightly how the seek end is determined. There
is a note on the manpage in case anyone finds the new behavior
confusing. It's somewhat cleaner and might be needed for supporting
multiple ranges (although that's unclear).
We don't hope to auto-detect them at load time, as that would be too
much of a pain - even FFmpeg requires fetching and parsing of video
packets, and exposes the information only via deprecated API.
But there still needs to be a way to select them by default. This is
also needed to get the first CC packet at all (without seeking back).
This commit also attempts to clean up locking a bit, which is a PITA,
but it's better be careful & clean.
See manpage additions.
(In ffmpeg-mpv and Libav, this is still called "cuvid". Libav won't work
yet, because it has no frame params support yet, but this could get
fixed soon.)