Commit Graph

59 Commits

Author SHA1 Message Date
Philip Langdale 19ea8b31bd hwtransfer: use the right hardware config to find conversion targets
The last piece in the puzzle for doing hardware conversions
automatically is ensuring we only consider valid target formats for the
conversion. Although it is unintuitive, some vaapi drivers can expose a
different set of formats for uploads vs for conversions, and that is
the case on the Intel hardware I have here.

Before this change, we would use the upload target list, and our
selection algorithm would pick a format that doesn't work for
conversions, causing everything to fail. Whoops.

Successfully obtaining the conversion target format list is a bit of a
convoluted process, with only parts of it encapsulated by ffmpeg.
Specifically, ffmpeg understands the concept of hardware configurations
that can affect the constraints of a device, but does not define what
configurations are - that is left up to the specific hwdevice type.

In the case of vaapi, we need to create a config for the video
processing endpoint, and use that when querying for constraints.

I decided to encapsulate creation of the config as part of the hwdec
init process, so that the constraint query can be down in the
hwtransfer code in an opaque way. I don't know if any other hardware
will need this capability, but if so, we'll be able to account for it.

Then, when we look at probing, instead of checking for what formats
are supported for transfers, we use the results of the constraint query
with the conversion config. And as that config doesn't depend on the
source format, we only need to do it once.
2023-08-26 10:07:55 -07:00
Philip Langdale 59478b0059 hwtransfer: implement support for hw->hw format conversion
Historically, we have not attempted to support hw->hw format conversion
in the autoconvert logic. If a user needed to do these kinds of
conversions, they needed to manually insert the hw format's conversion
filter manually (eg: scale_vaapi).

This was usually fine because the general rule is that any format
supported by the hardware can be used as well as any other. ie: You
would only need to do conversion if you have a specific goal in mind.

However, we now have two situations where we can find ourselves with a
hardware format being produced by a decoder that cannot be accepted by
a VO via hwdec-interop:
 * dmabuf-wayland can only accept formats that the Wayland compositor
   accepts. In the case of GNOME, it can only accept a handful of RGB
   formats.
 * When decoding via VAAPI on Intel hardware, some of the more unusual
   video encodings (4:2:2, 10bit 4:4:4) lead to packed frame formats
   which gpu-next cannot handle, causing rendering to fail.

In both these cases (at least when using VAAPI with dmabuf-wayland), if
we could detect the failure case and insert a `scale_vaapi` filter, we
could get successful output automatically. For `hwdec=drm`, there is
currently no conversion filter, so dmabuf-wayland is still out of luck
there.

The basic approach to implementing this is to detect the case where we
are evaluating a hardware format where the VO can accept the hardware
format itself, but may not accept the underlying sw format. In the
current code, we bypass autoconvert as soon as we see the hardware
format is compatible.

My first observation was that we actually have logic in autoconvert to
detect when the input sw format is not in the list of allowed sw
formats passed into the autoconverter. Unfortunately, we never populate
this list, and the way you would expect to do that is vo-query-format
returning sw format information, which it does not. We could define an
extended vo-query-format-2, but we'd still need to implement the
probing logic to fill it in.

On the other hand, we already have the probing logic in the hwupload
filter - and most recently I used that logic to implement conversion
on upload. So if we could leverage that, we could both detect when
hw->hw conversion is required, and pick the best target format.

This exercise is then primarily one of detecting when we are in this
case and letting that code run in a useful way. The hwupload filter is
a bit awkward to work with today, and so I refactored a bunch of the
set up code to actually make it more encapsulated. Now, instead of the
caller instantiating it and then triggering the probe, we probe on
creation and instantiate the correct underlying filter (hwupload vs
conversion) automatically.
2023-08-26 10:07:55 -07:00
Philip Langdale e50db42927 vo: hwdec: do hwdec interop lookup by image format
It turns out that it's generally more useful to look up hwdecs by image
format, rather than device type. In the situations where we need to
find one, we generally know the image format we're dealing with. Doing
this avoids us having to create mappings from image format to device
type.

The most significant part of this change is filling in the image format
for the various hw interops. There is a hw_imgfmt field today today, but
only a couple of the interops fill it in, and that seems to be because
we've never actually used this piece of metadata before. Well, now we
have a good use for it.
2022-09-21 09:39:34 -07:00
Philip Langdale 25fa1b0b45 hwdec/drmprime: add drmprime hwdec-interop
In the confusing landscape of hardware video decoding APIs, we have had
a long standing support gap for the v4l2 based APIs implemented for the
various SoCs from Rockship, Amlogic, Allwinner, etc. While VAAPI is the
defacto default for desktop GPUs, the developers who work on these SoCs
(who are not the vendors!) have preferred to implement kernel APIs
rather than maintain a userspace driver as VAAPI would require.

While there are two v4l2 APIs (m2m and requests), and multiple forks of
ffmpeg where support for those APIs languishes without reaching
upstream, we can at least say that these APIs export frames as DRMPrime
dmabufs, and that they use the ffmpeg drm hwcontext.

With those two constants, it is possible for us to write a
hwdec-interop without worrying about the mess underneath - for the most
part.

Accordingly, this change implements a hwdec-interop for any decoder
that produces frames as DRMPrime dmabufs. The bulk of the heavy
lifting is done by the dmabuf interop code we already had from
supporting vaapi, and which I refactored for reusability in a previous
set of changes.

When we combine that with the fact that we can't probe for supported
formats, the new code in this change is pretty simple.

This change also includes the hwcontext_fns that are required for us to
be able to configure the hwcontext used by `hwdec=drm-copy`. This is
technically unrelated, but it seemed a good time to fill this gap.

From a testing perspective, I have directly tested on a RockPRO64,
while others have tested with different flavours of Rockchip and on
Amlogic, providing m2m coverage.

I have some other SoCs that I need to spin up to test with, but I don't
expect big surprises, and when we inevitably need to account for new
special cases down the line, we can do so - we won't be able to support
every possible configuration blindly.
2022-08-09 10:19:18 -07:00
Philip Langdale fcc81cd940 vo_gpu[_next]: hwdec: fix logging regression when probing
When I introduced the concept of lazy loading of hwdecs by img format,
I did not propagate the probing flag correctly, leading to the new
normal loading path not runnng with probing set, meaning that any
errors would show up, creating unnecessary noise.

This change fixes this regression.
2022-03-21 09:53:37 -07:00
Philip Langdale 5186651f30 vo_gpu: hwdec: load hwdec interops on-demand by default
Historically, we have treated hwdec interop loading as a completely
separate step from loading the hwdecs themselves. Some hwdecs need an
interop, and some don't, and users generally configure the exact
hwdec they want, so interops that aren't relevant for that hwdec
shouldn't be loaded to save time and avoid warning/error spam.

The basic approach here is to recognise that interops are tied to
hwdecs by imgfmt. The hwdec outputs some format, and an interop is
needed to get that format to the vo without read back.

So, when we try to load an hwdec, instead of just blindly loading all
interops as we do today, let's pass the imgfmt in and only load
interops that work for that format. If more than one interop is
available for the format, the existing logic (whatever it is) will
continue to be used to pick one.

We also have one callsite in filters where we seem to pre-emptively
load all the interops. It's probably possible to trace down a specific
format but for now I'm just letting it keep loading all of them; it's
no worse than before.

You may notice there is no documentation update - and that's because
the current docs say that when the interop mode is `auto`, the interop
is loaded on demand. So reality now reflects the docs. How nice.
2022-02-17 20:02:32 -08:00
Philip Langdale 9c05be8999 video: cuda: add explicit context creation for copy hwaccels
In the distant past, the cuviddec backed copy hwaccel could be
configured directly using lavc options. However, since that time,
we gained support for automatic hw ctx creation which ended up
bypassing the lavc options.

Rather than trying to find a way to pass those options again, a
better idea is to make the 'cuda-decode-device' option, used by
the interop hwaccels, work for the copy hwaccels too.

And that's pretty simple: we have to add a create function that
checks the option and passes it on to ffmpeg.

Note that this does require a slight re-jig to the configuration
flags, as we now have a scenario where we want to build with support
for the cuda copy hwaccels but not the interop ones. So we need
a distinct configuration flag for that combination.

Fixes #7295.
2019-12-29 14:32:47 -08:00
wm4 77fd4dd681 video: remove mp_image_params.hw_flags field
This was speculatively added 2 years ago in preparation for something
that apparently never happened. The D3D code was added as an "example",
but this too was never used/finished.

No reason to keep this.
2019-10-17 22:43:14 +02:00
wm4 76276c9210 video: rewrite filtering glue code
Get rid of the old vf.c code. Replace it with a generic filtering
framework, which can potentially handle more than just --vf. At least
reimplementing --af with this code is planned.

This changes some --vf semantics (including runtime behavior and the
"vf" command). The most important ones are listed in interface-changes.

vf_convert.c is renamed to f_swscale.c. It is now an internal filter
that can not be inserted by the user manually.

f_lavfi.c is a refactor of player/lavfi.c. The latter will be removed
once --lavfi-complex is reimplemented on top of f_lavfi.c. (which is
conceptually easy, but a big mess due to the data flow changes).

The existing filters are all changed heavily. The data flow of the new
filter framework is different. Especially EOF handling changes - EOF is
now a "frame" rather than a state, and must be passed through exactly
once.

Another major thing is that all filters must support dynamic format
changes. The filter reconfig() function goes away. (This sounds complex,
but since all filters need to handle EOF draining anyway, they can use
the same code, and it removes the mess with reconfig() having to predict
the output format, which completely breaks with libavfilter anyway.)

In addition, there is no automatic format negotiation or conversion.
libavfilter's primitive and insufficient API simply doesn't allow us to
do this in a reasonable way. Instead, filters can use f_autoconvert as
sub-filter, and tell it which formats they support. This filter will in
turn add actual conversion filters, such as f_swscale, to perform
necessary format changes.

vf_vapoursynth.c uses the same basic principle of operation as before,
but with worryingly different details in data flow. Still appears to
work.

The hardware deint filters (vf_vavpp.c, vf_d3d11vpp.c, vf_vdpaupp.c) are
heavily changed. Fortunately, they all used refqueue.c, which is for
sharing the data flow logic (especially for managing future/past
surfaces and such). It turns out it can be used to factor out most of
the data flow. Some of these filters accepted software input. Instead of
having ad-hoc upload code in each filter, surface upload is now
delegated to f_autoconvert, which can use f_hwupload to perform this.

Exporting VO capabilities is still a big mess (mp_stream_info stuff).

The D3D11 code drops the redundant image formats, and all code uses the
hw_subfmt (sw_format in FFmpeg) instead. Although that too seems to be a
big mess for now.

f_async_queue is unused.
2018-01-30 03:10:27 -08:00
wm4 2ce7face96
hwdec: remove unused fields
These were replaced by a different mechanism, but the old fields weren't
removed.
2017-12-21 19:31:36 +01:00
wm4 292724538c video: remove some more hwdec legacy stuff
Finally get rid of all the HWDEC_* things, and instead rely on the
libavutil equivalents. vdpau still uses a shitty hack, but fuck the
vdpau code.

Remove all the now unneeded remains. The vdpau preemption thing was not
unused anymore; if someone cares this could probably be restored.
2017-12-02 04:53:55 +01:00
wm4 23a9efd124 vd_lavc, vdpau, vaapi: restore emulated API avoidance
This code is for trying to avoid using an emulation layer when using
auto probing, so that we end up using the actual API the drivers
provide. It was destroyed in the recent refactor.
2017-12-02 04:53:51 +01:00
wm4 0780d38329 hwdec: don't require setting legacy hwdec fields
With the recent changes, mpv's internal mechanisms got synced to
libavcodec's once more. Some things are still needed for filters (until
the mechanism gets replaced), but there's no need to require other hwdec
methods to use these fields. So remove them where they are unnecessary.

Also fix some minor leaks in the dxva2 backends, and set the driver_name
field in the Apple ones. Untested on Apple crap.
2017-12-02 04:53:51 +01:00
wm4 627fa5aae0 hwdec: remove unused HWDEC_* constants
Removing the rest requires replacing some other mechanisms, later.
2017-12-01 22:09:38 +01:00
wm4 eb8957cea1 vd_lavc: rewrite how --hwdec is handled
Change it from explicit metadata about every hwaccel method to trying to
get it from libavcodec. As shown by add_all_hwdec_methods(), this is a
quite bumpy road, and a bit worse than expected.

This will probably cause a bunch of regressions. In particular I didn't
check all the strange decoder wrappers, which all cause some sort of
special cases each. You're volunteering for beta testing by using this
commit.

One interesting thing is that we completely get rid of mp_hwdec_ctx in
vd_lavc.c, and that HWDEC_* mostly goes away (some filters still use it,
and the VO hwdec interops still have a lot of code to set it up, so it's
not going away completely for now).
2017-12-01 21:11:43 +01:00
wm4 a0d9e15342 video: refactor hw device creation for hwdec copy modes
Lots of shit code for nothing. We probably could just use libavutil's
code for all of this. But for now go with this, since it tends to
prevent stupid terminal messages during probing (libavutil has no
mechanism to selectively suppress errors specifically during probing).

Ignores the "emulated" API flag (for avoiding vaapi/vdpau wrappers), but
it doesn't matter that much for -copy anyway.
2017-12-01 08:05:16 +01:00
wm4 c5fac0c2b0 vd_lavc: move entrypoint for hwframes_refine
The idea is to get rid of vd_lavc_hwdec, so special functionality like
this has to go somewhere else. At this point, hwframes_refine is only
needed for d3d11, and it doesn't do much, so for now the new callback
has no context. In can be made more fancy if really needed.
2017-12-01 08:01:41 +01:00
wm4 91586c3592 vo_gpu: make it possible to load multiple hwdec interop drivers
Make the VO<->decoder interface capable of supporting multiple hwdec
APIs at once. The main gain is that this simplifies autoprobing a lot.
Before this change, it could happen that the VO loaded the "wrong" hwdec
API, and the decoder was stuck with the choice (breaking hw decoding).
With the change applied, the VO simply loads all available APIs, so
autoprobing trickery is left entirely to the decoder.

In the past, we were quite careful about not accidentally loading the
wrong interop drivers. This was in part to make sure autoprobing works,
but also because libva had this obnoxious bug of dumping garbage to
stderr when using the API. libva was fixed, so this is not a problem
anymore.

The --opengl-hwdec-interop option is changed in various ways (again...),
and renamed to --gpu-hwdec-interop. It does not have much use anymore,
other than debugging. It's notable that the order in the hwdec interop
array ra_hwdec_drivers[] still matters if multiple drivers support the
same image formats, so the option can explicitly force one, if that
should ever be necessary, or more likely, for debugging. One example are
the ra_hwdec_d3d11egl and ra_hwdec_d3d11eglrgb drivers, which both
support d3d11 input.

vo_gpu now always loads the interop lazily by default, but when it does,
it loads them all. vo_opengl_cb now always loads them when the GL
context handle is initialized. I don't expect that this causes any
problems.

It's now possible to do things like changing between vdpau and nvdec
decoding at runtime.

This is also preparation for cleaning up vd_lavc.c hwdec autoprobing.
It's another reason why hwdec_devices_request_all() does not take a
hwdec type anymore.
2017-12-01 05:57:01 +01:00
wm4 6b745769b1 vd_lavc: add support for nvdec hwaccel
See manpage additions.

(In ffmpeg-mpv and Libav, this is still called "cuvid". Libav won't work
yet, because it has no frame params support yet, but this could get
fixed soon.)
2017-10-28 19:59:08 +02:00
Lionel CHAZALLON cfcee4cfe7 Add DRM_PRIME Format Handling and Display for RockChip MPP decoders
This commit allows to use the AV_PIX_FMT_DRM_PRIME newly introduced
format in ffmpeg that allows decoders to provide an AVDRMFrameDescriptor
struct.

That struct holds dmabuf fds and information allowing zerocopy rendering
using KMS / DRM Atomic.

This has been tested on RockChip ROCK64 device.
2017-10-23 21:07:24 +02:00
wm4 ddfccd67d5 video: remove special path for hwdec screenshots
This was phased out, and was used only by vdpau by now. Drop the
mechanism and the vdpau special code, which means screenshots won't
include the vf_vdpaupp processing anymore. (I don't care enough about
vdpau, it's on its way out.)
2017-10-16 17:07:35 +02:00
wm4 9e140775bc video: fix previous commit
Sigh.
2017-10-16 17:06:01 +02:00
wm4 1ff6a1c8c7 video: make previously added hwdec params mechanism more generic
The mechanism introduced in b135af6842 assumed AVHWFramesContext would
be enough. Apparently it's not - the intended use with Rockchip (not
Rokchip btw.) requires accessing actual frame data in order to access
the AVDRMFrameDescriptor struct.

Just pass the entire mp_image to the new function. This is more
flexible, although it slightly worries me that it will be less reusable
for things which require setting up mp_image_params before any real
frames are processed (such as filters).
2017-10-16 17:00:38 +02:00
wm4 b135af6842 video: add mp_image_params.hw_flags and add an example
It seems this will be useful for Rokchip DRM hwcontext integration.

DRM hwcontexts have additional internal structure which can be different
depending on the decoder, and which is not part of the generic hwcontext
API. Rockchip has 1 layer, which EGL interop happens to translate to a
RGB texture, while VAAPI (mapped as DRM hwcontext) will use multiple
layers. Both will use sw_format=nv12, and thus are indistinguishable on
the mp_image_params level. But this is needed to initialize the EGL
mapping and the vo_gpu video renderer correctly.

We hope that the layer count is enough to tell whether EGL will
translate the data to a RGB texture (vs. 2 texture resembling raw nv12
data). For that we introduce MP_IMAGE_HW_FLAG_OPAQUE.

This commit adds the flag, infrastructure to set it, and an "example"
for D3D11.

The D3D11 addition is quite useless at this point. But later we want to
get rid of d3d11_update_image_attribs() anyway, while we still need a
way to force d3d11vpp filter insertion, so maybe it has some
justification (who knows). In any case it makes testing this easier.
Obviously it also adds some basic support for triggering the opaque
format for decoding, which will use a driver-specific format, but which
is not supported in shaders. The opaque flag is not used to determine
whether d3d11vpp needs to be inserted, though.
2017-10-16 15:02:12 +02:00
Aman Gupta 61a1612de9 hwdec: add mediacodec hardware decoder for IMGFMT_MEDIACODEC frames 2017-10-09 18:36:54 +02:00
Aman Gupta d08e407c9e hwdec: rename mediacodec to mediacodec-copy 2017-10-09 18:36:54 +02:00
wm4 2a04906063 hwdec: fix 2 comments
The first is outdated, the second was always wrong.
2017-05-24 14:32:23 +02:00
wm4 7aa070e1cb vdpau: crappy hack to allow initializing hw decoding after preemption
If vo_opengl is used, and vo_opengl already created the vdpau interop
(for whatever reasons), and then preemption happens, and then you try to
enable hw decoding, it failed. The reason was that preemption recovery
is not run at any point before libavcodec accesses the vdpau device.

The actual impact was that with libmpv + opengl-cb use, hardware
decoding was permanently broken after display mode switching (something
that caused the display to get preempted at least with older drivers).
With mpv CLI, you can for example enable hw decoding during playback,
then disable it, VT switch to console, switch back to X, and try to
enable hw decoding again.

This is mostly because libav* does not deal with preemption, and NVIDIA
driver preemption behavior being horrible garbage. In addition to being
misdesigned API, the preemption callback is not called before you try to
access vdpau API, and then only with _some_ accesses.

In summary, the preemption callback was never called, neither before nor
after libavcodec tried to init the decoder. So we have to get
mp_vdpau_handle_preemption() called before libavcodec accesses it. This
in turn will do a dummy API access which usually triggers the preemption
callback immediately (with NVIDIA's drivers).

In addition, we have to update the AVHWDeviceContext's device. In theory
it could change (in practice it usually seems to use handle "0").
Creating a new device would cause chaos, as we don't have a concept of
switching the device context on the fly. So we simply update it
directly. I'm fairly sure this violates the libav* API, but it's the
best we can do.
2017-05-19 15:24:38 +02:00
wm4 00d74a509a video: fix a typo in a comment 2017-03-23 11:16:02 +01:00
wm4 6aa4efd1e3 vd_lavc, vaapi: move hw device creation to generic code
hw_vaapi.c didn't do much interesting anymore. Other than the function
to create a device for decoding with vaapi-copy, everything can be done
by generic code. Other libavcodec hwaccels are planned to provide the
same API as vaapi. It will be possible to drop the other hw_ files in
the future. They will use this generic code instead.
2017-02-20 08:39:55 +01:00
wm4 2b5577901d videotoolbox: remove weird format-negotiation between VO and decoder
Originally, there was probably some sort of intention to restrict it to
formats supported by the interop, or something. But in the end it was
overcomplicated nonsense.

In the future, we could use mp_hwdec_ctx.supported_formats or other
mechanisms to handle this in a better way.

mp_hwdec_ctx.ctx is not set to a dummy pointer - hwdec_devices_load() is
only used to detect whether to vo_opengl interop is present, and the
common hwdec code expects that the .ctx field is not NULL.

This also changes videotoolbox-copy to use --videotoolbox-format,
instead of the FFmpeg-set default.
2017-02-17 14:14:22 +01:00
wm4 bbdecb792a hwdec: add a AVBufferRef(AVHWDeviceContext) field
This makes "generic" interaction with libav* components easier.
2017-01-16 16:10:22 +01:00
wm4 812128bab7 vo_opengl, vaapi: properly probe 10 bit rendering support
There are going to be users who have a Mesa installation which do not
support 10 bit, but a GPU which can decode to 10 bit. So it's probably
better not to hardcode whether it is supported.

Introduce a more general way to signal supported formats from renderer
to decoder. Obviously this is imperfect, because it still isn't part of
proper format negotation (for example, what if there's a vavpp filter,
which accepts anything). Still slightly better than before.

I don't know any way to probe for vaapi dmabuf/EGL dmabuf support
properly (in particular testing specific formats, not just general
availability). So we stay with the current approach and try to create
and map dummy surfaces on init to probe for support. Overdo it and check
all formats that AVHWFramesConstraints reports, instead of only NV12 and
P010 surfaces.

Since we can support unknown formats now, add explicitly checks to the
EGL/dmabuf mapper code to reject unsupported formats. I also noticed
that libavutil signals support for RGB0/BGR0, but couldn't get it to
work. Remove the DRM formats that are unused/didn't work the way I tried
to use them.

With this, 10 bit decoding + rendering should work, provided you have
a capable CPU and a patched Mesa. The required Mesa patch adds support
for the R16 and GR32 formats. It was sent by a Kodi developer to the
Mesa developer mailing list and was not accepted yet.
2017-01-13 18:43:35 +01:00
Philip Langdale f5e82d5ed3 vo_opengl: hwdec_cuda: Use dynamic loading for cuda functions
This change applies the pattern used in ffmpeg to dynamically load
cuda, to avoid requiring the CUDA SDK at build time.
2016-11-23 01:07:26 +01:00
wm4 6b18d4dba5 video: add --hwdec=vdpau-copy mode
At this point, all other hwaccels provide -copy modes, and vdpau is the
exception with not having one. Although there is vf_vdpaurb, it's less
convenient in certain situations, and exposes some issues with the
filter chain code as well.
2016-10-20 16:43:02 +02:00
Philip Langdale 0a81fe1cf9 vd_lavc: Add hwdec wrapper for crystalhd
This hardware decodes to system memory so it only requires a wrapper.
2016-10-15 17:44:23 +02:00
wm4 7e6456f43a rpi: add --hwdec=rpi-copy
This means it can be used with normal video filters.

Might help out with #3604.
2016-09-30 13:05:30 +02:00
Philip Langdale 3f7e43c2e2 hwdec_cuda: Add trivial cuda-copy wrapper
The cuvid decoder already knows how to copy back to system memory
if NV12 frames are requested, and this will happen if the decoder
is used without the hwdec.

For convenience, let's add a wrapper hwdec so people don't have
to explicitly pick the cuvid decoder if they want this behaviour.
2016-09-11 10:46:22 +02:00
Philip Langdale 2048ad2b8a hwdec/opengl: Add support for CUDA and cuvid/NvDecode
Nvidia's "NvDecode" API (up until recently called "cuvid" is a cross
platform, but nvidia proprietary API that exposes their hardware
video decoding capabilities. It is analogous to their DXVA or VDPAU
support on Windows or Linux but without using platform specific API
calls.

As a rule, you'd rather use DXVA or VDPAU as these are more mature
and well supported APIs, but on Linux, VDPAU is falling behind the
hardware capabilities, and there's no sign that nvidia are making
the investments to update it.

Most concretely, this means that there is no VP8/9 or HEVC Main10
support in VDPAU. On the other hand, NvDecode does export vp8/9 and
partial support for HEVC Main10 (more on that below).

ffmpeg already has support in the form of the "cuvid" family of
decoders. Due to the design of the API, it is best exposed as a full
decoder rather than an hwaccel. As such, there are decoders like
h264_cuvid, hevc_cuvid, etc.

These decoders support two output paths today - in both cases, NV12
frames are returned, either in CUDA device memory or regular system
memory.

In the case of the system memory path, the decoders can be used
as-is in mpv today with a command line like:

mpv --vd=lavc:h264_cuvid foobar.mp4

Doing this will take advantage of hardware decoding, but the cost
of the memcpy to system memory adds up, especially for high
resolution video (4K etc).

To avoid that, we need an hwdec that takes advantage of CUDA's
OpenGL interop to copy from device memory into OpenGL textures.

That is what this change implements.

The process is relatively simple as only basic device context
aquisition needs to be done by us - the CUDA buffer pool is managed
by the decoder - thankfully.

The hwdec looks a bit like the vdpau interop one - the hwdec
maintains a single set of plane textures and each output frame
is repeatedly mapped into these textures to pass on.

The frames are always in NV12 format, at least until 10bit output
supports emerges.

The only slightly interesting part of the copying process is that
CUDA works by associating PBOs, so we need to define these for
each of the textures.

TODO Items:
* I need to add a download_image function for screenshots. This
  would do the same copy to system memory that the decoder's
  system memory output does.
* There are items to investigate on the ffmpeg side. There appears
  to be a problem with timestamps for some content.

Final note: I mentioned HEVC Main10. While there is no 10bit output
support, NvDecode can return dithered 8bit NV12 so you can take
advantage of the hardware acceleration.

This particular mode requires compiling ffmpeg with a modified
header (or possibly the CUDA 8 RC) and is not upstream in ffmpeg
yet.

Usage:

You will need to specify vo=opengl and hwdec=cuda.

Note that hwdec=auto will probably not work as it will try to use
vdpau first.

mpv --hwdec=cuda --vo=opengl foobar.mp4

If you want to use filters that require frames in system memory,
just use the decoder directly without the hwdec, as documented
above.
2016-09-08 16:06:12 +02:00
Aman Gupta 588b2f48e5 videotoolbox: add --hwdec=videotoolbox-copy for h/w accelerated decoding with video filters 2016-07-15 01:01:17 +02:00
wm4 70b3561270 video: add --hwdec=auto-copy mode
This uses the normal autoprobing rules like "auto", but rejects anything
that isn't flagged as copying data back to system memory.

The chunk in command.c was dead code, so remove it instead of updating
it.
2016-05-11 16:20:13 +02:00
wm4 46fff8d31a video: refactor how VO exports hwdec device handles
The main change is with video/hwdec.h. mp_hwdec_info is made opaque (and
renamed to mp_hwdec_devices). Its accessors are mainly thread-safe (or
documented where not), which makes the whole thing saner and cleaner. In
particular, thread-safety rules become less subtle and more obvious.

The new internal API makes it easier to support multiple OpenGL interop
backends. (Although this is not done yet, and it's not clear whether it
ever will.)

This also removes all the API-specific fields from mp_hwdec_ctx and
replaces them with a "ctx" field. For d3d in particular, we drop the
mp_d3d_ctx struct completely, and pass the interfaces directly.

Remove the emulation checks from vaapi.c and vdpau.c; they are
pointless, and the checks that matter are done on the VO layer.

The d3d hardware decoders might slightly change behavior: dxva2-copy
will not use the VO device anymore if the VO supports proper interop.
This pretty much assumes that any in such cases the VO will not use any
form of exclusive mode, which makes using the VO device in copy mode
unnecessary.

This is a big refactor. Some things may be untested and could be broken.
2016-05-09 20:03:22 +02:00
wm4 833375f88d command: change some hwdec properties
Introduce hwdec-current and hwdec-interop properties.

Deprecate hwdec-detected, which never made a lot of sense, and which is
replaced by the new properties. hwdec-active also becomes useless, as
hwdec-current is a superset, so it's deprecated too (for now).
2016-05-04 16:55:26 +02:00
wm4 3706918311 vo_opengl: D3D11VA + ANGLE interop
This uses ID3D11VideoProcessor to convert the video to a RGBA surface,
which is then bound to ANGLE. Currently ANGLE does not provide any way
to bind nv12 surfaces directly, so this will have to do.

ID3D11VideoContext1 would give us slightly more control about the
colorspace conversion, though it's still not good, and not available
in MinGW headers yet.

The video processor is created lazily, because we need to have the coded
frame size, of which AVFrame and mp_image have no concept of. Doing the
creation lazily is less of a pain than somehow hacking the coded frame
size into mp_image.

I'm not really sure how ID3D11VideoProcessorInputView is supposed to
work. We recreate it on every frame, which is simple and hopefully
doesn't affect performance.
2016-04-27 13:49:47 +02:00
wm4 cf9b415173 hwdec: remove numbers from enum
They don't actually mean anything.

Just HWDEC_NONE should remain 0, because it's the default init value for
structs etc.
2016-04-27 13:37:55 +02:00
wm4 f033481551 videotoolbox: change how videotoolbox format is managed
The underlying intention of this code is to make changing
--videotoolbox-format at runtime work. For this reason, the format can't
just be statically setup, but must be read from the option at runtime.

This means the format is not fixed anymore, and we have to make sure the
renderer is property reinitialized if the format changes. There is
currently no way to trigger reinit on this level, which is why the
mp_image_params.hw_subfmt field was introduced.

One sketchy thing remains: normally, the renderer is supposed to be
involved with VO format negotiation, which would ensure that the VO
can take the format at all. Since the hw_subfmt is not part of this
format negotiation, it's implied the get_vt_fmt() callback only
returns formats supported by the renderer. This is not necessarily
clear because vo_opengl checks this with converted_imgfmt separately.
None of this matters in practice though, because we know all formats
are always supported.

(This still requires somehow triggering decoder reinit to make the
change effective.)
2016-04-07 19:54:58 +02:00
Kevin Mitchell a7110862c8 vd_lavc: add d3d11va hwdec
This commit adds the d3d11va-copy hwdec mode using the ffmpeg d3d11va
api. Functions in common with dxva2 are handled in a separate decode/d3d.c
file. A future commit will rewrite decode/dxva2.c to share this code.
2016-03-30 09:01:27 -07:00
Jan Ekström 5843392db5 Add a mediacodec decoder hwdec wrapper
Does the same thing as the rpi one - makes fallback possible by
pretending that h264_mediacodec is a hwdec.
2016-03-25 21:35:59 +01:00
Kevin Mitchell 084162d6fe dxva2: add interop (non-copyback) hwdec_type
This always falls back to software decoding right now. VO support will be added
in future commits.
2016-02-17 06:59:02 -08:00
wm4 1dd7b7bddc video: remove VDA support
VideoToolbox is preferred. Now that FFmpeg released 2.8, there's no
reason to support VDA anymore. In fact, we had a bug that made VDA not
useable with older FFmpeg versions in some newer mpv releases.

VideoToolbox is supported even on slightly older OSX versions, and if
not, you still can run mpv without hw decoding.
2015-09-28 22:03:14 +02:00