Commit Graph

33 Commits

Author SHA1 Message Date
wm4 710872bc22 vaapi: remove dependency on X11
There are at least 2 ways of using VAAPI without X11 (Wayland, DRM).
Remove the X11 requirement from the decoder part and the EGL interop.
This will be used by a following commit, which adds Wayland support.

The worst about this is the decoder part, which includes a bad hack for
using the decoder without any VO interop (also known as "vaapi-copy"
mode). Separate the X11 parts so that they're self-contained. For the
EGL interop code we do something similar (it's kept slightly simpler,
because it essentially only has to translate between our silly
MPGetNativeDisplay abstraction and the vaGetDisplay...() call).
2015-09-27 21:33:15 +02:00
wm4 2172c22ee3 vaapi: add HEVC profile entries
libavcodec does not support HEVC via VAAPI yet, so this won't work.
However, there is ongoing work to add HEVC support to VAAPI, and this
change might help with testing. (Or maybe not - but there is no harm in
this change.)
2015-08-24 23:00:45 +02:00
wm4 bffd78748f vd_lavc: remove unneeded hwdec parameters
All hwdec backends now use a single pixel format, and the format is
always checked.

Also, the init_decoder callback is now mandatory.
2015-08-19 21:33:18 +02:00
wm4 d1c37c0e29 vaapi: allow allocating additional surfaces during decoding
Fixes problems with --vo=opengl:interpolation. The issue here is that
vo_opengl retains more surfaces than what was preallocated for the
decoder. Until now, we just explicitly failed to decode frames for which
no additional surfaces are available. Since modern drivers usually are
fine with not "registering" surfaces before the decoder is created, just
allow allocating additional surfaces if needed.

(We also could probably recreate the HW decoder, since the HW decoder
should be stateless. But let's try to avoid raising the overall
complexity of the code.)
2015-07-15 12:37:28 +02:00
wm4 9620e37d6a vaapi: increase number of additional surfaces
Sometime recently, hardware decoding started to fail if h264 with full
reference frames was decoded, and --vo=vaapi was used. VAAPI requires
registering all surfaces that the decoder will ever use in advance, so
if the playback chain uses more surfaces than originally allocated, we
fail and drop back to software decoding.

I'm not really sure why or when this started happening. Commit 7b9d7265
for one is not the cause - it can be reproduced with earlier commits. It
also seems to be timing dependent. Possibly it has to do with the way
vo.c retains previous surfaces, and the way they can be queued/unqueued
asynchronously.

Increasing the number of reserved additional surfaces by 1 fixes it.

(Though I have no idea where exactly all these surfaces are being used.
Or rather, _when_.)
2015-07-08 12:18:29 +02:00
wm4 991af7dfb1 video: reduce error message when loading hwdec backend fails
When using --hwdec=auto, about half of all systems will print:

    "[vdpau] Error when calling vdp_device_create_x11: 1"

this happens because usually mpv will be linked against both vdpau and
vaapi libs, but the drivers are not necessarily available. Then trying
to load a driver will fail. This is a normal part of probing, but the
error messages were printed anyway. Silence them by explicitly
distinguishing probing.

This pretty much goes through all the layers. We actually consider
loading hw backends for vo_opengl always "auto probed", even if a hw
backend is explicitly requested. In this case vd_lavc will print a
warning message anyway (adjust this message a bit).
2015-06-20 22:26:57 +02:00
wm4 d3894290ec vaapi: remove direct mapping non-sense
This must have been some non-sense in the original vaapi mplayer patch.
While I still have no good idea what this "direct mapping" business is
about, it appears to be pretty much pointless. Nothing can hold
additional "real" surface references (due to how the API and mpv/lavc
refcounting work), so removing the additional surfaces won't break
anything. It still could be that this was for achieving additional
buffering (not reusing surfaces as soon), but we buffer some additional
data anyway. Plus, the original intention of the vaapi mplayer code was
probably increasing surface count just by 1 or 2, not actually doubling
it, and/or it was a "trick" to get to the maximum count of 21 when h264
is in use.

gstreamer-vaapi uses "ref_frames + SCRATCH_SURFACES_COUNT" here, with
SCRATCH_SURFACES_COUNT defined to 4. It doesn't appear to check the
overlay attributes at all in the decoder.

In any case, remove this non-sense.
2015-05-29 16:28:55 +02:00
wm4 d71bbcbc98 video: un-discourage "vaapi-copy" hwdec mode
Maybe I don't know what I'm doing. I'm fairly certain though that Intel
does not know what they're doing.
2015-02-20 22:24:37 +01:00
wm4 aae9af348e video: have a generic context struct for hwdec backends
Before this commit, each hw backend had their own specific struct types
for context, and some, like VDA, had none at all. Add a context struct
(mp_hwdec_ctx) that provides a somewhat generic way to pass the hwdec
context around. Some things get slightly better, some slightly more
verbose.

mp_hwdec_info is still around; it's still needed, but is reduced to its
role of handling delayed loading of the hwdec backend.
2015-01-22 15:32:23 +01:00
wm4 f1e78306cb vaapi: try dealing with Intel's braindamaged shit drivers
So talking to a certain Intel dev, it sounded like modern VA-API drivers
are reasonable thread-safe. But apparently that is not the case. Not at
all. So add approximate locking around all vaapi API calls.

The problem appeared once we moved decoding and display to different
threads. That means the "vaapi-copy" mode was unaffected, but decoding
with vo_vaapi or vo_opengl lead to random crashes.

Untested on real Intel hardware. With the vdpau emulation, it seems to
work fine - but actually it worked fine even before this commit, because
vdpau was written and designed not by morons, but competent people
(vdpau is guaranteed to be fully thread-safe).

There is some probability that this commit doesn't fix things entirely.
One problem is that locking might not be complete. For one, libavcodec
_also_ accesses vaapi, so we have to rely on our own guesses how and
when lavc uses vaapi (since we disable multithreading when doing hw
decoding, our guess should be relatively good, but it's still a lavc
implementation detail). One other reason that this commit might not
help is Intel's amazing potential to fuckup anything that is good and
holy.
2014-08-21 22:45:58 +02:00
wm4 7520d39e8b vaapi: we need more surfaces
Playing with high framedrop could make it run out of surfaces. In
theory, we wouldn't need an additional surface, if we could just clear
the vo_vaapi internal surface - but doing so would probably be a pain,
so I don't care.
2014-08-18 22:59:01 +02:00
wm4 df58e82237 video: move display and timing to a separate thread
The VO is run inside its own thread. It also does most of video timing.
The playloop hands the image data and a realtime timestamp to the VO,
and the VO does the rest.

In particular, this allows the playloop to do other things, instead of
blocking for video redraw. But if anything accesses the VO during video
timing, it will block.

This also fixes vo_sdl.c event handling; but that is only a side-effect,
since reimplementing the broken way would require more effort.

Also drop --softsleep. In theory, this option helps if the kernel's
sleeping mechanism is too inaccurate for video timing. In practice, I
haven't ever encountered a situation where it helps, and it just burns
CPU cycles. On the other hand it's probably actively harmful, because
it prevents the libavcodec decoder threads from doing real work.

Side note:

Originally, I intended that multiple frames can be queued to the VO. But
this is not done, due to problems with OSD and other certain features.
OSD in particular is simply designed in a way that it can be neither
timed nor copied, so you do have to render it into the video frame
before you can draw the next frame. (Subtitles have no such restriction.
sd_lavc was even updated to fix this.) It seems the right solution to
queuing multiple VO frames is rendering on VO-backed framebuffers, like
vo_vdpau.c does. This requires VO driver support, and is out of scope
of this commit.

As consequence, the VO has a queue size of 1. The existing video queue
is just needed to compute frame duration, and will be moved out in the
next commit.
2014-08-12 23:24:08 +02:00
wm4 056622c33e vaapi: fix uninitialized value read
Found with valgrind. This is somewhat terrifying, because the VA-API API
function is supposed to fill these values, and we access them only if
the API functions return success. So this shouldn't have happened.
2014-08-11 21:57:41 +02:00
wm4 b442b522f6 vaapi: fix destruction with --hwdec=haapi-copy
This is incomplete; the video chain will still hold some vaapi objects
after destroying the decoder and thus the vaapi context. This is very
bad. Fixing it would require something like refcounting the vaapi
context, but I don't really want to.
2014-05-28 02:08:45 +02:00
wm4 d99f30d726 video: warn if an emulated hwdec API is used
mpv supports two hardware decoding APIs on Linux: vdpau and vaapi. Each
of these has emulation wrappers. The wrappers are usually slower and
have fewer features than their native opposites. In particular the libva
vdpau driver is practically unmaintained.

Check the vendor string and print a warning if emulation is detected.
Checking vendor strings is a very stupid thing to do, but I find the
thought of people using an emulated API for no reason worse.

Also, make --hwdec=auto never use an API that is detected as emulated.
This doesn't work quite right yet, because once one API is loaded,
vo_opengl doesn't unload it, so no hardware decoding will be used if the
first probed API (usually vdpau) is rejected. But good enough.
2014-05-28 02:08:45 +02:00
wm4 49d13f76ca vaapi: make struct va_surface private
It's not really needed to be public. Other code can just use mp_image.
The only disadvantage is that the other code needs to call an accessor
to get the VASurfaceID.
2014-03-17 18:22:35 +01:00
wm4 31fc5e8563 vaapi: replace image pool implementation with mp_image_pool
Although I at first thought it would be better to have a separate
implementation for hwaccels because the difference to software images
are too large, it turns out you can actually save some code with it.

Note that the old implementation had a small memory management bug. This
got painted over in commit 269c1e1, but is hereby solved properly.

Also note that I couldn't test vf_vavpp.c (due to lack of hardware), and
I hope I didn't accidentally break it.
2014-03-17 18:22:25 +01:00
wm4 3ec7f528c4 vd_lavc: remove compatibility crap
All this code was needed for compatibility with very old libavcodec
versions only (such as Libav 9).

Includes some now-possible simplifications too.
2014-03-16 13:19:19 +01:00
wm4 ccce58d6d6 video: initialize hw decoder in get_format
Apparently the "right" place to initialize the hardware decoder is in
the libavcodec get_format callback.

This doesn't change vda.c and vdpau_old.c, because I don't have OSX, and
vdpau_old.c is probably going to be removed soon (if Libav ever manages
to release Libav 10). So for now the init_decoder callback added with
this commit is optional.

This also means vdpau.c and vaapi.c don't have to manage and check the
image parameters anymore.

This change is probably needed for when libavcodec VDA supports gets a
new iteration of its API.
2014-03-10 22:56:26 +01:00
wm4 70af7ab8e5 vaapi: mp_msg conversions
This ended up a little bit messy. In order to get a mp_log everywhere,
mostly make use of the fact that va_surface already references global
state anyway.
2013-12-21 20:50:11 +01:00
wm4 0112143fda Split mpvcore/ into common/, misc/, bstr/ 2013-12-17 02:39:45 +01:00
wm4 60cd300558 vaapi: remove unused hw image formats, simplify
PIX_FMT_VDA_VLD and PIX_FMT_VAAPI_VLD were never used anywhere. I'm not
sure why they were even added, and they sound like they are just for
compatibility with XvMC-style decoding, which sucks anyway.

Now that there's only a single vaapi format, remove the
IMGFMT_IS_VAAPI() macro. Also get rid of IMGFMT_IS_VDA(), which was
unused.
2013-11-29 14:19:29 +01:00
wm4 4fa2babacc video: move struct mp_hwdec_info into its own header file
This means most code accessing this struct must now include hwdec.h
instead of dec_video.h. I just put it into dec_video.h at first because
I thought a separate file would be a waste, but it's more proper to do
it this way, as there are too many files which include dec_video.h only
to get the mp_hwdec_info definition.
2013-11-23 21:26:31 +01:00
wm4 2d58fb3b8e vo_opengl: add support for VA-API OpenGL interop
VA-API's OpenGL/GLX interop is pretty bad and perhaps slow (renders a
X11 pixmap into a FBO, and has to go over X11, probably involves one or
more copies), and this code serves more as an example, rather than for
serious use. On the other hand, this might be work much better than
vo_vaapi, even if slightly slower.
2013-11-04 00:11:43 +01:00
wm4 24897eb94c video: check profiles with hardware decoding
We had some code for checking profiles earlier, which was removed in
commits 2508f38 and adfb71b. These commits mentioned that (working) hw
decoding was sometimes prevented due to profile checking, but I can't
find the samples anymore that showed this behavior. Also, I changed my
opinion, and I think checking the profiles is something that should be
done for better fallback to software decoding behavior.

The checks roughly follow VLC's vdpau profile checks, although we do
not check codec levels. (VLC's profile checks aren't necessarily
completely correct, but they're a welcome help anyway.)

Add a --vd-lavc-check-hw-profile option, which skips the profile check.
2013-11-01 17:33:33 +01:00
wm4 5b3ae5aaac vaapi: remove non-VLD entrypoints
These probably don't work. libavcodec doesn't seem to support them,
and neither did the original mplayer-vaapi patch.
2013-09-29 13:52:09 +02:00
wm4 60e7926248 vaapi: fix non-sense condition
Attempting signed comparison on unsigned value.
2013-09-29 13:45:49 +02:00
wm4 4d2f354da6 vaapi: potentially make reading surfaces back to system RAM faster
Don't allocate a VAImage and a mp_image every time. VAImage are cached
in the surfaces themselves, and for mp_image an explicit pool is
created. The retry loop runs only once for each surface now.

This also makes use of vaDeriveImage() if possible.
2013-09-27 17:59:44 +02:00
wm4 641e94cd27 vaapi: allow GPU read-back with --hwdec=vaapi-copy
This code is actually quite inefficient: it reuses the (slow, simple)
screenshot code. It uses an inefficient method to read the image
(vaGetImage() instead of vaDeriveImage()), allocates new memory for
each frame that is read, and it tries all image formats again each
time.

Also, in my tests it always picked NV12 as image format, which is not
ideal if you actually want to filter the video, and vo_xv can't handle
this format without conversion either.

However, a user confirmed that it worked for him, so everything is fine.
2013-09-25 13:53:42 +02:00
xylosper 39d1ab82e5 vaapi: add vf_vavpp and use it for deinterlacing
Merged from pull request #246 by xylosper. Minor cosmetic changes, some
adjustments (compatibility with older libva versions), and manpage
additions by wm4.

Signed-off-by: wm4 <wm4@nowhere>
2013-09-25 13:53:42 +02:00
wm4 2508f38a92 vaapi: use highest available profile, instead of mapping it exactly
Now the code does the same as the original MPlayer VAAPI patch, instead
of trying to map the profiles exactly.

See previous commit for justification and discussion.
2013-08-19 01:05:48 +02:00
wm4 0da9638576 video/decode: pass parameters directly to hwdec allocate_image callback
Instead of passing AVFrame. This also moves the mysterious logic about
the size of the allocated image to common code, instead of duplicating
it everywhere.
2013-08-15 23:40:02 +02:00
wm4 2827295703 video: add vaapi decode and output support
This is based on the MPlayer VA API patches. To be exact it's based on
a very stripped down version of commit f1ad459a263f8537f6c from
git://gitorious.org/vaapi/mplayer.git.

This doesn't contain useless things like benchmarking hacks and the
demo code for GLX interop. Also, unlike in the original patch, decoding
and video output are split into separate source files (the separation
between decoding and display also makes pixel format hacks unnecessary).

On the other hand, some features not present in the original patch were
added, like screenshot support.

VA API is rather bad for actual video output. Dealing with older libva
versions or the completely broken vdpau backend doesn't help. OSD is
low quality and should be rather slow. In some cases, only either OSD
or subtitles can be shown at the same time (because OSD is drawn first,
OSD is prefered).

Also, libva can't decide whether it accepts straight or premultiplied
alpha for OSD sub-pictures: the vdpau backend seems to assume
premultiplied, while a native vaapi driver uses straight. So I picked
straight alpha. It doesn't matter much, because the blending code for
straight alpha I added to img_convert.c is probably buggy, and ASS
subtitles might be blended incorrectly.

Really good video output with VA API would probably use OpenGL and the
GL interop features, but at this point you might just use vo_opengl.
(Patches for making HW decoding with vo_opengl have a chance of being
accepted.)

Despite these issues, decoding seems to work ok. I still got tearing
on the Intel system I tested (Intel(R) Core(TM) i3-2350M). It was also
tested with the vdpau vaapi wrapper on a nvidia system; however this
was rather broken. (Fortunately, there is no reason to use mpv's VAAPI
support over native VDPAU.)
2013-08-12 01:12:02 +02:00