So talking to a certain Intel dev, it sounded like modern VA-API drivers
are reasonable thread-safe. But apparently that is not the case. Not at
all. So add approximate locking around all vaapi API calls.
The problem appeared once we moved decoding and display to different
threads. That means the "vaapi-copy" mode was unaffected, but decoding
with vo_vaapi or vo_opengl lead to random crashes.
Untested on real Intel hardware. With the vdpau emulation, it seems to
work fine - but actually it worked fine even before this commit, because
vdpau was written and designed not by morons, but competent people
(vdpau is guaranteed to be fully thread-safe).
There is some probability that this commit doesn't fix things entirely.
One problem is that locking might not be complete. For one, libavcodec
_also_ accesses vaapi, so we have to rely on our own guesses how and
when lavc uses vaapi (since we disable multithreading when doing hw
decoding, our guess should be relatively good, but it's still a lavc
implementation detail). One other reason that this commit might not
help is Intel's amazing potential to fuckup anything that is good and
holy.
mpv supports two hardware decoding APIs on Linux: vdpau and vaapi. Each
of these has emulation wrappers. The wrappers are usually slower and
have fewer features than their native opposites. In particular the libva
vdpau driver is practically unmaintained.
Check the vendor string and print a warning if emulation is detected.
Checking vendor strings is a very stupid thing to do, but I find the
thought of people using an emulated API for no reason worse.
Also, make --hwdec=auto never use an API that is detected as emulated.
This doesn't work quite right yet, because once one API is loaded,
vo_opengl doesn't unload it, so no hardware decoding will be used if the
first probed API (usually vdpau) is rejected. But good enough.
It's not really needed to be public. Other code can just use mp_image.
The only disadvantage is that the other code needs to call an accessor
to get the VASurfaceID.
Although I at first thought it would be better to have a separate
implementation for hwaccels because the difference to software images
are too large, it turns out you can actually save some code with it.
Note that the old implementation had a small memory management bug. This
got painted over in commit 269c1e1, but is hereby solved properly.
Also note that I couldn't test vf_vavpp.c (due to lack of hardware), and
I hope I didn't accidentally break it.
This ended up a little bit messy. In order to get a mp_log everywhere,
mostly make use of the fact that va_surface already references global
state anyway.
Don't allocate a VAImage and a mp_image every time. VAImage are cached
in the surfaces themselves, and for mp_image an explicit pool is
created. The retry loop runs only once for each surface now.
This also makes use of vaDeriveImage() if possible.
Merged from pull request #246 by xylosper. Minor cosmetic changes, some
adjustments (compatibility with older libva versions), and manpage
additions by wm4.
Signed-off-by: wm4 <wm4@nowhere>
This is based on the MPlayer VA API patches. To be exact it's based on
a very stripped down version of commit f1ad459a263f8537f6c from
git://gitorious.org/vaapi/mplayer.git.
This doesn't contain useless things like benchmarking hacks and the
demo code for GLX interop. Also, unlike in the original patch, decoding
and video output are split into separate source files (the separation
between decoding and display also makes pixel format hacks unnecessary).
On the other hand, some features not present in the original patch were
added, like screenshot support.
VA API is rather bad for actual video output. Dealing with older libva
versions or the completely broken vdpau backend doesn't help. OSD is
low quality and should be rather slow. In some cases, only either OSD
or subtitles can be shown at the same time (because OSD is drawn first,
OSD is prefered).
Also, libva can't decide whether it accepts straight or premultiplied
alpha for OSD sub-pictures: the vdpau backend seems to assume
premultiplied, while a native vaapi driver uses straight. So I picked
straight alpha. It doesn't matter much, because the blending code for
straight alpha I added to img_convert.c is probably buggy, and ASS
subtitles might be blended incorrectly.
Really good video output with VA API would probably use OpenGL and the
GL interop features, but at this point you might just use vo_opengl.
(Patches for making HW decoding with vo_opengl have a chance of being
accepted.)
Despite these issues, decoding seems to work ok. I still got tearing
on the Intel system I tested (Intel(R) Core(TM) i3-2350M). It was also
tested with the vdpau vaapi wrapper on a nvidia system; however this
was rather broken. (Fortunately, there is no reason to use mpv's VAAPI
support over native VDPAU.)