RepoMirrors/mpv - mpv

Commit Graph

Author	SHA1	Message	Date
Philip Langdale	2048ad2b8a	hwdec/opengl: Add support for CUDA and cuvid/NvDecode Nvidia's "NvDecode" API (up until recently called "cuvid" is a cross platform, but nvidia proprietary API that exposes their hardware video decoding capabilities. It is analogous to their DXVA or VDPAU support on Windows or Linux but without using platform specific API calls. As a rule, you'd rather use DXVA or VDPAU as these are more mature and well supported APIs, but on Linux, VDPAU is falling behind the hardware capabilities, and there's no sign that nvidia are making the investments to update it. Most concretely, this means that there is no VP8/9 or HEVC Main10 support in VDPAU. On the other hand, NvDecode does export vp8/9 and partial support for HEVC Main10 (more on that below). ffmpeg already has support in the form of the "cuvid" family of decoders. Due to the design of the API, it is best exposed as a full decoder rather than an hwaccel. As such, there are decoders like h264_cuvid, hevc_cuvid, etc. These decoders support two output paths today - in both cases, NV12 frames are returned, either in CUDA device memory or regular system memory. In the case of the system memory path, the decoders can be used as-is in mpv today with a command line like: mpv --vd=lavc:h264_cuvid foobar.mp4 Doing this will take advantage of hardware decoding, but the cost of the memcpy to system memory adds up, especially for high resolution video (4K etc). To avoid that, we need an hwdec that takes advantage of CUDA's OpenGL interop to copy from device memory into OpenGL textures. That is what this change implements. The process is relatively simple as only basic device context aquisition needs to be done by us - the CUDA buffer pool is managed by the decoder - thankfully. The hwdec looks a bit like the vdpau interop one - the hwdec maintains a single set of plane textures and each output frame is repeatedly mapped into these textures to pass on. The frames are always in NV12 format, at least until 10bit output supports emerges. The only slightly interesting part of the copying process is that CUDA works by associating PBOs, so we need to define these for each of the textures. TODO Items: * I need to add a download_image function for screenshots. This would do the same copy to system memory that the decoder's system memory output does. * There are items to investigate on the ffmpeg side. There appears to be a problem with timestamps for some content. Final note: I mentioned HEVC Main10. While there is no 10bit output support, NvDecode can return dithered 8bit NV12 so you can take advantage of the hardware acceleration. This particular mode requires compiling ffmpeg with a modified header (or possibly the CUDA 8 RC) and is not upstream in ffmpeg yet. Usage: You will need to specify vo=opengl and hwdec=cuda. Note that hwdec=auto will probably not work as it will try to use vdpau first. mpv --hwdec=cuda --vo=opengl foobar.mp4 If you want to use filters that require frames in system memory, just use the decoder directly without the hwdec, as documented above.	2016-09-08 16:06:12 +02:00
wm4	edbb8f6286	vd_lavc: always force milliseconds for MMAL This libavcodec wrapper should rescale the API timestamps to whatever it internally needs, but it doesn't yet. So restore this code.	2016-08-29 13:15:44 +02:00
wm4	0110b738d5	vd_lavc, ad_lavc: set pkt_timebase, not time_base These are different AVCodecContext fields. pkt_timebase is the correct one for identifying the unit of packet/frame timestamps when decoding, while time_base is for encoding. Some decoders also overwrite the time_base field with some unrelated codec metadata. pkt_timebase does not exist in Libav, so an #if is required.	2016-08-29 12:46:12 +02:00
wm4	c218d9e960	vd_lavc: minor simplification The timebase is now always valid.	2016-08-23 12:07:46 +02:00
wm4	e5f61c2bd5	vd_lavc: remove unnecessary initialization This is already the default value.	2016-08-19 15:00:58 +02:00
wm4	05e4df3f0c	video/audio: always provide "proper" timestamps to libavcodec Instead of passing through double float timestamps opaquely, pass real timestamps. Do so by always setting a valid timebase on the AVCodecContext for audio and video decoding. Specifically try not to round timestamps to a too coarse timebase, which could round off small adjustments to timestamps (such as for start time rebasing or demux_timeline). If the timebase is considered too coarse, make it finer. This gets rid of the need to do this specifically for some hardware decoding wrapper. The old method of passing through double timestamps was also a bit questionable. While libavcodec is not supposed to interpret timestamps at all if no timebase is provided, it was needlessly tricky. Also, it actually does compare them with AV_NOPTS_VALUE. This change will probably also reduce confusion in the future.	2016-08-19 14:59:30 +02:00
wm4	85488f6892	video: change hw_subfmt meaning The hw_subfmt field roughly corresponds to the field AVHWFramesContext.sw_format in ffmpeg. The ffmpeg one is of the type AVPixelFormat (instead of the underlying hardware format), so it's a good idea to switch to this too for preparation. Now the hw_subfmt field is an mp_imgfmt instead of an opaque/API- specific number. VDPAU and Direct3D11 already used mp_imgfmt, but Videotoolbox and VAAPI had to be switched. One somewhat user-visible change is that the verbose log will now always show the hw_subfmt as image format, instead of as nonsensical number. (In the end it would be good if we could switch to AVHWFramesContext completely, but the upstream API is incomplete and doesn't cover Direct3D11 and Videotoolbox.)	2016-07-15 13:04:17 +02:00
Aman Gupta	588b2f48e5	videotoolbox: add --hwdec=videotoolbox-copy for h/w accelerated decoding with video filters	2016-07-15 01:01:17 +02:00
Niklas Haas	5b6cce2b73	vd_lavc: expose mastering display side data reference peak This greatly improves the result when decoding typical (ST.2084) HDR content, since the job of tone mapping gets significantly easier when you're only mapping from 1000 to 250, rather than 10000 to 250. The difference is so drastic that we can now even reasonably use `hdr-tone-mapping=linear` and get a very perceptually uniform result that is only slightly darker than normal. (To compensate for the extra dynamic range) Due to weird implementation details, this only seems to be present on keyframes (or something like that), so we have to cache the last seen value for the frames in between. Also, in some files the metadata is just completely broken / nonsensical, so I decided to apply a simple heuristic to detect completely broken metadata.	2016-07-03 19:42:52 +02:00
Niklas Haas	923e3c7b20	vo_opengl: generalize HDR tone mapping mechanism This involves multiple changes: 1. Brightness metadata is split into nominal peak and signal peak. For a quick and dirty explanation: nominal peak is the brightest value that your color space can represent (i.e. the brightness of an encoded 1.0), and signal peak is the brightest value that actually occurs in the video (i.e. the brightest thing that's displayed). 2. vo_opengl uses a new decision logic to figure out the right nom_peak and sig_peak for all situations. It also does a better job of picking the right target gamut/colorspace to use for the OSD. (Which still is and still should be treated as sRGB). This change in logic also fixes #3293 en passant. 3. Since it was growing rapidly, the logic for auto-guessing / inferring the right colorimetry configuration (in pass_colormanage) was split from the logic for actually performing the adaptation (now pass_color_map). Right now, the new logic doesn't do a whole lot since HDR metadata is still ignored (but not for long).	2016-07-03 19:42:52 +02:00
Niklas Haas	d81fb97f45	mp_image: split colorimetry metadata into its own struct This has two reasons: 1. I tend to add new fields to this metadata, and every time I've done so I've consistently forgotten to update all of the dozens of places in which this colorimetry metadata might end up getting used. While most usages don't really care about most of the metadata, sometimes the intend was simply to “copy” the colorimetry metadata from one struct to another. With this being inside a substruct, those lines of code can now simply read a.color = b.color without having to care about added or removed fields. 2. It makes the type definitions nicer for upcoming refactors. In going through all of the usages, I also expanded a few where I felt that omitting the “young” fields was a bug.	2016-07-03 19:42:52 +02:00
Ben Boeckel	76a73a0a5d	vd_lavc: hide structs behind platform flags Otherwise, warnings about them being unused appear.	2016-07-01 19:12:34 -04:00
wm4	9ca1592f3f	d3d: implement screenshots for --hwdec=d3d11va No method of taking a screenshot was implemented at all. vo_opengl lacked window screenshotting, because ANGLE doesn't allow reading the frontbuffer. There was no way to read back from a D3D11 texture either. Implement reading image data from D3D11 textures. This is a low-quality effort to get basic screenshots done. Eventually there will be a better implementation: once we use AVHWFramesContext natively, the readback implementation will be in libavcodec, and will be able to cache the staging texture correctly. Hopefully. (For now it doesn't even have a AVHWFramesContext for D3D11 yet. But the abstraction is more appropriate for this purpose.)	2016-06-28 20:38:53 +02:00
wm4	17c5738cb4	d3d: merge angle_common.h into d3d.h OK, this was dumb. The file didn't have much to do with ANGLE, and the functionality can simply be moved to d3d.c. That file contains helpers for decoding, but can always be present (on Windows) since it doesn't access any D3D specific libavcodec APIs. Thus it doesn't need to be conditionally built like the actual hwaccel wrappers.	2016-06-28 20:07:56 +02:00
stepshal	c5094206ce	Fix misspellings	2016-06-26 13:47:21 +02:00
wm4	7be37337f4	vo_opengl: vdpau interop without RGB conversion Until now, we've always converted vdpau video surfaces to RGB, and then mapped the resulting RGB texture. Change this so that the surface is mapped as NV12 plane textures. The reason this wasn't done until now is because vdpau surfaces are mapped in an "interlaced" way as separate fields, even for progressive video. This requires messy reinterleraving. It turns out that even though it's an extra processing step, the result can be faster than going through the video mixer for RGB conversion. Other than some potential speed-gain, doing this has multiple other advantages. We can apply our own color conversion, which is important in more complex cases. We can correctly apply debanding and potentially other processing that requires chroma-specific or in-YUV handling. If deinterlacing is enabled, this switches back to the old RGB conversion method. Until we have at least a primitive deinterlacer in vo_opengl, this will stay this way. The d3d11 and vaapi code paths are similar. (Of course these don't require any crazy field reinterleaving.)	2016-06-19 19:58:40 +02:00
wm4	7ecac3ae6f	d3d11va: remove unused d3d11va_surface.subindex field This is now stored within the AVFrame/mp_image.	2016-06-16 18:13:46 +02:00
James Ross-Gowan	88b584656d	dxva2: remove dead code in failure case This was made unnecessary by `a02d77b`, since the only way the function could fail was by failing to add a reference to the DirectX DLLs.	2016-06-07 18:53:05 +10:00
wm4	0348cd080f	video: remove d3d11 video processor use from OpenGL interop We now have a video filter that uses the d3d11 video processor, so it makes no sense to have one in the VO interop code. The VO uses it for formats not directly supported by ANGLE (so the video data is converted to a RGB texture, which ANGLE can take in). Change this so that the video filter is automatically inserted if needed. Move the code that maps RGB surfaces to its own inteorp backend. Add a bunch of new image formats, which are used to enforce the new constraints, and to automatically insert the filter only when needed. The added vf mechanism to auto-insert the d3d11vpp filter is very dumb and primitive, and will work only for this specific purpose. The format negotiation mechanism in the filter chain is generally not very pretty, and mostly broken as well. (libavfilter has a different mechanism, and these mechanisms don't match well, so vf_lavfi uses some sort of hack. It only works because hwaccel and non-hwaccel formats are strictly separated.) The RGB interop is now only used with older ANGLE versions. The only reason I'm keeping it is because it's relatively isolated (uses only existing mechanisms and adds no new concepts), and because I want to be able to compare the behavior of the old code with the new one for testing. It will be removed eventually. If ANGLE has NV12 interop, P010 is now handled by converting to NV12 with the video processor, instead of converting it to RGB and using the old mechanism to import that as a texture.	2016-05-29 19:00:55 +02:00
wm4	a02d77ba0d	d3d: simplify DLL loading For some reason, the d3d9/dxva2/d3d11 DLLs are still optional. But we don't need to try so hard to keep exact references. In fact, there's no reason to unload them at all. So load them once in a central place. For simplicity, the d3d9/d3d11 backends both load all DLLs. (They will error out only if the required DLLs could not be loaded.) In theory, we could just call LoadLibrary multiple times (without calling FreeLibrary), but I'm slightly worried that this could be detected as a "bug", or that the reference count could even have a low static limit that could be hit soon.	2016-05-17 11:59:54 +02:00
wm4	39b64fb176	video: merge dxva2 source files video/dxva2.c exported only 2 functions, both used only by video/decode/dxva2.c. The same was already done for d3d11.	2016-05-17 11:05:51 +02:00
wm4	32c10956e0	vaapi: avoid forward declaration of variable Why is everything so horrible.	2016-05-15 18:37:51 +02:00
wm4	70b3561270	video: add --hwdec=auto-copy mode This uses the normal autoprobing rules like "auto", but rejects anything that isn't flagged as copying data back to system memory. The chunk in command.c was dead code, so remove it instead of updating it.	2016-05-11 16:20:13 +02:00
wm4	fd82e14888	build: merge d3d11va and dxva2 hwaccel checks We don't have any reason to disable either. Both are loaded dynamically at runtime anyway. There is also no reason why dxva2 would disappear from libavcodec any time soon.	2016-05-11 15:40:31 +02:00
wm4	a3d416c3d3	vo_opengl: d3d11egl: native NV12 sampling support This uses EGL_ANGLE_stream_producer_d3d_texture_nv12 and related extensions to map the D3D textures coming from the hardware decoder directly in GL. In theory this would be trivial to achieve, but unfortunately ANGLE does not have a mechanism to "import" D3D textures as GL textures. Instead, an awkward mechanism via EGL_KHR_stream was implemented, which involves at least 5 extensions and a lot of glue code. (Even worse than VAAPI EGL interop, and very far from the simplicity you get on OSX.) The ANGLE mechanism so far supports only the NV12 texture format, which means 10 bit won't work. It also does not work in ES3 mode yet. For these reasons, the "old" ID3D11VideoProcessor code is kept and used as a fallback.	2016-05-10 21:06:34 +02:00
wm4	46fff8d31a	video: refactor how VO exports hwdec device handles The main change is with video/hwdec.h. mp_hwdec_info is made opaque (and renamed to mp_hwdec_devices). Its accessors are mainly thread-safe (or documented where not), which makes the whole thing saner and cleaner. In particular, thread-safety rules become less subtle and more obvious. The new internal API makes it easier to support multiple OpenGL interop backends. (Although this is not done yet, and it's not clear whether it ever will.) This also removes all the API-specific fields from mp_hwdec_ctx and replaces them with a "ctx" field. For d3d in particular, we drop the mp_d3d_ctx struct completely, and pass the interfaces directly. Remove the emulation checks from vaapi.c and vdpau.c; they are pointless, and the checks that matter are done on the VO layer. The d3d hardware decoders might slightly change behavior: dxva2-copy will not use the VO device anymore if the VO supports proper interop. This pretty much assumes that any in such cases the VO will not use any form of exclusive mode, which makes using the VO device in copy mode unnecessary. This is a big refactor. Some things may be untested and could be broken.	2016-05-09 20:03:22 +02:00
wm4	5be40f035b	d3d: DXVA2_ModeMPEG2_VLD supports all profiles Fixes hardware decoding of most mpeg2 things.	2016-05-03 15:46:16 +02:00
James Ross-Gowan	622bcb0e37	win32: replace libuuid.a usage with initguid.h Including initguid.h at the top of a file that uses references to GUIDs causes the GUIDs to be declared globally with __declspec(selectany). The 'selectany' attribute tells the linker to consolidate multiple definitions of each GUID, which would be great except that, in Cygwin and MinGW GCC 6.1, this method of linking makes the GUIDs conflict with the ones declared in libuuid.a. Since initguid.h obsoletes libuuid.a in modern compilers that support __declspec(selectany), add initguid.h to all files that use GUIDs and remove libuuid.a from the build. Fixes #3097	2016-05-01 21:10:24 +10:00
Kevin Mitchell	8d51f08010	d3d11va: fix invalid deref on decoder init failure fixes #3092	2016-04-29 23:36:01 -07:00
wm4	64f9e48bf1	d3d11va, dxva2: return the format struct directly Slight simplification, IMHO.	2016-04-29 23:30:01 +02:00
wm4	016eab2209	d3d11va, dxva2: simplify decoder selection In particular, this moves the depth test to common code. Should be functionally equivalent, except that for DXVA2, the IDirectXVideoDecoderService_GetDecoderRenderTargets API is called more often potentially.	2016-04-29 23:24:28 +02:00
wm4	bda111018c	video: add IMGFMT_P010 alias Gets rid of some silliness, and might be useful in the future.	2016-04-29 22:38:54 +02:00
wm4	dff33893f2	d3d11va: store texture/subindex in IMGFMT_D3D11VA plane pointers Basically this gets rid of the need for the accessors in d3d11va.h, and the code can be cleaned up a little bit. Note that libavcodec only defines a ID3D11VideoDecoderOutputView pointer in the last plane pointers, but it tolerates/passes through the other plane pointers we set.	2016-04-27 14:06:50 +02:00
wm4	9896994688	vd_lavc: adjust D3D11VA autoprobe order We want to prefer d3d11va over dxva2 anything. But since dxva2 copyback is more efficient than d3d11va's currently, d3d11va-copy should come last.	2016-04-27 13:54:20 +02:00
wm4	3706918311	vo_opengl: D3D11VA + ANGLE interop This uses ID3D11VideoProcessor to convert the video to a RGBA surface, which is then bound to ANGLE. Currently ANGLE does not provide any way to bind nv12 surfaces directly, so this will have to do. ID3D11VideoContext1 would give us slightly more control about the colorspace conversion, though it's still not good, and not available in MinGW headers yet. The video processor is created lazily, because we need to have the coded frame size, of which AVFrame and mp_image have no concept of. Doing the creation lazily is less of a pain than somehow hacking the coded frame size into mp_image. I'm not really sure how ID3D11VideoProcessorInputView is supposed to work. We recreate it on every frame, which is simple and hopefully doesn't affect performance.	2016-04-27 13:49:47 +02:00
wm4	da59726776	vd_lavc: hack against videotoolbox crash on failure I guess this won't ever be fixed properly in FFmpeg. Too hairy, and the alternative (using VideoToolbox as "full decoder") is too attractive.	2016-04-26 18:53:58 +02:00
wm4	8ffd2f1dd4	vd_lavc: simplify some unneeded ifdeffery These were for ancient libavcodec versions.	2016-04-25 19:11:16 +02:00
wm4	6a1814dfb8	vd_lavc: make image_format hwdec field optional For Mediacodec in particular we don't care about the format. It can just decode to whatever it wants. The only case we would care about is it not returning an opaque format if we don't have proper interop, but libavcodec always returns non-opaque formats by default.	2016-04-25 12:23:38 +02:00
wm4	7e3d8e7134	vd_lavc: simplify RPI and Mediacodec wrappers Use the recently added lavc_suffix mechanism to select the wrapper decoder. With all hwdec callbacks being optional, and RPI/Mediacodec having only dummy callbacks, all the callbacks can be removed as well. The result is that the vd_lavc_hwdec struct for both of them is tiny. It's better to move them to vd_lavc.c directly, because they are so trivial and small.	2016-04-25 12:23:38 +02:00
wm4	46e49a37be	vd_lavc: make all hwdec callbacks optional	2016-04-25 12:23:37 +02:00
wm4	bb17df1f07	vd_lavc: set AVCodecContext.time_base to forced time base This is a bit sketchy, as there isn't a truly standard way to communicate the timebase.	2016-04-25 12:23:37 +02:00
wm4	4f5509e1dd	vd_lavc: better hwdec wrapper decoder selection This is intended for cases when --hwdec needs to override the decoder implementation in use, like for example on the RPI. It does two things: 1. Allow the hwdec to indicate a decoder suffix. libavcodec by convention adds a suffix to all wrapper decoders, and here we start relying on it. While not necessarily the best idea, it's the only thing we got. libavcodec's hwaccel list is useless, because it only has the codec ID, not the associated decoder's name. 2. Make --hwdec=auto work properly. It shouldn't fail anymore, and hwdec probing should reliably work, even if a different decoder is selected with --vd. The semantics of --hwdec should dictate that it overrides the default decoder.	2016-04-25 12:13:12 +02:00
wm4	85416bc36a	vd_lavc: allow process_image() to return NULL In case of errors or whatever.	2016-04-25 11:30:07 +02:00
wm4	b1a8e8dba6	vd_lavc: fix hwdec fallback if hwdec pre-initialization fails Damn.	2016-04-22 15:52:16 +02:00
Kevin Mitchell	ce153bdb42	d3dva: move Intel_H264_NoFGT_ClearVideo to lower priority This seems to cause problems, so only use it if H264_E is not available. fixes #3059	2016-04-18 02:00:50 -07:00
Kevin Mitchell	2f5c7af914	dxva2: fix missing newline in error message	2016-04-18 01:58:18 -07:00
Kevin Mitchell	611594e7e8	d3dva: include selected decoder and format in verbose output	2016-04-17 18:28:14 -07:00
wm4	f5ff2656e0	vaapi: determine surface format in decoder, not in renderer Until now, we have made the assumption that a driver will use only 1 hardware surface format. the format is dictated by the driver (you don't create surfaces with a specific format - you just pass a rt_format and get a surface that will be in a specific driver-chosen format). In particular, the renderer created a dummy surface to probe the format, and hoped the decoder would produce the same format. Due to a driver bug this required a workaround to actually get the same format as the driver did. Change this so that the format is determined in the decoder. The format is then passed down as hw_subfmt, which allows the renderer to configure itself with the correct format. If the hardware surface changes its format midstream, the renderer can be reconfigured using the normal mechanisms. This calls va_surface_init_subformat() each time after the decoder returns a surface. Since libavcodec/AVFrame has no concept of sub- formats, this is unavoidable. It creates and destroys a derived VAImage, but this shouldn't have any bad performance effects (at least I didn't notice any measurable effects). Note that vaDeriveImage() failures are silently ignored as some drivers (the vdpau wrapper) support neither vaDeriveImage, nor EGL interop. In addition, we still probe whether we can map an image in the EGL interop code. This is important as it's the only way to determine whether EGL interop is supported at all. With respect to the driver bug mentioned above, it doesn't matter which format the test surface has. In vf_vavpp, also remove the rt_format guessing business. I think the existing logic was a bit meaningless anyway. It's not even a given that vavpp produces the same rt_format for output.	2016-04-11 22:03:26 +02:00
wm4	813372d6e9	d3d: fix Windows build Commit `f009d16f` accidentally broke it. Thanks to RiCON for noticing and testing.	2016-04-07 21:04:31 +02:00
wm4	f033481551	videotoolbox: change how videotoolbox format is managed The underlying intention of this code is to make changing --videotoolbox-format at runtime work. For this reason, the format can't just be statically setup, but must be read from the option at runtime. This means the format is not fixed anymore, and we have to make sure the renderer is property reinitialized if the format changes. There is currently no way to trigger reinit on this level, which is why the mp_image_params.hw_subfmt field was introduced. One sketchy thing remains: normally, the renderer is supposed to be involved with VO format negotiation, which would ensure that the VO can take the format at all. Since the hw_subfmt is not part of this format negotiation, it's implied the get_vt_fmt() callback only returns formats supported by the renderer. This is not necessarily clear because vo_opengl checks this with converted_imgfmt separately. None of this matters in practice though, because we know all formats are always supported. (This still requires somehow triggering decoder reinit to make the change effective.)	2016-04-07 19:54:58 +02:00

1 2 3 4 5 ...

452 Commits