Commit Graph

5520 Commits

Author SHA1 Message Date
Andreas Rheinhardt fbbe7729f0 avutil/aes_ctr: Avoid allocation of AVAES struct
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-12-08 14:14:00 +01:00
Andreas Rheinhardt 6c57e0b4a8 avutil/frame: Treat frame as uninitialized in get_frame_defaults()
Currently, it also tests whether extended_data points to something
different than the AVFrame's data array and frees extended_data
if it is different. Yet this is only necessary for one of its three
callers, namely av_frame_unref(); meanwhile the other two callers
took measures to avoid this (or rather, to make it to an av_free(NULL)).

This commit moves this chunk to av_frame_unref() (so that
get_frame_defaults() now treats its input as uninitialized)
and removes the now superfluous code in the other two callers.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-12-05 13:27:38 +01:00
Anton Khirnov 3a9861e22c lavu/frame: clarify doxy
AVFrame.data[] elements not used by the format should ALWAYS be null,
hwaccel formats are not an exception.
2021-12-04 14:29:06 +01:00
Anton Khirnov 3b8efec3c5 lavu/frame: drop mentions of non-refcounted frames
All frames we deal with should always be refcounted now.
2021-12-04 14:28:23 +01:00
nyanmisaka 64467cbca2 libavutil/hwcontext_qsv: fix a bug for mapping vaapi frame to qsv
The data stored in data[3] in VAAPI AVFrame is VASurfaceID while
the data stored in pair->first is the pointer of VASurfaceID, so
we need to do cast to make following commandline works:

ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format vaapi -i input.264 \
-vf "hwmap=derive_device=qsv,format=qsv" -c:v h264_qsv output.264

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2021-12-04 14:06:30 +01:00
Lynne b236ef0a59
lavu/avframe: add a time_base field
This adds a time_base field to AVFrame, as an analogue to the
AVPacket.time_base field.
2021-12-03 22:41:00 +01:00
Andreas Rheinhardt a4798a5d51 all: Use av_memdup() where appropriate
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-12-03 16:07:02 +01:00
rcombs 350eb59f8c videotoolbox: add alpha support 2021-11-28 16:40:58 -06:00
rcombs d2229eca51 lavu/videotoolbox: add 422 and 444 pixel format mappings 2021-11-28 16:40:43 -06:00
rcombs b2cd1fb2ec lavu/pixfmt: add high-bit-depth semi-planar 4:2:2/4:4:4 formats
These are used by VideoToolbox hardware decoders.
2021-11-28 16:40:43 -06:00
Lynne c90b3661fa
hwcontext_vulkan: use correct return value for allocation failure 2021-11-27 04:46:41 +01:00
Wu Jianhua b3624069f0 avutil/hwcontext_vulkan: fully support customizable validation layers
Validation layer is an indispensable part of developing on Vulkan.

The following commands is on how to enable validation layers:

ffmpeg -init_hw_device vulkan=0,debug=1,validation_layers=VK_LAYER_LUNARG_monitor+VK_LAYER_LUNARG_api_dump

Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
2021-11-26 10:36:39 +01:00
Wu Jianhua 7e9e2cf93b avutil/hwcontext_vulkan: check if created before destroying the instance
Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
2021-11-24 11:09:49 +01:00
Wu Jianhua c2a356d583 avutil/hwcontext_vulkan: check if created before destroying the device
Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
2021-11-24 11:09:49 +01:00
Timo Rothenpieler 2de6cd4ba4 avutil/hwcontext_cuda: return more useful error codes from init functions 2021-11-22 23:03:21 +01:00
Timo Rothenpieler b1f1de0844 avutil/hwcontext_cuda: add option to use primary device context 2021-11-22 23:03:21 +01:00
Lynne b159975e80
hwcontext_vulkan: check for non-flagged transfer queue families
"All commands that are allowed on a queue that supports transfer
operations are also allowed on a queue that supports either
graphics or compute operations. Thus, if the capabilities of a
queue family include VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT,
then reporting the VK_QUEUE_TRANSFER_BIT capability separately for
that queue family is optional."
2021-11-20 02:37:41 +01:00
Lynne 135e1c0adf
lavu/vulkan: check for initialization when freeing buffers
What happens on startup is that ffmpeg.c initializes the filter,
then frees it without feeding a single frame through. With no
input frame, the filter lacks a hardware device. The rest of the
uninit code checks if Vulkan objects exist, which they must if there's
a hardware device, but vk->DeviceWaitIdle does not require an object.
So, add a check for it.
2021-11-20 01:48:45 +01:00
Wu Jianhua ff82bd5a00 avutil/vulkan_glslang: fix compiling failure issue
Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
2021-11-19 16:47:48 +01:00
Lynne da72aca7b0
lavu/vulkan: add support for using libshaderc as a GLSL compiler
It's got a much better API that's actually maintained, it eliminates
race conditions, it comes with a pkg-config file by default, and
unfortunately isn't currently packaged by Debian or other large
distributions.
2021-11-19 16:47:30 +01:00
Lynne 1d06084171
vulkan: fix checkheaders 2021-11-19 16:47:28 +01:00
Lynne f6dd30df24
lavfi/vulkan: split off lavfi-specific code into vulkan_filter.c
The issue is that libavfilter depends on libavcodec, and when doing a
static build, if libavcodec also includes "libavfilter/vulkan.c", then
during link-time, compiling programs will fail as there would be multiple
definitions of the same symbols in both libavfilter and libavcodec's
object files.
Linkers are, however, more permitting if both files that include
a common file that's used as a template are one-to-one identical.
Hence, to make both files the same in the future, export all avfilter
specific functions to a separate file.
There is some work in progress to make templated files like this be
compiled only once, so this is not a long-term solution.

This also removes a macro that could be used to toggle SPIRV compilation
capability on #include-time, as this could cause the files to be different.
2021-11-19 16:47:26 +01:00
James Almer 67b92d68c6 x86/intmath: add VEX encoded versions of av_clipf() and av_clipd()
Prevents mixing inlined SSE instructions and AVX instructions when the compiler
generates the latter.

Signed-off-by: James Almer <jamrial@gmail.com>
2021-11-19 11:21:03 -03:00
Lynne b2aec70bd6
lavu/vulkan: add option to switch between shader compilers and cleanup glslang 2021-11-19 13:44:47 +01:00
Lynne d1133e8c44
lavu/vulkan: move common Vulkan code from libavfilter to libavutil 2021-11-19 13:44:45 +01:00
Soft Works daef8cbff7 avutil/frame: Document the possibility of negative line sizes
Signed-off-by: softworkz <softworkz@hotmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
2021-11-18 20:40:24 +01:00
Andreas Rheinhardt 9181b9ec7c avutil/hwcontext_qsv: Remove redundant check
It has already been checked immediately before that said
AVDictionaryEntry exists; checking again is redundant.
Furthermore, av_hwdevice_find_type_by_name() requires its argument
to be non-NULL, so adding a codepath that automatically calls it
with that parameter is nonsense. The same goes for the argument
corresponding to %s.

Fixes Coverity issue 1491394.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-11-18 19:50:08 +01:00
Andreas Rheinhardt bd5ec3601f avutil/hwcontext_qsv: Fix leak of AVBuffer and AVBufferRef
This av_buffer_create() does nothing but leak an AVBuffer and an
AVBufferRef (except on allocation error).

Fixes Coverity issue 1491393.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-11-18 19:50:00 +01:00
Lynne 85a6b7f7b7
vulkan_loader: fix typo in error message 2021-11-18 06:40:52 +01:00
Derek Buitenhuis 54e65aa38a avutil: Add Dolby Vision RPU side data type
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2021-11-17 14:12:33 +00:00
Jonathan Wright 08b4716a9e aarch64: Add Armv8.5-A BTI support
Add Branch Target Identifiers (BTIs) to all functions defined in
AArch64 assembly files. Most of the BTI landing pads are added
automatically by the 'function' macro.

BTI support is turned on or off at compile time based on the presence
of the __ARM_FEATURE_BTI_DEFAULT feature macro.

A binary compiled with BTI support can be executed on an Armv8-A
processor without BTI support because the instructions are defined in
NOP space.

Signed-off-by: Jonathan Wright <jonathan.wright@arm.com>
Signed-off-by: Elijah Ahmad <elijah.ahmad@arm.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2021-11-16 13:43:56 +02:00
Mark Reid c3502f4f75 libavutil/common: clip nan value to amin
Changes av_clipf to return amin if a is nan.
Before if a is nan av_clipf_c returned nan and
av_clipf_sse would return amax. Now the both
should behave the same.

This works because nan > amin is false.
The max(nan, amin) will be amin.

Signed-off-by: James Almer <jamrial@gmail.com>
2021-11-15 16:50:08 -03:00
Anton Khirnov db932241ee */version.h: define FF_API macros unconditionally
There is no reason to wrap them in #ifndef guards, they should only be
defined here and nowhere else. The define guards just add the
possibility to accidentally use the same FF_API name in different
libraries.
2021-11-15 16:24:58 +01:00
Timo Rothenpieler 2ece70090d avutil/hwcontext_vulkan: add support for exporting memory via Win32 Handles 2021-11-14 12:50:32 +01:00
Timo Rothenpieler fedf4ff85c avutil/vulkan: load win32 external memory functions 2021-11-14 12:50:32 +01:00
Soft Works 99a49f9147 avutil/opt: fix mis-alignment of option and constant values for filter help
Before:

overlay AVOptions:
  x                 <string>     ..FV....... set the x expression (default "0")
  y                 <string>     ..FV....... set the y expression (default "0")
  eof_action        <int>        ..FV....... Action to take when encountering EOF from secondary input  (from 0 to 2) (default repeat)
     repeat          0            ..FV....... Repeat the previous frame.
     endall          1            ..FV....... End both streams.
     pass            2            ..FV....... Pass through the main input.
  eval              <int>        ..FV....... specify when to evaluate expressions (from 0 to 1) (default frame)

After:
a
overlay AVOptions:
   x                 <string>     ..FV....... set the x expression (default "0")
   y                 <string>     ..FV....... set the y expression (default "0")
   eof_action        <int>        ..FV....... Action to take when encountering EOF from secondary input  (from 0 to 2) (default repeat)
     repeat          0            ..FV....... Repeat the previous frame.
     endall          1            ..FV....... End both streams.
     pass            2            ..FV....... Pass through the main input.
   eval              <int>        ..FV....... specify when to evaluate expressions (from 0 to 1) (default frame)

Signed-off-by: softworkz <softworkz@hotmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
2021-11-13 19:55:32 +01:00
Soft Works fba4d6f72b avutil/hwcontext_dxva2: add ARGB format
Required for uploading frames with alpha for qsv_overlay
(v2: remove tab indent)

Signed-off-by: softworkz <softworkz@hotmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
2021-11-13 19:22:57 +01:00
Lynne 7f6dc9b386
hwcontext_vaapi: don't use the generic mapping struct for DRM/VAAPI
Avoids a per-frame allocation since we don't need the flag field.
2021-11-13 15:13:03 +01:00
Lynne f388791ff9
hwcontext_vulkan: fix small memory leak when unmapping 2021-11-13 14:47:12 +01:00
Lynne 9dc544cdb4
hwcontext_vulkan: wait for semaphores when unmapping from VAAPI
We don't really want to do a full all-queue blocking wait here, since
this happens once per frame, and this could delay future frames.
2021-11-13 14:22:11 +01:00
Lynne 6a23a5597c
hwcontext_vulkan: print error information on queue submission failure
Makes it clearer what went wrong.
2021-11-13 14:21:36 +01:00
Lynne c96d1ee401
hwcontext_vulkan: fix DMABUF import format check call
VkExternalImageFormatProperties is required to be present in the .pNext
chain of VkImageFormatProperties2, or some drivers crash (RADV).
2021-11-13 11:12:50 +01:00
Lynne f74ceb358c
hwcontext_vulkan: improve CUDA error handling 2021-11-13 04:30:33 +01:00
Lynne 0d1992e025
hwcontext_vulkan: close exported memory FD on CUDA import error
Prevents resource leakage.
2021-11-13 00:40:46 +01:00
Lynne 015b487777
hwcontext_vulkan: do not dup() semaphore FDs for CUDA 2021-11-13 00:32:53 +01:00
Lynne fa28c1b2f9
hwcontext_vulkan: properly migrate queue families with DRM import/export 2021-11-13 00:03:58 +01:00
Lynne 549d91ae3a
hwcontext_vulkan: properly migrate between queue families on CUDA import/export
It's more correct.
2021-11-13 00:03:56 +01:00
Lynne 8449baf9aa
hwcontext_vulkan: properly error out if timeline semaphores are unsupported
Missing goto.
2021-11-13 00:03:51 +01:00
Lynne 296cb99d46
hwcontext_vulkan: fix CreateSemaphore conflict with synchapi.h
Include windows.h to fix it. Normally, it'd be better to include it in
vulkan_functions.h, but I'm reasonably confident nothing else that uses
the Vulkan code will need to include Windows functions and not windows.h.
2021-11-12 14:45:20 +01:00
Lynne 57e11321ea
hwcontext_vulkan: use vkDeviceWaitIdle instead of vkWaitSemaphores on uninit
To silence a possible validation layer bug, switch the function. It only gets
triggered by vf_libplacebo, which is odd.
2021-11-12 14:45:17 +01:00
Lynne 8478d60d5b
doc/APIchanges: update for Vulkan API changes 2021-11-12 05:23:41 +01:00
Lynne d05a18cdc7
lavu: move hwcontext_vulkan's function loader into separate files
This allows for the loader to be shared with libavcodec and libavfilter.
2021-11-12 05:23:40 +01:00
Lynne 1ffb59c056
hwcontext_vulkan: clean up extensions code and add additional defaults 2021-11-12 05:23:40 +01:00
Lynne bde1fc5386
hwcontext_vulkan: host wait on semaphores before freeing frame 2021-11-12 05:23:39 +01:00
Lynne f7f1613638
hwcontext_vulkan: report device that's used
Not sure why this wasn't done before.
2021-11-12 05:23:39 +01:00
Lynne 6bf9a6539e
vulkan: add support for encode and decode queues and refactor queue code
This simplifies and makes queue family picking simpler and more robust.
The requirements on the device context are relaxed. They made no sense
in the first place.

The video encode/decode extension is still in beta, at least on paper,
but I really doubt they'd change needing a separate queue family.
2021-11-12 05:23:36 +01:00
Lynne 09e4687b5b
hwcontext_vulkan: port CUDA interop to use timeline semaphores 2021-11-12 03:36:44 +01:00
Lynne 0370a580dc
hwcontext_vulkan: fix mapping from/to DRM/VAAPI frames 2021-11-12 03:36:42 +01:00
Lynne 00ef53c3ea
hwcontext_vulkan: switch to using timeline semaphores 2021-11-12 03:36:40 +01:00
Lynne 7f3878828d
hwcontext_vulkan: bump required Vulkan loader version to 1.2 2021-11-12 03:36:35 +01:00
Zhao Zhili 9fd2b39428 avutil/opt: handle whole range of int64_t in av_opt_get_int
Make get_int/set_int symetric. The int64_t to double to int64_t
conversion is unprecise for large value.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-11-11 13:14:58 +01:00
Limin Wang 8dc8c01d6c avutil/parseutils: add qhd(Quad HD) or wqhd(Wide Quad HD) for 1440p
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
2021-11-03 21:38:37 +08:00
Limin Wang 6cab5206b0 avutil/hwcontext_videotoolbox: fix use of unknown builtin '__builtin_available'
OSX version: 10.11.6
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin15.6.0

Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
2021-11-03 21:20:47 +08:00
Michael Niedermayer e154353fdb avutil/mathematics: Document av_rescale_rnd() behavior on non int64 results
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-21 14:13:03 +02:00
Limin Wang 9997047a18 avutil/detection_bbox: Fix av_detection_bbox_alloc failed if nb_bboxes == 0
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
2021-10-08 10:11:59 +08:00
Limin Wang e724004fd8 avutil/detection_bbox: use offsetof for bboxes_offset
Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
2021-10-08 10:11:59 +08:00
Andreas Rheinhardt 03a0dbaff3 avutil/md5: Avoid av_unused variable
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-10-02 17:13:57 +02:00
Andreas Rheinhardt ff80090374 avutil/utils: Remove racy check from avutil_version()
avutil_version() currently performs several checks before
just returning the version. There is a static int that aims
to ensure that these tests are run only once. The reason is that
there used to be a slightly expensive check, but it has been removed
in 92e3a6fdac. Today running only
once is unnecessary and can be counterproductive: GCC 10 optimizes
all the actual checks away, but the checks_done variable and the code
setting it has been kept. Given that this check is inherently racy
(it uses non-atomic variables), it is best to just remove it.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-29 02:58:07 +02:00
Andreas Rheinhardt 4e135347a7 avutil/tests/opt: Set AVClass.version
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-27 05:51:44 +02:00
Andreas Rheinhardt 386a4989df avutil/opt: Remove outdated version check
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-27 05:42:48 +02:00
Manuel Stoeckl 0760d9153c lavu/pix_fmt: add pixel format for x2bgr10
The new format (given in big/little endian forms) matches the
existing X2RGB10 format, except with B and R channels switched.

AV_PIX_FMT_X2BGR10 data often is created by OpenGL programs
whose buffers use the GL_RGB10 internal format.

Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-09-26 16:26:10 +02:00
Wenbin Chen f2891fbded libavutil/hwcontext_qsv: fix a bug for mapping qsv frame to vaapi
Command below failed.
ffmpeg -v verbose -init_hw_device vaapi=va:/dev/dri/renderD128
-init_hw_device qsv=qs@va -hwaccel qsv -hwaccel_device qs
-filter_hw_device va -c:v h264_qsv
-i 1080P.264 -vf "hwmap,format=vaapi" -c:v h264_vaapi output.264

Cause: Assign pair->first directly to data[3] in vaapi frame.
pair->first is *VASurfaceID while data[3] in vaapi frame is
VASurfaceID. I fix this line of code. Now the command above works.

Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
2021-09-23 22:59:11 -03:00
Andreas Rheinhardt 8d5de914d3 avutil/mem: Deprecate av_mallocz_array()
It does the same as av_calloc(), so one of them should be removed.
Given that av_calloc() has the shorter name, it is retained.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-20 01:04:09 +02:00
Andreas Rheinhardt 1ea3650823 Replace all occurences of av_mallocz_array() by av_calloc()
They do the same.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-20 01:03:52 +02:00
Andreas Rheinhardt 3df34e7bf7 avutil/opt: Simplify av_opt_set_dict2()
Make it clearer that the ordinary exit always returns 0.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-20 00:25:16 +02:00
Andreas Rheinhardt 3ba1bbf8d0 avutil/opt: Also warn for deprecated named constants
Intended for the "truncated" AVCodecContext flag.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-20 00:24:12 +02:00
Andreas Rheinhardt 4e0da7d311 avutil/buffer: Avoid allocation of AVBuffer when using buffer pool
Do this by putting an AVBuffer structure into BufferPoolEntry and
reuse it for all subsequent uses of said BufferPoolEntry.

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-09-18 23:16:49 +02:00
James Almer ccfdef79b1 avutil/buffer: constify some function parameters
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-09-17 13:28:09 -03:00
Artem Galin 4f78711f9c libavutil/hwcontext_d3d11va: adding more texture information to the D3D11 hwcontext API
Microsoft VideoProcessor requires texture with D3DUSAGE_RENDERTARGET flag as output.
There is no way to allocate array of textures with D3D11_BIND_RENDER_TARGET flag
and .ArraySize > 2 by ID3D11Device_CreateTexture2D due to the Microsoft limitation.
Adding AVD3D11FrameDescriptors array to store array of single textures
instead of texture with multiple slices resolves this.

Signed-off-by: Artem Galin <artem.galin@intel.com>
2021-09-08 17:48:02 -03:00
Artem Galin f1cd1dc6ce libavutil/hwcontext_qsv: add usage child_device_type argument to explicitly select d3d11va/DX11 device type
UPD: Rebase of last patch set over current master and use DX9 as default device type.

Makes selection of dxva2/DX9 device type by default as before with explicit d3d11va/DX11 usage to cover more HW configurations.
Added warning message to expect changing default device type in the future.

Fixes TGL / AV1 decode as requires DX11 with explicit DX11 type
selection.

Add headless/multi adapter support and fixes:
    https://trac.ffmpeg.org/ticket/7511
    https://trac.ffmpeg.org/ticket/6827
    http://ffmpeg.org/pipermail/ffmpeg-trac/2017-November/041901.html
    https://trac.ffmpeg.org/ticket/7933
    338fbcd5bb
    https://github.com/jellyfin/jellyfin/issues/2626#issuecomment-602153952

Any other fixes are welcome including OpenCL interop patch since I don't have proper setup to validate this use case

Decoding, encoding, transcoding have been validated.

child_device_type option is responsible for d3d11va/dxva2 device selection

Usage examples:

DirectX 11:
    -init_hw_device qsv:hw,child_device_type=d3d11va
    -init_hw_device qsv:hw,child_device_type=d3d11va,child_device=0
OR
    -init_hw_device d3d11va=dx -init_hw_device qsv@dx

DirectX 9 is still supported but requires explicit selection:
    -init_hw_device qsv:hw,child_device_type=dxva2
OR
    -init_hw_device dxva2=dx -init_hw_device qsv@dx

Signed-off-by: Artem Galin <artem.galin@intel.com>
2021-09-08 17:42:53 -03:00
Artem Galin a08a5299ac libavutil/hwcontext_qsv: supporting d3d11va device type
This enables usage of non-powered/headless GPU, better HDR support.
Pool of resources is allocated as one texture with array of slices.

Signed-off-by: Artem Galin <artem.galin@intel.com>
2021-09-08 17:42:53 -03:00
Anton Khirnov fdc0bb78fe lavu/slicethread: return ENOSYS rather than EINVAL in the dummy func
EINVAL is the wrong error code here, since the arguments passed to the
function are valid. The error is that the function is not implemented in
the build, which corresponds to ENOSYS.
2021-08-29 18:45:04 +02:00
Andreas Rheinhardt 81b6186920 avutil/log: Reorder elements of AVClass to make it smaller
Putting child_next besides child_class_iterate is actually nicer.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-08-25 23:01:53 +02:00
Niklas Haas cf37c3fb6c avcodec/h264_slice: compute and export film grain seed
From SMPTE RDD 5-2006, the grain seed is to be computed from the
following definition of `pic_offset`:

> When decoding H.264 | MPEG-4 AVC bitstreams, pic_offset is defined as
> follows:
>   - pic_offset = PicOrderCnt(CurrPic) + (PicOrderCnt_offset << 5)
> where:
>   - PicOrderCnt(CurrPic) is the picture order count of the current frame,
>     which shall be derived from [the video stream].
>
>   - PicOrderCnt_offset is set to idr_pic_id on IDR frames. idr_pic_id
>     shall be read from the slice header of [the video stream]. On non-IDR I
>     frames, PicOrderCnt_offset is set to 0. A frame shall be classified as I
>     frame when all its slices are I slices, which may be optionally
>     designated by setting primary_pic_type to 0 in the access delimiter NAL
>     unit. Otherwise, PicOrderCnt_offset it not changed. PicOrderCnt_offset is
>     updated in decoding order.

Co-authored-by: James Almer <jamrial@gmail.com>
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-08-24 09:58:52 -03:00
Andreas Rheinhardt 8c53b14599 avutil/opt: Document actual behaviour of av_opt_copy a bit more
In particular, document that av_opt_copy() always disentangles
allocated options even on error; this guarantee is needed to e.g.
properly free duplicated thread contexts in libavcodec on error.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-08-17 19:11:57 +02:00
Nicolas George 1d8e1afc00 lavu/internal: add FF_FIELD_AT(). 2021-08-14 09:17:45 +02:00
Lynne 1c5610824a
hwcontext_vulkan: use GPU memcpy when copying to system RAM
This should speed it up significantly on systems where it matters.
2021-08-14 00:31:28 +02:00
Lynne d5de9965ef
imgutils: expose av_image_copy_plane_uc_from()
The reason why the generic av_image_copy_uc_from() doesn't really
fit in the case for Vulkan is because some planes may be copied via
other methods (such as mapping GPU memory), and if they don't satisfy
the strict alignment requirements, a gpu image->gpu buffer->cpu ram
copy is performed.

We need this for hwcontext_vulkan, and I think this will also be
useful to API users like libplacebo who would rather not write
a custom SIMD memcpy.
2021-08-14 00:27:43 +02:00
Andreas Rheinhardt 21c7df0d22 avutil/mem: Correct av_calloc() documentation
Incorrect since 4959f18a8e.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-08-12 15:25:58 +02:00
Andreas Rheinhardt f9126b62b6 avutil/mem: Reinline av_size_mult() internally
Since 580e168a94, av_size_mult() is no
longer inlined; on systems where interposing is a thing, this also
inhibits the compiler from inlining said function into the internal
callers of said function, although inlining such a small function is
typically beneficial: With GCC 10.3 on Ubuntu x64 and -O3 this decreases
the size of av_realloc_array from 91B to 23B, from 129B to 81B for
av_realloc_f and from 77B to 23B for each of av_malloc_array,
av_mallocz_array and av_calloc.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-08-12 15:25:58 +02:00
James Almer 35331aa266 avutil/tx: add a return at the end of non-void functions
Fixes compilation with GCC 11 when configured with --disable-optimizations

Signed-off-by: James Almer <jamrial@gmail.com>
2021-08-06 21:22:49 -03:00
Andreas Rheinhardt 2146b65553 avutil/internal: Move MAKE_ACCESSORS to its only user
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-08-05 20:05:54 +02:00
Andreas Rheinhardt 549502868d Move ff_tlog() from lavc/internal.h to lavu/internal.h
It is also used by libavfilter and it is only natural to define it
alongside ff_dlog().

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-08-05 20:02:35 +02:00
Andreas Rheinhardt 7ab0207d4b avutil/Makefile: Apply CFLAGS for compilation
Fixes "make tools/crypto_bench.o".

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-08-04 12:59:42 +02:00
Andreas Rheinhardt 5088c7c733 avutil/error: Include macros.h for MKTAG
Up until now, including error.h alone does not make the AVERROR_* defines
usable, because they just expand to something involving MKTAG, but
without the header providing MKTAG. So include macros.h, the header
providing MKTAG.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-29 22:02:05 +02:00
Andreas Rheinhardt 2dd8acbe80 avutil/common, macros: Move several macros from common.h to macros.h
common.h currently contains several things: Math macros, UTF-8 macros,
other fundamental macros; furthermore it also contains miscellaneous
math functions and it (directly and indirectly) includes lots of other
headers.

This commit moves the "other fundamental macros" to macros.h which is
a more fitting place.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-29 22:02:05 +02:00
Jiaxun Yang a1cd62883f avutil/mips: Use $at as MMI macro temporary register
Some function had exceed 30 inline assembly register oprands limiation
when using LOONGSON2 version of MMI macros. We can avoid that by take
$at, which is register reserved for assembler, as temporary register.

As none of instructions used in these macros is pseudo, it is safe to
utilize $at here.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-28 23:31:48 +02:00
Jiaxun Yang b868272d7e avutil/mips: Use MMI_{L, S}QC1 macro in {SAVE, RECOVER}_REG
{SAVE,RECOVER}_REG will be available for Loongson2 again,
also comment about the magic.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-07-28 23:31:48 +02:00
James Almer e3b5ff17c2 avutil/film_grain_params: add support for H.274 Film Grain Characteristics
Used by codecs like H.264, HEVC, and VVC.

Signed-off-by: James Almer <jamrial@gmail.com>
2021-07-23 11:06:31 -03:00
Andreas Rheinhardt 2934a4b9a5 Remove unnecessary avassert.h inclusions
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-22 15:02:30 +02:00
Andreas Rheinhardt 4608f7cc6a Remove unnecessary mem.h inclusions
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-22 14:47:57 +02:00
Andreas Rheinhardt e7bd47e657 Remove obsolete version.h inclusions
These have mostly been added because of FF_API_*; yet when these were
removed, removing the header has been forgotten.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-22 14:34:31 +02:00
Andreas Rheinhardt 2c05ee092b avutil/internal, swresample/audioconvert: Remove cpu.h inclusions
These inclusions are not necessary, as cpu.h is already included
wherever it is needed (via direct inclusion or via the arch-specific
headers).
Also remove other unnecessary cpu.h inclusions from ordinary
non-headers.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-22 14:33:45 +02:00
Andreas Rheinhardt 69f120ead7 avcodec/avcodec: Don't include cpu.h
It is not used here at all; instead, add it where it is used without
including it or any of the arch-specific CPU headers.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-22 12:59:07 +02:00
J. Dekker c866a099b2 lavu/kperf: use ff_thread_once()
Signed-off-by: J. Dekker <jdek@itanimul.li>
2021-07-21 16:35:27 +02:00
James Almer f9d5050d28 avutil/macos_kperf: add missing header guards
Fixes fate-source

Signed-off-by: James Almer <jamrial@gmail.com>
2021-07-20 19:01:02 -03:00
J. Dekker 9a727235fd lavu/checkasm: add (private) kperf timing for macOS
Signed-off-by: J. Dekker <jdek@itanimul.li>
2021-07-20 19:40:03 +02:00
Thilo Borgmann c1bf56a526 lavu/cpu: Use av_cpu_ prefix 2021-07-20 10:31:41 +02:00
Aman Karmani 504c60660d avutil/hwcontext_videotoolbox: implement hwupload to convert AVFrame to CVPixelBuffer
Teach AV_HWDEVICE_TYPE_VIDEOTOOLBOX to be able to create AVFrames of type
AV_PIX_FMT_VIDEOTOOLBOX. This can be used to hwupload a regular AVFrame
into its CVPixelBuffer equivalent.

    ffmpeg -init_hw_device videotoolbox -f lavfi -i color=black:640x480 -vf hwupload -c:v h264_videotoolbox -f null -y /dev/null

Signed-off-by: Aman Karmani <aman@tmm1.net>
2021-07-18 12:01:16 -07:00
Lynne 997f9bdb99
x86/tx_float: correctly load the transform length
The field is a standard field, yet we were loading it as if it was
a quadword. This worked for forward transforms by chance, but broke
when the transform was inverse.
checkasm couldn't catch that because we only test forward transforms,
which are identical to inverse transforms but with a different revtab.
2021-07-18 15:04:57 +02:00
Thilo Borgmann 87951dcbe7 lavu/cpu.c: Add av_force_cpu_count() to override auto-detection. 2021-07-16 10:06:10 +02:00
Michael Niedermayer 85b883429f avutil/tx: avoid negative left shifts
Fixes: left shift of negative value -1
Fixes: 33736/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SIREN_fuzzer-6657785795313664

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-06-18 18:58:25 +02:00
James Almer 8a6103326e avutil/samplefmt: don't add offsets to NULL pointers
Signed-off-by: James Almer <jamrial@gmail.com>
2021-06-13 16:10:37 -03:00
James Almer 7916c14713 avutil/samplefmt: remove outdated comment
av_samples_fill_arrays() has been returning the minimum required buffer size
for a while now.

Signed-off-by: James Almer <jamrial@gmail.com>
2021-06-13 16:10:37 -03:00
Matthieu Patou 163559ed62 avutil/tests/audio_ffio: add missing header
Needed for HAVE_BIGENDIAN

Suggested-by: ffmpeg@fb.com
Signed-off-by: James Almer <jamrial@gmail.com>
2021-06-13 13:39:57 -03:00
James Almer 75f0bc651f avutil/tests/lzo: remove timer macros
Suggested-by: ffmpeg@fb.com
Signed-off-by: James Almer <jamrial@gmail.com>
2021-06-13 13:39:57 -03:00
Anton Khirnov 580e168a94 lavu/mem: un-inline av_size_mult()
There seems to be no compelling reason for it to be inline.
2021-06-11 19:42:47 +02:00
Anton Khirnov c8778606b3 lavu/video_enc_params: make sure blocks are properly aligned 2021-06-10 16:59:50 +02:00
Lynne 08d933bf61
hwcontext_vulkan: fix typo in vulkan_device_init()
load_functions() did not load the device-level functions.
2021-06-10 12:24:04 +02:00
Andreas Rheinhardt 7e03d962a4 avutil/opt: Check directly for av_dict_copy() failure
av_dict_copy() returned void when this code was written.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-06-08 12:52:50 +02:00
Jin Bo fd5fd48659 libavcodec/mips: Fix build errors reported by clang
Clang is more strict on the type of asm operands, float or double
type variable should use constraint 'f', integer variable should
use constraint 'r'.

Signed-off-by: Jin Bo <jinbo@loongson.cn>
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-06-03 13:44:00 +02:00
Valerii Zapodovnikov 6b1268f8c3 pixfmt: fixed wrong fix of comment
This mostly reverts 785bfb1d7b.
But I also added some clarifications so that nobody mixes primaries
with matrix again. SMPTE 240 and 170 primaires are the same, while
matrix coeff. are different, because 240 is derived from 170's new
primaries and white point while 170 uses BT.601 derived from BT.470
System M (yes, with Illuminant C) a.k.a. NTSC 1953. Some nits too.

Reviewed-by: Reto Kromer <lists@reto.ch>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-06-02 17:30:24 +02:00
James Almer baf5cc5b7a avutil/mem: use GCC builtins to check for overflow in av_size_mult()
Signed-off-by: James Almer <jamrial@gmail.com>
2021-05-31 09:02:06 -03:00
James Almer 918fc9a0ed avutil/mem: check for max_alloc_size in av_fast_malloc()
This puts av_fast_malloc*() in line with av_fast_realloc().

Signed-off-by: James Almer <jamrial@gmail.com>
2021-05-27 10:29:59 -03:00
James Almer 786be70e28 avutil/mem: make ff_fast_malloc() internal to mem.c
Signed-off-by: James Almer <jamrial@gmail.com>
2021-05-27 10:29:52 -03:00
James Almer be96f4b616 avutil/mem: make max_alloc_size an atomic type
Signed-off-by: James Almer <jamrial@gmail.com>
2021-05-23 11:26:22 -03:00
James Almer fc99d59553 avutil/imgutils: don't add offsets to NULL pointers
Signed-off-by: James Almer <jamrial@gmail.com>
2021-05-12 15:52:50 -03:00
Shiyou Yin ab04fedaaa mips: Fix potential illegal instruction error.
MSA2 optimizations are attached to MSA macros in generic_macros_msa.h.
It's difficult to do runtime check for them. Remove this part of code
can make it more robust. H264 1080p decoding: 5.13x==>5.12x.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-05-07 17:53:23 +02:00
Andreas Rheinhardt 8b83a4a885 avutil/mem: Also poison new av_realloc-allocated blocks
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-04-30 10:24:32 +02:00
Lynne cf17e2323f hwcontext_vulkan: dlopen libvulkan
While Vulkan itself went more or less the way it was expected to go,
libvulkan didn't quite solve all of the opengl loader issues. It's multi-vendor,
yes, but unfortunately, the code is Google/Khronos QUALITY, so suffers from
big static linking issues (static linking on anything but OSX is unsupported),
has bugs, and due to the prefix system used, there are 3 or so ways to type out
functions.

Just solve all of those problems by dlopening it. We even have nice emulation
for it on Windows.
2021-04-30 00:08:37 +02:00
Lynne 4a6581e968 hwcontext_vulkan: dynamically load functions
This patch allows for alternative loader implementations.
2021-04-30 00:08:37 +02:00
James Almer ffeeff4fbc avutil/hwcontext_vulkan: fix format specifiers for some printed variables
VkPhysicalDeviceLimits.optimalBufferCopyRowPitchAlignment and
VkPhysicalDeviceExternalMemoryHostPropertiesEXT.minImportedHostPointerAlignment are of type
VkDeviceSize (a typedef uint64_t).
VkPhysicalDeviceLimits.minMemoryMapAlignment is of type size_t.

Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Lynne <dev@lynne.ee>
2021-04-29 14:04:02 -03:00
Lynne 3a3e8c35b6
hwcontext_vulkan: reorder structure fields and add spaces in between
We're in the middle of an ABI unstable period, so we're allowed to.
2021-04-28 18:18:05 +02:00
Anton Khirnov 85ba17f36d Bump major versions of all libraries.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-04-27 11:48:05 -03:00
James Almer 0bf3a7361d avutil: remove deprecated AVClass.child_class_next
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 11:48:04 -03:00
Andreas Rheinhardt d40bb518b5 avutil/cpu: Remove deprecated functions
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:13 -03:00
Andreas Rheinhardt ef6a9e5e31 avutil/buffer: Switch AVBuffer API to size_t
Announced in 14040a1d91.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:13 -03:00
Andreas Rheinhardt 985c0dac67 avutil/pixdesc: Remove deprecated AV_PIX_FMT_FLAG_PSEUDOPAL
Deprecated in d6fc031caf.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:13 -03:00
Andreas Rheinhardt 1eb3110115 avutil/frame: Remove deprecated getters and setters
Deprecated in 7df37dd319.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:13 -03:00
Andreas Rheinhardt a240097ecd avutil: Switch crypto APIs to size_t
Announced in e435beb1ea.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:13 -03:00
Andreas Rheinhardt 6e30b35b85 avutil/frame: Remove deprecated AVFrame.pkt_pts field
Deprecated in 32c8359093.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:13 -03:00
Andreas Rheinhardt 3b56fa85e8 avutil/frame: Remove deprecated AVFrame.error
Deprecated in 1aa24df74c.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:12 -03:00
Andreas Rheinhardt 0181162bb5 avutil/pixdesc: Remove deprecated off-by-one fields from pix fmt descs
Deprecated in 2268db2cd0.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:12 -03:00
Andreas Rheinhardt b8accd1175 avutil/frame: Remove AVFrame QP table API
Originally deprecated in 1296b1f6c0631ab79464e22d48a6a1548450b943;
scheduled again for removal in a991526832.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:12 -03:00
Andreas Rheinhardt ad524cb9ee avutil/pixfmt: Remove deprecated VAAPI pixel formats
Deprecated in 9f8e57efe4.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:12 -03:00
James Almer 7a6ea6ce2a x86/tx_float: remove ff_ prefix from external constant tables
Fixes compilation with some assemblers.

Reviewed-by: Lynne
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-25 18:42:38 -03:00
Lynne bb40f800bd
x86/tx_float: fix forgotten 2-argument mulps
Yasm *really* cannot deal with any omitted arguments at all.
2021-04-24 22:33:42 +02:00
Lynne e2cf0a1f68
x86/tx_float: use all arguments on vperm2f and vpermilps and reindent comments
Apparently even old nasm isn't required to accept incomplete instructions.
2021-04-24 22:21:13 +02:00
James Almer fddddc7ec2 x86/tx_float: Fixes compilation with old yasm
Use three operand format on some instructions, and lea to load effective
addresses of tables.

Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-24 17:02:31 -03:00
Lynne e448a4b4ea
lavu/x86/tx_float: fix FMA3 implying AVX2 is available
It's the other way around - AVX2 implies FMA3 is available.
2021-04-24 19:00:27 +02:00
Lynne 119a3f7e8d
lavu/x86: add FFT assembly
This commit adds a pure x86 assembly SIMD version of the FFT in libavutil/tx.
The design of this pure assembly FFT is pretty unconventional.

On the lowest level, instead of splitting the complex numbers into
real and imaginary parts, we keep complex numbers together but split
them in terms of parity. This saves a number of shuffles in each transform,
but more importantly, it splits each transform into two independent
paths, which we process using separate registers in parallel.
This allows us to keep all units saturated and lets us use all available
registers to avoid dependencies.
Moreover, it allows us to double the granularity of our per-load permutation,
skipping many expensive lookups and allowing us to use just 4 loads per register,
rather than 8, or in case FMA3 (and by extension, AVX2), use the vgatherdpd
instruction, which is at least as fast as 4 separate loads on old hardware,
and quite a bit faster on modern CPUs).

Higher up, we go for a bottom-up construction of large transforms, foregoing
the traditional per-transform call-return recursion chains. Instead, we always
start at the bottom-most basis transform (in this case, a 32-point transform),
and continue constructing larger and larger transforms until we return to the
top-most transform.
This way, we only touch the stack 3 times per a complete target transform:
once for the 1/2 length transform and two times for the 1/4 length transform.

The combination algorithm we use is a standard Split-Radix algorithm,
as used in our C code. Although a version with less operations exists
(Steven G. Johnson and Matteo Frigo's "A modified split-radix FFT with fewer
arithmetic operations", IEEE Trans. Signal Process. 55 (1), 111–119 (2007),
which is the one FFTW uses), it only has 2% less operations and requires at least 4x
the binary code (due to it needing 4 different paths to do a single transform).
That version also has other issues which prevent it from being implemented
with SIMD code as efficiently, which makes it lose the marginal gains it offered,
and cannot be performed bottom-up, requiring many recursive call-return chains,
whose overhead adds up.

We go through a lot of effort to minimize load/stores by keeping as much in
registers in between construcring transforms. This saves us around 32 cycles,
on paper, but in reality a lot more due to load/store aliasing (a load from a
memory location cannot be issued while there's a store pending, and there are
only so many (2 for Zen 3) load/store units in a CPU).
Also, we interleave coefficients during the last stage to save on a store+load
per register.

Each of the smallest, basis transforms (4, 8 and 16-point in our case)
has been extremely optimized. Our 8-point transform is barely 20 instructions
in total, beating our old implementation 8-point transform by 1 instruction.
Our 2x8-point transform is 23 instructions, beating our old implementation by
6 instruction and needing 50% less cycles. Our 16-point transform's combination
code takes slightly more instructions than our old implementation, but makes up
for it by requiring a lot less arithmetic operations.

Overall, the transform was optimized for the timings of Zen 3, which at the
time of writing has the most IPC from all documented CPUs. Shuffles were
preferred over arithmetic operations due to their 1/0.5 latency/throughput.

On average, this code is 30% faster than our old libavcodec implementation.
It's able to trade blows with the previously-untouchable FFTW on small transforms,
and due to its tiny size and better prediction, outdoes FFTW on larger transforms
by 11% on the largest currently supported size.
2021-04-24 17:19:18 +02:00
Lynne 1978b143eb
checkasm: add av_tx FFT SIMD testing code
This sadly required making changes to the code itself,
due to the same context needing to be reused for both versions.
The lookup table had to be duplicated for both versions.
2021-04-24 17:19:17 +02:00
Lynne ff71671d88
lavu/tx: add parity revtab generator version
This will be used for SIMD support.
2021-04-24 17:17:30 +02:00
Lynne 18af1ea8d1
lavu: bump minor and add APIchanges entry for the lavu/tx changes 2021-04-24 17:17:28 +02:00
Lynne 0072a42388
lavu/tx: add full-sized iMDCT transform flag 2021-04-24 17:17:27 +02:00
Lynne aa6c757d50
lavu/tx: add unaligned flag to the API 2021-04-24 17:17:26 +02:00
Lynne 8c55c82583
lavu/tx: add a 9-point FFT and (i)MDCT 2021-04-24 17:17:25 +02:00
Lynne bd9ea917a3
lavu/tx: add a 7-point FFT and (i)MDCT 2021-04-24 17:17:23 +02:00
Lynne 89da62f2fc
lavu/tx: refactor power-of-two FFT
This commit refactors the power-of-two FFT, making it faster and
halving the size of all tables, making the code much smaller on
all systems.
This removes the big/small pass split, because on modern systems
the "big" pass is always faster, and even on older machines there
is no measurable speed difference.
2021-04-24 17:17:20 +02:00
Lynne aa910a7ecd
lavu/tx: minor code style improvements and additional comments 2021-04-24 17:17:15 +02:00
Andreas Rheinhardt 7368e5537d avutil/cpu: Schedule deprecated functions for removal
av_set_cpu_flags_mask() has been deprecated in the commit which merged
it: 6df42f98746be06c883ce683563e07c9a2af983f; av_parse_cpu_flags() has
been deprecated in 4b529edff8.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-04-19 14:34:19 +02:00
Andreas Rheinhardt f3c197b129 Include attributes.h directly
Some files currently rely on libavutil/cpu.h to include it for them;
yet said file won't use include it any more after the currently
deprecated functions are removed, so include attributes.h directly.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-04-19 14:34:10 +02:00
Brad Smith c8fb68ec52 avutil/cpu: Use HW_NCPUONLINE to detect # of online CPUs with OpenBSD
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
2021-04-18 22:51:14 +02:00
Guo, Yejun 0c7aef84a0 lavu/detection_bbox.h: use AV_NUM_DETECTION_BBOX_CLASSIFY to replace AV_NUM_BBOX_CLASSIFY 2021-04-18 10:41:17 +08:00
Lynne 6c65e49990
lavu/detection_bboxes: add missing space
Could at least maintainers with push access follow the code styles
we have?
2021-04-17 13:14:47 +02:00
Guo, Yejun f1bf465aa0 lavu: add side data AV_FRAME_DATA_DETECTION_BBOXES for object detection/classification 2021-04-17 17:27:02 +08:00
Andreas Rheinhardt 416cc012f6 avutil/frame: Return 0 on success in av_frame_ref()
av_frame_copy() is allowed to return values >= 0 on success, whereas
the documentation of av_frame_ref() states that the return value is 0 on
success. Ergo the latter must not just return the former's return value.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-04-05 18:36:51 +02:00
Andreas Rheinhardt a2a38b1606 avutil/cpu: Fix race condition in av_cpu_count()
av_cpu_count() intends to emit a debug message containing the number of
logical cores when called the first time. The check currently works with
a static volatile int; yet this does not help at all in case of
concurrent accesses by multiple threads. So replace this with an
atomic_int.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-04-02 19:12:43 +02:00
Andreas Rheinhardt b7565b65b8 avutil/pixdesc: Fix 1 << 32
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-04-01 14:52:18 +02:00
Andreas Rheinhardt bbf8431b1b avutil/base64: Fix undefined NULL + 0
Affected the base64 FATE test.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-04-01 14:47:00 +02:00
Michael Niedermayer 522a5259e9 avutil/common: Add FF_PTR_ADD()
Suggested-by: Andreas Rheinhardt
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-03-31 23:09:35 +02:00
Andreas Rheinhardt a77beea6c8 avutil/frame: Deprecate av_get_colorspace_name()
Contrary to av_color_space_name() it doesn't even support newer
colorspaces.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-03-24 08:00:57 +01:00
Michael Niedermayer c361fa9e21 Bump minor versions after release branch
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-03-20 01:02:11 +01:00
Michael Niedermayer c67d2a2875 Bump Versions before release/4.4 branch
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-03-20 01:01:12 +01:00
Andreas Rheinhardt e8c0bca6bd avutil/adler32: Switch av_adler32_update() to size_t on bump
av_adler32_update() is used by av_hash_update() which will be switched
to size_t at the next bump. So it also has to be made to use size_t.
This is also necessary for framecrcenc.c, because the size of side data
will become a size_t, too.

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-03-19 04:19:53 +01:00
Andreas Rheinhardt 520754476d avutil/avstring: Check for memory allocation error in av_escape
av_bprint_finalize() can still fail even when it has been checked that
the AVBPrint is currently complete: Namely if the string was so short
that it fit into the AVBPrint's internal buffer.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-03-15 06:45:07 +01:00
Andreas Rheinhardt c2649d5196 avutil/avstring: Limit string length in av_escape to range of int
Otherwise the caller can't distinguish the return value from an error.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-03-15 06:44:03 +01:00
Michael Niedermayer c94875471e avutil/timecode: Avoid fps overflow
Fixes: Integer overflow and division by 0
Fixes: poc-202102-div.mov

Found-by: 1vanChen of NSFOCUS Security Team
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-03-14 23:29:51 +01:00
Christopher Degawa e93403b75f libavutil/timer: Fix clang reserved-user-defined-literal
clang errors when compiling with C++11 about needing spaces between
literal and identifier

Signed-off-by: Christopher Degawa <ccom@randomderp.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-03-13 11:47:43 -03:00
Andreas Rheinhardt 1ad628d2c2 avutil/buffer_internal: Include internal for buffer_size_t
Fixes checkheaders.

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-03-11 14:07:15 +01:00
James Almer e36eb94048 avutil: use the buffer_size_t typedef where required
Signed-off-by: James Almer <jamrial@gmail.com>
2021-03-10 20:26:37 -03:00
James Almer dbd47b7990 avutil/frame: change av_frame_new_side_data() size parameter type to size_t
Signed-off-by: James Almer <jamrial@gmail.com>
2021-03-10 20:26:36 -03:00
James Almer 14040a1d91 avutil/buffer: change public function and struct size parameter types to size_t
Signed-off-by: James Almer <jamrial@gmail.com>
2021-03-10 20:26:36 -03:00
Stefano Sabatini 0f6bf94eb7 avutil/{avstring,bprint}: add XML escaping from ffprobe to avutil
Base escaping only escapes values required for base character data
according to part 2.4 of XML, and if additional flags are added
single and double quotes can additionally be escaped in order
to handle single and double quoted attributes.

Co-authored-by: Jan Ekström <jan.ekstrom@24i.com>
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
2021-03-05 19:45:00 +02:00
Michael Niedermayer 5d7f17e885 avutil/parseutils: Check sign in av_parse_time()
Fixes: signed integer overflow: -9223372053736 * 1000000 cannot be represented in type 'long'
Fixes: 26910/clusterfuzz-testcase-minimized-ffmpeg_dem_CONCAT_fuzzer-6607924558430208

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-03-03 16:54:20 +01:00
Andreas Rheinhardt 514ee8770d avutil/spherical: Use av_strstart instead of strncmp
It makes the intent clearer and avoids calculating the length
separately.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-28 17:14:21 +01:00
Andreas Rheinhardt 13e98f1daf avutil/stereo3d: Use av_strstart instead of strncmp
It makes the intent clearer and avoids calculating the length
separately.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-28 17:14:21 +01:00
Andreas Rheinhardt 114603d0b9 avutil/pixdesc: Use av_strstart where appropriate
It makes the intent clearer and allows to avoid calculating the strlen
separately.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-28 17:14:21 +01:00
Lynne e20a39a375
lavu/tx: do not invert permutes on MDCTs 2021-02-27 05:01:17 +01:00
Lynne 8e94b7cff0
lavu/tx: invert permutation lookups
out[lut[i]] = in[i] lookups were 4.04 times(!) slower than
out[i] = in[lut[i]] lookups for an out-of-place FFT of length 4096.

The permutes remain unchanged for anything but out-of-place monolithic
FFT, as those benefit quite a lot from the current order (it means
there's only 1 lookup necessary to add to an offset, rather than
a full gather).

The code was based around non-power-of-two FFTs, so this wasn't
benchmarked early on.
2021-02-27 04:21:05 +01:00
Lynne 9ddaf0c9f0
lavu/tx: simplify in-place permute search function 2021-02-27 04:21:03 +01:00
Lynne 10341743d2
lavu/tx: require output argument to match input for inplace transforms
This simplifies some assembly code by a lot, by either saving a branch
or saving an entire duplicated function.
2021-02-26 05:42:24 +01:00
James Almer 45a2902976 avutil/buffer: free all pooled buffers immediately after uninitializing the pool
No buffer will be fetched from the pool after it's uninitialized, so there's
no benefit from waiting until every single buffer has been returned to it
before freeing them all.
This should free some memory in certain scenarios, which can be beneficial in
low memory systems.

Based on a patch by Jonas Karlman.

Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-24 10:45:30 -03:00
Andreas Rheinhardt c7c7aa85b7 avutil/tx: Fix declaration after statement
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 10:03:32 +01:00
Martin Storsjö ee040a7fc2 arm/aarch64: Use mach_absolute_time as timer on apple platforms
This is much less precise than the cycle counter register, but
the cycle counter register is not available on apple platforms
(and on linux, it requires a kernel module for allowing user mode
access).

Signed-off-by: Martin Storsjö <martin@martin.st>
2021-02-21 22:41:34 +02:00
Lynne 5ca40d6d94
lavu/tx: support in-place FFT transforms
This commit adds support for in-place FFT transforms. Since our
internal transforms were all in-place anyway, this only changes
the permutation on the input.

Unfortunately, research papers were of no help here. All focused
on dry hardware implementations, where permutes are free, or on
software implementations where binary bloat is of no concern so
storing dozen times the transforms for each permutation and version
is not considered bad practice.
Still, for a pure C implementation, it's only around 28% slower
than the multi-megabyte FFTW3 in unaligned mode.

Unlike a closed permutation like with PFA, split-radix FFT bit-reversals
contain multiple NOPs, multiple simple swaps, and a few chained swaps,
so regular single-loop single-state permute loops were not possible.
Instead, we filter out parts of the input indices which are redundant.
This allows for a single branch, and with some clever AVX512 asm,
could possibly be SIMD'd without refactoring.

The inplace_idx array is guaranteed to never be larger than the
revtab array, and in practice only requires around log2(len) entries.

The power-of-two MDCTs can be done in-place as well. And it's
possible to eliminate a copy in the compound MDCTs too, however
it'll be slower than doing them out of place, and we'd need to dirty
the input array.
2021-02-21 17:05:16 +01:00
Lynne aa34e99f88
lavu/tx: space out enum AVTXType values with newlines
Makes separation clearer.
2021-02-21 17:05:04 +01:00
Andreas Rheinhardt c9d9c60746 avutil/video_enc_params: Check for truncation before creating buffer
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-19 07:45:39 +01:00
Andreas Rheinhardt 39df279c74 avutil/video_enc_params: Combine overflow checks
This patch also fixes a -Wtautological-constant-out-of-range-compare
warning from Clang and a -Wtype-limits warning from GCC on systems
where size_t is 64bits and unsigned 32bits. The reason for this seems
to be that variable (whose value derives from sizeof() and can therefore
be known at compile-time) is used instead of using sizeof() directly in
the comparison.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-19 07:45:39 +01:00
Andreas Rheinhardt bd50e715a9 avutil/common: Move everything inside inclusion guards
libavutil/common.h is a public header that provides generic math
functions whereas libavutil/intmath.h is a private header that contains
plattform-specific optimized versions of said math functions. common.h
includes intmath.h (when building the FFmpeg libraries) so that the
optimized versions are used for them.

This interdependency sometimes causes trouble: intmath.h once contained
an inlined ff_sqrt function that relied upon av_log2_16bit. In case there
was no optimized logarithm available on this plattform, intmath.h needed
to include common.h to get the generic implementation and this has been
done after the optimized versions (if any) have been provided so that
common.h used the optimized versions; it also needed to be done before
ff_sqrt. Yet when intmath.h was included from common.h and if an ordinary
inclusion guard was used by common.h, the #include "common.h" in intmath.h
was a no-op and therefore av_log2_16bit was still unknown at the end of
intmath.h (and also in ff_sqrt) if no optimized version was available.

Before a955b59658 this was solved by
duplicating the #ifndef av_log2_16bit check after the inclusion of
common.h in intmath.h; said commit instead moved these checks to the
end of common.h, outside the inclusion guards and made common.h include
itself to get these unguarded defines. This is still the current
state of affairs.

Yet this is unnecessary since 9734b8ba56
as said commit removed ff_sqrt as well as the #include "common.h" from
intmath.h. Therefore this commit moves everything inside the inclusion
guards and makes common.h not include itself.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-11 09:07:10 +01:00