Commit Graph

5757 Commits

Author SHA1 Message Date
Andreas Rheinhardt 1a7efafd33 avutil/tx: Use proper deallocator
May fix the FATE failures on x64 Windows here:
https://fate.ffmpeg.org/report.cgi?slot=x86_64-msvc17-windows-native&time=20221125130443

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-11-25 15:54:33 +01:00
Lynne a56d7e0ca3
lavu/tx: add DCT-III implementation 2022-11-24 15:58:36 +01:00
Lynne 504b7bec1a
lavu/tx: add DCT-II implementation 2022-11-24 15:58:35 +01:00
Lynne 93c30bd6f0
lavu/tx: clarify stride for RDFT transforms 2022-11-24 15:58:35 +01:00
Lynne 43d285a40f
lavu/tx: fix last coefficient scaling for R2C transforms
This was a typo.
2022-11-24 15:58:35 +01:00
Lynne 8547123f3b
lavu/tx: generalize PFA FFTs
This commit permits any stacking of FFTs of any size.
2022-11-24 15:58:34 +01:00
Lynne 7f019e7758
lavu/tx: add length decomposition function
Rather than using a list of lengths supported, this goes a step beyond
and uses all registered codelets to come up with a good decomposition.
2022-11-24 15:58:34 +01:00
Lynne 87bae6b018
lavu/tx: refactor to explicitly track and convert lookup table order
Necessary for generalizing PFAs.
2022-11-24 15:58:34 +01:00
Lynne 1c8d77a2bf
lavu/tx: refactor and separate codelet list and prio code 2022-11-24 15:58:33 +01:00
Lynne 958b3760b5
lavu/tx: improve transform tree logging
Now prints the actual codelet size used, as well as the number of
allowed factors.
2022-11-24 15:58:33 +01:00
Lynne 6ddd10c3e2
lavu/tx: allow codelets to specify a minimum number of matching factors 2022-11-24 15:58:33 +01:00
Lynne dd77e61182
lavu/tx: add ff_tx_clear_ctx()
This function allows implementations to clean up a context after
successfully initializing subcontexts.
2022-11-24 15:58:32 +01:00
Lynne fab97faf02
x86/tx_float: implement striding in fft_15xM 2022-11-24 15:58:32 +01:00
Lynne 92100eee5b
x86/tx_float_init: properly specify the supported factors of 15xM FFTs
Only powers of two are currently supported.
2022-11-24 15:58:32 +01:00
Lynne cc1df4045e
x86/tx_float: add a standalone 15-point AVX2 transform
Enables its use everywhere else in the framework.
2022-11-24 15:58:31 +01:00
Lynne 877e575b5d
x86/tx_float: optimize and macro out FFT15 2022-11-24 15:58:31 +01:00
Lynne fbe4fd992f
lavu/tx: support output stride in naive transforms
Allows them to be used in general PFAs.
2022-11-24 15:58:31 +01:00
Lynne 68cabf8750
lavu/tx: add fft_inplace_small transforms
This is much faster than the loop.
2022-11-24 15:58:30 +01:00
Lynne d4e39cae2e
lavu/tx: drop requirement of input == output for in-place transforms
No longer necessary.
2022-11-24 15:58:30 +01:00
Lynne fff3e1d848
lavu/tx: support out-of-place transforms in fft_inplace
This makes testing easier, as a unified path can be used for in/out of
place transforms.
2022-11-24 15:58:30 +01:00
Lynne d260796f11
lavu/tx: make C ptwo transforms in+out of place
We assume that _all_ in-place transforms can operate out of place,
which isn't true, because the C ptwo transforms were always in-place (dst).
2022-11-24 15:58:29 +01:00
Lynne 37008dc402
lavu/tx: add naive_small FFT
The same as naive but with precomputed tables. Makes it more useful
for odd-factors we don't support yet.
2022-11-24 15:58:29 +01:00
Lynne e8a9b7b298
lavu/tx: list all odd-length FFT factors as regular codelets
Allows them to be picked just like any other transform.
2022-11-24 15:58:28 +01:00
Lynne 45bd4bf79f
lavu/tx: generalize single-factor transforms
Not that useful, but it gives us fast small odd-length transforms.
2022-11-24 15:58:28 +01:00
Lynne 79f11e2409
lavu/tx: make prime factor transforms truly in-place
They all overwrote in[0] and then used it as a DC.
2022-11-24 15:58:28 +01:00
Haihao Xiang 3dc8bceabe lavu/pixfmt: Update the description for AV_PIX_FMT_QSV
Since D3D11 was introduced for QSV in FFmpeg 5.0, there is an implied
API/ABI change for user-supplied frames [1], hence update the
description for AV_PIX_FMT_QSV.

[1] https://ffmpeg.org/pipermail/ffmpeg-devel/2021-December/290444.html

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-11-22 13:52:38 +08:00
Zhao Zhili b7a3f16957 avutil/hwcontext: verify hw_frames_ctx in transfer_data_alloc
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2022-11-21 23:57:03 +08:00
Zhao Zhili 2697f23f4e avutil/hwcontext_mediacodec: add ANativeWindow support
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2022-11-21 23:53:27 +08:00
James Almer 84fe53f6e1 avutil/hwcontext_cuda: fix compilation without Vulkan after last commit
Signed-off-by: James Almer <jamrial@gmail.com>
2022-11-12 15:54:53 -03:00
James Almer f4aa5c275f avutil/hwcontext_cuda: fix mixed declarations and code warning
Signed-off-by: James Almer <jamrial@gmail.com>
2022-11-12 15:52:10 -03:00
Andreas Rheinhardt c124981b79 avutil/cast5: Avoid undefined shift of uint32_t by 32 places
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-11-11 12:24:23 +01:00
James Almer 86157f5a25 avutil/tx: use llrintf() to convert a float into a 64 bit integer
Should fix fate failures on Windowx x86 targets, where long is 32 bits.

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: James Almer <jamrial@gmail.com>
2022-11-08 14:24:49 -03:00
James Almer d5c7970a27 avutil/tx: use a lower log level for the debug messages
The amount of lines printed is too high for the verbose level, and the debug
level is a better fit for their content.

Signed-off-by: James Almer <jamrial@gmail.com>
2022-11-08 14:08:05 -03:00
Marvin Scholz 2508e846a8 avutil/dict: Improve documentation
Mostly consistent formatting and consistently ordering of
warnings/notes to be next to the description.

Additionally group the AV_DICT_* macros.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2022-11-06 08:26:50 +01:00
Marvin Scholz 3101b8afb3 avutil/dict: Use av_dict_iterate in av_dict_get_string
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2022-11-06 08:26:50 +01:00
Marvin Scholz 3c2050b749 avutil/dict: Use av_dict_iterate in av_dict_copy
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2022-11-06 08:26:50 +01:00
Marvin Scholz 5f7c5a0bd7 avutil/dict: Use av_dict_iterate in av_dict_get
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2022-11-06 08:26:50 +01:00
Marvin Scholz 9dad237928 avutil/dict: Add av_dict_iterate
This is a more explicit iteration API rather than using the "magic"
av_dict_get(d, "", t, AV_DICT_IGNORE_SUFFIX) which is not really
trivial to grasp what it does when casually reading through code.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2022-11-06 08:26:48 +01:00
James Darnley 0f252dfa95 avutil/tests/cpu: print the avx512icl flag 2022-11-04 19:37:46 +01:00
James Almer 6228ba141d avutil/channel_layout: add a 7.1(top) channel layout
Signed-off-by: James Almer <jamrial@gmail.com>
2022-11-03 19:39:45 -03:00
James Almer 83e918de71 avutil/channel_layout: add a cube channel layout
Signed-off-by: James Almer <jamrial@gmail.com>
2022-10-30 16:18:30 -03:00
Andreas Rheinhardt f8efd890bf avutil/tx_template: Move function pointers to const memory
This can be achieved by moving the AVOnce out of the structure
containing the function pointers; the latter can then be made
const.
This also has the advantage of eliminating padding in the structure
(sizeof(AVOnce) is four here) and allowing the AVOnces to be put
into .bss (dependening upon the implementation).

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-28 09:30:10 +02:00
Andreas Rheinhardt 188216581b avutil/tx_template: Avoid code duplication
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-28 09:30:10 +02:00
Andreas Rheinhardt 2af5f55b2e avutil/tx_template: Don't waste space for inexistent factors
It is possible to avoid the factors array for the power-of-two
tables for which said array is unused by using a different
structure for initialization for power-of-two tables than for
non-power-of-two-tables. This saves 3*15*16B from .data.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-28 09:29:41 +02:00
Andreas Rheinhardt d7c3e52fbf avutil/integer: Use '|' instead of '+' where it is more natural
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-24 20:11:20 +02:00
Andreas Rheinhardt 9a6cdd1ba3 avutil/integer: Fix undefined left shifts of negative numbers
Affected the integers FATE-test.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-24 18:04:24 +02:00
Andreas Rheinhardt f49375f28f avutil/aes: Don't use out-of-bounds index
Up until now, av_aes_init() uses a->round_key[0].u8 + t
as dst of memcpy where it is intended for t to greater
than 16 (u8 is an uint8_t[16]); given that round_key itself
is an array, it is actually intended for the dst to be
in a latter round_key member. To do this properly,
just cast a->round_key to unsigned char*.

This fixes the srtp, aes, aes_ctr, mov-3elist-encrypted,
mov-frag-encrypted and mov-tenc-only-encrypted
FATE-tests with (Clang-)UBSan.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-24 16:28:14 +02:00
Andreas Rheinhardt 73930e4f93 avutil/aes: Don't use misaligned pointers
The AES code uses av_aes_block, a union consisting of
uint64_t[2], uint32_t[4], uint8_t[4][4] and uint8_t[16].
subshift() performs byte-wise manipulations of two av_aes_blocks,
but when encrypting, it does so with a shift of two bytes;
more precisely, it uses
"av_aes_block *s1 = (av_aes_block *) (s0[0].u8 - s)"
and lateron uses the uint8_t[16] member to access s0.
Yet av_aes_block requires to be suitably aligned for
the uint64_t[2] member, which s0[0].u8 - 2 is certainly
not. This is in violation of 6.3.2.3 (7) of C11. UBSan
reports this in the aes_ctr, mov-3elist-encrypted,
mov-frag-encrypted, mov-tenc-only-encrypted and srtp
tests.
Furthermore, there is another issue here: The pointer points
outside of s0; this works, because all the accesses lateron
use an index >= 3. (Clang-)UBSan reports this as
"runtime error: index -2 out of bounds for type 'uint8_t[16]'".

This commit fixes both of these issues: The latter issue
is fixed by applying an offset of "+ 3" during the cast
and subtracting this from the indices used lateron.
The former issue is solved by not casting to av_aes_block*
at all; instead simply cast to unsigned char*.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-24 16:28:14 +02:00
Anton Khirnov f66e794672 lavu/thread: add an internal function for setting thread name
Linux-only for now.
2022-10-24 02:00:31 +02:00
Carl Eugen Hoyos 882a17068f lavu/hwcontext_vaapi: Fix type specifier for uintptr_t
Fixes a format specifier warning on x86_32 Linux.
Fixes part of ticket #9986.
2022-10-23 20:51:42 +02:00