ffmpeg

Commit Graph

Author	SHA1	Message	Date
Rémi Denis-Courmont	b3825bbe45	riscv: test for assembler support This should fix the build on LLVM 16 and earlier, at the cost of turning all non-RVV optimisations off.	2023-12-08 17:21:09 +02:00
Alfred Wingate	e5ce473040	swscale/x86/rgb_2_rgb: Add opaque pointer to missed definitions of ff_nv12ToUV Opaque parameters were previously added to the original definition of ff_nv12ToUV, leading to gcc noticing a type mismatch with -Wlto-type-mismatch. `f2de911818` https://bugs.gentoo.org/907484 Signed-off-by: Alfred Wingate <parona@protonmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2023-12-02 11:22:46 +01:00
xufuji456	cc86343b96	lavc/hevcdsp_qpel_neon: using movi.16b instead of movi.2d Building iOS platform with arm64, the compiler has a warning: "instruction movi.2d with immediate #0 may not function correctly on this CPU, converting to movi.16b" Signed-off-by: xufuji456 <839789740@qq.com> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-11-28 15:54:49 +02:00
Rémi Denis-Courmont	6d60cc7baf	sws/rgb2rgb: fix unaligned accesses in R-V V YUYV to I422p In my personal opinion, we should not need to support unaligned YUY2 pixel maps. They should always be aligned to at least 32 bits, and the current code assumes just 16 bits. However checkasm does test for unaligned input bitmaps. QEMU accepts it, but real hardware dose not. In this particular case, we can at the same time improve performance and handle unaligned inputs, so do just that. uyvytoyuv422_c: 104379.0 uyvytoyuv422_c: 104060.0 uyvytoyuv422_rvv_i32: 25284.0 (before) uyvytoyuv422_rvv_i32: 19303.2 (after)	2023-11-13 18:34:29 +02:00
Rémi Denis-Courmont	5b8b5ec9c5	sws/rgb2rgb: rework R-V V YUY2 to 4:2:2 planar This saves three scratch registers and three instructions per line. The performance gains are mostly negligible. The main point is to free up registers for further rework.	2023-11-13 18:34:29 +02:00
Niklas Haas	736284e7b9	swscale/yuv2rgb: fix sws_getCoefficients for colorspace=0 The documentation states that invalid entries default to SWS_CS_DEFAULT. A value of 0 is not a valid SWS_CS_*, yet the code incorrectly hard-codes it to BT.709 coefficients instead of SWS_CS_DEFAULT.	2023-11-09 12:53:35 +01:00
Niklas Haas	d043e5c54c	swscale: don't omit ff_sws_init_range_convert for high-bit This was a complete hack seemingly designed to work around a different bug, which was fixed in the previous commit. As such, there is no more reason not to do this, as it simply breaks changing color range in sws_setColorspaceDetails for no reason.	2023-11-09 12:53:35 +01:00
Niklas Haas	cedf589c09	swscale: fix sws_setColorspaceDetails after sws_init_context More commonly, this fixes the case of sws_setColorspaceDetails after sws_getContext, since the latter implies sws_init_context. The problem here is that sws_init_context sets up the range conversion and fast path tables based on the values of srcRange/dstRange at init time. This may result in locking in a "wrong" path (either using unscaled fast path when range conversion later required, or using scaled slow path when range conversion becomes no longer required). There are two way outs: 1. Always initialize range conversion and unscaled converters, even if they will be unused, and extend the runtime check. 2. Re-do initialization if the values change after sws_setColorspaceDetails. I opted for approach 1 because it was simpler and easier to reason about. Reword the av_log message to make it clear that this special converter is not necessarily used, depending on whether or not there is range conversion or YUV matrix conversion going on.	2023-11-09 12:53:35 +01:00
Michael Niedermayer	47e784f881	Bump versions after 6.1 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2023-10-29 16:19:14 +01:00
Michael Niedermayer	9d3a7d30c4	Bump versions prior to 6.1 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2023-10-29 15:34:05 +01:00
Martin Storsjö	a76b409dd0	aarch64: Reindent all assembly to 8/24 column indentation libavcodec/aarch64/vc1dsp_neon.S is skipped here, as it intentionally uses a layered indentation style to visually show how different unrolled/interleaved phases fit together. Signed-off-by: Martin Storsjö <martin@martin.st>	2023-10-21 23:25:54 +03:00
Martin Storsjö	93cda5a9c2	aarch64: Lowercase UXTW/SXTW and similar flags Signed-off-by: Martin Storsjö <martin@martin.st>	2023-10-21 23:25:23 +03:00
Martin Storsjö	184103b310	aarch64: Consistently use lowercase for vector element specifiers Signed-off-by: Martin Storsjö <martin@martin.st>	2023-10-21 23:25:18 +03:00
Rémi Denis-Courmont	19baf4e009	swscale/rgb2rgb: R-V V deinterleaveBytes	2023-10-03 22:53:20 +03:00
Rémi Denis-Courmont	ede3215115	swscale/rgb2rgb: fix extra iteration in R-V V interleave There was an additional iteration doing nothing for each line, due to checking the selected vector length instead of the available vector length.	2023-10-03 22:53:20 +03:00
Rémi Denis-Courmont	d14130aea3	swscale/rgb2rgb: unroll R-V V interleave_bytes	2023-10-03 20:48:47 +03:00
Rémi Denis-Courmont	6269c4a440	swscale/rgb2rgb: unroll RISC-V V uyvytoyuv422	2023-10-03 20:48:39 +03:00
Rémi Denis-Courmont	e50f8e861b	swscale/rgb2rgb: avoid S-regs in RISC-V V uyvytoyuv422 We can make do with callee-clobbered registers only now. As an added bonus, this makes the code XLEN-independent.	2023-10-03 20:48:39 +03:00
Rémi Denis-Courmont	be37a2e364	swscale/rgb2rgb: rework RISC-V V uyvytoyuv422 This avoids using relatively slow register strides.	2023-10-03 20:48:39 +03:00
Rémi Denis-Courmont	1a4bd76ea5	swscale/rgb2rgb: remove R-V V shuffle_bytes_3012 This is slower than the Zbb version on real hardware due to register strides. Proper support for vector byte-swap requires the Zvbb extension, but it's much too early for me to worry about it.	2023-10-02 22:28:38 +03:00
Rémi Denis-Courmont	c4a144c29d	swscale/rgb2rgb: add R-V Zbb shuffle_bytes_3210	2023-10-02 22:28:25 +03:00
Paul B Mahol	29b673bdcf	swscale: add GBRAP14 format support	2023-09-28 19:37:58 +02:00
Andreas Rheinhardt	f8503b4c33	avutil/internal: Don't auto-include emms.h Instead include emms.h wherever it is needed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2023-09-04 11:04:45 +02:00
L. E. Segovia	ddc1cd5cdd	configure: Set WIN32_LEAN_AND_MEAN at configure time Including winsock2.h or windows.h without WIN32_LEAN_AND_MEAN cause bzlib.h to parse as nonsense, due to an instance of #define char small in rpcndr.h. See: https://stackoverflow.com/a/27794577 Signed-off-by: L. E. Segovia <amy@amyspark.me> Signed-off-by: Martin Storsjö <martin@martin.st>	2023-08-14 22:57:28 +03:00
Rémi Denis-Courmont	c2b38619c0	swscale/rgb2rgb2: rework RISC-V V shuffle_bytes_{1230,3012} This avoids strided loads. Before: shuffle_bytes_1230_rvv_i32: 308.7 shuffle_bytes_3012_rvv_i32: 308.7 After: shuffle_bytes_1230_rvv_i32: 46.7 shuffle_bytes_3012_rvv_i32: 46.7	2023-07-21 22:18:02 +03:00
Rémi Denis-Courmont	15982554e6	swscale/rgb2rgb2: rework RISC-V V shuffle_bytes_{0321,2103} This avoids strided loads. Before: shuffle_bytes_0321_rvv_i32: 307.7 shuffle_bytes_2103_rvv_i32: 308.7 After: shuffle_bytes_0321_rvv_i32: 59.7 shuffle_bytes_2103_rvv_i32: 61.5	2023-07-21 22:18:02 +03:00
Rémi Denis-Courmont	d3948e4db5	swscale: inline ff_shuffle_bytes_3210_rvv No functional changes.	2023-07-21 22:18:02 +03:00
Rémi Denis-Courmont	b6585eb04c	lavu: add/use flag for RISC-V Zba extension The code was blindly assuming that Zbb or V implied Zba. While the earlier is practically always true, the later broke some QEMU setups, as V was introduced earlier than Zba.	2023-07-19 19:29:35 +03:00
Khem Raj	a7b3c0203f	libswscale/riscv: fix syntax of vsetvli Add missing operand which clang complains about but GCC assumes it to be 'm1' if not specified. Works around build failure with Clang: \| src/libswscale/riscv/rgb2rgb_rvv.S:88:25: error: operand must be e[8\|16\|32\|64\|128\|256\|512\|1024],m[1\|2\|4\|8\|f2\|f4\|f8],[ta\|tu],[ma\|mu] \| vsetvli t4, t3, e8, ta, ma \| ^ Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>	2023-07-13 22:01:24 +03:00
Lynne	b3fb73af6b	swscale: bump minor for implementing support for the new pixfmts	2023-05-29 00:42:02 +02:00
Lynne	934525eae0	lsws: add in/out support for the new 12-bit 2-plane 422 and 444 pixfmts	2023-05-29 00:41:35 +02:00
Jin Bo	cb4ae8baee	swscale/la: Add following builtin optimized functions yuv420_rgb24_lsx yuv420_bgr24_lsx yuv420_rgba32_lsx yuv420_argb32_lsx yuv420_bgra32_lsx yuv420_abgr32_lsx ./configure --disable-lasx ffmpeg -i ~/media/1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -pix_fmt rgb24 -y /dev/null -an before: 184fps after: 207fps Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2023-05-25 21:05:15 +02:00
Lu Wang	4501b1dfd7	swscale/la: Optimize the functions of the swscale series with lsx. ./configure --disable-lasx ffmpeg -i ~/media/1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -s 640x480 -pix_fmt bgra -y /dev/null -an before: 91fps after: 160fps Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2023-05-25 21:05:08 +02:00
Lynne	a62a3930c2	swscale/ppc: remove hScale8To19_vsx Fails checkasm on a Power9 system.	2023-05-20 20:07:18 +02:00
Michael Niedermayer	47ac3e6065	version.h: Bump minor post 6.0 branch Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2023-02-19 18:37:36 +01:00
Michael Niedermayer	62efa096af	version.h: Bump minor for 6.0 branch Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2023-02-19 18:32:07 +01:00
James Almer	5bad485603	Bump major versions of all libraries Signed-off-by: James Almer <jamrial@gmail.com>	2023-02-09 15:35:14 +01:00
Tomas Härdin	a678b0c252	sws/utils.c: Do not uselessly call initFilter() when unscaling	2023-02-08 15:53:55 +01:00
Lynne	bbe95f7353	x86: replace explicit REP_RETs with RETs From x86inc: > On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either > a branch or a branch target. So switch to a 2-byte form of ret in that case. > We can automatically detect "follows a branch", but not a branch target. > (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.) x86inc can automatically determine whether to use REP_RET rather than REP in most of these cases, so impact is minimal. Additionally, a few REP_RETs were used unnecessary, despite the return being nowhere near a branch. The only CPUs affected were AMD K10s, made between 2007 and 2011, 16 years ago and 12 years ago, respectively. In the future, everyone involved with x86inc should consider dropping REP_RETs altogether.	2023-02-01 04:23:55 +01:00
Andreas Rheinhardt	1ff9c07fa6	swscale/utils: Fix indentation Forgotten after `c1eb3e7fec`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-11-24 21:02:57 +01:00
Andreas Rheinhardt	b2d1a25816	swscale/utils: Derive range from YUVJ-pix-fmt only once Currently, it is done once per slice-thread, leading to one warning per slice-thread in case a YUVJ pixel format has been originally used. This also fixes the anomaly that said parameter are only updated for the user-facing context (whose values are retrievable via av_opt_get()) if slice-threading is not in use. Fixes ticket #9860. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-11-24 20:59:03 +01:00
Andreas Rheinhardt	ff39dcb129	swscale/utils: Move functions to avoid forward declarations Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-11-24 20:58:21 +01:00
Andreas Rheinhardt	baccc1c541	swscale/utils: Avoid calling ff_thread_once() unnecessarily Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-11-24 20:58:21 +01:00
Andreas Rheinhardt	8ee0711228	swscale/utils: Don't allocate AVFrames for slice contexts Only the parent context's AVFrames are ever used. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-11-24 20:58:21 +01:00
Andreas Rheinhardt	64ed1d40df	swscale/utils: Factor initializing single slice context out Initializing slice threads currently uses the function (sws_init_context()) that is also used for initializing user-facing contexts with the only difference being that nb_threads is set to one before initializing the slice contexts. Yet sws_init_context() also initializes lots of stuff that is not slice-dependent, i.e. (src\|dst)Range. This currently only works because the code sets these fields to the same values for all slice contexts. This is not nice; even worse, it entails that log messages are printed once per slice context (and therefore fill the screen). This commit lays the groundwork to fix this. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-11-24 20:58:21 +01:00
Michael Niedermayer	ba209e3d51	swscale/input: Use more unsigned intermediates Same principle as previous commit, with sufficiently huge rgb2yuv table values this produces wrong results and undefined behavior. The unsigned produces the same incorrect results. That is probably ok as these cases with huge values seem not to occur in any real use case. Fixes: signed integer overflow Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2022-11-20 21:55:06 +01:00
Jeremy Dorfman	ce566281f9	swscale/input: Use unsigned intermediates in rgb64ToUV_c_template Large rgb2yuv tables and high pixel values cause the intermediate int32_t of rur + gug + bu*b to exceed INT_MAX, which is undefined behavior. This causes libswscale built with LLVM -fsanitize=undefined to assert. Using unsigned integers instead has defined behavior and produces identical results, and makes rgb64ToUV_c_template match rgb64ToY_c_template. Fixes: signed integer overflow Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2022-11-20 21:23:57 +01:00
Andreas Rheinhardt	b616b04704	swscale/utils: Remove obsolete 3DNow reference swscale does not use 3DNow any more since commit `608319a311`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2022-11-09 17:39:00 +01:00
Michael Niedermayer	b74f89caae	swscale/output: Bias 16bps output calculations to improve non overflowing range for GBRP16/GBRPF32 Fixes: integer overflow Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2022-11-04 22:44:16 +01:00
Michael Niedermayer	0f0afc7fb5	swscale/output: Bias 16bps output calculations to improve non overflowing range Fixes: integer overflow Fixes: ./ffmpeg -f rawvideo -video_size 66x64 -pixel_format yuva420p10le -i ~/videos/overflow_input_w66h64.yuva420p10le -filter_complex "scale=flags=bicubic+full_chroma_int+full_chroma_inp+bitexact+accurate_rnd:in_color_matrix=bt2020:out_color_matrix=bt2020:in_range=full:out_range=full,format=rgba64[out]" -pixel_format rgba64 -map '[out]' -y overflow_w66h64.png Found-by: Drew Dunne <asdunne@google.com> Tested-by: Drew Dunne <asdunne@google.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2022-11-04 22:44:16 +01:00

1 2 3 4 5 ...

2505 Commits