ffmpeg/libswscale
Lynne bbe95f7353
x86: replace explicit REP_RETs with RETs
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)

x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.

The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.

In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
2023-02-01 04:23:55 +01:00
..
aarch64 sw_scale: Add specializations for hscale 16 to 19 2022-11-01 15:24:58 +02:00
arm
loongarch swscale/la: Add output_lasx.c file. 2022-09-10 22:56:39 +02:00
ppc
riscv sws/rgb2rgb: RISC-V 64-bit V packed YUYV/UYVY to planar 4:2:2 2022-09-30 07:25:44 +02:00
tests
x86 x86: replace explicit REP_RETs with RETs 2023-02-01 04:23:55 +01:00
alphablend.c
bayer_template.c
gamma.c
half2float.c swscale/input: add rgbaf16 input support 2022-08-19 22:09:36 +02:00
hscale_fast_bilinear.c
hscale.c swscale: add opaque parameter to input functions 2022-08-19 22:09:36 +02:00
input.c swscale/input: Use more unsigned intermediates 2022-11-20 21:55:06 +01:00
libswscale.v
log2_tab.c
Makefile swscale/input: add rgbaf16 input support 2022-08-19 22:09:36 +02:00
options.c
output.c swscale/output: Bias 16bps output calculations to improve non overflowing range for GBRP16/GBRPF32 2022-11-04 22:44:16 +01:00
rgb2rgb_template.c
rgb2rgb.c sws/rgb2rgb: RISC-V V shuffle_bytes_xxxx functions 2022-09-30 07:24:09 +02:00
rgb2rgb.h sws/rgb2rgb: RISC-V V shuffle_bytes_xxxx functions 2022-09-30 07:24:09 +02:00
slice.c swscale/input: add rgbaf16 input support 2022-08-19 22:09:36 +02:00
swscale_internal.h swscale/la: Optimize hscale functions with lasx. 2022-09-10 22:56:38 +02:00
swscale_unscaled.c libswscale: force a minimum size of the slide for bayer sources 2022-10-14 12:19:13 +02:00
swscale.c swscale/la: Optimize hscale functions with lasx. 2022-09-10 22:56:38 +02:00
swscale.h swscale: document some missing arguments 2022-10-17 09:56:47 +02:00
swscaleres.rc
utils.c swscale/utils: Fix indentation 2022-11-24 21:02:57 +01:00
version_major.h
version.c
version.h swscale/output: add support for Y210LE and Y212LE 2022-09-10 12:29:12 -07:00
vscale.c
yuv2rgb.c swscale/la: Add yuv2rgb_lasx.c and rgb2rgb_lasx.c files 2022-09-10 22:56:38 +02:00