ffmpeg/libswscale
Lauri Kasanen ce92ee4b4f swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_2
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \
        -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \
        -cpuflags 0 -v error -

32-bit mul, power8 only.

~2x speedup:

rgb24
  24431 UNITS in yuv2packed2,   16384 runs,      0 skips
  13783 UNITS in yuv2packed2,   16383 runs,      1 skips
bgr24
  24396 UNITS in yuv2packed2,   16384 runs,      0 skips
  14059 UNITS in yuv2packed2,   16384 runs,      0 skips
rgba
  26815 UNITS in yuv2packed2,   16383 runs,      1 skips
  12797 UNITS in yuv2packed2,   16383 runs,      1 skips
bgra
  27060 UNITS in yuv2packed2,   16384 runs,      0 skips
  13138 UNITS in yuv2packed2,   16384 runs,      0 skips
argb
  26998 UNITS in yuv2packed2,   16384 runs,      0 skips
  12728 UNITS in yuv2packed2,   16381 runs,      3 skips
bgra
  26651 UNITS in yuv2packed2,   16384 runs,      0 skips
  13124 UNITS in yuv2packed2,   16384 runs,      0 skips

This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version
is also heavily inaccurate, while the vsx version has high accuracy.
2019-04-11 09:08:51 +03:00
..
aarch64 sws/aarch64: add ff_yuv2planeX_8_neon 2016-04-11 16:27:19 +02:00
arm arm: swscale: Only compile the rgb2yuv asm if .dn aliases are supported 2018-03-31 21:54:56 +03:00
ppc swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_2 2019-04-11 09:08:51 +03:00
tests Merge commit '0fd0d4fd0a518e30ff23972828ad7cf7f35cfb9d' 2017-10-30 12:34:40 -03:00
x86 swscale/x86/rgb2rgb.asm : add Ivo Van Poorten name to the top of the file 2018-10-18 21:43:19 +02:00
Makefile Merge commit '92db5083077a8b0f8e1050507671b456fd155125' 2017-05-04 19:59:30 -03:00
alphablend.c avutil: Rename FF_CEIL_COMPAT to AV_CEIL_COMPAT 2016-01-27 16:36:46 +00:00
bayer_template.c
gamma.c
hscale.c avutil: Rename FF_CEIL_COMPAT to AV_CEIL_COMPAT 2016-01-27 16:36:46 +00:00
hscale_fast_bilinear.c
input.c swscale : add support for YUVA444P12 and YUVA422P12 2018-11-24 16:24:47 +01:00
libswscale.v build: Change structure of the linker version script templates 2016-05-29 16:43:11 +02:00
log2_tab.c
options.c swscale/options: Use AV_OPT_TYPE_PIXEL_FMT 2016-11-20 13:00:22 +01:00
output.c swscale: Remove duplicated code 2019-03-27 09:00:06 +02:00
rgb2rgb.c swscale/rgb : move shuffle func shuffle_bytes_1230, shuffle_bytes_3012, shuffle_bytes_3210 in order to add SIMD 2018-03-24 20:22:02 +01:00
rgb2rgb.h swscale/rgb2rgb : cosmetic, move shuffle_bytes func declaration 2018-03-24 20:22:17 +01:00
rgb2rgb_template.c lsws/rgb2rgb_template: Do not compile unneeded shuffle functions on big-endian. 2018-06-10 03:22:59 +02:00
slice.c lsws/slice: Move a misplaced const. 2017-03-08 00:33:21 +01:00
swscale.c swscale/swscale : small cosmetic 2018-08-22 11:36:15 +02:00
swscale.h doxygen: Standardize root-level modules 2016-08-02 22:15:25 -07:00
swscale_internal.h swscale/ppc: Move VSX-using code to its own file 2018-12-04 02:59:07 +01:00
swscale_unscaled.c swscale/swscale_unscaled: Fix chroma slice height 2019-03-28 22:47:32 +01:00
swscaleres.rc
utils.c swscale : add support for YUVA444P12 and YUVA422P12 2018-11-24 16:24:47 +01:00
version.h Bump minor version for master after 4.1 branchpoint 2018-11-02 00:53:07 +01:00
vscale.c swscale: cleanup unused code 2016-03-31 16:36:16 -03:00
yuv2rgb.c swscale/yuv2rgb: Return a more specific error code from ff_yuv2rgb_c_init_tables() 2019-01-01 21:11:47 +01:00