ffmpeg/libavcodec/x86
Christophe Gisquet 9fa056ba75 pngdsp x86: use unaligned access
For test images manually generated to contain only up prediction,
timing results:
         8380x3032    255x185
before:   138635       1992
after:    139232       1996

Actually jumping to the proper version depending on the alignment:
8380x3032: 138767

A 0.5% speed improvement for gigantic images is not worth the code
duplication.

Fixes ticket #4148

Signed-off-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Tested-by: Benoit Fouet <benoit.fouet@free.fr>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-12-03 11:56:22 +01:00
..
Makefile v210enc: Add SIMD optimised 8-bit and 10-bit encoders 2014-11-26 20:30:47 +01:00
ac3dsp.asm
ac3dsp_init.c
audiodsp.asm
audiodsp_init.c
blockdsp.asm
blockdsp_init.c
bswapdsp.asm
bswapdsp_init.c
cabac.h
cavsdsp.c x86/cavsdsp: fix buffer alignment in cavs_idct8_add_mmx() 2014-09-25 16:00:16 -03:00
constants.c x86/vp9: add AVX and AVX2 MC 2014-09-22 22:35:03 -03:00
constants.h
dcadsp.asm
dcadsp_init.c
dct-test.c Merge commit 'dcb7c868ec7af7d3a138b3254ef2e08f074d8ec5' 2014-08-27 21:09:30 +02:00
dct32.asm
dct_init.c
deinterlace.asm
dirac_dwt.c
dirac_dwt.h
diracdsp_mmx.c
diracdsp_mmx.h
diracdsp_yasm.asm
dnxhdenc.asm
dnxhdenc_init.c
dwt_yasm.asm
fdct.c
fdct.h
fdctdsp_init.c
fft.asm
fft.h
fft_init.c
flac_dsp_gpl.asm
flacdsp.asm x86/flacdsp: add SSE2 and AVX decorrelate functions 2014-11-13 13:47:55 -03:00
flacdsp_init.c x86/flacdsp: add SSE2 and AVX decorrelate functions 2014-11-13 13:47:55 -03:00
fmtconvert.asm avcodec/x86/fmtconvert: Fix operand size in ff_int32_to_float_fmul_array8_sse* 2014-09-28 19:04:06 +02:00
fmtconvert_init.c x86/fmtconvert: add ff_int32_to_float_fmul_array8_{sse,sse2} 2014-09-26 20:48:40 -03:00
fpel.asm
fpel.h
h263_loopfilter.asm
h263dsp_init.c
h264_chromamc.asm
h264_chromamc_10bit.asm
h264_deblock.asm
h264_deblock_10bit.asm
h264_i386.h h264_i386: Optimize decode_significance_8x8_x86 for 64 bit. 2014-11-22 14:06:48 +01:00
h264_idct.asm
h264_idct_10bit.asm
h264_intrapred.asm Merge commit '2d91abade29e43bb45c881d45909b8ee77e904e2' 2014-10-08 11:48:58 +02:00
h264_intrapred_10bit.asm
h264_intrapred_init.c
h264_qpel.c
h264_qpel_8bit.asm
h264_qpel_10bit.asm
h264_weight.asm
h264_weight_10bit.asm
h264chroma_init.c
h264dsp_init.c
hevc_deblock.asm
hevc_idct.asm
hevc_mc.asm x86/hevc: get rid off packusdw for ssse3 compatibility 2014-10-04 21:14:15 +02:00
hevc_res_add.asm x86/hevc_res_add: add missing guards to hevc_transform_add32_8_avx2 2014-09-04 23:34:01 -03:00
hevcdsp.h x86/hevc_res_add: add ff_hevc_transform_add32_8_avx2 2014-09-04 20:21:29 -03:00
hevcdsp_init.c x86/hevc_res_add: add ff_hevc_transform_add32_8_avx2 2014-09-04 20:21:29 -03:00
hpeldsp.asm
hpeldsp.h
hpeldsp_init.c
hpeldsp_rnd_template.c x86/hpeldsp: fix loop in {avg,avg_no_rnd}_pixels16_x2_mmx 2014-10-23 13:11:05 -03:00
huffyuvdsp.asm
huffyuvdsp_init.c
huffyuvencdsp_mmx.c
idctdsp.asm x86/idctdsp: port {put,add}_pixels_clamped to yasm 2014-09-24 21:52:13 -03:00
idctdsp.h lavc/x86/idctdsp.h: Fix make checkheaders. 2014-09-25 22:18:25 +02:00
idctdsp_init.c x86/idctdsp: port {put,add}_pixels_clamped to yasm 2014-09-24 21:52:13 -03:00
imdct36.asm
inline_asm.h
lossless_audiodsp.asm avcodec/x86/lossless_audiodsp: fix fallback code for 32bit 2014-11-22 21:08:38 +01:00
lossless_audiodsp_init.c
lossless_videodsp.asm
lossless_videodsp_init.c
lpc.c
mathops.h
me_cmp.asm Merge commit '9c12c6ff9539e926df0b2a2299e915ae71872600' 2014-11-24 12:13:00 +01:00
me_cmp_init.c Merge commit '9c12c6ff9539e926df0b2a2299e915ae71872600' 2014-11-24 12:13:00 +01:00
mlpdsp.asm x86/mlpdec: add ff_mlp_rematrix_channel_{sse4,avx2} 2014-10-02 22:11:55 -03:00
mlpdsp_init.c x86/mlpdec: add ff_mlp_rematrix_channel_{sse4,avx2} 2014-10-02 22:11:55 -03:00
mpegaudiodsp.c
mpegvideo.c
mpegvideodsp.c
mpegvideoenc.c
mpegvideoenc_qns_template.c
mpegvideoenc_template.c
mpegvideoencdsp.asm x86/mpegvideoencdsp: improve ff_pix_sum16_sse2 2014-10-01 13:07:22 -03:00
mpegvideoencdsp_init.c x86/mpegvideoencdsp: improve ff_pix_sum16_sse2 2014-10-01 13:07:22 -03:00
pixblockdsp.asm
pixblockdsp_init.c
pngdsp.asm pngdsp x86: use unaligned access 2014-12-03 11:56:22 +01:00
pngdsp_init.c
proresdsp.asm
proresdsp_init.c
qpel.asm
qpeldsp.asm
qpeldsp_init.c
rnd_template.c
rv34dsp.asm
rv34dsp_init.c
rv40dsp.asm
rv40dsp_init.c
sbrdsp.asm
sbrdsp_init.c
simple_idct.c Merge commit '95c0cec03acec0a80cc1c7db48f3b2355d9e767b' 2014-09-03 03:19:40 +02:00
simple_idct.h
snowdsp.c
svq1enc.asm avcodec/svq1enc: align buffer used by simd functions 2014-09-25 16:00:20 -03:00
svq1enc_init.c
ttadsp.asm
ttadsp_init.c
v210-init.c
v210.asm lavc/x86/v210: give cpuflag to INIT macro 2014-09-05 00:35:07 +02:00
v210enc.asm v210enc: Add SIMD optimised 8-bit and 10-bit encoders 2014-11-26 20:30:47 +01:00
v210enc_init.c v210enc: Add SIMD optimised 8-bit and 10-bit encoders 2014-11-26 20:30:47 +01:00
vc1dsp.asm
vc1dsp.h
vc1dsp_init.c
vc1dsp_mmx.c
videodsp.asm x86/videodsp: add ff_emu_edge_{hfix,hvar}_avx2 2014-09-24 16:12:55 -03:00
videodsp_init.c x86/videodsp: add ff_emu_edge_{hfix,hvar}_avx2 2014-09-24 16:12:55 -03:00
vorbisdsp.asm
vorbisdsp_init.c
vp3dsp.asm
vp3dsp_init.c
vp6dsp.asm
vp6dsp_init.c
vp8dsp.asm
vp8dsp_init.c
vp8dsp_loopfilter.asm
vp9dsp_init.c x86/vp9: add AVX and AVX2 MC 2014-09-22 22:35:03 -03:00
vp9intrapred.asm
vp9itxfm.asm
vp9lpf.asm avcodec/x86/vp9lpf: Always include x86util.asm 2014-09-17 23:37:46 +02:00
vp9mc.asm x86/vp9: add AVX and AVX2 MC 2014-09-22 22:35:03 -03:00
vp56_arith.h
w64xmmtest.c
xvididct.h Merge commit 'dcb7c868ec7af7d3a138b3254ef2e08f074d8ec5' 2014-08-27 21:09:30 +02:00
xvididct_init.c Merge commit '7a1d6ddd2c6b2d66fbc1afa584cf506930a26453' 2014-09-03 04:09:38 +02:00
xvididct_mmx.c avcodec/x86: use function pointers for {put,add}_pixels_clamped 2014-09-24 18:52:32 -03:00
xvididct_sse2.c avcodec/x86: use function pointers for {put,add}_pixels_clamped 2014-09-24 18:52:32 -03:00