ffmpeg/libavcodec/x86
Roland Scheidegger 82c71913e4 h264: new assembly version of get_cabac for x86_64 with PIC
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register.
There is a surprisingly large performance improvement over the c version (more
so than the generated assembly seems to suggest) just in get_cabac, I measured
roughly 40% faster for get_cabac on a K8. However, overall the difference is
not that big, I measured roughly 5% on a test clip on a K8 and a Core2.
Hopefully it still compiles on x86 32bit...
Now that only one table is used, there's some chance even darwin as compiles
this (apparently the label arithmetic used previously doesn't work if it
involves symbols defined in a different file, thanks to Ronald S. Bultje for
helping me with this).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-28 20:02:27 +02:00
..
Makefile
ac3dsp.asm
ac3dsp_mmx.c
cabac.h h264: new assembly version of get_cabac for x86_64 with PIC 2012-04-28 20:02:27 +02:00
cavsdsp_mmx.c
dct32_sse.asm
deinterlace.asm
diracdsp_mmx.c
diracdsp_mmx.h
diracdsp_yasm.asm
dnxhd_mmx.c
dsputil_mmx.c lowres2 support. 2012-04-22 22:26:55 +02:00
dsputil_mmx.h
dsputil_mmx_avg_template.c
dsputil_mmx_qns_template.c
dsputil_mmx_rnd_template.c
dsputil_yasm.asm
dsputilenc_mmx.c
dsputilenc_yasm.asm
dwt.c
dwt.h
dwt_yasm.asm
fdct_mmx.c
fft.c
fft.h
fft_3dn.c
fft_3dn2.c
fft_mmx.asm
fft_sse.c
fmtconvert.asm
fmtconvert_mmx.c
h264_chromamc.asm
h264_chromamc_10bit.asm
h264_deblock.asm
h264_deblock_10bit.asm
h264_i386.h h264: new assembly version of get_cabac for x86_64 with PIC 2012-04-28 20:02:27 +02:00
h264_idct.asm
h264_idct_10bit.asm
h264_intrapred.asm
h264_intrapred_10bit.asm
h264_intrapred_init.c
h264_qpel_10bit.asm
h264_qpel_mmx.c
h264_weight.asm
h264_weight_10bit.asm
h264dsp_mmx.c
idct_mmx.c
idct_mmx_xvid.c
idct_sse2_xvid.c
idct_xvid.h
imdct36_sse.asm
lpc_mmx.c
mathops.h
mlpdsp.c
motion_est_mmx.c
mpegaudiodec_mmx.c
mpegvideo_mmx.c
mpegvideo_mmx_template.c
pngdsp-init.c
pngdsp.asm
proresdsp-init.c
proresdsp.asm
rv34dsp.asm
rv34dsp_init.c
rv40dsp.asm
rv40dsp_init.c
sbrdsp.asm
sbrdsp_init.c
simple_idct_mmx.c
snowdsp_mmx.c
v210-init.c
v210.asm
vc1dsp_mmx.c
vc1dsp_yasm.asm
vp3dsp.asm
vp8dsp-init.c
vp8dsp.asm
vp56_arith.h
vp56dsp.asm
vp56dsp_init.c
w64xmmtest.c