Commit Graph

2 Commits

Author SHA1 Message Date
Ronald S. Bultje 1acd7d594c h264: integrate clear_blocks calls with IDCT.
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-19 16:25:50 +01:00
Ronald S. Bultje 7ff1a4b10f Add add_pixels4/8() to h264dsp, and remove add_pixels4 from dsputil.
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-12 02:14:16 +01:00