Commit Graph

68 Commits

Author SHA1 Message Date
Michael Niedermayer 2b3649f656 Fix compilation with -O0.
Originally committed as revision 21308 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 23:41:12 +00:00
Michael Niedermayer bffe82f504 Rather call filter_mb_mbaff_edge*v() more often than do extra calculations
in the innerst loop. ~150 cpu cycles faster

Originally committed as revision 21299 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 21:22:09 +00:00
Michael Niedermayer 0fe674cb4a Use h->slice_num where possible.
Originally committed as revision 21292 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 20:13:53 +00:00
Michael Niedermayer bce6a1e7c7 Enable filter_mb_fast for CAVLC P slices.
Originally committed as revision 21291 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 19:45:56 +00:00
Michael Niedermayer 42ebca8551 PAFF CABAC P slices seem to work as well, so enable them for ff_h264_filter_mb_fast() too.
Originally committed as revision 21289 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 16:29:16 +00:00
Michael Niedermayer a8f4921595 Reenable filter_mb_fast for I slices and progressive CABAC P slices.
Originally committed as revision 21288 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 16:16:22 +00:00
Michael Niedermayer b6ef858ec7 Move CAVLC 8x8 DCT special case from ff_h264_filter_mb() to fill_caches
that way it is also available for ff_h264_filter_mb_fast().

Originally committed as revision 21283 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 13:09:53 +00:00
Michael Niedermayer 6d7e6b2657 Perform reference remapping at fill_cache() time instead of in the
loop filter. This removes one obstacle of getting ff_h264_filter_mb_fast()
bitexact. code is maybe 0.1% faster

Originally committed as revision 21280 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 05:15:31 +00:00
Michael Niedermayer 44a5e7b64c Move the qp check to skip the loop filter up.
Originally committed as revision 21274 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 00:20:44 +00:00
Michael Niedermayer b6303e6d2a Reorganize how values are stored in h->non_zero_count.
~1% faster

Originally committed as revision 21273 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-17 23:44:23 +00:00
Michael Niedermayer c988f97566 Rearchitecturing the stiched up goose part 1
Run loop filter per row instead of per MB, this also should make it
much easier to switch to per frame filtering and also doing so in a
seperate thread in the future if some volunteer wants to try.
Overall decoding speedup of 1.7% (single thread on pentium dual / cathedral sample)
This change also allows some optimizations to be tried that would not have
been possible before.

Originally committed as revision 21270 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-17 20:35:55 +00:00
Michael Niedermayer 7931bb2a0c Comment for() ; out
~200 bytes smaller ff_h264_filter_mb()
please everyone, NEVER add code with the assumtation that gcc will remove it
without checking gcc actually does. Chances are it does not.

Originally committed as revision 21251 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-16 17:41:40 +00:00
Michael Niedermayer ed3d7e2f65 Mark a few functions as noinline, this makes ff_h264_filter_mb() a bit smaller
and 5% faster.
ff_h264_filter_mb_fast() stay the same size as gcc decided not to inline these
functions there in the first place.

Originally committed as revision 21250 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-16 17:27:17 +00:00
Michael Niedermayer 183a86c958 Apply last 2 optimizations to similar code i forgot.
Originally committed as revision 21249 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-16 16:21:12 +00:00
Michael Niedermayer 3f55a651c4 Another microopt, 4 cpu cycles for avoidance of FFABS().
Originally committed as revision 21248 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-16 16:14:32 +00:00
Michael Niedermayer 26147d368b Minor (2 cpu cycles) optimization ||->|.
Originally committed as revision 21246 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-16 15:19:05 +00:00
Michael Niedermayer 2e36c931f0 Avoid wasting 4 cpu cycles per MB in redundantly calculating qp_thresh.
Originally committed as revision 21243 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-16 11:55:35 +00:00
Michael Niedermayer 082cf97106 Split h264 loop filter off h264.c.
No meassureable speed difference on pentium dual & cathedral sample.

Originally committed as revision 21159 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-12 06:01:55 +00:00