Commit Graph

64 Commits

Author SHA1 Message Date
Diego Biurrun ba87f0801d Remove explicit filename from Doxygen @file commands.
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.

Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-20 14:45:34 +00:00
Måns Rullgård 4693b031a3 Move H264 dsputil functions into their own struct
This moves the H264-specific functions from DSPContext to the new
H264DSPContext.  The code is made conditional on CONFIG_H264DSP
which is set by the codecs requiring it.

The qpel and chroma MC functions are not moved as these are used by
non-h264 code.

Originally committed as revision 22565 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-16 01:17:00 +00:00
Måns Rullgård 84dc2d8afa Remove DECLARE_ALIGNED_{8,16} macros
These macros are redundant.  All uses are replaced with the generic
DECLARE_ALIGNED macro instead.

Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-06 14:24:59 +00:00
Måns Rullgård 19769ece3b H264: use alias-safe macros
This eliminates all aliasing violation warnings in h264 code.
No measurable speed difference with gcc-4.4.3 on i7.

Originally committed as revision 21881 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-18 16:24:31 +00:00
Måns Rullgård 40d1122752 Use LOCAL_ALIGNED macro for local arrays
Originally committed as revision 21866 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-17 20:36:20 +00:00
Alexander Strange 78998bf217 h264: Remove unused variables.
Originally committed as revision 21815 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 21:09:38 +00:00
Michael Niedermayer 9873ae0d44 Fix CAVLC+8x8DCT+MBAFF loopfiltering.
Fixes issue1250

Originally committed as revision 21665 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-07 02:00:00 +00:00
Michael Niedermayer 37b2b0d6cd Get rid of a check in one direction that cant be true in it in that part
of the code.
No meassureable speed change.

Originally committed as revision 21566 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-31 02:05:26 +00:00
Michael Niedermayer 2646814897 Split first reference list comparission from mv comparission.
about 0.5% faster MBAFF loop filtering

Originally committed as revision 21552 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-30 20:07:37 +00:00
Michael Niedermayer 4e992796a9 Replace h->left_type[0] by the local variable for it we have.
No meassureable speed effect.

Originally committed as revision 21541 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-30 14:33:25 +00:00
Michael Niedermayer 012dbcce08 slightly faster bit trickery.
Originally committed as revision 21540 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-30 14:10:06 +00:00
Michael Niedermayer 77821e11b3 Replace ?: by branchless code.
about 0.5% faster loop filtering

Originally committed as revision 21539 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-30 13:40:20 +00:00
Michael Niedermayer 34032e26ab factorize first filter call out, this makes the code somewhat
smaller without any speed loss.

Originally committed as revision 21514 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 19:44:13 +00:00
Michael Niedermayer 592e03a8da Change wraper functions to always inline, they are faster now that way.
1% faster MBAFF decoding overall, maybe ~0.1% faster for the cathedral sample.

Originally committed as revision 21507 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 11:37:35 +00:00
Michael Niedermayer 5364db2893 indent
Originally committed as revision 21506 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 11:18:06 +00:00
Michael Niedermayer 2cf0d46d4c Restructure check_mv()
~20 cpu cycles faster loopfilter

Originally committed as revision 21505 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 11:12:46 +00:00
Michael Niedermayer fabd704b37 Restructure if() in check_mv()
quite a bit faster

Originally committed as revision 21504 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 10:38:43 +00:00
Michael Niedermayer ca7c784fdf Unroll loops in check_mv()
~6% faster (slow path) loopfilter (should be ~2% overall)

Originally committed as revision 21503 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 10:34:06 +00:00
Michael Niedermayer e814817b74 Factor mv/ref compare code out.
This is a hair slower (0.15% maybe) but i really dont want to have the
identical code duplicated 3 times because gcc adds odd threaded jumps with
register reshuffling and register safe/restore.

Originally committed as revision 21502 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 10:10:02 +00:00
Michael Niedermayer 3b84924516 Simplify first edge filter condition.
Originally committed as revision 21497 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 02:41:52 +00:00
Michael Niedermayer b6302d0c55 Cosmetics, mostly indention, 2 or so new fixme comments that i was to lazy
to split out

Originally committed as revision 21496 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 02:20:31 +00:00
Michael Niedermayer 0a32508d90 Make the fast loop filter path work with unavailable left MBs.
This prevents the issue with having to switch between slow and
fast code paths in each row.
0.5% faster loopfilter for cathedral

Originally committed as revision 21495 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 02:15:25 +00:00
Michael Niedermayer b304767301 get rid of the start variable.
a few cycles faster

Originally committed as revision 21494 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 01:31:06 +00:00
Michael Niedermayer 980bcc554d Unroll main loop so the edge==0 case is seperate.
This allows many things to be simplified away.
h264 decoder is overall 1% faster with a mbaff sample and
0.1% slower with the cathedral sample, probably because the slow loop
filter code must be loaded into the code cache for each first MB of each
row but isnt used for the following MBs.

Originally committed as revision 21493 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 01:24:25 +00:00
Michael Niedermayer 8670f84cf9 Update comment.
Originally committed as revision 21479 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-27 13:18:08 +00:00
Michael Niedermayer e470ef7641 Use table to speedup access to non_zero_count in MBAFF with differing interlacing.
~4 cpu cycles speedup

Originally committed as revision 21474 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-27 11:14:29 +00:00
Michael Niedermayer 16e5e39ab4 Optimize loop filtering of the left edge in MBAFF.
60 cpu cycles speedup

Originally committed as revision 21467 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 22:59:19 +00:00
Michael Niedermayer 6548c939ec remove unneeded check
Originally committed as revision 21460 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 15:34:21 +00:00
Michael Niedermayer 18ea2f933c Use left_mb_xy from fill_caches instead of recalculating it.
Originally committed as revision 21459 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 14:57:53 +00:00
Michael Niedermayer d5c30c86d0 Simplify loop filter a little by using top/left_type.
Originally committed as revision 21457 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 13:39:26 +00:00
Michael Niedermayer 50eb40a799 Remove all uses of slice_type* from the loop filter, also remove its
initialization befre the loop filter.

Originally committed as revision 21416 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-24 13:20:17 +00:00
Michael Niedermayer 0c32e19d58 Move +52 from the loop filter to the alpha/beta offsets in the context.
This should fix a segfault, also it might be faster on systems where the
+52 wasnt free.

Originally committed as revision 21406 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-23 18:05:30 +00:00
Michael Niedermayer 1cc2d21175 Set edges based on cbp and mv partitioning, not just skiped MBs.
This is faster for videos that have lots of MBs that fall in this category.

Originally committed as revision 21400 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-23 15:28:34 +00:00
Michael Niedermayer 6b3661b22d Optimize filter_mb_mbaff_edge*()
Originally committed as revision 21397 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-23 14:50:56 +00:00
Michael Niedermayer 933bea77e5 Optmize 8x8dct check used to skip some borders in the loop filter.
4 cpu cycles faster.

Originally committed as revision 21396 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-23 13:54:02 +00:00
Måns Rullgård c67278098d Move array specifiers outside DECLARE_ALIGNED() invocations
Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-22 03:25:11 +00:00
Michael Niedermayer 258b60c224 Gcc idiocy fixes related to filter_mb_edge*.
Change order of operands as gcc uses a hardcoded register per operand it seems
even for static functions
thus reducing unneeded moved (now functions try to pass the same argument in
the same spot).
Change signed int to unsigned int for array indexes as signed requires signed
extension while unsigned is free.
move the +52 up and merge it where it will end as a lea instruction, gcc always
splits the 52 out there turning the free +52 into an expensive one otherwise.
The changed code becomes a little faster.

Originally committed as revision 21375 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-22 01:59:17 +00:00
Michael Niedermayer 31f6e3c19e Make calculation of mask_edge free of branches, faster of course but probably
little effect overall as this is not that often executed.

Originally committed as revision 21366 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-21 16:50:31 +00:00
Alexander Strange bec358d683 H.264: Declare bS with DECLARE_ALIGNED_8 for uint64_t casts.
Originally committed as revision 21345 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-20 03:28:57 +00:00
Michael Niedermayer 97775235ec Simplify/Optimize another of the mbaff loop filter cases.
Its faster but too rarely used to make a differnce.

Originally committed as revision 21344 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-20 03:00:08 +00:00
Michael Niedermayer 085d9d98e8 Only calculate the second chroma qp if it differs from the firstin the main
loop filter. (a little faster for the common case where they are equal)

Originally committed as revision 21342 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-20 01:49:24 +00:00
Michael Niedermayer 948180e7b1 Set bS with 64bits at a time.
Originally committed as revision 21341 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-20 01:38:32 +00:00
Michael Niedermayer 87df989ee3 Merge multiple IS_* macro uses where possible.
Originally committed as revision 21340 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-20 01:15:30 +00:00
Michael Niedermayer 55c54371c4 Simplify and optimize intra code in h264_loopfilter.c
Originally committed as revision 21339 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-20 00:44:03 +00:00
Michael Niedermayer 9528ce7b99 Sightly simplify initialization of int start.
No real speed change.

Originally committed as revision 21336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-20 00:17:16 +00:00
Michael Niedermayer 655a1d57d5 Reenable ff_h264_filter_mb_fast() for all slices it supported before.
Originally committed as revision 21328 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-19 16:43:57 +00:00
Michael Niedermayer 2b3649f656 Fix compilation with -O0.
Originally committed as revision 21308 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 23:41:12 +00:00
Michael Niedermayer bffe82f504 Rather call filter_mb_mbaff_edge*v() more often than do extra calculations
in the innerst loop. ~150 cpu cycles faster

Originally committed as revision 21299 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 21:22:09 +00:00
Michael Niedermayer 0fe674cb4a Use h->slice_num where possible.
Originally committed as revision 21292 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 20:13:53 +00:00
Michael Niedermayer bce6a1e7c7 Enable filter_mb_fast for CAVLC P slices.
Originally committed as revision 21291 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 19:45:56 +00:00