Cosmetics to speed up finding sections of interest.

Originally committed as revision 11596 to svn://svn.ffmpeg.org/ffmpeg/trunk
This commit is contained in:
Michael Niedermayer 2008-01-22 14:48:02 +00:00
parent 8f738eea43
commit 5e123bd359

View File

@ -1,6 +1,8 @@
optimization Tips (for libavcodec): optimization Tips (for libavcodec):
===================================
What to optimize: What to optimize:
-----------------
If you plan to do non-x86 architecture specific optimizations (SIMD normally), If you plan to do non-x86 architecture specific optimizations (SIMD normally),
then take a look in the i386/ directory, as most important functions are then take a look in the i386/ directory, as most important functions are
already optimized for MMX. already optimized for MMX.
@ -9,7 +11,9 @@ If you want to do x86 optimizations then you can either try to finetune the
stuff in the i386 directory or find some other functions in the C source to stuff in the i386 directory or find some other functions in the C source to
optimize, but there aren't many left. optimize, but there aren't many left.
Understanding these overoptimized functions: Understanding these overoptimized functions:
--------------------------------------------
As many functions tend to be a bit difficult to understand because As many functions tend to be a bit difficult to understand because
of optimizations, it can be hard to optimize them further, or write of optimizations, it can be hard to optimize them further, or write
architecture-specific versions. It is recommened to look at older architecture-specific versions. It is recommened to look at older
@ -23,7 +27,9 @@ and how they can be optimized.
NOTE: If you still don't understand some function, ask at our mailing list!!! NOTE: If you still don't understand some function, ask at our mailing list!!!
(http://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel) (http://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel)
What speedup justifies an optimizetion? What speedup justifies an optimizetion?
---------------------------------------
Normaly with clean&simple optimizations and widely used codecs a overall Normaly with clean&simple optimizations and widely used codecs a overall
speedup of the affected codec of 0.1% is enough. These speedups accumulate speedup of the affected codec of 0.1% is enough. These speedups accumulate
and can make a big difference after a while ... and can make a big difference after a while ...
@ -35,6 +41,7 @@ small and readable than to make it 1% faster.
WTF is that function good for ....: WTF is that function good for ....:
-----------------------------------
The primary purpose of that list is to avoid wasting time to optimize functions The primary purpose of that list is to avoid wasting time to optimize functions
which are rarely used which are rarely used
@ -145,9 +152,11 @@ The minimum guaranteed alignment is written in the .h files, for example:
Links: Links:
======
http://www.aggregate.org/MAGIC/ http://www.aggregate.org/MAGIC/
x86-specific: x86-specific:
-------------
http://developer.intel.com/design/pentium4/manuals/248966.htm http://developer.intel.com/design/pentium4/manuals/248966.htm
The IA-32 Intel Architecture Software Developer's Manual, Volume 2: The IA-32 Intel Architecture Software Developer's Manual, Volume 2:
@ -161,7 +170,7 @@ http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pd
ARM-specific: ARM-specific:
-------------
ARM Architecture Reference Manual (up to ARMv5TE): ARM Architecture Reference Manual (up to ARMv5TE):
http://www.arm.com/community/university/eulaarmarm.html http://www.arm.com/community/university/eulaarmarm.html
@ -176,7 +185,7 @@ Optimization guide for Intel XScale (used in Sharp Zaurus PDA):
http://download.intel.com/design/intelxscale/27347302.pdf http://download.intel.com/design/intelxscale/27347302.pdf
PowerPC-specific: PowerPC-specific:
-----------------
PowerPC32/AltiVec PIM: PowerPC32/AltiVec PIM:
www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPEM.pdf www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPEM.pdf
@ -188,6 +197,7 @@ http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/30B3520C93F437AB8725706
http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/9F820A5FFA3ECE8C8725716A0062585F/$file/CBE_Handbook_v1.1_24APR2007_pub.pdf http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/9F820A5FFA3ECE8C8725716A0062585F/$file/CBE_Handbook_v1.1_24APR2007_pub.pdf
SPARC-specific: SPARC-specific:
---------------
SPARC Joint Programming Specification (JPS1): Commonality SPARC Joint Programming Specification (JPS1): Commonality
http://www.fujitsu.com/downloads/PRMPWR/JPS1-R1.0.4-Common-pub.pdf http://www.fujitsu.com/downloads/PRMPWR/JPS1-R1.0.4-Common-pub.pdf
@ -198,6 +208,7 @@ VIS Whitepaper (contains optimization guidelines)
http://www.sun.com/processors/vis/download/vis/vis_whitepaper.pdf http://www.sun.com/processors/vis/download/vis/vis_whitepaper.pdf
GCC asm links: GCC asm links:
--------------
official doc but quite ugly official doc but quite ugly
http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html