avfilter/vsrc_mandelbrot: change sin to sinf for color computation

lrintf is anyway used, suggesting we only care up to floating precision.
Rurthermore, there is a compat hack in avutil/libm for this function,
and it is used in avcodec/aacps_tablegen.h.

This yields a non-negligible speedup. Sample benchmark:
x86-64, Haswell, GNU/Linux:

old (draw_mandelbrot):
274635709 decicycles in draw_mandelbrot,     256 runs,      0 skips
300287046 decicycles in draw_mandelbrot,     512 runs,      0 skips
371819935 decicycles in draw_mandelbrot,    1024 runs,      0 skips
336663765 decicycles in draw_mandelbrot,    2048 runs,      0 skips
581851016 decicycles in draw_mandelbrot,    4096 runs,      0 skips

new (draw_mandelbrot):
269882717 decicycles in draw_mandelbrot,     256 runs,      0 skips
296359285 decicycles in draw_mandelbrot,     512 runs,      0 skips
370076599 decicycles in draw_mandelbrot,    1024 runs,      0 skips
331478354 decicycles in draw_mandelbrot,    2048 runs,      0 skips
571904318 decicycles in draw_mandelbrot,    4096 runs,      0 skips

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
This commit is contained in:
Ganesh Ajjanagadde 2015-11-24 13:00:54 -05:00
parent e9c7493f19
commit 990619968a
1 changed files with 2 additions and 2 deletions

View File

@ -334,11 +334,11 @@ static void draw_mandelbrot(AVFilterContext *ctx, uint32_t *color, int linesize,
switch(s->outer){
case ITERATION_COUNT:
zr = i;
c = lrintf((sin(zr)+1)*127) + lrintf((sin(zr/1.234)+1)*127)*256*256 + lrintf((sin(zr/100)+1)*127)*256;
c = lrintf((sinf(zr)+1)*127) + lrintf((sinf(zr/1.234)+1)*127)*256*256 + lrintf((sinf(zr/100)+1)*127)*256;
break;
case NORMALIZED_ITERATION_COUNT:
zr = i + log2(log(s->bailout) / log(zr*zr + zi*zi));
c = lrintf((sin(zr)+1)*127) + lrintf((sin(zr/1.234)+1)*127)*256*256 + lrintf((sin(zr/100)+1)*127)*256;
c = lrintf((sinf(zr)+1)*127) + lrintf((sinf(zr/1.234)+1)*127)*256*256 + lrintf((sinf(zr/100)+1)*127)*256;
break;
case WHITE:
c = 0xFFFFFF;