Martin Vignali
|
f3df42e81d
|
avfilter/x86/vf_blend : add SIMD for 16 bit version of
grainextract
grainmerge
average
extremity
negation
|
2018-04-05 21:46:16 +02:00 |
Martin Vignali
|
8eb0bb1108
|
avfilter/x86/vf_blend : reorganize DIFFERENCE macro to reduce line duplication between 8bit and 16 bit version
|
2018-04-05 21:46:11 +02:00 |
Martin Vignali
|
53a03b5c8c
|
avfilter/x86/vf_blend : add 16 bit version for BLEND_SIMPLE, phoenix, difference for SSE and AVX2 (x86_64)
|
2018-02-24 21:44:19 +01:00 |
Martin Vignali
|
3a230ce5fa
|
avfilter/x86/vf_blend : avfilter/x86/vf_blend : add AVX2 version for each func except divide
and optimize average, grainextract, multiply, screen, grain merge
|
2018-01-28 20:21:32 +01:00 |
Paul B Mahol
|
f8d0689d3f
|
avfilter/vf_blend: rename addition128 and difference128 to grainmerge and grainextract
|
2017-08-24 14:45:52 +02:00 |
James Almer
|
d2ef9e6e7f
|
x86/vf_blend: use ABS2 macro
|
2017-06-27 20:45:55 -03:00 |
James Almer
|
0daa1cf073
|
x86/vf_blend: optimize difference and negation functions
Process more pixels per loop.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
2017-06-27 13:17:23 -03:00 |
James Almer
|
fa50d9360b
|
x86/vf_blend: add sse and ssse3 extremity functions
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
2017-06-27 13:17:23 -03:00 |
Timothy Gu
|
222e6da605
|
x86/vf_blend: Add SSE2 optimization for divide
4.5x faster than C float version with autovectorization
10 x faster than C int version
25 x faster than C float version without autovectorization
|
2016-02-28 08:19:09 -08:00 |
Timothy Gu
|
4574323973
|
vf_blend: Reduce number of arguments for kernel function
|
2016-02-14 08:58:41 -08:00 |
Timothy Gu
|
74f8d9aaef
|
x86/vf_blend: Add SSE2 optimization for screen
10x faster than C.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
|
2016-02-10 11:26:04 -08:00 |
Timothy Gu
|
c8b1612af0
|
x86/vf_blend: Move multiplying to a macro
Reviewed-by: Paul B Mahol <onemda@gmail.com>
|
2016-02-10 11:25:11 -08:00 |
Timothy Gu
|
253209ac44
|
vf_blend: Add SSE2 optimization for multiply
5 times faster than C, 3 times overall.
|
2016-02-08 13:35:24 -08:00 |
James Almer
|
8dba3fb8fd
|
x86/vf_blend: add sse2 versions of blend_difference and blend_negation
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
2015-12-24 13:05:27 -03:00 |
James Almer
|
02f428051a
|
x86/vf_blend: make all functions work on x86_32
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
2015-12-24 13:05:24 -03:00 |
James Almer
|
0988c68cf9
|
x86/vf_blend: simplify using macros
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
2015-12-24 13:05:21 -03:00 |
Paul B Mahol
|
624a1a0e69
|
avfilter/x86/vf_blend.asm: hardmix: do same with two pxor instructions less
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
2015-10-07 23:12:09 +02:00 |
Paul B Mahol
|
e999210cec
|
avfilter/x86/vf_blend.asm: 11th register is used, update functions
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
2015-10-07 22:53:54 +02:00 |
Paul B Mahol
|
0948ba3204
|
avfilter/x86/vf_blend.asm: add hardmix and phoenix sse2 SIMD
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
2015-10-07 22:50:15 +02:00 |
Paul B Mahol
|
9762554dd0
|
avfilter/vf_blend: add x86 SIMD for some modes
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
2015-10-03 21:26:17 +02:00 |