Commit Graph

206 Commits

Author SHA1 Message Date
Paul B Mahol c6c888e996 avfilter/vf_w3fdif: add >8 but <16 bit support
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2016-12-25 09:50:36 +01:00
James Almer a8e3833a61 x86/avf_showcqt: use the FMULADD_PS x86util macro
Signed-off-by: James Almer <jamrial@gmail.com>
2016-08-20 02:12:33 -03:00
Matthieu Bouron 9eb3da2f99 asm: FF_-prefix internal macros used in inline assembly
See merge commit '39d6d3618d48625decaff7d9bdbb45b44ef2a805'.
2016-06-27 17:21:18 +02:00
Hendrik Leppkes c142dc203e Merge commit 'dc40a70c5755bccfb1a1349639943e1f408bea50'
* commit 'dc40a70c5755bccfb1a1349639943e1f408bea50':
  Drop unnecessary libavutil/x86/asm.h #includes

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-06-26 15:53:00 +02:00
James Almer 172af20852 x86/showcqt: use three operand format for some instructions
Fixes failures with yasm 1.1.0 and older

Signed-off-by: James Almer <jamrial@gmail.com>
2016-06-08 19:37:08 -03:00
James Almer 7d7fdd6532 x86/showcqt: add missing preprocessor checks
Old yasm/nasm versions don't support some of these

Signed-off-by: James Almer <jamrial@gmail.com>
2016-06-08 19:34:43 -03:00
James Almer 99b899483e avutil/x86util: move haddps sse emulation from showcqt
Signed-off-by: James Almer <jamrial@gmail.com>
2016-06-08 14:18:00 -03:00
Muhammad Faiz 1e69ac9246 avfilter/avf_showcqt: cqt_calc optimization on x86
on x86_64:
        time    PSNR
plain   3.303   inf
SSE     1.649   107.087535
SSE3    1.632   107.087535
AVX     1.409   106.986771
FMA3    1.265   107.108437

on x86_32 (PSNR compared to x86_64 plain):
        time    PSNR
plain   7.225   103.951979
SSE     1.827   105.859282
SSE3    1.819   105.859282
AVX     1.533   105.997661
FMA3    1.384   105.885377

FMA4 test is not available

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
2016-06-08 16:09:43 +07:00
Diego Biurrun dc40a70c57 Drop unnecessary libavutil/x86/asm.h #includes 2016-05-28 19:18:26 +02:00
Paul B Mahol 5b8faaad6c avfilter/vf_blend: fix incorrect Y variable when threading is used
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2016-05-23 21:49:15 +02:00
Ronald S. Bultje f4075767b2 vf_colorspace: use enums for bpp/subsampling array indices.
Also add some documentation for each function to colorspacedsp.h.
2016-05-10 08:37:56 -04:00
Ronald S. Bultje 9b26a8077f vf_colorspace: add const to yuv_stride[] argument in DSP functions. 2016-05-10 08:37:55 -04:00
Ronald S. Bultje 5ce703a6bf vf_colorspace: x86-64 SIMD (SSE2) optimizations. 2016-04-12 16:42:48 -04:00
Thomas Mundt d0a9114f99 avfilter/vf_bwdif: Add yadif base information to copyright header
Signed-off-by: Thomas Mundt <loudmax@yahoo.de>
Signed-off-by: James Almer <jamrial@gmail.com>
2016-03-16 00:05:45 -03:00
Thomas Mundt 5024a82e95 avfilter/vf_bwdif: add x86 SIMD
Signed-off-by: Thomas Mundt <loudmax@yahoo.de>
2016-03-13 10:06:21 +01:00
Timothy Gu 222e6da605 x86/vf_blend: Add SSE2 optimization for divide
4.5x faster than C float version with autovectorization
10  x faster than C int version
25  x faster than C float version without autovectorization
2016-02-28 08:19:09 -08:00
Timothy Gu 4574323973 vf_blend: Reduce number of arguments for kernel function 2016-02-14 08:58:41 -08:00
Timothy Gu 74f8d9aaef x86/vf_blend: Add SSE2 optimization for screen
10x faster than C.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
2016-02-10 11:26:04 -08:00
Timothy Gu c8b1612af0 x86/vf_blend: Move multiplying to a macro
Reviewed-by: Paul B Mahol <onemda@gmail.com>
2016-02-10 11:25:11 -08:00
Timothy Gu 253209ac44 vf_blend: Add SSE2 optimization for multiply
5 times faster than C, 3 times overall.
2016-02-08 13:35:24 -08:00
Hendrik Leppkes 53ada3af62 x86/vf_w3fdif: 32-bit compatibility for w3fdif_simple_high 2016-01-08 11:56:43 +01:00
James Almer 35b0c7efda x86/vf_stereo3d: remove a few unnecessary movas
Signed-off-by: James Almer <jamrial@gmail.com>
2016-01-03 02:09:02 -03:00
James Almer 1817643d4f x86/vf_stereo3d: make ff_anaglyph_sse4 work on x86_32
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-28 17:20:24 -03:00
James Almer 6e243d17e9 x86/vf_stereo3d: optimize register usage
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-28 17:20:12 -03:00
James Almer 8dba3fb8fd x86/vf_blend: add sse2 versions of blend_difference and blend_negation
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-24 13:05:27 -03:00
James Almer 02f428051a x86/vf_blend: make all functions work on x86_32
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-24 13:05:24 -03:00
James Almer 0988c68cf9 x86/vf_blend: simplify using macros
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-24 13:05:21 -03:00
James Almer ce4c85de6a x86/vf_maskedmerge: make ff_maskedmerge8_sse2 work on x86_32
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-24 13:05:18 -03:00
Michael Niedermayer e42e0b11f1 avfilter/x86/vf_maskedmerge: Clear upper part of width
Fixes crash
Fixes: Ticket5055

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-23 22:38:15 +01:00
Paul B Mahol 45938f0301 avfilter/x86/vf_maskedmerge: move %define out of .nextrow
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-12-10 09:52:04 +01:00
James Almer d897d4c12d x86/vf_w3fdif: use aligned loads in w3fdif_complex_high
Found-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-10-27 01:49:22 -03:00
James Almer 224a529b44 x86/vf_w3fdif: use aligned loads in w3fdif_simple_high
Found-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-10-11 20:07:12 -03:00
James Almer e8903fbf8e x86/vf_w3fdif: simplify w3fdif_simple_high
Signed-off-by: James Almer <jamrial@gmail.com>
2015-10-11 20:04:54 -03:00
James Almer d2bf2d094e x86/vf_w3fdif: move pxor outside the loop in w3fdif_complex_low
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-10-11 14:23:21 -03:00
Paul B Mahol c3d312bb7f avfilter/x86/vf_w3fdif: add colons after labels
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-10 17:55:06 +02:00
Paul B Mahol 5740dc27e1 avfilter/vf_w3fdif: add x86 SIMD
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-10 17:33:43 +02:00
Andreas Cadhalpun 8d6625642d doc: fix spelling errors
Reviewed-by: Lou Logan <lou@lrcd.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-10-09 22:09:08 +02:00
Paul B Mahol 624a1a0e69 avfilter/x86/vf_blend.asm: hardmix: do same with two pxor instructions less
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-07 23:12:09 +02:00
Paul B Mahol e999210cec avfilter/x86/vf_blend.asm: 11th register is used, update functions
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-07 22:53:54 +02:00
Paul B Mahol 0948ba3204 avfilter/x86/vf_blend.asm: add hardmix and phoenix sse2 SIMD
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-07 22:50:15 +02:00
Paul B Mahol ac74e857a2 avfilter/vf_stereo3d: add x86 SIMD for anaglyph outputs
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-06 21:01:24 +02:00
Michael Niedermayer fd9a528523 avfilter/vf_blend: Fix argument types, fix segfault in asm
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-03 21:59:24 +02:00
Paul B Mahol 9762554dd0 avfilter/vf_blend: add x86 SIMD for some modes
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-03 21:26:17 +02:00
Paul B Mahol 160556c9ad avfilter/vf_maskedmerge: add SIMD for maskedmerge with 8 bit depth input
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-02 17:40:57 +02:00
Paul B Mahol 0701ff2c32 avfilter/x86/vf_psnr.asm: fix typo
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-01 21:53:13 +02:00
Hendrik Leppkes 5d8e836d0e Replace all remaining occurances of step/depth_minus1 and offset_plus1 2015-09-08 17:10:48 +02:00
Ronald S. Bultje ad45121d56 options: mark av_get_{int,double,q} as deprecated.
Convert last users to av_opt_get_*() counterparts.
2015-08-18 12:05:17 -04:00
Henrik Gramner ab43beefab x86inc: Drop SECTION_TEXT macro
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2015-08-11 11:12:01 +02:00
Henrik Gramner f0b7882ceb x86inc: Drop SECTION_TEXT macro
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
2015-08-04 20:13:09 +02:00
James Almer d9e10af547 x86/vf_interlace: add missing colon to labels
Silences warnings with Nasm

Signed-off-by: James Almer <jamrial@gmail.com>
2015-07-26 02:50:50 -03:00