Commit Graph

41 Commits

Author SHA1 Message Date
Paul B Mahol 72acff9f59 avutil/x86/float_dsp: add fma3 for scalarproduct 2022-09-13 17:43:15 +02:00
Andreas Rheinhardt 2718a3be1f avutil/x86/float_dsp: Remove obsolete 3dnowext function
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). So given that the only systems which benefit
from ff_vector_fmul_window_3dnowext are truely ancient 32bit
AMD x86s it is removed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-06-22 13:37:22 +02:00
James Almer 9d002d7818 x86/float_dsp: add ff_vector_dmul_{sse2,avx}
~3x to 5x faster.

Signed-off-by: James Almer <jamrial@gmail.com>
2018-09-14 12:54:42 -03:00
James Almer f1d80bc630 x86/float_dsp: add ff_vector_fmul_reverse_avx2
~20% faster than AVX.

Signed-off-by: James Almer <jamrial@gmail.com>
2017-04-11 21:35:35 -03:00
James Almer ed9b25a148 x86/float_dsp: add ff_vector_dmac_scalar_{sse2,avx,fma3} 2017-04-10 12:18:55 -03:00
Clément Bœsch f291a9a1ad Merge commit '99434f4df81b6801b2b535d5b9143305595784f6'
* commit '99434f4df81b6801b2b535d5b9143305595784f6':
  float_dsp: Have implementation match function pointer prototype

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-03-30 10:23:25 +02:00
Diego Biurrun 99434f4df8 float_dsp: Have implementation match function pointer prototype
libavutil/x86/float_dsp_init.c(144) : warning C4028: formal parameter 1 different from declaration
libavutil/x86/float_dsp_init.c(144) : warning C4028: formal parameter 2 different from declaration
2016-11-03 17:43:55 +01:00
James Almer 70d685a77f x86: use the new helper macros where useful
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2016-02-14 20:00:21 -03:00
James Almer c16e99e3b3 x86: check for AV_CPU_FLAG_AVXSLOW where useful
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-06-01 00:15:35 +02:00
James Almer d68c05380c x86: check for AV_CPU_FLAG_AVXSLOW where useful
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-05-31 12:07:11 +02:00
James Almer dcaf9660b6 x86/float_dsp: port vector_fmul_window to yasm
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-08 12:41:32 +02:00
James Almer 7d7487e85c x86/float_dsp: add ff_vector_{fmul_add, fmac_scalar}_fma3
~7% faster than AVX

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-13 04:34:05 +01:00
Michael Niedermayer 9d01bf7d66 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Consistently use "cpu_flags" as variable/parameter name for CPU flags

Conflicts:
	libavcodec/x86/dsputil_init.c
	libavcodec/x86/h264dsp_init.c
	libavcodec/x86/hpeldsp_init.c
	libavcodec/x86/motion_est.c
	libavcodec/x86/mpegvideo.c
	libavcodec/x86/proresdsp_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-18 09:53:47 +02:00
Diego Biurrun 3ac7fa81b2 Consistently use "cpu_flags" as variable/parameter name for CPU flags 2013-07-18 00:31:35 +02:00
Michael Niedermayer 3c200aa693 Merge commit '1fda184a85178cfd7b98d9e308d18e1ded76a511'
* commit '1fda184a85178cfd7b98d9e308d18e1ded76a511':
  avutil: Add av_cold attributes to init functions missing them

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-05 12:53:50 +02:00
Diego Biurrun 1fda184a85 avutil: Add av_cold attributes to init functions missing them 2013-05-04 22:48:05 +02:00
Christophe Gisquet 566b7a20fd x86: float dsp: butterflies_float SSE
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
2013-05-03 08:08:02 +02:00
Christophe Gisquet 1a4007964c x86: float dsp: butterflies_float SSE
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-17 00:03:25 +02:00
Michael Niedermayer 63a97d5674 Merge commit 'b6649ab5037fb55f78c2606f3d23cea0867cdeaa'
* commit 'b6649ab5037fb55f78c2606f3d23cea0867cdeaa':
  cosmetics: Remove unnecessary extern keywords from function declarations

Conflicts:
	libswscale/x86/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-28 11:20:41 +01:00
Diego Biurrun b6649ab503 cosmetics: Remove unnecessary extern keywords from function declarations 2013-03-27 14:21:45 +01:00
Michael Niedermayer 8102f27b5b Merge commit '73b704ac609d83e0be124589f24efd9b94947cf9'
* commit '73b704ac609d83e0be124589f24efd9b94947cf9':
  arm: Add some missing header #includes
  floatdsp: move scalarproduct_float from dsputil to avfloatdsp.

Conflicts:
	libavcodec/acelp_pitch_delay.c
	libavcodec/amrnbdec.c
	libavcodec/amrwbdec.c
	libavcodec/ra288.c
	libavcodec/x86/dsputil_mmx.c
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:31:55 +01:00
Michael Niedermayer 6e6e170898 Merge commit '42d324694883cdf1fff1612ac70fa403692a1ad4'
* commit '42d324694883cdf1fff1612ac70fa403692a1ad4':
  floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.

Conflicts:
	libavcodec/arm/dsputil_init_vfp.c
	libavcodec/arm/dsputil_vfp.S
	libavcodec/dsputil.c
	libavcodec/ppc/float_altivec.c
	libavcodec/x86/dsputil.asm
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:04:50 +01:00
Michael Niedermayer b1b870fbd7 Merge commit '55aa03b9f8f11ebb7535424cc0e5635558590f49'
* commit '55aa03b9f8f11ebb7535424cc0e5635558590f49':
  floatdsp: move vector_fmul_add from dsputil to avfloatdsp.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/x86/dsputil.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 13:54:34 +01:00
Ronald S. Bultje 42d3246948 floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje 55aa03b9f8 floatdsp: move vector_fmul_add from dsputil to avfloatdsp. 2013-01-22 11:55:42 -08:00
Ronald S. Bultje d56668bd80 floatdsp: move scalarproduct_float from dsputil to avfloatdsp.
This makes the aac decoder and all voice codecs independent of dsputil.
2013-01-22 11:55:42 -08:00
Michael Niedermayer 17596198ca Merge commit '80ac87c13dc8c6c063e26a464c5c542357c0583f'
* commit '80ac87c13dc8c6c063e26a464c5c542357c0583f':
  lavc: support ZenoXVID custom tag
  libcdio: support recent cdio-paranoia
  float_dsp: Add #ifdef HAVE_INLINE_ASM around vector_fmul_window
  theora: Skip zero-sized headers

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-18 13:36:39 +01:00
Martin Storsjö 973b4d44f1 float_dsp: Add #ifdef HAVE_INLINE_ASM around vector_fmul_window
This fixes builds on 64bit MSVC.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-01-17 19:07:35 +02:00
Michael Niedermayer 5c7e9e16c9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavc: Move vector_fmul_window to AVFloatDSPContext
  rtpdec_mpeg4: Check the remaining amount of data before reading

Conflicts:
	libavcodec/dsputil.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-16 12:38:41 +01:00
Justin Ruggles e034cc6c60 lavc: Move vector_fmul_window to AVFloatDSPContext
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-16 10:45:45 +01:00
Michael Niedermayer 15784c2bab Merge commit '9d5c62ba5b586c80af508b5914934b1c439f6652'
* commit '9d5c62ba5b586c80af508b5914934b1c439f6652':
  lavu/opt: do not filter out the initial sign character except for flags
  eval: treat dB as decibels instead of decibytes
  float_dsp: add vector_dmul_scalar() to multiply a vector of doubles

Conflicts:
	libavutil/eval.c
	tests/ref/fate/eval

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-06 14:33:38 +01:00
Justin Ruggles ac7eb4cb20 float_dsp: add vector_dmul_scalar() to multiply a vector of doubles
Include x86-optimized versions for SSE2 and AVX.
2012-12-05 11:23:36 -05:00
Michael Niedermayer b4d4e51027 Merge commit '3c370f5abc55739a261534b9f9bdc739cedbbbb9'
* commit '3c370f5abc55739a261534b9f9bdc739cedbbbb9':
  riff: only warn on a bad INFO chunk code size instead of failing
  configure: Add separate list for libraries and use where appropriate
  x86: float_dsp: add SSE version of vector_fmul_scalar()

Conflicts:
	configure
	libavformat/riff.c
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-27 14:10:05 +01:00
Justin Ruggles 947f933687 x86: float_dsp: add SSE version of vector_fmul_scalar() 2012-11-26 11:30:19 -05:00
Michael Niedermayer 77aedc77ab Merge remote-tracking branch 'qatar/master'
* qatar/master:
  swscale: Provide the right alignment for external mmx asm
  x86: Replace checks for CPU extensions and flags by convenience macros
  configure: msvc: fix/simplify setting of flags for hostcc
  x86: mlpdsp: mlp_filter_channel_x86 requires inline asm

Conflicts:
	libavcodec/x86/fft_init.c
	libavcodec/x86/h264_intrapred_init.c
	libavcodec/x86/h264dsp_init.c
	libavcodec/x86/mpegaudiodec.c
	libavcodec/x86/proresdsp_init.c
	libavutil/x86/float_dsp_init.c
	libswscale/utils.c
	libswscale/x86/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-09-09 13:27:42 +02:00
Diego Biurrun e0c6cce447 x86: Replace checks for CPU extensions and flags by convenience macros
This separates code relying on inline from that relying on external
assembly and fixes instances where the coalesced check was incorrect.
2012-09-08 18:18:34 +02:00
Carl Eugen Hoyos a26789cf9f Fix compilation with yasm-0.6.2. 2012-09-01 10:59:16 +02:00
Michael Niedermayer cabbd271a5 Merge remote-tracking branch 'qatar/master'
* qatar/master: (24 commits)
  flvdec: remove incomplete, disabled seeking code
  mem: add support for _aligned_malloc() as found on Windows
  lavc: Extend the documentation for avcodec_init_packet
  flvdec: remove incomplete, disabled seeking code
  http: replace atoll() with strtoll()
  mpegts: remove unused/incomplete/broken seeking code
  af_amix: allow float planar sample format as input
  af_amix: use AVFloatDSPContext.vector_fmac_scalar()
  float_dsp: add x86-optimized functions for vector_fmac_scalar()
  float_dsp: Move vector_fmac_scalar() from libavcodec to libavutil
  lavr: Add x86-optimized function for flt to s32 conversion
  lavr: Add x86-optimized function for flt to s16 conversion
  lavr: Add x86-optimized functions for s32 to flt conversion
  lavr: Add x86-optimized functions for s32 to s16 conversion
  lavr: Add x86-optimized functions for s16 to flt conversion
  lavr: Add x86-optimized function for s16 to s32 conversion
  rtpenc: Support packetizing iLBC
  rtpdec: Add a depacketizer for iLBC
  Implement the iLBC storage file format
  mov: Support muxing/demuxing iLBC
  ...

Conflicts:
	Changelog
	configure
	libavcodec/avcodec.h
	libavcodec/dsputil.c
	libavcodec/version.h
	libavformat/movenc.c
	libavformat/mpegts.c
	libavformat/version.h
	libavutil/mem.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-19 20:53:27 +02:00
Justin Ruggles 82b2df9790 float_dsp: add x86-optimized functions for vector_fmac_scalar() 2012-06-18 18:01:14 -04:00
Michael Niedermayer 7e22514d98 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  float_dsp: ppc: add a separate header for Altivec function prototypes
  ARM: fix float_dsp breakage from d5a7229
  Add a float DSP framework to libavutil
  PPC: Move types_altivec.h and util_altivec.h from libavcodec to libavutil
  ARM: Move asm.S from libavcodec to libavutil
  vc1dsp: mark put/avg_vc1_mspel_mc() always_inline

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-08 23:59:09 +02:00
Justin Ruggles d5a7229ba4 Add a float DSP framework to libavutil
Move vector_fmul() from DSPContext to AVFloatDSPContext.
2012-06-08 13:14:38 -04:00