Commit Graph

41 Commits

Author SHA1 Message Date
Henrik Gramner c3d3f0e697 avutil/x86util: Fix broken pre-SSE4.1 PMINSD emulation
Fixes yadif-16 which allows FATE to pass.

Broken since 2904db9045 (2017).
2024-03-17 13:52:27 +01:00
Lynne bbe95f7353
x86: replace explicit REP_RETs with RETs
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)

x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.

The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.

In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
2023-02-01 04:23:55 +01:00
Andreas Rheinhardt a05f22eaf3 swscale/x86/swscale: Remove obsolete and harmful MMX(EXT) functions
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). So given that the only systems that
benefit from these functions are truely ancient 32bit x86s
they are removed.

Moreover, some of the removed code was buggy/not bitexact
and lead to failures involving the f32le and f32be versions of
gray, gbrp and gbrap on x86-32 when SSE2 was not disabled.
See e.g.
https://fate.ffmpeg.org/report.cgi?time=20220609221253&slot=x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx

Notice that yuv2yuvX_mmx is not removed, because it is used
by SSE3 and AVX2 as fallback in case of unaligned data and
also for tail processing. I don't know why yuv2yuvX_mmxext
isn't being used for this; an earlier version [1] of
554c2bc708 used it, but
the version that was eventually applied does not.

[1]: https://ffmpeg.org/pipermail/ffmpeg-devel/2020-November/272124.html

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-06-22 13:36:04 +02:00
James Almer 2904db9045 Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2'
* commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2':
  x86util: Port all macros to cpuflags

See d5f8a642f6

Merged-by: James Almer <jamrial@gmail.com>
2017-10-21 12:15:57 -03:00
Diego Biurrun 994c4bc107 x86util: Port all macros to cpuflags
Also do some small cosmetic changes: Drop pointless _MMX suffix from ABSD2
macro name, drop pointless check for MMX support, we always assume MMX is
available in our SIMD code, fix spelling.
2017-03-14 17:23:32 +01:00
Michael Niedermayer f59750641a swscale: x86: Add some forgotten 12-bit planar YUV cases
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-10-12 17:39:30 +02:00
Clément Bœsch 8ef57a0d61 Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb'
* commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb':
  cosmetics: Fix spelling mistakes

Merged-by: Clément Bœsch <u@pkh.me>
2016-06-21 21:55:34 +02:00
Vittorio Giovara 41ed7ab45f cosmetics: Fix spelling mistakes
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-05-04 18:16:21 +02:00
James Almer 345f2234d1 x86/scale: fix xmm register count for hscale*_sse2
xmm6 was being clobbered in ff_hscale8to{15,19}_8_sse2 on Win64

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-09 04:13:23 +02:00
Thilo Borgmann d814a839ac Reinstate proper FFmpeg license for all files. 2013-08-30 15:47:38 +00:00
Michael Niedermayer 9766d9c985 Merge commit '04581c8c77ce779e4e70684ac45302972766be0f'
* commit '04581c8c77ce779e4e70684ac45302972766be0f':
  x86: yasm: Use complete source path for macro helper %includes

Conflicts:
	Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-31 13:57:09 +01:00
Michael Niedermayer 3174616f59 Merge commit '6860b4081d046558c44b1b42f22022ea341a2a73'
* commit '6860b4081d046558c44b1b42f22022ea341a2a73':
  x86: include x86inc.asm in x86util.asm
  cng: Reindent some incorrectly indented lines
  cngdec: Allow flushing the decoder
  cngdec: Make the dbov variable have the right unit
  cngdec: Fix the memset size to cover the full array
  cngdec: Update the LPC coefficients after averaging the reflection coefficients
  configure: fix print_config() with broke awks

Conflicts:
	libavcodec/x86/ac3dsp.asm
	libavcodec/x86/dct32.asm
	libavcodec/x86/deinterlace.asm
	libavcodec/x86/dsputil.asm
	libavcodec/x86/dsputilenc.asm
	libavcodec/x86/fft.asm
	libavcodec/x86/fmtconvert.asm
	libavcodec/x86/h264_chromamc.asm
	libavcodec/x86/h264_deblock.asm
	libavcodec/x86/h264_deblock_10bit.asm
	libavcodec/x86/h264_idct.asm
	libavcodec/x86/h264_idct_10bit.asm
	libavcodec/x86/h264_intrapred.asm
	libavcodec/x86/h264_intrapred_10bit.asm
	libavcodec/x86/h264_weight.asm
	libavcodec/x86/vc1dsp.asm
	libavcodec/x86/vp3dsp.asm
	libavcodec/x86/vp56dsp.asm
	libavcodec/x86/vp8dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-31 13:43:33 +01:00
Diego Biurrun 04581c8c77 x86: yasm: Use complete source path for macro helper %includes
This is more consistent with the way we handle C #includes and
it simplifies the build system.
2012-10-31 00:37:42 +01:00
Diego Biurrun 6860b4081d x86: include x86inc.asm in x86util.asm
This is necessary to allow refactoring some x86util macros with cpuflags.
2012-10-31 00:37:42 +01:00
Michael Niedermayer a32032b508 sws/x86: add some forgotten 12bit planar yuv cases
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-05 04:38:57 +02:00
Michael Niedermayer ca19862d38 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  libxvid: remove disabled code
  qdm2: make a table static const
  qdm2: simplify bitstream reader setup for some subpacket types
  qdm2: use get_bits_left()
  build: Consistently handle conditional compilation for all optimization OBJS.
  avpacket, bfi, bgmc, rawenc: K&R prettyprinting cosmetics
  msrle: convert MS RLE decoding function to bytestream2.
  x86inc improvements for 64-bit

Conflicts:
	common.mak
	libavcodec/avpacket.c
	libavcodec/bfi.c
	libavcodec/msrledec.c
	libavcodec/qdm2.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-13 00:39:19 +02:00
Henrik Gramner 729f90e268 x86inc improvements for 64-bit
Add support for all x86-64 registers
Prefer caller-saved register over callee-saved on WIN64
Support up to 15 function arguments

Also (by Ronald S. Bultje)
Fix up our asm to work with new x86inc.asm.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
2012-04-11 15:47:00 -04:00
Michael Niedermayer 4257ce112c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dxa: remove useless code
  lavf: don't select an attached picture as default stream for seeking.
  avconv: remove pointless checks.
  avconv: check for get_filtered_frame() failure.
  avconv: remove a pointless check.
  swscale: convert hscale() to use named arguments.
  x86inc: add *mp named argument support to DEFINE_ARGS.
  swscale: convert hscale to cpuflags().

Conflicts:
	ffmpeg.c
	libswscale/x86/scale.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-16 01:36:49 +01:00
Ronald S. Bultje 45fdcc8e2d swscale: convert hscale() to use named arguments. 2012-03-14 20:09:53 -07:00
Ronald S. Bultje aba7a827aa swscale: convert hscale to cpuflags(). 2012-03-14 20:09:53 -07:00
Michael Niedermayer 6df42f9874 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  SBR DSP: fix SSE code to not use SSE2 instructions.
  cpu: initialize mask to -1, so that by default, optimizations are used.
  error_resilience: initialize s->block_index[].
  svq3: protect against negative quantizers.
  Don't use ff_cropTbl[] for IDCT.
  swscale: make filterPos 32bit.
  FATE: add CPUFLAGS variable, mapping to -cpuflags avconv option.
  avconv: add -cpuflags option for setting supported cpuflags.
  cpu: add av_set_cpu_flags_mask().
  libx264: Allow overriding the sliced threads option
  avconv: fix counting encoded video size.

Conflicts:
	doc/APIchanges
	doc/fate.texi
	doc/ffmpeg.texi
	ffmpeg.c
	libavcodec/h264idct_template.c
	libavcodec/svq3.c
	libavutil/avutil.h
	libavutil/cpu.c
	libavutil/cpu.h
	libswscale/swscale.c
	tests/Makefile
	tests/fate-run.sh
	tests/regression-funcs.sh

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-07 03:22:49 +01:00
Ronald S. Bultje 2254b559cb swscale: make filterPos 32bit.
Fixes overflows for large image sizes.

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
2012-03-06 10:47:41 -08:00
Michael Niedermayer e37f161e66 Merge remote-tracking branch 'qatar/master'
* qatar/master: (71 commits)
  movenc: Allow writing to a non-seekable output if using empty moov
  movenc: Support adding isml (smooth streaming live) metadata
  libavcodec: Don't crash in avcodec_encode_audio if time_base isn't set
  sunrast: Document the different Sun Raster file format types.
  sunrast: Add a check for experimental type.
  libspeexenc: use AVSampleFormat instead of deprecated/removed SampleFormat
  lavf: remove disabled FF_API_SET_PTS_INFO cruft
  lavf: remove disabled FF_API_OLD_INTERRUPT_CB cruft
  lavf: remove disabled FF_API_REORDER_PRIVATE cruft
  lavf: remove disabled FF_API_SEEK_PUBLIC cruft
  lavf: remove disabled FF_API_STREAM_COPY cruft
  lavf: remove disabled FF_API_PRELOAD cruft
  lavf: remove disabled FF_API_NEW_STREAM cruft
  lavf: remove disabled FF_API_RTSP_URL_OPTIONS cruft
  lavf: remove disabled FF_API_MUXRATE cruft
  lavf: remove disabled FF_API_FILESIZE cruft
  lavf: remove disabled FF_API_TIMESTAMP cruft
  lavf: remove disabled FF_API_LOOP_OUTPUT cruft
  lavf: remove disabled FF_API_LOOP_INPUT cruft
  lavf: remove disabled FF_API_AVSTREAM_QUALITY cruft
  ...

Conflicts:
	doc/APIchanges
	libavcodec/8bps.c
	libavcodec/avcodec.h
	libavcodec/libx264.c
	libavcodec/mjpegbdec.c
	libavcodec/options.c
	libavcodec/sunrast.c
	libavcodec/utils.c
	libavcodec/version.h
	libavcodec/x86/h264_deblock.asm
	libavdevice/libdc1394.c
	libavdevice/v4l2.c
	libavformat/avformat.h
	libavformat/avio.c
	libavformat/avio.h
	libavformat/aviobuf.c
	libavformat/dv.c
	libavformat/mov.c
	libavformat/utils.c
	libavformat/version.h
	libavformat/wtv.c
	libavutil/Makefile
	libavutil/file.c
	libswscale/x86/input.asm
	libswscale/x86/swscale_mmx.c
	libswscale/x86/swscale_template.c
	tests/ref/lavf/ffm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-28 07:53:34 +01:00
Ronald S. Bultje 3b15a6d742 config.asm: change %ifdef directives to %if directives.
This allows combining multiple conditionals in a single statement.
2012-01-27 10:19:57 +08:00
Michael Niedermayer 7f83db3124 Merge remote-tracking branch 'qatar/master'
* qatar/master: (46 commits)
  mtv: Make sure audio_subsegments is not 0
  v4l2: use V4L2_FMT_FLAG_EMULATED only if it is defined
  avconv: add symbolic names for -vsync parameters
  flvdec: Fix compiler warning for uninitialized variables
  rtsp: Fix compiler warning for uninitialized variable
  ulti: convert to new bytestream API.
  swscale: Use standard multiple inclusion guards in ppc/ header files.
  Place some START_TIMER invocations in separate blocks.
  v4l2: list available formats
  v4l2: set the proper codec_tag
  v4l2: refactor device_open
  v4l2: simplify away io_method
  v4l2: cosmetics
  v4l2: uniform and format options
  v4l2: do not force interlaced mode
  avio: exit early in fill_buffer without read_packet
  vc1dec: fix invalid memory access for small video dimensions
  rv34: fix invalid memory access for small video dimensions
  rv34: joint coefficient decoding and dequantization
  avplay: Don't call avio_set_interrupt_cb(NULL)
  ...

Conflicts:
	Changelog
	avconv.c
	doc/APIchanges
	doc/indevs.texi
	libavcodec/adxenc.c
	libavcodec/dnxhdenc.c
	libavcodec/h264.c
	libavdevice/v4l2.c
	libavformat/flvdec.c
	libavformat/mtv.c
	libswscale/utils.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-05 02:03:12 +01:00
Ronald S. Bultje 6ea64339c5 swscale: split scale.asm.
scale.asm keeps horizontal scaling functions, whereas output.asm gets
the vertical scaling/output functions.
2012-01-03 20:02:07 -08:00
Michael Niedermayer e462257242 Merge remote-tracking branch 'qatar/master'
* qatar/master: (23 commits)
  applehttp: Properly clean up if unable to probe a segment
  applehttp: Avoid reading uninitialized memory
  fate: Replace misleading "aac" in the name of an ADTS test with "adts".
  fate: Drop pointless "-an" from pictor test command.
  fate: split off image codec FATE tests into their own file
  fate: split off WMA codec FATE tests into their own file
  fate: split off lossless video and audio FATE tests into their own files
  fate: split off qtrle codec FATE tests into their own file
  fate: split off Ut Video codec FATE tests into their own file
  fate: split off screen codec FATE tests into their own file
  fate: split off Real Inc. codec FATE tests into their own file
  fate: split off AC-3 codec FATE tests into their own file
  mpegvideo: remove abort() in ff_find_unused_picture()
  rv40: NEON optimised loop filter strength selection
  rv40: rearrange loop filter functions
  configure: cosmetics: sort some lists where appropriate
  swscale_mmx: drop no longer required parameters from VSCALEX macros
  swscale: Mark yuv2planeX_8_mmx as MMX2; it contains MMX2 instructions.
  build: conditionally compile x86 H.264 chroma optimizations
  v410 encoder and decoder
  ...

Conflicts:
	Changelog
	configure
	doc/developer.texi
	doc/general.texi
	libavcodec/arm/asm.S
	libavcodec/avcodec.h
	libavcodec/v410dec.c
	libavcodec/v410enc.c
	libavcodec/version.h
	libavcodec/x86/Makefile
	libavcodec/x86/dsputil_mmx.c
	libswscale/x86/swscale_mmx.c
	tests/Makefile
	tests/fate2.mak

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-12-14 23:58:10 +01:00
Diego Biurrun 52de07e1f1 swscale: Mark yuv2planeX_8_mmx as MMX2; it contains MMX2 instructions. 2011-12-14 11:58:46 +01:00
Michael Niedermayer 5f268ca5c5 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavf: pass options from AVFormatContext to avio.
  avformat: Use avio_open2, pass the AVFormatContext interrupt_callback onwards
  avio: add avio_open2, taking an interrupt callback and options
  avio: add support for passing options to protocols.
  avio: add and use ffurl_protocol_next().
  avformat: Pass the interrupt callback on to chained muxers/demuxers
  avio: Add an AVIOInterruptCB parameter to ffurl_open/ffurl_alloc
  avformat: Use ff_check_interrupt
  avio: Add an internal utility function for checking the new interrupt callback
  avio: Add AVIOInterruptCB
  texi2html: remove stray \n
  doc: prettyfy the texi2html documentation
  swscale: handle unaligned buffers in yuv2plane1

Conflicts:
	libavformat/avformat.h
	libavformat/avio.c
	libavformat/mov.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-11-14 00:33:39 +01:00
Ronald S. Bultje 8283f90a52 swscale: handle unaligned buffers in yuv2plane1
The issue had been introduced in
c435653627

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2011-11-13 08:27:20 +01:00
Michael Niedermayer 13b7781ec8 Merge remote-tracking branch 'qatar/master'
* qatar/master: (23 commits)
  x86inc: use sse versions of common macros instead of sse2 when applicable
  doc/APIchanges: add missing dates and hashes
  lavf: don't return from void av_update_cur_dts()
  Changelog: add more entries.
  Changelog: update ffmpeg/avconv incompatibility list.
  avconv: remove some redundant temporary variables.
  avconv: fix broken indentation
  avconv: move copy_initial_nonkeyframes to the options context.
  avconv: use file:stream instead of file.stream in log messages.
  doc/avconv: elaborate on basic functionality.
  doc/avconv: -sample_fmts, not -help sample_fmts prints the sample formats
  openssl: Only use CRYPTO_set_id_callback on OpenSSL < 1.0.0
  Call avformat_network_init/deinit in the programs
  Remove leftover includes of strings.h
  avutil: Don't allow using strcasecmp/strncasecmp
  Replace all usage of strcasecmp/strncasecmp
  avstring: Add locale independent implementations of strcasecmp/strncasecmp
  avstring: Add locale independent implementations of toupper/tolower
  cosmetics: insert some spaces in explicit enum value assignments
  move 8SVX audio codecs to the audio codec list part on the next bump
  ...

Conflicts:
	avprobe.c
	doc/APIchanges
	ffplay.c
	ffserver.c
	libavcodec/avcodec.h
	libavdevice/bktr.c
	libavdevice/v4l.c
	libavdevice/v4l2.c
	libavformat/matroskaenc.c
	libavformat/wtv.c
	libavutil/avstring.c
	libavutil/avstring.h
	libavutil/avutil.h
	libswscale/x86/swscale_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-11-07 03:01:43 +01:00
Ronald S. Bultje c435653627 swscale: write yuv2plane1 MMX/SSE2/SSE4/AVX functions. 2011-11-05 20:48:14 -07:00
Michael Niedermayer 2b0cdb7364 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Move id3v2 tag writing to a separate file.
  swscale: add missing colons to x86 assembly yuv2planeX.
  g722: split decoder and encoder into separate files
  cosmetics: remove extra spaces before end-of-statement semi-colons
  vorbisdec: check output buffer size before writing output
  wavpack: calculate bpp using av_get_bytes_per_sample()
  ac3enc: Set max value for mode options correctly
  lavc: move get_b_cbp() from h263.h to mpeg4videoenc.c
  mpeg12: move closed_gop from MpegEncContext to Mpeg1Context
  mpeg12: move full_pel from MpegEncContext to Mpeg1Context
  mpeg12: move Mpeg1Context from mpeg12.c to mpeg12.h
  mpegvideo: remove some unused variables from MpegEncContext.

Conflicts:
	libavcodec/mpeg12.c
	libavformat/mp3enc.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-10-24 01:01:21 +02:00
Ronald S. Bultje 9e66b892e8 swscale: add missing colons to x86 assembly yuv2planeX.
This fixes assembling using "nasm".
2011-10-23 09:44:03 -07:00
Michael Niedermayer f97faf6751 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  id3v2: fix doxy comment - 'machine byte order' makes no sense on char arrays
  VC1: restore mistakenly removed code
  twinvq: check output buffer size before decoding
  twinvq: return an error when the packet size is too small
  lavf: export some forgotten symbols with non-av prefixes.
  swscale: update altivec yuv2planeX asm to new per-plane API.
  swscale: make yuv2yuvX_10_sse2/avx 8/9/16-bits aware.
  yuv2planeX10 SIMD
  swscale: decide whether to use yuv2plane1/X on a per-plane basis.
  swscale: reintroduce full precision in 16-bit output.
  Split up yuv2yuvX functions
  Split out yuv2yuv1 luma and chroma in order to make them generic DSP functions
  lavc: replace references to deprecated AVCodecContext.error_recognition to use AVCodecContext.err_recognition
  lavc: translate non-flag-based er options into flag-based ef options at codec open
  add -err_filter AVOptions to access flag-based error recognition
  h264_weight: initialize "height" function argument properly.
  presets: spelling error in libvpx 1080p50_60
  avplay: fix fullscreen behaviour with SDL 1.2.14 on Mac OS X

Conflicts:
	ffplay.c
	libavformat/libavformat.v
	libswscale/swscale.c
	libswscale/x86/swscale_template.c
	tests/ref/lavfi/pixfmts_scale

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-10-23 05:13:56 +02:00
Ronald S. Bultje 6cacecdca3 swscale: make yuv2yuvX_10_sse2/avx 8/9/16-bits aware.
Also implement MMX/MMX2 versions and SSE4 versions.
2011-10-22 10:35:14 -07:00
Kieran Kunhya 7fbbf95293 yuv2planeX10 SIMD
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-22 10:35:14 -07:00
Michael Niedermayer b81f8880e0 Merge remote-tracking branch 'qatar/master'
* qatar/master: (23 commits)
  fix AC3ENC_OPT_MODE_ON/OFF
  h264: fix HRD parameters parsing
  prores: implement multithreading.
  prores: idct sse2/sse4 optimizations.
  swscale: use aligned move for storage into temporary buffer.
  prores: extract idct into its own dspcontext and merge with put_pixels.
  h264: fix invalid shifts in init_cavlc_level_tab()
  intfloat_readwrite: fix signed addition overflows
  mov: do not misreport empty stts
  mov: cosmetics, fix for and if spacing
  id3v2: fix NULL pointer dereference
  mov: read album_artist atom
  mov: fix disc/track numbers and totals
  doc: fix references to obsolete presets directories for avconv/ffmpeg
  flashsv: return more meaningful error value
  flashsv: fix typo in av_log() message
  smacker: validate channels and sample format.
  smacker: check buffer size before reading output size
  smacker: validate number of channels
  smacker: Separate audio flags from sample rates in smacker demuxer.
  ...

Conflicts:
	cmdutils.h
	doc/ffmpeg.texi
	libavcodec/Makefile
	libavcodec/motion_est_template.c
	libavformat/id3v2.c
	libavformat/mov.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-10-12 05:40:57 +02:00
Ronald S. Bultje 6aa3cac6bf swscale: use aligned move for storage into temporary buffer.
The intermediate buffer is always aligned.
2011-10-11 07:50:48 -07:00
Michael Niedermayer 1eb8014b49 swscale: add 14bit support to the "MMX/SSE2/SSSE3/SSE4 versions for horizontal scaling"
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-09-14 00:19:03 +02:00
Ronald S. Bultje e0c3e07387 sws: implement MMX/SSE2/SSSE3/SSE4 versions for horizontal scaling.
Speed: from 3.9x to 9.6x speed improvement over C, and some small
(up to 15%) speed improvements over existing MMX code (particularly
for bigger filters).
2011-09-13 09:53:42 -07:00