Commit Graph

331 Commits

Author SHA1 Message Date
Carl Eugen Hoyos 5c0068758f Fix compilation with --disable-yasm. 2011-04-12 17:40:18 +02:00
Oskar Arvidsson 8dbe585641 Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-10 22:33:42 +02:00
Michael Niedermayer 3c8493074b Merge remote-tracking branch 'newdev/master'
* newdev/master:
  dsputil: allow to skip drawing of top/bottom edges.
  Split fate-psx-str-v3 into a video-only and audio-only test.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/mpegvideo.c
	libavcodec/snow.c
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-03-27 01:40:18 +01:00
Alexander Strange 1500be13f2 dsputil: allow to skip drawing of top/bottom edges. 2011-03-26 17:45:38 -04:00
Michael Niedermayer 2fd41c9067 Merge remote-tracking branch 'newdev/master'
* newdev/master:
  avio: make udp_set_remote_url/get_local_port internal.
  asfdec: also subtract preroll when reading simple index object
  matroskaenc: remove a variable that's unused after bc17bd9.
  avio: cosmetics - nicer vertical alignment.
  Remove unnecessary icc version checks
  Disable 'attribute "foo" ignored' warnings from icc
  rtsp: Don't use a locale dependent format string
  Add xd55 codec tag for XDCAM HD422 720p25 CBR files.
  configure: get libavcodec version from new version.h header
  lavc: move the version macros to a new installed header.
  matroskaenc: simplify get_aac_sample_rates by using ff_mpeg4audio_get_config
  Do not use format string "%0.3f" for RTSP Range field.
  Add apply_window_int16() to DSPContext with x86-optimized versions and use it in the ac3_fixed encoder.
  Document usage of import libraries created by dlltool
  configure: Set the correct lib target for arm/wince dlltool
  fate: simplify regression-funcs.sh
  fate: add support for multithread testing

Conflicts:
	libavformat/rtspdec.c
	libavutil/attributes.h
	libavutil/internal.h
	libavutil/mem.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-03-24 02:16:11 +01:00
Justin Ruggles e6e9823488 Add apply_window_int16() to DSPContext with x86-optimized versions and use it
in the ac3_fixed encoder.
2011-03-22 21:08:30 -04:00
Michael Niedermayer d375c10400 Fake-Merge remote-tracking branch 'ffmpeg-mt/master' 2011-03-22 22:36:57 +01:00
Michael Niedermayer d4a50a2100 Merge remote-tracking branch 'newdev/master'
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-03-21 03:33:28 +01:00
Mans Rullgard 0aded9484d Move dct and rdft definitions to separate files
This leaves fft.h with only the core FFT and MDCT definitions
thus making it more managable.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-20 17:15:33 +00:00
Mans Rullgard 2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Justin Ruggles 0f999cfddb ac3enc: add float_to_fixed24() with x86-optimized versions to AC3DSPContext
and use in scale_coefficients() for the floating-point AC-3 encoder.
2011-03-17 16:46:48 -04:00
Justin Ruggles 79414257e2 mathops: fix MULL() when the compiler does not inline the function.
If the function is not inlined, an immmediate cannot be used for the
shift parameter, so the %cl register must be used instead in that case.

This fixes compilation for x86-32 using gcc with --disable-optimizations.
2011-03-15 20:49:37 -04:00
Justin Ruggles aaff3b312e mathops: change "g" constraint to "rm" in x86-32 version of MUL64().
The 1-arg imul instruction cannot take an immediate argument, only a register
or memory argument.
2011-03-15 13:43:47 -04:00
Justin Ruggles b181b8fb96 mathops: convert MULL/MULH/MUL64 to inline functions rather than macros.
This fixes unexpected name collisions that were occurring with variables
declared within the macros.
It also fixes the fate-acodec-ac3_fixed regression test on x86-32.
2011-03-15 13:43:47 -04:00
Justin Ruggles f1efbca5e9 ac3enc: add SIMD-optimized shifting functions for use with the fixed-point AC3 encoder. 2011-03-14 08:45:31 -04:00
Mans Rullgard a5444fee06 Add CONFIG_AC3DSP symbol to simplify makefiles
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-12 11:35:26 +00:00
Ronald S. Bultje bf6fa73245 dsputil_mmx.c: remove ff_vector128.
Remove ff_vector128, it is identical to ff_pb_80.
2011-02-19 10:51:15 -05:00
Ronald S. Bultje 12802ec060 dsputil: move VC1-specific stuff into VC1DSPContext. 2011-02-17 17:35:35 -05:00
Justin Ruggles 1f004fc512 ac3dsp: Change punpckhqdq to movhlps in ac3_max_msb_abs_int16().
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-02-16 14:08:34 -05:00
Justin Ruggles fbb6b49dab ac3enc: Add x86-optimized function to speed up log2_tab().
AC3DSPContext.ac3_max_msb_abs_int16() finds the maximum MSB of the absolute
value of each element in an array of int16_t.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-02-13 16:49:39 -05:00
Loren Merritt e6b1ed693a FFT: factor a shuffle out of the inner loop and merge it into fft_permute.
6% faster SSE FFT on Conroe, 2.5% on Penryn.

Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-02-13 15:36:39 +01:00
Justin Ruggles dda3f0ef48 Add x86-optimized versions of exponent_min().
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-02-10 15:32:47 -05:00
Ronald S. Bultje 17cf7c68ed Fix ff_emu_edge_core_sse() on Win64.
Fix emu_edge_v_extend_15 to be <128 bytes on Win64, by being more strict
on the size of registers and which registers are being used for operations
where multiple are available. This fixes segfaults in emulated_edge()
function calls on Win64.
2011-02-08 18:25:12 -05:00
Justin Ruggles c73d99e672 Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-02 02:44:53 +00:00
Alex Converse 770c410fbb Fix ff_imdct_calc_sse() on gcc-4.6
Gcc 4.6 only preserves the first value when using an array with an "m"
constraint.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-02 02:40:05 +00:00
Ronald S. Bultje 81f2a3f4ff Implement a SIMD version of emulated_edge_mc() for x86.
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.
2011-01-31 20:55:56 -05:00
Justin Ruggles d19b744a36 cosmetics: indentation
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:30:15 +00:00
Justin Ruggles 80ba1ddb58 Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:28:42 +00:00
Mans Rullgard 80944df720 x86: fix overflow in h264 8x8 planar prediction
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-24 23:24:28 +00:00
Justin Ruggles 6eabb0d3ad Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-22 17:53:27 +00:00
Justin Ruggles 1c189fc533 cosmetics related to LPC changes.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-21 19:59:08 +00:00
Justin Ruggles 77a78e9bdc Separate window function from autocorrelation.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-21 19:59:08 +00:00
Justin Ruggles 56f8952b25 Move lpc_compute_autocorr() from DSPContext to a new struct LPCContext.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-21 19:58:59 +00:00
Ronald S. Bultje b9c7f66e6d Fix horizontal/horizontal_up 8x8l intra prediction x86/simd functions.
The original functions did not work correctly for edge pixels, e.g.
when CODEC_FLAG_EMU_EDGE is set, leading to corrupt output in e.g. VLC.
Based on a patch by Daniel Kang <daniel d kang gmail com>.

Signed-off-by: Ronald S. Bultje <rsbultje gmail com>
2011-01-19 20:34:42 -05:00
Mans Rullgard ef4a65149d Replace ASMALIGN() with .p2align
This macro has unconditionally used .p2align for a long time and
serves no useful purpose.
2011-01-18 20:48:24 +00:00
Mans Rullgard ac3c9d0169 x86: remove VLA in ac3_downmix_sse 2011-01-18 20:48:24 +00:00
Janne Grunau 2c3589bfda consolidate .gitignore patters into a single file
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-18 21:32:05 +01:00
Janne Grunau 348b8218f7 convert svn:ignore properties to .gitignore files
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-17 15:50:14 +01:00
Ronald S. Bultje 1b3e43e4fd Fix overflow in pred16x16_plane x86 simd code. Fixes issue 2547.
Originally committed as revision 26381 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-15 22:00:44 +00:00
Ronald S. Bultje ec3233a855 Fix ff_pw_3 alignment.
Originally committed as revision 26344 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 23:26:34 +00:00
Jason Garrett-Glaser 19fb234e4a H.264: split luma dc idct out and implement MMX/SSE2 versions
About 2.5x the speed.

NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.

Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 21:34:25 +00:00
Daniel Kang 004357a11f Fix compilation on x86-32 with --disable-optimizations,
fixes issue 2127.

Patch by Daniel Kang, daniel.d.kang at gmail

Originally committed as revision 26204 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-03 11:30:04 +00:00
Daniel Kang 0790caba60 Fix invalid reads in valgrind fate, patch by Daniel Kang <daniel dot d dot
kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26177 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-31 01:29:06 +00:00
Daniel Kang 536e9b2f58 Port pred8x8l_down_left_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26162 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 23:48:44 +00:00
Daniel Kang 720ea2d5b2 Port pred4x4_down_right_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26159 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:55:51 +00:00
Daniel Kang d0aebe23e2 Port pred4x4_vertical_right_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26158 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:52:41 +00:00
Daniel Kang 76497232ef Port pred4x4_horizontal_down_mmxext (H.264 intra prediction) from x264
(authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26157 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:49:57 +00:00
Daniel Kang e9c576a467 Port pred4x4_horizontal_up_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26156 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:42:33 +00:00
Daniel Kang 92f441ae86 Port pred4x4_vertical_left_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26155 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:35:34 +00:00
Ronald S. Bultje e8d98764cc Merge a few superfluous CONFIG_GPL checks.
Originally committed as revision 26154 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:30:47 +00:00