Commit Graph

146 Commits

Author SHA1 Message Date
Michael Niedermayer d82d11397f avutil/arm/intmath: return int for uint8 / uint16 clip
The C functions return uint8/16_t but that is effectively int not unsigned int
Fixes fate-filter-tblend

We do not return uint8/16_t as that would require the compiler to truncate the
values, slowing it down.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-20 17:20:16 +02:00
Andreas Cadhalpun 5bf84a584e arm: only enable setend on ARMv6
Without this check it causes SIGILL crashes on ARMv5.

Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-06-05 17:14:10 +02:00
Michael Niedermayer 9a1884a10e Merge commit 'dcae2e32f7d8a1ca5fb8c1e4aa81313be854dd73'
* commit 'dcae2e32f7d8a1ca5fb8c1e4aa81313be854dd73':
  arm: Suppress tags about used cpu arch and extensions

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-07 19:30:51 +01:00
Martin Storsjö dcae2e32f7 arm: Suppress tags about used cpu arch and extensions
When all the codepaths using manually set .arch/.fpu code is
behind runtime detection, the elf attributes should be suppressed.

This allows tools to know that the final built binary doesn't
strictly require these extensions.

Signed-off-by: Martin Storsjö <martin@martin.st>
2015-03-07 17:10:08 +02:00
Michael Niedermayer 1253091d6f Merge commit '76ce9bd8e26dcb3652240a1072840ff4011d7cdc'
* commit '76ce9bd8e26dcb3652240a1072840ff4011d7cdc':
  libavutil: Add ARM av_clip_intp2_arm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-21 11:15:32 +01:00
Peter Meerwald 76ce9bd8e2 libavutil: Add ARM av_clip_intp2_arm
add ARM code for implementing av_clip_intp2 using the ssat instruction

on Cortex-A8, av_clip_intp2_arm() is faster than av_clip_intp2_c() and
the generic av_clip(), about -19%

Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-02-21 00:54:40 +01:00
Michael Niedermayer 16e65419ed Merge commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46'
* commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46':
  arm: Use .data.rel.ro for const data with relocations

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-12-09 11:58:13 +01:00
Martin Storsjö f963f80399 arm: Use .data.rel.ro for const data with relocations
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-12-09 11:43:25 +02:00
jessejiang 29d208d5d4 avutil/arm/float_dsp_init_vfp: replace restrict by av_restrict
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-11-20 11:17:42 +01:00
Michael Niedermayer 1e519b9d40 avutil: turn arm setend into a cpuflag
this allows disabling and enabling it
it also prevents crashes if vfpv3 and neon are disabled which previously
would have enabled the flag

And last but not least one can enable setend on cpus like cortex-a8 where
its fast but disabled by default

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-13 14:50:15 +02:00
Michael Niedermayer 7cdb3b2b79 Merge commit '6869612f5c7d4d2f20f69a5658328a761deadb1c'
* commit '6869612f5c7d4d2f20f69a5658328a761deadb1c':
  arm: Macroize the test for 'setend' CPU instruction support

Conflicts:
	libavcodec/arm/h264dsp_init_arm.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 12:46:13 +02:00
Ben Avison 6869612f5c arm: Macroize the test for 'setend' CPU instruction support
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-07-21 15:08:01 -07:00
Ben Avison 5a272190a0 armv6: Accelerate butterflies_float
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in butterflies_float_c() / ff_butterflies_float_vfp() for the
same sample AAC stream:

                   Before          After
                   Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode       1542.8 43.7     1470.5 41.5    100.0%      +4.9%
butterflies_float  130.0  11.9     70.2   12.1    100.0%      +85.2%

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-07-18 01:34:38 +03:00
Ben Avison 5edad2c4a1 armv6: Accelerate vector_fmul_window
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in vector_fmul_window_c() / ff_vector_fmul_window_vfp() for the
same sample AAC stream:

                    Before          After
                    Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode        1598.2 47.4     1529.2 25.4    100.0%      +4.5%
vector_fmul_window  244.0  22.1     188.9  22.3    100.0%      +29.2%

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-07-18 01:34:31 +03:00
Ben Avison 57641410d1 armv6: Accelerate butterflies_float
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in butterflies_float_c() / ff_butterflies_float_vfp() for the
same sample AAC stream:

                   Before          After
                   Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode       1542.8 43.7     1470.5 41.5    100.0%      +4.9%
butterflies_float  130.0  11.9     70.2   12.1    100.0%      +85.2%

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-16 21:38:02 +02:00
Ben Avison 649c666137 armv6: Accelerate vector_fmul_window
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in vector_fmul_window_c() / ff_vector_fmul_window_vfp() for the
same sample AAC stream:

                    Before          After
                    Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode        1598.2 47.4     1529.2 25.4    100.0%      +4.5%
vector_fmul_window  244.0  22.1     188.9  22.3    100.0%      +29.2%

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-16 21:37:41 +02:00
Michael Niedermayer 01983e50c0 Merge commit '7b0c7c9163fe3dd0081696befde28617119d2590'
* commit '7b0c7c9163fe3dd0081696befde28617119d2590':
  arm: Detect 32 bit cpu features on ARMv8 when running on a 64 bit kernel

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-28 21:31:18 +02:00
Martin Storsjö 7b0c7c9163 arm: Detect 32 bit cpu features on ARMv8 when running on a 64 bit kernel
When running on a 64 bit kernel, /proc/cpuinfo lists different
optional features than on 32 bit kernels (because some of them
are mandatory in the 64 bit implemenations).

The kernel does list the old features properly if they are queried
via /proc/self/auxv though - however this file is not always readable
(e.g. on most android systems). The getauxval function could also
provide the same info as /proc/self/auxv even if this file isn't
readable, but this function is not always available (and thus would
need to be loaded with dlsym for compatibility with older android
versions).

The android cpufeatures library does this slightly differently,
by assuming that these are available if the "CPU architecture"
line is >= 8, see [1] for details.

It has been suggested to include the old, non-optional features in
/proc/cpuinfo as well, but that suggested patch never was merged.
See [2] for the discussion around this suggestion.

[1] https://android-review.googlesource.com/91380
[2] http://marc.info/?l=linux-arm-kernel&m=139087240101974

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-06-28 22:16:59 +03:00
Michael Niedermayer a40c338a00 Merge commit 'd5a55981986ac5d1a31aef3a8d16eaff8534a412'
* commit 'd5a55981986ac5d1a31aef3a8d16eaff8534a412':
  build: check if AS supports the '.func' directive

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-04 12:45:35 +02:00
Janne Grunau d5a5598198 build: check if AS supports the '.func' directive
Not supported by Clang's integrated assembler. Since it just adds
debug information it can safely omitted.
2014-06-03 14:23:03 +02:00
Michael Niedermayer 1c788eaca9 Merge commit '831a1180785a786272cdcefb71566a770bfb879e'
* commit '831a1180785a786272cdcefb71566a770bfb879e':
  Update dsputil- and SIMD-related comments to match reality more closely

Conflicts:
	libavcodec/x86/hpeldsp.asm
	libavutil/arm/float_dsp_init_arm.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-13 23:59:56 +01:00
Diego Biurrun 831a118078 Update dsputil- and SIMD-related comments to match reality more closely 2014-03-13 05:50:29 -07:00
Michael Niedermayer a74bab7079 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: hpeldsp: prevent overreads in armv6 asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-05 21:35:30 +01:00
Janne Grunau cbddee1cca arm: hpeldsp: prevent overreads in armv6 asm
Based on a patch by Russel King <rmk+libav@arm.linux.org.uk>

Bug-Id: 646
CC: libav-stable@libav.org
2014-03-05 14:30:57 +01:00
Michael Niedermayer 53d11f7b2d Merge commit '543156d7518f5e5d731123da066d86278f9fa492'
* commit '543156d7518f5e5d731123da066d86278f9fa492':
  arm: Mark the stack as non-executable

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-19 14:07:26 +01:00
Martin Storsjö 543156d751 arm: Mark the stack as non-executable
If linking in an object file without this attribute set, the
linker will assume that an executable stack might be needed.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-02-19 09:57:19 +02:00
Michael Niedermayer a7574a36af Merge commit 'e3fec3f095ab5ea08ee662942d98526aaf5e3635'
* commit 'e3fec3f095ab5ea08ee662942d98526aaf5e3635':
  arm: Add EXTERN_ASM to the .func and .type declarations for exported symbols

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 00:49:28 +01:00
Martin Storsjö e3fec3f095 arm: Add EXTERN_ASM to the .func and .type declarations for exported symbols
This makes the generated assembly more internally consistent,
avoiding declaring two labels for the same function (for cases
where EXTERN_ASM is empty) and not declaring a separate unprefixed
label in other cases.

This also makes sure the .func and .type delcarations have the same
prefix. They have previously not been used on the platforms
that have prefixed symbols on arm (iOS), but gas-preprocessor
has recently started using the .func declarations for adding
.thumb_func declarations for such functions.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-02-07 15:14:06 +02:00
Michael Niedermayer 9d5cc55f0f Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Add an option for making sure NEON registers aren't clobbered

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-11 03:08:10 +01:00
Martin Storsjö 44a0a98f92 arm: Add an option for making sure NEON registers aren't clobbered
This is pretty much based on the same test for XMM registers.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-11 00:03:00 +02:00
Michael Niedermayer edba54630b Merge commit '5dae4872357613a0b51120b54a4c5221e0ec3f69'
* commit '5dae4872357613a0b51120b54a4c5221e0ec3f69':
  arm: Allow overriding the alignment set in the function macro

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:36:56 +01:00
Martin Storsjö 5dae487235 arm: Allow overriding the alignment set in the function macro
The function macro always sets .align 2 before declaring the
function label (since 5c5e1ea3) and always sets the section to
.text (since 278caa6a).

The .align 5 before certain functions, added in fc252eba, were added
before .text and .align were added to the function macro and thus
became useless/unused when the function macro got them.

This restores the original intention, to align the loop entry
points.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:29:56 +02:00
Thilo Borgmann d814a839ac Reinstate proper FFmpeg license for all files. 2013-08-30 15:47:38 +00:00
Michael Niedermayer 946f080b54 Merge commit '7ffda66fd5c81af4725bff7c2c4f207ba2aa0613'
* commit '7ffda66fd5c81af4725bff7c2c4f207ba2aa0613':
  arm: float_dsp: Propagate cpu_flags to vfp initialization function

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 16:05:04 +02:00
Michael Niedermayer 2a60666d1d Merge commit '8410d6e93c2e074881f1c7b7e4cdefd2e497d52e'
* commit '8410d6e93c2e074881f1c7b7e4cdefd2e497d52e':
  avutil: Refactor CPU extension availability macros

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:15:10 +02:00
Michael Niedermayer c83d794936 Merge commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b'
* commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b':
  avutil: Move internal CPU detection function declarations to private header

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:05:15 +02:00
Diego Biurrun 7ffda66fd5 arm: float_dsp: Propagate cpu_flags to vfp initialization function 2013-08-29 11:24:14 +02:00
Diego Biurrun 8410d6e93c avutil: Refactor CPU extension availability macros 2013-08-28 23:54:14 +02:00
Diego Biurrun b78b10c4b7 avutil: Move internal CPU detection function declarations to private header 2013-08-28 23:54:14 +02:00
Michael Niedermayer c88503e3f6 Merge commit '439902e0d68a0f0d800c21b5e6b598d5fa0c51da'
* commit '439902e0d68a0f0d800c21b5e6b598d5fa0c51da':
  Employ consistent LIBAV_COMPAT_ multiple inclusion guards in compat/

Conflicts:
	compat/aix/math.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-19 10:56:10 +02:00
Diego Biurrun 439902e0d6 Employ consistent LIBAV_COMPAT_ multiple inclusion guards in compat/
Also fix a comment and an #endif comment.
2013-07-18 18:12:38 +02:00
Michael Niedermayer b7c6d1ed90 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Only output eabi attributes if building for ELF
  fix scalarproduct_and_madd_int16_altivec() for orders > 16

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-27 08:55:24 +02:00
Martin Storsjö be7952b5c3 arm: Only output eabi attributes if building for ELF
This matches the other eabi attribute in the same file. This is
required in order to build for arm/hardfloat with other object
file formats than ELF.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-27 00:55:33 +03:00
Michael Niedermayer 3c200aa693 Merge commit '1fda184a85178cfd7b98d9e308d18e1ded76a511'
* commit '1fda184a85178cfd7b98d9e308d18e1ded76a511':
  avutil: Add av_cold attributes to init functions missing them

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-05 12:53:50 +02:00
Diego Biurrun 1fda184a85 avutil: Add av_cold attributes to init functions missing them 2013-05-04 22:48:05 +02:00
Michael Niedermayer 3ccda2b02b Merge commit '375ef6528c9dd2db7f9881e232cb0ec3aa16970d'
* commit '375ef6528c9dd2db7f9881e232cb0ec3aa16970d':
  libfdk-aacenc: Actually check for upper bounds of cutoff
  arm: Fall back to runtime cpu feature detection via /proc/cpuinfo

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-12 12:41:09 +01:00
Martin Storsjö ab8f1a6989 arm: Fall back to runtime cpu feature detection via /proc/cpuinfo
On recent android versions, /proc/self/auxw is unreadable
(unless the process is running running under the shell uid or
in debuggable mode, which makes it hard to notice). See
http://b.android.com/43055 and
https://android-review.googlesource.com/51271 for more information
about the issue.

This makes sure e.g. neon optimizations are enabled at runtime in
android apps even when built in release mode, if configured to
use the runtime detection.

CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-11 17:15:15 +02:00
Michael Niedermayer 8102f27b5b Merge commit '73b704ac609d83e0be124589f24efd9b94947cf9'
* commit '73b704ac609d83e0be124589f24efd9b94947cf9':
  arm: Add some missing header #includes
  floatdsp: move scalarproduct_float from dsputil to avfloatdsp.

Conflicts:
	libavcodec/acelp_pitch_delay.c
	libavcodec/amrnbdec.c
	libavcodec/amrwbdec.c
	libavcodec/ra288.c
	libavcodec/x86/dsputil_mmx.c
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:31:55 +01:00
Michael Niedermayer 24604ebaf8 Merge commit '5959bfaca396ecaf63a8123055f499688b79cae3'
* commit '5959bfaca396ecaf63a8123055f499688b79cae3':
  floatdsp: move butterflies_float from dsputil to avfloatdsp.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/dsputil.h
	libavcodec/imc.c
	libavcodec/mpegaudiodec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:13:54 +01:00
Michael Niedermayer 6e6e170898 Merge commit '42d324694883cdf1fff1612ac70fa403692a1ad4'
* commit '42d324694883cdf1fff1612ac70fa403692a1ad4':
  floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.

Conflicts:
	libavcodec/arm/dsputil_init_vfp.c
	libavcodec/arm/dsputil_vfp.S
	libavcodec/dsputil.c
	libavcodec/ppc/float_altivec.c
	libavcodec/x86/dsputil.asm
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:04:50 +01:00