Commit Graph

158 Commits

Author SHA1 Message Date
James Almer d1ee6fb729 Merge commit '6a1ea4ec932f4fc9fdc00ec51ee070b298ddb35f'
* commit '6a1ea4ec932f4fc9fdc00ec51ee070b298ddb35f':
  arm: warn/error on movrelx usage problematic with PIC on ELF

Merged-by: James Almer <jamrial@gmail.com>
2017-04-04 16:04:29 -03:00
Clément Bœsch a0860b0a38 Merge commit '6f9e34baea4f6f484392e4e67f606a0835d07b73'
* commit '6f9e34baea4f6f484392e4e67f606a0835d07b73':
  arm: Check for support for the .fpu directive

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-02-02 11:22:04 +01:00
Janne Grunau 6a1ea4ec93 arm: warn/error on movrelx usage problematic with PIC on ELF
The warning has false positives but our asm does not trigger it. For
new code false positives can only be avoided by changing the register
allocation.
2016-11-24 21:26:22 +01:00
Martin Storsjö 86c5a23ee5 arm: Clear the gp register alias at the end of functions
We reset .Lpic_gp to zero at the start of each function, which means
that the logic within movrelx for clearing gp when necessary will
be missed.

This fixes using movrelx in different functions with a different
helper register.

This is cherry-picked from libav commit
824e8c2840.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2016-11-15 15:10:03 -05:00
Martin Storsjö 824e8c2840 arm: Clear the gp register alias at the end of functions
We reset .Lpic_gp to zero at the start of each function, which means
that the logic within movrelx for clearing gp when necessary will
be missed.

This fixes using movrelx in different functions with a different
helper register.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-11-10 14:01:04 +02:00
Martin Storsjö 6f9e34baea arm: Check for support for the .fpu directive
When targeting COFF (windows), clang doesn't support this
directive (while binutils supports it for all targets).

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-07-21 12:52:10 +03:00
Timothy Gu 44304ae322 all: Add missing header guards 2016-01-28 19:49:48 -08:00
Hendrik Leppkes a04a9434b5 Merge commit '73c8c0341cce9e1a6c4169721f5123f97fc4be2f'
* commit '73c8c0341cce9e1a6c4169721f5123f97fc4be2f':
  arm: Fix vfp dead code elimination with have_vfp_vm

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-19 08:51:12 +01:00
Martin Storsjö 73c8c0341c arm: Fix vfp dead code elimination with have_vfp_vm
This fixes builds with --disable-vfp.

Checking for the armv6 cpu flag is incorrect, since vfpv2 isn't
armv6 specific.

Signed-off-by: Martin Storsjö <martin@martin.st>
2016-01-08 23:52:59 +02:00
Hendrik Leppkes e754c8e8ca Merge commit 'e2710e790c09e49e86baa58c6063af0097cc8cb0'
* commit 'e2710e790c09e49e86baa58c6063af0097cc8cb0':
  arm: add a cpu flag for the VFPv2 vector mode

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-01-02 11:01:29 +01:00
Janne Grunau e2710e790c arm: add a cpu flag for the VFPv2 vector mode
The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu
implementations do not support it in hardware. Vector mode code will
depending the OS either be emulated in software or result in an illegal
instruction on cpus which does not support it. This was not really
problem in practice since NEON implementations of the same functions are
preferred. It will however become a problem for checkasm which tests
every cpu flag separately.

Since this is a cpu feature newer cpu do not support anymore the
behaviour of this flag differs from the other flags. It can be only
activated by runtime cpu feature selection.
2015-12-14 16:42:35 +01:00
James Almer 36e1665d3d avutil/attributes: add AV_GCC_VERSION_AT_MOST
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-09-18 12:41:29 -03:00
Michael Niedermayer d82d11397f avutil/arm/intmath: return int for uint8 / uint16 clip
The C functions return uint8/16_t but that is effectively int not unsigned int
Fixes fate-filter-tblend

We do not return uint8/16_t as that would require the compiler to truncate the
values, slowing it down.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-20 17:20:16 +02:00
Andreas Cadhalpun 5bf84a584e arm: only enable setend on ARMv6
Without this check it causes SIGILL crashes on ARMv5.

Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-06-05 17:14:10 +02:00
Michael Niedermayer 9a1884a10e Merge commit 'dcae2e32f7d8a1ca5fb8c1e4aa81313be854dd73'
* commit 'dcae2e32f7d8a1ca5fb8c1e4aa81313be854dd73':
  arm: Suppress tags about used cpu arch and extensions

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-07 19:30:51 +01:00
Martin Storsjö dcae2e32f7 arm: Suppress tags about used cpu arch and extensions
When all the codepaths using manually set .arch/.fpu code is
behind runtime detection, the elf attributes should be suppressed.

This allows tools to know that the final built binary doesn't
strictly require these extensions.

Signed-off-by: Martin Storsjö <martin@martin.st>
2015-03-07 17:10:08 +02:00
Michael Niedermayer 1253091d6f Merge commit '76ce9bd8e26dcb3652240a1072840ff4011d7cdc'
* commit '76ce9bd8e26dcb3652240a1072840ff4011d7cdc':
  libavutil: Add ARM av_clip_intp2_arm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-21 11:15:32 +01:00
Peter Meerwald 76ce9bd8e2 libavutil: Add ARM av_clip_intp2_arm
add ARM code for implementing av_clip_intp2 using the ssat instruction

on Cortex-A8, av_clip_intp2_arm() is faster than av_clip_intp2_c() and
the generic av_clip(), about -19%

Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-02-21 00:54:40 +01:00
Michael Niedermayer 16e65419ed Merge commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46'
* commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46':
  arm: Use .data.rel.ro for const data with relocations

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-12-09 11:58:13 +01:00
Martin Storsjö f963f80399 arm: Use .data.rel.ro for const data with relocations
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-12-09 11:43:25 +02:00
jessejiang 29d208d5d4 avutil/arm/float_dsp_init_vfp: replace restrict by av_restrict
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-11-20 11:17:42 +01:00
Michael Niedermayer 1e519b9d40 avutil: turn arm setend into a cpuflag
this allows disabling and enabling it
it also prevents crashes if vfpv3 and neon are disabled which previously
would have enabled the flag

And last but not least one can enable setend on cpus like cortex-a8 where
its fast but disabled by default

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-13 14:50:15 +02:00
Michael Niedermayer 7cdb3b2b79 Merge commit '6869612f5c7d4d2f20f69a5658328a761deadb1c'
* commit '6869612f5c7d4d2f20f69a5658328a761deadb1c':
  arm: Macroize the test for 'setend' CPU instruction support

Conflicts:
	libavcodec/arm/h264dsp_init_arm.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 12:46:13 +02:00
Ben Avison 6869612f5c arm: Macroize the test for 'setend' CPU instruction support
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-07-21 15:08:01 -07:00
Ben Avison 5a272190a0 armv6: Accelerate butterflies_float
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in butterflies_float_c() / ff_butterflies_float_vfp() for the
same sample AAC stream:

                   Before          After
                   Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode       1542.8 43.7     1470.5 41.5    100.0%      +4.9%
butterflies_float  130.0  11.9     70.2   12.1    100.0%      +85.2%

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-07-18 01:34:38 +03:00
Ben Avison 5edad2c4a1 armv6: Accelerate vector_fmul_window
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in vector_fmul_window_c() / ff_vector_fmul_window_vfp() for the
same sample AAC stream:

                    Before          After
                    Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode        1598.2 47.4     1529.2 25.4    100.0%      +4.5%
vector_fmul_window  244.0  22.1     188.9  22.3    100.0%      +29.2%

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-07-18 01:34:31 +03:00
Ben Avison 57641410d1 armv6: Accelerate butterflies_float
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in butterflies_float_c() / ff_butterflies_float_vfp() for the
same sample AAC stream:

                   Before          After
                   Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode       1542.8 43.7     1470.5 41.5    100.0%      +4.9%
butterflies_float  130.0  11.9     70.2   12.1    100.0%      +85.2%

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-16 21:38:02 +02:00
Ben Avison 649c666137 armv6: Accelerate vector_fmul_window
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in vector_fmul_window_c() / ff_vector_fmul_window_vfp() for the
same sample AAC stream:

                    Before          After
                    Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode        1598.2 47.4     1529.2 25.4    100.0%      +4.5%
vector_fmul_window  244.0  22.1     188.9  22.3    100.0%      +29.2%

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-16 21:37:41 +02:00
Michael Niedermayer 01983e50c0 Merge commit '7b0c7c9163fe3dd0081696befde28617119d2590'
* commit '7b0c7c9163fe3dd0081696befde28617119d2590':
  arm: Detect 32 bit cpu features on ARMv8 when running on a 64 bit kernel

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-28 21:31:18 +02:00
Martin Storsjö 7b0c7c9163 arm: Detect 32 bit cpu features on ARMv8 when running on a 64 bit kernel
When running on a 64 bit kernel, /proc/cpuinfo lists different
optional features than on 32 bit kernels (because some of them
are mandatory in the 64 bit implemenations).

The kernel does list the old features properly if they are queried
via /proc/self/auxv though - however this file is not always readable
(e.g. on most android systems). The getauxval function could also
provide the same info as /proc/self/auxv even if this file isn't
readable, but this function is not always available (and thus would
need to be loaded with dlsym for compatibility with older android
versions).

The android cpufeatures library does this slightly differently,
by assuming that these are available if the "CPU architecture"
line is >= 8, see [1] for details.

It has been suggested to include the old, non-optional features in
/proc/cpuinfo as well, but that suggested patch never was merged.
See [2] for the discussion around this suggestion.

[1] https://android-review.googlesource.com/91380
[2] http://marc.info/?l=linux-arm-kernel&m=139087240101974

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-06-28 22:16:59 +03:00
Michael Niedermayer a40c338a00 Merge commit 'd5a55981986ac5d1a31aef3a8d16eaff8534a412'
* commit 'd5a55981986ac5d1a31aef3a8d16eaff8534a412':
  build: check if AS supports the '.func' directive

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-04 12:45:35 +02:00
Janne Grunau d5a5598198 build: check if AS supports the '.func' directive
Not supported by Clang's integrated assembler. Since it just adds
debug information it can safely omitted.
2014-06-03 14:23:03 +02:00
Michael Niedermayer 1c788eaca9 Merge commit '831a1180785a786272cdcefb71566a770bfb879e'
* commit '831a1180785a786272cdcefb71566a770bfb879e':
  Update dsputil- and SIMD-related comments to match reality more closely

Conflicts:
	libavcodec/x86/hpeldsp.asm
	libavutil/arm/float_dsp_init_arm.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-13 23:59:56 +01:00
Diego Biurrun 831a118078 Update dsputil- and SIMD-related comments to match reality more closely 2014-03-13 05:50:29 -07:00
Michael Niedermayer a74bab7079 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: hpeldsp: prevent overreads in armv6 asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-05 21:35:30 +01:00
Janne Grunau cbddee1cca arm: hpeldsp: prevent overreads in armv6 asm
Based on a patch by Russel King <rmk+libav@arm.linux.org.uk>

Bug-Id: 646
CC: libav-stable@libav.org
2014-03-05 14:30:57 +01:00
Michael Niedermayer 53d11f7b2d Merge commit '543156d7518f5e5d731123da066d86278f9fa492'
* commit '543156d7518f5e5d731123da066d86278f9fa492':
  arm: Mark the stack as non-executable

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-19 14:07:26 +01:00
Martin Storsjö 543156d751 arm: Mark the stack as non-executable
If linking in an object file without this attribute set, the
linker will assume that an executable stack might be needed.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-02-19 09:57:19 +02:00
Michael Niedermayer a7574a36af Merge commit 'e3fec3f095ab5ea08ee662942d98526aaf5e3635'
* commit 'e3fec3f095ab5ea08ee662942d98526aaf5e3635':
  arm: Add EXTERN_ASM to the .func and .type declarations for exported symbols

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 00:49:28 +01:00
Martin Storsjö e3fec3f095 arm: Add EXTERN_ASM to the .func and .type declarations for exported symbols
This makes the generated assembly more internally consistent,
avoiding declaring two labels for the same function (for cases
where EXTERN_ASM is empty) and not declaring a separate unprefixed
label in other cases.

This also makes sure the .func and .type delcarations have the same
prefix. They have previously not been used on the platforms
that have prefixed symbols on arm (iOS), but gas-preprocessor
has recently started using the .func declarations for adding
.thumb_func declarations for such functions.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-02-07 15:14:06 +02:00
Michael Niedermayer 9d5cc55f0f Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Add an option for making sure NEON registers aren't clobbered

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-11 03:08:10 +01:00
Martin Storsjö 44a0a98f92 arm: Add an option for making sure NEON registers aren't clobbered
This is pretty much based on the same test for XMM registers.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-11 00:03:00 +02:00
Michael Niedermayer edba54630b Merge commit '5dae4872357613a0b51120b54a4c5221e0ec3f69'
* commit '5dae4872357613a0b51120b54a4c5221e0ec3f69':
  arm: Allow overriding the alignment set in the function macro

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:36:56 +01:00
Martin Storsjö 5dae487235 arm: Allow overriding the alignment set in the function macro
The function macro always sets .align 2 before declaring the
function label (since 5c5e1ea3) and always sets the section to
.text (since 278caa6a).

The .align 5 before certain functions, added in fc252eba, were added
before .text and .align were added to the function macro and thus
became useless/unused when the function macro got them.

This restores the original intention, to align the loop entry
points.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:29:56 +02:00
Thilo Borgmann d814a839ac Reinstate proper FFmpeg license for all files. 2013-08-30 15:47:38 +00:00
Michael Niedermayer 946f080b54 Merge commit '7ffda66fd5c81af4725bff7c2c4f207ba2aa0613'
* commit '7ffda66fd5c81af4725bff7c2c4f207ba2aa0613':
  arm: float_dsp: Propagate cpu_flags to vfp initialization function

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 16:05:04 +02:00
Michael Niedermayer 2a60666d1d Merge commit '8410d6e93c2e074881f1c7b7e4cdefd2e497d52e'
* commit '8410d6e93c2e074881f1c7b7e4cdefd2e497d52e':
  avutil: Refactor CPU extension availability macros

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:15:10 +02:00
Michael Niedermayer c83d794936 Merge commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b'
* commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b':
  avutil: Move internal CPU detection function declarations to private header

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:05:15 +02:00
Diego Biurrun 7ffda66fd5 arm: float_dsp: Propagate cpu_flags to vfp initialization function 2013-08-29 11:24:14 +02:00
Diego Biurrun 8410d6e93c avutil: Refactor CPU extension availability macros 2013-08-28 23:54:14 +02:00