Commit Graph

75970 Commits

Author SHA1 Message Date
Nedeljko Babic
de262d018d avcodec/mips/aaccoder_mips: Sync with the generic code
This patch fixes build of AAC encoder optimized for mips that was broken due
 to some changes in generic code that were not propagated to the optimized code.

Also, some functions in the optimized code are basically duplicate of functions
 from generic code. Since they do not bring enough improvement to the optimized
 code to justify their existence, they are removed (which improves
 maintainability of the optimized code).

Optimizations disabled in 97437bd are enabled again.

Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 17:22:56 +02:00
Ronald S. Bultje
e578638382 vp9: use registers for constant loading where possible. 2015-10-13 11:06:01 -04:00
Ronald S. Bultje
408bb8556f vp9: refactor itx coefficients and share between 8 and 10/12bpp. 2015-10-13 11:06:01 -04:00
Ronald S. Bultje
eb4b5ff738 vp9: add itxfm_add eob shortcuts to 10/12bpp functions.
These aren't quite as helpful as the ones in 8bpp, since over there,
we can use pmulhrsw, but here the coefficients have too many bits to
be able to take advantage of pmulhrsw. However, we can still skip
cols for which all coefs are 0, and instead just zero the input data
for the row itx. This helps a few % on overall decoding speed.
2015-10-13 11:06:01 -04:00
Ronald S. Bultje
488fadebbc vp9: add 10/12bpp idct_idct_32x32 sse2 SIMD version. 2015-10-13 11:06:00 -04:00
Ronald S. Bultje
3d0ca2fe89 vp9: 10/12bpp sse2 SIMD for iadst16. 2015-10-13 11:06:00 -04:00
Ronald S. Bultje
0e80265b0a vp9: refactor 10/12bpp dc-only code in 4x4/8x8 and add to 16x16. 2015-10-13 11:06:00 -04:00
Ronald S. Bultje
1338fb79d4 vp9: add 10/12bpp sse2 SIMD version for idct_idct_16x16. 2015-10-13 11:06:00 -04:00
Ronald S. Bultje
cb054d061a vp9: add 10/12bpp sse2 SIMD versions of iadst8x8. 2015-10-13 11:05:59 -04:00
Ronald S. Bultje
e0610787b2 vp9: add 10/12bpp sse2 SIMD for idct_idct_8x8. 2015-10-13 11:05:59 -04:00
Ronald S. Bultje
a35f6bdb38 vp9: add 12bpp sse2 versions of iadst4. 2015-10-13 11:05:59 -04:00
Ronald S. Bultje
235e76aeb8 vp9: initial attempt at a idct_idct_4x4 12bpp x86 simd (sse2) impl.
The trouble with this function is that intermediates overflow 31+sign
bits, so I've added some helpers (that will also be used in 10/12bpp
8x8, 16x16 and 32x32) to make that easier, basically emulating a half-
assed pmaddqd using 2xpmaddwd. It's currently sse2-only, if anyone sees
potential in adding ssse3, I'd love to hear it.
2015-10-13 11:05:58 -04:00
Ronald S. Bultje
f76423d097 vp9: add x86 simd (sse2/ssse3) for iadst4 10bpp functions. 2015-10-13 11:05:58 -04:00
Ronald S. Bultje
6b579cf547 vp9: add 10bpp simd (mmxext/ssse3) for idct_idct_4x4. 2015-10-13 11:05:58 -04:00
Ronald S. Bultje
1c3be32533 vp9: add 10/12bpp mmxext-optimized iwht_iwht_4x4 function. 2015-10-13 11:05:57 -04:00
Christophe Gisquet
b6594a9605 x86: dct-test: add more idcts
In particular for 10 and 12 bits.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 16:03:04 +02:00
Michael Niedermayer
a745d1a9e4 avcodec/dct-test: Print failure notice below the failed *dct
This makes it easier to see where a failure happens

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 16:03:03 +02:00
Christophe Gisquet
7ece8b50b1 x86: simple_idct: 12bits versions
On 12 frames of a 444p 12 bits DNxHR sequence, _put function:
C:         78902 decicycles in idct,  262071 runs,     73 skips
avx:       32478 decicycles in idct,  262045 runs,     99 skips

Difference between the 2:
stddev:    0.39 PSNR:104.47 MAXDIFF:    2

This is unavoidable and due to the scale factors used in the x86
version, which cannot match the C ones.

In addition, the trick of adding an initial bias to the input of a
pass can overflow, as the input coefficients are already 15bits,
which is the maximum this function can handle.

Overall, however, the omse on 12 bits samples goes from 0.16916 to
0.16883. Reducing rowshift by 1 improves to 0.0908, but causes
overflows.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 15:34:32 +02:00
Christophe Gisquet
4369b9dc7b x86: simple_idct(_put): 10bits versions
Modeled from the prores version. Clips to [0;1023] and is bitexact.
Bitexactness requires to add offsets in different places compared to
prores or C, and makes the function approximately 2% slower.

For 16 frames of a DNxHD 4:2:2 10bits test sequence:

C:    60861 decicycles in idct, 1048205 runs,    371 skips
sse2: 27567 decicycles in idct, 1048216 runs,    360 skips
avx:  26272 decicycles in idct, 1048171 runs,    405 skips

The add version is not implemented, so the corresponding dsp
function is set to NULL to make it clear in a code executing it.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 13:32:21 +02:00
Christophe Gisquet
e652f69b35 x86: simple_idct10_template: fix overflow in pass
When the input of a pass has 15 or 16 bits of precision (in particular
the column pass), the addition of a bias to W4 may lead to overflows
in the input to pmaddwd.

This requires postponing the adding of the bias to after the first
butterfly. To do so, the fact that m15, unused although zeroed, is
exploited. In case the pass is safe, an address can be directly used,
and the number of xmm regs can be decreased. Otherwise, the 32bits bias
is loaded into it.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 12:51:10 +02:00
Ganesh Ajjanagadde
3b336ec2fb avfilter/af_sidechaincompress: replace FFABS with fabs 2015-10-13 09:37:18 +02:00
Ganesh Ajjanagadde
ac6b7c47cc avfilter/af_astats: replace FFABS with fabs 2015-10-13 09:34:39 +02:00
Ganesh Ajjanagadde
9ab98b580e avfilter/af_agate: replace FFABS with fabs 2015-10-13 09:31:16 +02:00
Christophe Gisquet
f1181e4660 fate: add 10bits YUV4:2:2 dnxhd test
It was useful to (accidentally?) spot an overflow in the column pass
of the x86 simple_idct10 implementation.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 04:04:02 +02:00
Christophe Gisquet
2fd14dd8eb avcodec/simple_idct10: improve precision
omse goes from 0.03060703 (which fails for dct-test) to 0.01663750.
This also actually improve the error of decoding the sample generated
by fate-vsynth3-dnxhd1080i-10bit using simple_idct10 to FAANI, which
goes (when resampled to yuv422p) from:
stddev:    0.06 PSNR: 72.28 MAXDIFF:    1
to identical.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 02:10:51 +02:00
Christophe Gisquet
e9a68b0316 x86: prores: templatize 10 bits simple_idct
This should be reused for a generic simple_idct10 function.
Requires a bit of trickery to declare common constants in C.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 01:10:34 +02:00
Rostislav Pehlivanov
93e6b23c9f aacenc: shorten name of ff_aac_adjust_common_prediction
To keep it similar to the other functions which are all named *_pred.
2015-10-12 23:33:07 +01:00
Rostislav Pehlivanov
65f5b96dd8 aacenc: increase size of s->planar_samples[] from 6 to 8
Left out of last commit which added support for eight channel audio.
2015-10-12 23:25:45 +01:00
Christophe Gisquet
9f3bfe30dd mpegvideo: dnxhdenc: permute 10bits content
Dequant or encoding were trying to reverse a scan that hadn't been
applied...

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 00:01:39 +02:00
Michael Niedermayer
97437bd17a avcodec/mips/aaccoder_mips: Disable ff_aac_coder_init_mips() to prevent build failure
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 00:01:39 +02:00
Ricardo Constantino
6eaf97c289 avformat/webvttdec: Don't stop parsing on comments
Signed-off-by: Ricardo Constantino <wiiaboo@gmail.com>
2015-10-12 22:16:12 +02:00
Ricardo Constantino
a96dbdc14f fate/subtitles: Add a new test for WebVTT
Includes escapes that should now be supported and a few features not yet
fully supported, like comments, regions, classes, ruby, and lang.

All were tested with https://quuz.org/webvtt/ for validation, except
regions because the validator doesn't support them yet, and I couldn't
find any other way to validate WebVTT.

Signed-off-by: Ricardo Constantino <wiiaboo@gmail.com>
2015-10-12 22:14:44 +02:00
Ricardo Constantino
53886d6955 avcodec/webvttdec: Deal with WebVTT escapes
Bare ampersand characters are still accepted, even though out-of-spec.
Also fixes adjacent tags not being parsed.

Fixes trac #4915

Signed-off-by: Ricardo Constantino <wiiaboo@gmail.com>
2015-10-12 22:04:05 +02:00
Lou Logan
329bd25475 doc/filters: s/nb_inputs/inputs for stack filters
Signed-off-by: Lou Logan <lou@lrcd.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-10-12 10:56:13 -08:00
Derek Buitenhuis
1156b634c1 avcodec: Don't lock on init for codecs without an init function
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2015-10-12 15:25:51 -03:00
Rostislav Pehlivanov
ccd3b3df39 fate: increase fuzz on fate-aac-tns-encode test
Fails on SunOS and old GCC (<=4.6 is ancient) versions.
2015-10-12 17:15:30 +01:00
Rostislav Pehlivanov
e2749ef60a aacenc_utils: fit find_form_factor() below 80 chars per line 2015-10-12 17:14:50 +01:00
Rostislav Pehlivanov
0f4334df45 aacenc: add support for changing options based on a profile
This commit adds the ability for a profile to set the default
options, as well as for the user to override such options
by simply stating them in the command line while still keeping
the same profile, as long as those options are still permitted by
the profile.

Example: setting the profile to aac_low (the default) will turn
PNS and IS on. They can be disabled by -aac_pns 0 and -aac_is 0,
respectively. Turning on -aac_pred 1 will cause the profile to be
elevated to aac_main, as long as no options forbidding aac_main
have been entered (like AAC-LTP, which will be pushed soon).

A useful feature is that by setting the profile to mpeg2_aac_low,
all MPEG4 features will be disabled and if the user tries to enable
them then the program will exit with an error. This profile is
signalled with the same bitstream as aac_low (MPEG4) but some devices
and decoders will fail if any MPEG4 features have been enabled.
2015-10-12 16:57:56 +01:00
Alex Agranovsky
cf28490e56 avfilter/drawtext: allow to format pts with strftime
Signed-off-by: Alex Agranovsky <alex@sighthound.com>
2015-10-12 16:56:58 +02:00
Bela Bodecs
1f3a29e999 lavf/tee: allow multiple stream specifiers in select.
It makes possible to put multiple stream specifier into the select
option separated by comma.
eg. select=\'a:0,v\'

Signed-off-by: Bela Bodecs <bodecsb@vivanet.hu>
Signed-off-by: Nicolas George <george@nsup.org>
2015-10-12 16:56:58 +02:00
Rostislav Pehlivanov
b3deaece87 aacenc: add support for encoding 7.1 channel audio
This commit implements support for 7.1 channel audio. There's no
more predefined bitstream channel mappings so going beyond 8 channels
(and 7 channels exactly) will require programmable channel elements,
which is already underway.
2015-10-12 15:53:17 +01:00
Rostislav Pehlivanov
e679a1e65f aacenc_quantization: fix header description
Two guesses as to which file was used as boilerplate.
2015-10-12 15:41:50 +01:00
Claudio Freire
b629c67ddf AAC encoder: memoize quantize_band_cost
The bulk of calls to quantize_band_cost are replaced
by a call to a version that memoizes, greatly improving
performance, since during coefficient search there is
a great deal of repeat work.

Memoization cannot always be applied, so do this in a
different function, and leave the original as-is.
2015-10-12 03:56:22 -03:00
Michael Niedermayer
ce0834bdd6 avformat/flvdec: set broken_sizes for "metadatacreator : MEGA"
The 2nd size value is wrong for the sample file

Fixes: Ticket4903

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-12 05:36:39 +02:00
Claudio Freire
07b3b779a9 AAC encoder: fix assertion error re SF differences
Intermediate results can indeed violate SF delta. Instead of asserting
there, just make the code safe, and assert on the final result.

Also re-clamp SFs more often in short windows (which tend to violate
the restriction when encoding the switch from one window to the other)
2015-10-11 23:00:46 -03:00
Rostislav Pehlivanov
d25c033ddd aaccoder_twoloop.h: simplify and comment ff_pns_bits() 2015-10-12 01:42:43 +01:00
Rostislav Pehlivanov
5f760da6b6 aacenc_utils: add 'inline' flag to find_form_factor, silence warning
Seems it was forgotten.
2015-10-12 01:12:43 +01:00
James Almer
224a529b44 x86/vf_w3fdif: use aligned loads in w3fdif_simple_high
Found-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-10-11 20:07:12 -03:00
James Almer
e8903fbf8e x86/vf_w3fdif: simplify w3fdif_simple_high
Signed-off-by: James Almer <jamrial@gmail.com>
2015-10-11 20:04:54 -03:00
Andreas Cadhalpun
ec0275843d avcodec: remove leftover iff_byterun1 decoder
It was merged with the iff_ilbm decoder in commit
929a24efff.

Define AV_CODEC_ID_IFF_BYTERUN1 as AV_CODEC_ID_IFF_ILBM for API
compatibility.

Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-10-12 00:21:13 +02:00