Commit Graph

83259 Commits

Author SHA1 Message Date
Mark Thompson
32b3812b60 vaapi_vc1: Convert to use the new VAAPI hwaccel code
(cherry picked from commit 520fb77285)
2017-01-17 23:06:46 +00:00
Mark Thompson
71acbea112 vaapi_mpeg2: Convert to use the new VAAPI hwaccel code
(cherry picked from commit 102e13c353)
2017-01-17 23:06:45 +00:00
Mark Thompson
c8b26d5954 vaapi_h264: Convert to use the new VAAPI hwaccel code
(cherry picked from commit 2fe93244ab)
2017-01-17 23:06:45 +00:00
Mark Thompson
79307ae563 lavc: Rewrite VAAPI decode infrastructure
Moves much of the setup logic for VAAPI decoding into lavc; the user
now need only provide the hw_frames_ctx.

(cherry picked from commit 123ccd07c5)
(cherry picked from commit 5e879b54a3)
(cherry picked from commit 0aec37e625)
(cherry picked from commit cfa4eb4fba)
2017-01-17 23:06:45 +00:00
Mark Thompson
d07d01bcce vaapi_vc1: Remove redundant version check
The lowest supported VAAPI version is 0.34 (checked at configure
time), so this test is no longer needed.

(cherry picked from commit 5a667322f5)
2017-01-17 23:06:45 +00:00
Mark Thompson
845c2c140b vaapi_vc1: Constify pointers
(cherry picked from commit 01d6f84f49)
2017-01-17 23:06:45 +00:00
Mark Thompson
6bc2808c41 vaapi_mpeg2: Constify pointers
(cherry picked from commit ee9061293e)
2017-01-17 23:06:45 +00:00
Mark Thompson
d0897da924 vaapi_h264: Constify pointers
(cherry picked from commit 03adfe9130)
2017-01-17 23:06:45 +00:00
Michael Niedermayer
b05d8e7184 libavformat/mpegtsenc: support hevc with missing in stream headers like h.264
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-17 20:36:34 +01:00
Kacper Michajłow
2064a3b8df configure: Don't disable SSA Optimizer on MSVC v19.00.24218+.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-17 17:55:34 +01:00
Matthieu Bouron
bdbbb8f11e Merge commit 'f450cc7bc595155bacdb9f5d2414a076ccf81b4a'
* commit 'f450cc7bc595155bacdb9f5d2414a076ccf81b4a':
  h264: eliminate decode_postinit()

Also includes fixes from 1f7b4f9abc and e344e65109.

Original patch replace H264Context.next_output_pic (H264Picture *) by
H264Context.output_frame (AVFrame *). This change is discarded as it
is incompatible with the frame reconstruction and motion vectors
display code which needs the extra information from the H264Picture.

Merged-by: Clément Bœsch <u@pkh.me>
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
2017-01-17 14:38:48 +01:00
Matthieu Bouron
adf5dc90a9 avutil/tests: add aes_ctr, audio_fifo and imgutils to .gitignore 2017-01-17 10:08:05 +01:00
Carl Eugen Hoyos
e664730271 configure: Fix standalone compilation of aiff and caf muxers. 2017-01-16 12:03:21 +01:00
Clément Bœsch
9561de4183 lavc/h264dec: reconstruct and debug flush frames as well 2017-01-16 10:43:41 +01:00
Clément Bœsch
bd520e8569 lavc/h264_slice: drop redundant current_slice reset
It is done unconditionally in ff_h264_field_end()
2017-01-16 10:43:41 +01:00
Clément Bœsch
a91c265f39 lavc/pthread_frame: protect read state access in setup finish function 2017-01-16 10:43:41 +01:00
Paul B Mahol
591be9e384 avformat/aadec: use avio_get_str()
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-16 10:24:02 +01:00
Paul B Mahol
e0665d385e avformat/aadec: stop ignoring file metadata
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-16 10:24:01 +01:00
Paul B Mahol
40cf943714 avcodec: add SIPR parser
Fixes #2056.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-16 10:24:01 +01:00
Steve Lhomme
8fb4865901 dxva2: allow an empty array of ID3D11VideoDecoderOutputView
We can pick the correct slice index directly from the ID3D11VideoDecoderOutputView
casted from data[3].

Also added myself as maintainer for DXVA2 and D3D11VA.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-16 02:54:04 +01:00
Steve Lhomme
153b36fc62 dxva2: get the slice number directly from the surface in D3D11VA
No need to loop through the known surfaces, we'll use the requested surface
anyway.

The loop is only done for DXVA2.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-16 02:54:04 +01:00
Steve Lhomme
77742c75c5 dxva2: use a single macro to test if the DXVA context is valid
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-16 02:54:04 +01:00
Andreas Cadhalpun
367cac7827 libopenmpt: add missing avio_read return value check
This fixes heap-buffer-overflows in libopenmpt caused by interpreting
the negative size value as unsigned size_t.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Reviewed-by: Jörn Heusipp <osmanx@problemloesungsmaschine.de>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-16 02:54:04 +01:00
Daniil Cherednik
c2500d62c6 dcaenc: Implementation of Huffman codes for DCA encoder
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-01-15 18:17:12 +00:00
Daniil Cherednik
a6191d098a dcaenc: Reverse data layout to prevent data copies during Huffman encoding introduction
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-01-15 18:16:31 +00:00
Rostislav Pehlivanov
e7dec52d4d matroskaenc: remove unofficial compliance on color information
When support for this was added the details weren't yet finalized.
This is no longer the case.
Fixes writing of mkv/webm files with HDR.

Reported-by: Kagami Hiiragi <kagami@genshiken.org>
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Reviewed-by: James Almer <jamrial@gmail.com>
2017-01-15 17:49:21 +00:00
Martin Storsjö
0ba0187535 aarch64: vp9mc: Fix a comment to refer to a register with the right name
This is cherrypicked from libav commit
85ad5ea72c.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:43 +01:00
Martin Storsjö
02cfb9a16e aarch64: vp9dsp: Fix vertical alignment in the init file
This is cherrypicked from libav commit
65074791e8.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:40 +01:00
Martin Storsjö
656d910981 arm: vp9mc: Fix vertical alignment of operands
This is cherrypicked from libav commit
c536e5e869.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:37 +01:00
Martin Storsjö
8b11a89c06 aarch64: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

vp9_inv_dct_dct_16x16_sub16_add_neon:   1373.2
vp9_inv_dct_dct_32x32_sub32_add_neon:   8089.0

By skipping individual 8x16 or 8x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     235.3
vp9_inv_dct_dct_16x16_sub2_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub8_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   1372.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   1372.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     555.1
vp9_inv_dct_dct_32x32_sub2_add_neon:    5190.2
vp9_inv_dct_dct_32x32_sub4_add_neon:    5180.0
vp9_inv_dct_dct_32x32_sub8_add_neon:    5183.1
vp9_inv_dct_dct_32x32_sub12_add_neon:   6161.5
vp9_inv_dct_dct_32x32_sub16_add_neon:   6155.5
vp9_inv_dct_dct_32x32_sub20_add_neon:   7136.3
vp9_inv_dct_dct_32x32_sub24_add_neon:   7128.4
vp9_inv_dct_dct_32x32_sub28_add_neon:   8098.9
vp9_inv_dct_dct_32x32_sub32_add_neon:   8098.8

I.e. in general a very minor overhead for the full subpartition case due
to the additional cmps, but a significant speedup for the cases when we
only need to process a small part of the actual input data.

This is cherrypicked from libav commits
cad42fadcd and
a0c443a398.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:32 +01:00
Martin Storsjö
388f6e6715 arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

                                     Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub16_add_neon:   3188.1   2435.4   2499.0   1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon:  18531.7  16582.3  14207.6  12000.3

By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     274.6    189.5    211.7    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    2064.0   1534.8   1719.4   1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    2135.0   1477.2   1736.3   1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon:    2446.7   1828.7   1993.6   1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   2832.4   2118.3   2266.5   1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.7   2475.3   2523.5   1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     756.2    456.7    862.0    553.9
vp9_inv_dct_dct_32x32_sub2_add_neon:   10682.2   8190.4   8539.2   6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10813.5   8014.9   8518.3   6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon:   11859.6   9313.0   9347.4   7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon:  12946.6  10752.4  10192.2   8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon:  14074.6  11946.5  11001.4   9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon:  15269.9  13662.7  11816.1   9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon:  16327.9  14940.1  12626.7  10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17462.7  15776.1  13446.2  11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon:  18575.5  17157.0  14249.3  12015.1

I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.

In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.

This is cherrypicked from libav commit
9c8bc74c2b.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:30 +01:00
Martin Storsjö
ecd343aa1f arm: vp9itxfm: Only reload the idct coeffs for the iadst_idct combination
This avoids reloading them if they haven't been clobbered, if the
first pass also was idct.

This is similar to what was done in the aarch64 version.

This is cherrypicked from libav commit
3c87039a40.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:27 +01:00
Martin Storsjö
37cb224e3e aarch64: vp9itxfm: Don't repeatedly set x9 when nothing overwrites it
This is cherrypicked from libav commit
2f99117f6f.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:25 +01:00
Martin Storsjö
f69dd26df5 arm: vp9itxfm: Rename a macro parameter to fit better
Since the same parameter is used for both input and output,
the name inout is more fitting.

This matches the naming used below in the dmbutterfly macro.

This is cherrypicked from libav commit
79566ec8c7.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:21 +01:00
Martin Storsjö
4a5874ea8d arm/aarch64: vp9itxfm: Fix indentation of macro arguments
This is cherrypicked from libav commit
721bc37522.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:19 +01:00
Martin Storsjö
a95e7de41d aarch64: vp9itxfm: Use w3 instead of x3 for the int eob parameter
The clobbering tests in checkasm are only invoked when testing
correctness, so this bug didn't show up when benchmarking the
dc-only version.

This is cherrypicked from libav commit
4d960a1185.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:16 +01:00
Janne Grunau
a71cd8439f arm: vp9itxfm: Simplify the stack alignment code
This is one instruction less for thumb, and only have got
1/2 arm/thumb specific instructions.

This is cherrypicked from libav commit
e5b0fc170f.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:12 +01:00
Janne Grunau
cb220eeef9 aarch64: vp9: loop filter: replace 'orr; cbn?z' with 'adds; b.{eq,ne};
The latter is 1 cycle faster on a cortex-53 and since the operands are
bytewise (or larger) bitmask (impossible to overflow to zero) both are
equivalent.

This is cherrypicked from libav commit
e7ae8f7a71.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:10 +01:00
Janne Grunau
62ea07d797 aarch64: vp9: use alternative returns in the core loop filter function
Since aarch64 has enough free general purpose registers use them to
branch to the appropiate storage code. 1-2 cycles faster for the
functions using loop_filter 8/16, ... on a cortex-a53. Mixed results
(up to 2 cycles faster/slower) on a cortex-a57.

This is cherrypicked from libav commit
d7595de0b2.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:06 +01:00
Michael Bradshaw
3ac46a0a62 ffmpeg: Add -time_base option to hint the time base
Signed-off-by: Michael Bradshaw <mjbshaw@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 20:03:56 +01:00
Paul B Mahol
743052ec5b avcodec/cinepakenc: remove CVID from long description
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-14 16:56:47 +01:00
Carl Eugen Hoyos
935404923d Cosmetics: Reindent after last commit. 2017-01-14 06:07:06 +01:00
Carl Eugen Hoyos
c723108e25 lavf/matroskaenc: Do not write two CodecID elements for rawvideo.
Fixes ticket #6068.
2017-01-14 06:06:05 +01:00
Martin Vignali
1412e5a004 fate/psd : add test for bitmap and duotone
The duotone file is interpreted as gray

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 04:52:43 +01:00
Martin Vignali
31e722e9da libavcodec/psd : add test for channel depth/channel count in bitmap mode
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 04:52:43 +01:00
Matthieu Bouron
e109c54a69 swresample/arm: cosmetic fixes 2017-01-13 21:24:25 +01:00
Matthieu Bouron
0265aec565 swresample/aarch64: add ff_resample_common_apply_filter_{x4,x8}_{float,s16}_neon 2017-01-13 21:24:19 +01:00
Paul B Mahol
2eaee6e79b avcodec/qdrw: skip long comment for now
Fixes part of #5918.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-13 21:19:17 +01:00
Steinar H. Gunderson
d68d7198be speedhq: Align blocks variable properly.
Seemingly ff_clear_block_sse assumed that the block array is aligned,
so make sure it is.

Fixes ticket #6079

Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-13 16:47:53 -03:00
James Almer
6596b34954 avcodec/lossless_videodsp: add missing call to ff_llviddsp_init_ppc()
Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-12 22:56:50 -03:00