Commit Graph

43797 Commits

Author SHA1 Message Date
Alexandra Hájková e3f941cb03 checkasm: add a test for HEVC IDCT
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-11 18:15:40 +02:00
Martin Storsjö 9b2ccafb48 aarch64: Add missing sign extension in ff_h264_idct8_add_neon
Signed-off-by: Martin Storsjö <martin@martin.st>
2016-10-10 14:57:53 +03:00
Yogender Gupta cbd84b8a51 nvenc: Fix error log
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2016-10-09 20:58:10 +02:00
Yogender Gupta da2848375a nvenc: Force high_444 profile for 444 input
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2016-10-07 10:41:38 +02:00
Anton Khirnov e4128c08d7 Revert "hevc: x86: Refactor IDCT macro declarations"
This reverts commit d9dccc0389. There were
outstanding objections to this commit.
2016-10-06 15:24:04 +02:00
Diego Biurrun 5801f9ed24 h264_intrapred: x86: Update comments left behind in 95c89da36e 2016-10-06 12:32:34 +02:00
Diego Biurrun 20abcaa273 configure: #include stdint.h as part of libxavs test
Unfortunately the xavs.h API header is not self-sufficient and relies
on manual stdint.h inclusion by its users.
2016-10-06 12:32:34 +02:00
Diego Biurrun d9dccc0389 hevc: x86: Refactor IDCT macro declarations 2016-10-06 12:32:34 +02:00
Steve Lhomme be630b1e08 d3d11va: Use the proper decoding slice index
The decoding buffer index expected by D3D11VA is the one from the
ID3D11Texture2D not the one from the ID3D11VideoDecoderOutputView array
in AVD3D11VAContext.

Otherwise, when providing decoder slices that do not start from 0,
pictures appear in bogus order. For an invalid index crashes and
image corruption can occur.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-10-05 18:37:27 +02:00
Ronald S. Bultje 715f139c9b vp9lpf/x86: make filter_16_h work on 32-bit.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:09 +02:00
Ronald S. Bultje 8915320db9 vp9lpf/x86: make filter_48/84/88_h work on 32-bit.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:09 +02:00
Ronald S. Bultje 725a216481 vp9lpf/x86: make filter_44_h work on 32-bit.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:09 +02:00
Ronald S. Bultje 5bfa96c4b3 vp9lpf/x86: make filter_16_v work on 32-bit.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:09 +02:00
Ronald S. Bultje b905e8d2fe vp9lpf/x86: make filter_48/84_v work on 32-bit.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje 37637e6590 vp9lpf/x86: make filter_88_v work on 32-bit.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje be10834bd9 vp9lpf/x86: make filter_44_v work on 32-bit.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje 7c62891efe vp9lpf/x86: save one register in SIGN_ADD/SUB.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje c6375a83d1 vp9lpf/x86: store unpacked intermediates for filter6/14 on stack.
filter16 goes from 508 to 482 (h) or 346 to 314 (v) cycles; filter88
goes from 240 to 238 (h) or 174 to 165 (v) cycles, measured on TOS.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje 4ce8ba72f9 vp9lpf/x86: move variable assigned inside macro branch.
The value is not used outside the branch.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje e4961035b2 vp9lpf/x86: simplify ABSSUM_CMP by inverting the comparison meaning.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje 683da2788e vp9lpf/x86: remove unused register from ABSSUB_CMP macro.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje 6e74e9636b vp9lpf/x86: slightly simplify 44/48/84/88 h stores.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje 6411c328a2 vp9lpf/x86: make cglobal statement more conservative in register allocation.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje a6e288d624 vp9lpf/x86: save one register in loopfilter surface coverage.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Clément Bœsch 0ed21bdc9e vp9lpf/x86: add ff_vp9_loop_filter_[vh]_44_16_{sse2,ssse3,avx}.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Clément Bœsch f2e3d706a1 vp9lpf/x86: add ff_vp9_loop_filter_h_{48,84}_16_{sse2,ssse3,avx}().
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
James Almer 92d47550ea vp9lpf/x86: add an SSE2 version of vp9_loop_filter_[vh]_88_16
Similar gains as the ssse3 version once again

Additional improvements by Clément Bœsch <u@pkh.me>.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Clément Bœsch 6bea478158 vp9lpf/x86: add ff_vp9_loop_filter_[vh]_88_16_{ssse3,avx}.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
James Almer 1f451eed60 vp9lpf/x86: add ff_vp9_loop_filter_[vh]_16_16_sse2().
Similar gains in performance as the SSSE3 version

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Clément Bœsch a692724c58 vp9lpf/x86: add x86 SSSE3/AVX SIMD for vp9_loop_filter_[vh]_16_16.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:08 +02:00
Ronald S. Bultje c935b54bd6 checkasm: add VP9 loopfilter tests.
The randomize_buffer() implementation assures that "most of the time",
we'll do a good mix of wide16/wide8/hev/regular/no filters for complete
code coverage. However, this is not mathematically assured because that
would make the code either much more complex, or much less random.

Some fixes and improvements by Rodger Combs <rodger.combs@gmail.com>

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:07 +02:00
Ronald S. Bultje a451324ddd vp9: ignore reference segmentation map if error_resilience flag is set.
Fixes ffvp9_fails_where_libvpx.succeeds.webm.

Bug-Id: ffmpeg/3849.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2016-10-04 10:54:07 +02:00
Vittorio Giovara dc3fe45fca fate: Add test for rscc palette 2016-10-02 15:42:03 -04:00
Carl Eugen Hoyos c19830aa2c rscc: Support palette format
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-10-02 15:42:03 -04:00
Vittorio Giovara b8d5070db6 avcodec: Document AV_PKT_DATA_PALETTE side data type 2016-10-02 15:42:03 -04:00
Vittorio Giovara 497c087939 avidec: Set palette alpha as fully opaque
Palette format is always in RGBA.
2016-10-02 15:42:03 -04:00
Vittorio Giovara bad4aad403 avidec: Do not special case palette on big-endian
This simplifies the code a bit, does not change output data in any way.
2016-10-02 15:42:03 -04:00
Vittorio Giovara 310c55f179 pixfmt: Document alternative names for smpte 431 and 432 2016-10-02 15:42:03 -04:00
Mark Thompson 5a5df90d9c vaapi_h265: Add main 10 encode support 2016-10-02 20:23:18 +01:00
Mark Thompson eaaaabf6c9 hwcontext_vaapi: Enable P010 support
This is required for 10-bit surfaces.
2016-10-02 20:23:18 +01:00
Mark Thompson b8cac1e830 vaapi_h265: Fix buffering parameters
A decoder may need this to be set correctly to output frames in the
right order.
2016-10-02 20:23:18 +01:00
Mark Thompson fc30a90898 vaapi_h265: Fix slice header writing
This was not observed earlier because the only syntax element which
it normally misses with the current setup is slice_qp_delta, but that
is always going to be zero (in IDR frames QP isn't varied on the
slice) which will always exp-golomb code as a single 1 bit.  The
immediately following part is the byte alignment, which is always a 1
bit followed by 0s which are ignored, so as long as the bitstream is
never aligned at that point we will never notice because the only
difference is that an ignored bit is a 1 instead of a 0.
2016-10-02 20:23:18 +01:00
Mark Thompson ec17ab381e vaapi_h264: Write bitstream restriction fields 2016-10-02 20:23:18 +01:00
Mark Thompson 17a0f9481c vaapi_h264: Fix CFR mode with frame_rate set in AVCodecContext 2016-10-02 20:23:18 +01:00
Mark Thompson 314b421dd8 vaapi_encode: Decide on GOP setup before initialising sequence parameters
This was always too late; several fields related to it have been incorrectly
zero since the encoder was added.
2016-10-02 20:23:18 +01:00
Anton Khirnov 5cc0057f49 lavu: remove the custom atomic API
It has been replaced by C11 stdatomic.h and is now unused.
2016-10-02 19:35:55 +02:00
Anton Khirnov 59c7022740 pthread_frame: use atomics for frame progress 2016-10-02 19:35:46 +02:00
Anton Khirnov 64a31b2854 pthread_frame: use atomics for PerThreadContext.state 2016-10-02 19:35:34 +02:00
Anton Khirnov db2733256d pthread_frame: use a thread-safe way for signalling threads to die
Current code uses a plain int in a racy way, which is UB.
2016-10-02 19:35:23 +02:00
Anton Khirnov 8385ba53f1 mmaldec: convert to stdatomic 2016-10-02 19:35:12 +02:00