Commit Graph

82636 Commits

Author SHA1 Message Date
Muhammad Faiz
06f94149c6 swresample/resample: optimize exact_rational=on:linear_interp=on case
separate dsp.resample to dsp.resample_common and dsp.resample_linear
and choose to call faster resample_common even when linear_interp=on
when c->frac and c->dst_incr_mod are both zero

speed up resampling when exact_rational and linear_interp are both
enabled because exact_rational force c->frac and c->dst_incr_mod to
be zero when soft compensation does not happen

benchmark on exact_rational=on:linear_interp=on
        old     new
real    8.432s  5.097s
user    7.679s  4.989s
sys     0.125s  0.107s

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
2016-11-25 03:22:04 +07:00
Muhammad Faiz
ebb4c783d0 fate/swresample: add resample exact_lin and exact_lin_async test
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
2016-11-25 03:21:56 +07:00
Wan-Teh Chang
048b46b4e2 avutil/tests: add cpu_init to .gitignore and tests/fate
This is a follow-up to commit d84a21207e,
which added the libavutil/tests/cpu_init.c.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-11-24 21:10:37 +01:00
Wan-Teh Chang
dceac9a4a7 avfilter/tests/.gitignore: add integral
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-11-24 21:10:37 +01:00
James Almer
d1de725bee cuda: check for cuda.h when enabled
Fixes make checkheaders on systems without the Cuda Toolkit, which
was broken after the dynlink changes.

Signed-off-by: James Almer <jamrial@gmail.com>
2016-11-24 13:50:43 -03:00
Paul B Mahol
8f5a2bed5e ffmpeg_filter: fix several logic failures
Move global thread variables to better place.
Use correct variable for simple and complex filtergraphs.

This makes number of threads set per filter work again.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2016-11-24 16:27:55 +01:00
Andreas Cadhalpun
995512328e pgssubdec: only set w/h/linesize when allocating data
Rects with positive w/h/linesize but no data are invalid.

Reviewed-by: Petri Hintukainen <phintuka@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2016-11-24 01:48:43 +01:00
Moritz Barsnick
0700d02a69 lavfi/pan: allow negative gain parameters also for other inputs than the first named
Expands the parser to also accept the separator '-' in addition to
'+', and take the negative sign into consideration.

The optional sign for the first factor in the expression is already
covered by parsing for an integer.

Signed-off-by: Moritz Barsnick <barsnick@gmx.net>
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-11-24 00:54:52 +01:00
Jun Zhao
584eea5bf3 lavc/vaapi_hevc: fix scaling list duplicate transfer issue.
scaling list is already transfered to raster scan during head parsing,
so no need to transfer it again.

And after this fix, FATE test SLIST_A_Sony_4/SLIST_B_Sony_8/
SLIST_C_Sony_3/SLIST_D_Sony_9 will pass in i965/Skylake.

Signed-off-by: Wang, Yi A <yi.a.wamg@intel.com>
Signed-off-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
2016-11-23 21:38:10 +00:00
Wan-Teh Chang
d84a21207e avutil/tests: Add cpu_init.c to check whether the one-time initialization in av_get_cpu_flags() has data races.
Co-author: Dmitry Vyukov of Google

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-11-23 22:35:25 +01:00
Wan-Teh Chang
29fb49194b avutil/cpu: remove the |checked| static variable
Remove the |checked| variable because the invalid value of -1 for
|flags| can be used to indicate the same condition. Also rename |flags|
to |cpu_flags| because there are a local variable and a function
parameter named |flags| in the same file.

Co-author: Dmitry Vyukov of Google

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-11-23 22:35:25 +01:00
Philip Langdale
dd10e7253a avcodec/cuvid: Restore initialization of pixel format in init()
I moved this into the handle_video_sequence callback because that's
the earliest time you can make an accurate decision as to what the
format should be.

However, transcoding requires that the decision between using
the accelerated PIX_FMT_CUDA vs a normal pix format happen at init()
time. There is enough information available to make that decision
and things work out with the underlying format only being discovered
in the sequence callback.
2016-11-23 13:23:34 -08:00
Paul B Mahol
b96a6e2024 avfilter/vf_zscale: add support for some recent new additions
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2016-11-23 19:02:20 +01:00
James Almer
42ae9c6654 fate: update fate-source ref file
Signed-off-by: James Almer <jamrial@gmail.com>
2016-11-23 00:55:01 -03:00
Sam Hocevar
3115550abe doc/examples/muxing: Fix av_frame_make_writable usage
This patch moves the av_frame_make_writable() call from fill_yuv_image
to get_video_frame so that its argument can be the actual frame that
will be sent to the encoder.

This fixes data corruption issues in codecs that keep references on
one or several previous frames.

Signed-off-by: Sam Hocevar <sam@hocevar.net>
Reviewed-by: wm4
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-11-23 03:28:04 +01:00
Michael Niedermayer
69f7dd3524 avcodec/options_table: make channel_layouts uint64
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-11-23 02:01:05 +01:00
Michael Niedermayer
2f935baa7d avutil/opt: Add AV_OPT_TYPE_UINT64
Requested-by: wm4 ([FFmpeg-devel] [PATCH] avutil/opt: Support max > INT64_MAX in write_number() with AV_OPT_TYPE_INT64)
Requested-by: ronald ([FFmpeg-devel] [PATCH] avutil/opt: Support max > INT64_MAX in write_number() with AV_OPT_TYPE_INT64)
Reviewed-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-11-23 02:01:05 +01:00
Andreas Cadhalpun
dbefbb61b7 sbgdec: prevent NULL pointer access
Reviewed-by: Josh de Kock <josh@itanimul.li>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2016-11-23 01:16:42 +01:00
Andreas Cadhalpun
de4ded0636 rmdec: validate block alignment
This fixes division by zero crashes.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2016-11-23 00:57:10 +01:00
Andreas Cadhalpun
946ecd19ea smacker: limit recursion depth of smacker_decode_bigtree
This fixes segmentation faults due to stack-overflow caused by too deep
recursion.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2016-11-23 00:57:10 +01:00
Michael Niedermayer
4e5049a230 avformat/mpeg: Adjust vid probe threshold to correct mis-detection
Fixes: _ij.mp3

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-11-23 00:54:45 +01:00
Andreas Cadhalpun
fdb8c455b6 mxfdec: fix NULL pointer dereference in mxf_read_packet_old
Metadata streams have priv_data set to NULL.

Reviewed-by: Josh de Kock <josh@itanimul.li>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2016-11-23 00:40:52 +01:00
Alex Converse
3ee59939a1 libvpxenc: Support targeting a VP9 level
Levels are specified at https://www.webmproject.org/vp9/levels/
2016-11-22 11:31:48 -08:00
Philip Langdale
81147b5596 avcodec/cuvid: Add support for P010/P016 as an output surface format
The nvidia 375.xx driver introduces support for P016 output surfaces,
for 10bit and 12bit HEVC content (it's also the first driver to support
hardware decoding of 12bit content).

The cuvid api, as far as I can tell, only declares one output format
that they appear to refer to as P016 in the driver strings. Of course,
10bit content in P016 is identical to P010, and it is useful for
compatibility purposes to declare the format to be P010 to work with
other components that only know how to consume P010 (and to avoid
triggering swscale conversions that are lossy when they shouldn't be).

For simplicity, this change does not maintain the previous ability
to output dithered NV12 for 10/12 bit input video - the user will need
to update their driver to decode such videos.
2016-11-22 10:09:30 -08:00
Philip Langdale
8d6c358ea8 libavutil/hwcontext_cuda: Support P010 and P016 formats
CUVID is now capable of returning 10bit and 12bit decoded content
in P010/P016. Let's support transfering those formats.
2016-11-22 10:09:14 -08:00
Philip Langdale
237421f149 avutil: add P016 pixel format
P016 is the 16-bit variant of NV12 (planar luma, packed chroma), using
two bytes per component.

It may, and in fact is most likely to, be used in situations where
there are less than 16 bits of data. It is the responsibility of
the writer to zero out any unused LSBs.
2016-11-22 10:07:43 -08:00
Timo Rothenpieler
5ea8f70623 avcodec/libx264: fix forced_idr logic
Currently, it forces IDR frames for both true and false.
Not entirely sure what the original idea behind the tri-state bool
option is.

Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2016-11-22 16:35:08 +01:00
Miroslav Slugen
10db40f374 avcodec/cuvid: allow setting number of used surfaces
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2016-11-22 10:34:27 +01:00
Miroslav Slugeň
de2faec2fa avcodec/nvenc: better surface allocation alghoritm, fix rc_lookahead
User selectable surfaces are not working correctly, if you set number of
surfaces on cmdline, it will always use minimum 32 or 48 depends on
selected resolution, but in nvenc it is not necessary to use so many
surfaces.

So from now you can define as low as 1 surface and nvenc will still
work, it will ofcourse lower GPU memory usage by 95% and async_delay to zero

That was the easy part, now littlebit more...

Next part of this patch is to always prefer rc_lookahead to be more
important for number of surfaces, than user defined surfaces value.
Maximum rc_lookahead from nvidia documentation is 32, but could increase
in future generations so there is no limit for this yet. Value
async_depth is still accepted and prefered over rc_lookahead.

There were also bug when you request more than rc_lookahead > 31, it
will always set maximum 31, because surface numbers recalculation was
after setting lookahead, which is now fixed.

Results:
If you set -rc_lookahead 32 and -bf 3 it will now use only 40 surfaces
and lower GPU memory usage by 20%, also it will now increase PSNR by 0.012dB

Two more comments:

1. from my internal test, i don't understand addition of 4 more surfaces
when lookahead is calculated, i didn't used this and everything works as
with those 4 more extra surfaces, does anybody know what is going on
there? I looks like it was used for B frames which are calculated
separately, because B frames maximum is 4.

2. rc_lookahead is defined default to -1, but in test condition if
(ctx->rc_lookahead) which sets lookahead it will be always true, i don't
know if this is intended behavior, so in default behavior is lookahead
always on!

This is default condition when rc_lokkahead is -1 (not defined on
cmdline), whis is maybe something that is not intended:
ctx->encode_config.rcParams.enableLookahead = 1;
ctx->encode_config.rcParams.lookaheadDepth  = 0;
ctx->encode_config.rcParams.disableIadapt   = 0;
ctx->encode_config.rcParams.disableBadapt   = 0;

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2016-11-22 10:34:27 +01:00
Miroslav Slugeň
c4aca65a42 avcodec/nvenc: maximum usable surfaces are limited to maximum registered frames
Maximum usable surfaces is limited to MAX_REGISTERED_FRAMES constant in
nvenc.h

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2016-11-22 10:34:27 +01:00
Timo Rothenpieler
8228b714be configure: cuda is no longer nonfree, enable and autodetect by default 2016-11-22 10:34:27 +01:00
Timo Rothenpieler
0faf3c3a25 avfilter/vf_hwupload_cuda: check ff_formats_ref for errors 2016-11-22 10:34:27 +01:00
Timo Rothenpieler
b0ca90d7cb avfilter/vf_hwupload_cuda: use new hwdevice allocation API 2016-11-22 10:34:27 +01:00
Timo Rothenpieler
a66835bcb1 avcodec/nvenc: use dynamically loaded CUDA 2016-11-22 10:34:27 +01:00
Timo Rothenpieler
a0c9e76942 avfilter/vf_scale_npp: use dynamically loaded CUDA 2016-11-22 10:34:27 +01:00
Timo Rothenpieler
d9ad18f3b4 avcodec/cuvid: use dynamically loaded CUDA/CUVID
And remove the now obsolete compat headers.
2016-11-22 10:34:27 +01:00
Timo Rothenpieler
e6464a44ed avutil/hwcontext_cuda: use dynamically loaded CUDA 2016-11-22 10:34:27 +01:00
Timo Rothenpieler
5c02d2827b compat/cuda: add dynamic loader 2016-11-22 10:34:27 +01:00
Steven Liu
d316b21dba avformat/flvenc: add no_metadata to flvflags
some flv have no metadata,
ffmpeg will same with the source flv stream.

Signed-off-by: Steven Liu <lingjiujianke@gmail.com>
2016-11-22 10:18:23 +08:00
James Almer
0b8df0ce48 avformat/utils: add missing brackets around arguments in av_realloc() call
Found-by: Neil Birkbeck <neil.birkbeck@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2016-11-21 23:02:20 -03:00
Mark Thompson
f242e0a0ff vaapi_encode: Fix format specifier for bitrate logging
Same as e0df56f25d.  This was accidentally
reintroduced while merging c8241e730f.
2016-11-21 22:59:58 +00:00
Jun Zhao
e72662e131 lavc/vaapi_encode_h264: fix poc incorrect issue after meeting idr frame.
when meeting IDR frame, vaapi_encode_h264 poc number don't reset, now fix
this issue based on h264 spec. Some decoder don't care this case, but this
fix will enhance the encoder action. Before this fix, poc number is
negative in some case.

Reviewed-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
2016-11-21 22:37:02 +00:00
Mark Thompson
30ebabca7c vaapi_h265: Fix buffering parameters
A decoder may need this to be set correctly to output frames in the
right order.

(cherry picked from commit b8cac1e830)
2016-11-21 22:13:41 +00:00
Mark Thompson
ae0230cc3e vaapi_h265: Fix slice header writing
This was not observed earlier because the only syntax element which
it normally misses with the current setup is slice_qp_delta, but that
is always going to be zero (in IDR frames QP isn't varied on the
slice) which will always exp-golomb code as a single 1 bit.  The
immediately following part is the byte alignment, which is always a 1
bit followed by 0s which are ignored, so as long as the bitstream is
never aligned at that point we will never notice because the only
difference is that an ignored bit is a 1 instead of a 0.

(cherry picked from commit fc30a90898)
2016-11-21 22:13:41 +00:00
Mark Thompson
6796e6ea84 vaapi_h264: Write bitstream restriction fields
(cherry picked from commit ec17ab381e)
2016-11-21 22:13:41 +00:00
Mark Thompson
658c5afaa0 vaapi_h264: Fix CFR mode with frame_rate set in AVCodecContext
(cherry picked from commit 17a0f9481c)
2016-11-21 22:13:41 +00:00
Mark Thompson
ded1859df1 vaapi_encode: Decide on GOP setup before initialising sequence parameters
This was always too late; several fields related to it have been incorrectly
zero since the encoder was added.

(cherry picked from commit 314b421dd8)
2016-11-21 22:13:41 +00:00
Mark Thompson
ee1d04f970 vaapi_h264: Set max_num_ref_frames to 1 when not using B frames
(cherry picked from commit 956a54129d)
2016-11-21 22:13:41 +00:00
Mark Thompson
94f446c628 vaapi_encode: Sync to input surface rather than output
While outwardly bizarre, this change makes the behaviour consistent
with other VAAPI encoders which sync to the encode /input/ picture in
order to wait for /output/ from the encoder.  It is not harmful on
i965 (because synchronisation already happens in vaRenderPicture(),
so it has no effect there), and it allows the encoder to work on
mesa/gallium which assumes this behaviour.

(cherry picked from commit 086e4b58b5)
2016-11-21 22:13:41 +00:00
Mark Thompson
478a4b7e6d vaapi_encode: Check packed header capabilities
This improves behaviour with drivers which do not support packed
headers, such as AMD VCE on mesa/gallium.

(cherry picked from commit 892bbbcdc1)
2016-11-21 22:13:41 +00:00