Commit Graph

107 Commits

Author SHA1 Message Date
Ronald S. Bultje 9ebcf7699b vp8: fix segmentation race during frame-threading.
Fixes occasional failure of make fate-vp8-test-vector-010 with
frame-multithreading enabled.
2011-05-31 07:13:34 -07:00
Mans Rullgard 4276112277 vp8: use av_clip_uintp2() where possible
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-29 02:10:05 +01:00
Mans Rullgard 1550f45a89 Add av_clip_uintp2() function
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-13 16:45:24 -04:00
Oskar Arvidsson 19a0729b4c Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:36 -04:00
Ronald S. Bultje 4773d90421 vp8: frame-multithreading.
Tested on a Mac Pro, 2 CPUs, 2 cores each, OSX 10.6.6:

time ./ffmpeg -v 0 -vsync 0 -threads [1234] -i \
  ~/Downloads/sintel_trailer_1080p_vp8_vorbis.webm \
  -f null -vcodec rawvideo -an -
1: 0m14.630s (89.9 fps)
2: 0m8.056s (163.2 fps)
3: 0m5.882s (223.6 fps)
4: 0m4.952s (265.6 fps)

time ./ffmpeg -v 0 -vsync 0 -threads [1234] -i \
  ~/Downloads/Elephants_Dream-720p-Stereo.webm \
  -f null -vcodec rawvideo -an -
1: 1m12.962s (215.1 fps)
2: 0m44.682s (351.2 fps)
3: 0m31.183s (503.2 fps)
4: 0m25.284s (620.6 fps)

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2011-05-02 17:03:31 +02:00
Stefano Sabatini 975a1447f7 Replace deprecated FF_*_TYPE symbols with AV_PICTURE_TYPE_*.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-02 12:18:44 +02:00
Alexander Strange 66f608a6aa vp8.c: rename EDGE_* to VP8_EDGE_*. 2011-03-24 21:48:18 -04:00
Mans Rullgard 2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Jason Garrett-Glaser 81a131312d VP8: fix other function declaration
Was missed in 3efbe137.
2011-03-12 15:36:15 -08:00
Jason Garrett-Glaser 1eeca88691 VP8: optimize VP8Context struct ordering
Shaves at least 3KB off code size on x86, should improve cache utilization.
This would probably be useful to do for other decoders/encoders as well.
2011-03-12 03:43:42 -08:00
Jason Garrett-Glaser 3efbe13739 VP8: fix function declaration 2011-03-12 03:41:39 -08:00
Jason Garrett-Glaser 628b48db85 VP8: use a goto to break out of two loops
A break statement was supposed to break out of two loops, but only broke out of one.
Didn't affect output, just could have been marginally slower.
2011-03-12 03:41:33 -08:00
Jason Garrett-Glaser 891b1f15a7 VP8: init one less near_mv
This one didn't actually need to be initialized.
2011-02-17 15:25:28 -08:00
Jason Garrett-Glaser bcf4568f18 VP8: split out declarations to new header 2011-02-17 15:25:16 -08:00
Jason Garrett-Glaser 7634771e70 VP8: faster MV clipping 2011-02-17 15:23:53 -08:00
Reinhard Tartler 737eb5976f Merge libavcore into libavutil
It is pretty hopeless that other considerable projects will adopt
libavutil alone in other projects. Projects that need small footprint
are better off with more specialized libraries such as gnulib or rather
just copy the necessary parts that they need. With this in mind, nobody
is helped by having libavutil and libavcore split. In order to ease
maintenance inside and around FFmpeg and to reduce confusion where to
put common code, avcore's functionality is merged (back) to avutil.

Signed-off-by: Reinhard Tartler <siretart@tauware.de>
2011-02-15 16:18:21 +01:00
Mans Rullgard a7878c9f73 VP8: ARM optimised decode_block_coeffs_internal
Approximately 5% faster on Cortex-A8.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-11 15:48:11 +00:00
Jason Garrett-Glaser f3d09d44b7 VP8: optimized mv prediction and decoding
Merge find_near_mvs and mv bitstream decoding: don't do prediction steps
until absolutely necessary.
2011-02-10 16:18:16 -08:00
Jason Garrett-Glaser 62457f9052 VP8: idct_mb optimizations
Currently uses AV_RL32 instead of AV_RL32A, as the latter doesn't exist yet.
2011-02-08 15:59:24 -08:00
Jason Garrett-Glaser 8a2c99b486 VP8: slightly faster loopfilter sharpness logic 2011-02-04 04:51:22 -08:00
Jason Garrett-Glaser 79dec1541b VP8: faster deblock strength calculation
Convert hev_thresh logic to a LUT, simplify mbedge_lim calculation.
2011-02-04 04:51:18 -08:00
Jason Garrett-Glaser a1b227bb53 VP8: faster filter_level clip 2011-02-03 19:55:06 -08:00
Jason Garrett-Glaser dd18c9a050 VP8: simplify lf_delta mb mode logic 2011-02-03 19:55:02 -08:00
Jason Garrett-Glaser 64233e702a VP8: merge chroma MC calls
Adds some duplicated code, but avoids duplicate edge checks and similar.
~0.5% faster overall on Parkjoy test sample.
2011-01-31 20:46:54 -08:00
Jason Garrett-Glaser 73be29b0c4 Slightly simplify VP8 inter_predict
Merge an if and a switch.
2011-01-30 12:12:02 -08:00
Ronald S. Bultje 2e27959879 Move ff_emulated_edge_mc() into DSPContext. 2011-01-28 22:13:26 -05:00
Ronald S. Bultje 9d4bdcb714 Fix VP8 aliasing problems.
Replace * (uint32_t *) buf accesses with AV_WN32A/AV_COPY32.
2011-01-28 10:20:00 -05:00
Diego Elio Pettenò d36beb3f69 Add ff_ prefix to data symbols of encoders, decoders, hwaccel, parsers, bsf.
None of these symbols should be accessed directly, so declare them as
hidden.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-26 16:08:45 +00:00
Ronald S. Bultje 44002d8323 Don't do edge emulation unless the edge pixels will be used in MC.
Do not emulate larger edges than we will actually use for this round of
MC. Decoding goes from avg+SE 29.972+/-0.023sec to 29.856+/-0.023, i.e.
0.12sec or ~0.4% faster.
2011-01-25 13:50:16 -05:00
Ronald S. Bultje 7148da489e Fix valgrind invalid read on top MB rows with CODEC_FLAG_EMU_EDGE set.
Originally committed as revision 26168 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-30 14:33:21 +00:00
Ronald S. Bultje ee555de7dd Support CODEC_FLAG_EMU_EDGE in VP8 decoder.
Originally committed as revision 26117 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-28 17:37:19 +00:00
Stefano Sabatini e16f217ceb Use new imgutils.h API names, fix deprecation warnings.
Originally committed as revision 25058 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-07 19:15:29 +00:00
Jason Garrett-Glaser 2b476e02e1 Remove some stray +s in VP8
Originally committed as revision 24791 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-13 02:02:07 +00:00
Pascal Massimino aa93c52c21 remove b4_stride/mb_stride.
correct mb_xy to use mb_width.
tighten allocations.
reduce the amount of zeroing.

Originally committed as revision 24760 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-11 08:27:38 +00:00
Pascal Massimino ccf13f9e20 fix over-allocation. confused b4_stride with mb_width.
Originally committed as revision 24758 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-11 05:24:19 +00:00
Stefano Sabatini 6ce9b4310c Remove use of the deprecated function avcodec_check_dimensions(), use
av_check_image_size() instead.

Originally committed as revision 24711 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-06 09:37:04 +00:00
Jason Garrett-Glaser 7e13022a4d VP8: fix bug in prefetch
Motion vectors in VP8 are qpel, not fullpel.

Originally committed as revision 24707 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-05 20:03:54 +00:00
Jason Garrett-Glaser 905ef0d064 VP5/6/8: eliminate CABAC dependency
Create a custom table for VP5/6/8's renorm to avoid depending on H.264's.
Saves one instruction in the arithmetic decoder as well.

Originally committed as revision 24701 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 23:04:05 +00:00
Jason Garrett-Glaser 1e73967950 VP8: partially inline decode_block_coeffs
Avoids a function call in the case of empty DCT blocks (most of the time).

Originally committed as revision 24691 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 02:23:25 +00:00
Jason Garrett-Glaser ffbf0794f9 Fix 100L in r24689
Accidentally committed some timing code.

Originally committed as revision 24690 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 01:40:58 +00:00
Jason Garrett-Glaser afb54a85c3 VP8: simplify decode_block_coeffs to avoid having to track nonzero coeffs
Slightly faster.

Originally committed as revision 24689 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 01:38:08 +00:00
Jason Garrett-Glaser b0d5879513 VP8: slightly faster DCT coefficient probability update
Originally committed as revision 24687 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 23:21:47 +00:00
Jason Garrett-Glaser 476be414a4 VP8: make another RAC call branchy
1-2 clocks faster.

Originally committed as revision 24683 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 11:34:24 +00:00
Jason Garrett-Glaser 0908f1b945 VP8: unroll partition type decoding tree
~34% faster partition type decoding.

Originally committed as revision 24681 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 11:10:58 +00:00
Jason Garrett-Glaser c5dec7f137 VP8: unroll splitmv decoding tree
Much faster splitmv mode decoding.

Originally committed as revision 24680 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 10:37:14 +00:00
Jason Garrett-Glaser 23117d69c1 VP8: unroll MB mode decoding tree
~50% faster MB mode decoding, plus eliminate a costly switch.

Originally committed as revision 24679 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 10:24:28 +00:00
Jason Garrett-Glaser 370b622a45 VP8: eliminate a dereference in coefficient decoding
Originally committed as revision 24671 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 22:48:38 +00:00
Jason Garrett-Glaser f311208cf1 VP8: much faster DC transform handling
A lot of the time the DC block is empty: don't do the WHT in this case.
A lot of the rest of the time, there's only one coefficient: make a special
DC-only transform for that case.
When the block is empty, don't incorrectly mark luma DCT blocks as having DC
coefficients.

Originally committed as revision 24670 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 20:57:03 +00:00
Jason Garrett-Glaser 827d43bb9d VP8: move zeroing of luma DC block into the WHT
Lets us do the zeroing in asm instead of C.
Also makes it consistent with the way the regular iDCT code does it.

Originally committed as revision 24668 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 20:18:09 +00:00
Pascal Massimino d2840fa49c only store intra prediction modes on the boundary for keyframes, not as a plane.
inter-frame behaviour unchanged.

Originally committed as revision 24664 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 09:44:53 +00:00