Commit Graph

87 Commits

Author SHA1 Message Date
Anton Khirnov 23e85be58f h264: add a parameter to the CHROMA444 macro.
This way it does not look like a constant.
2013-03-21 10:21:02 +01:00
Anton Khirnov e962bd08ee h264: add a parameter to the CHROMA422 macro.
This way it does not look like a constant.
2013-03-21 10:20:58 +01:00
Anton Khirnov 6d2b6f21eb h264: add a parameter to the CABAC macro.
This way it does not look like a constant.
2013-03-21 10:20:52 +01:00
Anton Khirnov 7fa00653a5 h264: add a parameter to the FIELD_PICTURE macro.
This way it does not look like a constant.
2013-03-21 10:20:44 +01:00
Anton Khirnov 7bece9b22f h264: add a parameter to the FRAME_MBAFF macro.
This way it does not look like a constant.
2013-03-21 10:20:39 +01:00
Anton Khirnov da6be8fcec h264: add a parameter to the MB_FIELD macro.
This way it does not look like a constant.
2013-03-21 10:20:35 +01:00
Anton Khirnov 82313eaa34 h264: add a parameter to the MB_MBAFF macro.
This way it does not look like a constant.
2013-03-21 10:20:30 +01:00
Anton Khirnov 759001c534 lavc decoders: work with refcounted frames. 2013-03-08 07:38:30 +01:00
Diego Biurrun c242bbd8b6 Remove unnecessary dsputil.h #includes 2013-02-26 00:51:34 +01:00
Ronald S. Bultje 7ebfb466ae h264: Don't store intra pcm samples in h->mb
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-19 22:34:14 +02:00
Anton Khirnov 2c54155407 h264: deMpegEncContextize
Most of the changes are just trivial are just trivial replacements of
fields from MpegEncContext with equivalent fields in H264Context.
Everything in h264* other than h264.c are those trivial changes.

The nontrivial parts are:
1) extracting a simplified version of the frame management code from
   mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
   its own more complex system already and those were set only to appease
   the mpegvideo parts.
2) some tables that need to be allocated/freed in appropriate places.
3) hwaccels -- mostly trivial replacements.
   for dxva, the draw_horiz_band() call is moved from
   ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
   because it's now different for h264 and MpegEncContext-based
   decoders.
4) svq3 -- it does not use h264 complex reference system, so I just
   added some very simplistic frame management instead and dropped the
   use of ff_h264_frame_start(). Because of this I also had to move some
   initialization code to svq3.

Additional fixes for chroma format and bit depth changes by
Janne Grunau <janne-libav@jannau.net>

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-02-15 16:35:16 +01:00
Diego Biurrun 88bd7fdc82 Drop DCTELEM typedef
It does not help as an abstraction and adds dsputil dependencies.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2013-01-22 18:32:56 -08:00
Ronald S. Bultje f6f7d15041 h264: don't touch H264Context->ref_count[] during MB decoding
The variable is copied to subsequent threads at the same time, so this
may cause wrong ref_count[] values to be copied to subsequent threads.

This bug was found using TSAN.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-10-05 02:49:45 +02:00
Diego Biurrun 1218777ffd avcodec: Convert some commented-out printf/av_log instances to av_dlog 2012-10-01 10:24:28 +02:00
Diego Biurrun 6f6b0311a3 avcodec: Drop some silly commented-out av_log() invocations 2012-10-01 10:24:28 +02:00
Mans Rullgard 0b6f973635 h264: use asm cabac reader under a generic condition
This removes a dependency on implementation details from generic
code and allows easy addition of the equivalent optimisation for
other architectures than x86.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-23 22:14:21 +01:00
Roland Scheidegger 9b9df1cdff h264: new assembly version of get_cabac for x86_64 with PIC
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register. get_cabac() gets about 40% faster, for
an overall speedup of about 5%.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 09:43:25 -07:00
Roland Scheidegger 14e9ffc1e4 h264: use one table instead of several for cabac functions
The reason is this is easier for PIC code (in particular on darwin...).
Keep the old names as pointers (static in cabac_functions.h so gcc
knows these are just immediate offsets) so the c code can nicely stay the same
(alternatively could use offsets directly in the functions needing the
tables). This should produce the same code as before with non-pic and better
code (confirmed) with pic.

The assembly uses the new table but still won't work for PIC case.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 08:26:12 -07:00
Diego Biurrun 0becb07842 h264: Factorize declaration of mb_sizes array. 2012-04-05 17:17:22 +02:00
Ronald S. Bultje 63a1b481f6 h264: fix cabac-on-stack after safe cabac reader. 2012-03-28 16:35:42 -07:00
Ronald S. Bultje d1604b3de9 h264: prevent overreads in intra PCM decoding.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
2012-02-29 13:17:34 -08:00
Ronald S. Bultje 45b7bd7c53 h264: disallow constrained intra prediction modes for luma.
Conversion of the luma intra prediction mode to one of the constrained
("alzheimer") ones can happen by crafting special bitstreams, causing
a crash because we'll call a NULL function pointer for 16x16 block intra
prediction, since constrained intra prediction functions are only
implemented for chroma (8x8 blocks).

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
2012-02-09 22:57:01 -08:00
Diego Biurrun 55b9ef18e4 cabac: split cabac.h into declarations and function definitions
This fixes standalone compilation of some decoders with --disable-optimizations.
cabac.h defines some inline functions that use symbols from cabac.c.  Without
optimizations these inline functions are not eliminated and linking fails with
references to non-existing symbols.

Splitting the inline functions off into their own header and only #including
it in the places where the inline functions are used allows #including cabac.h
from anywhere without ill effects.
2012-01-12 23:08:23 +01:00
Martin Storsjö 676a9ee1d2 x86: Fix constraints for decode_significance*_x86
Originally, prior to 8742a4ff8, the caller code was compiled
within this condition:

ARCH_X86 && HAVE_7REGS && HAVE_EBX_AVAILABLE && !defined(BROKEN_RELOCATIONS)

Since HAVE_7REGS is defined as
(ARCH_X86_64 || (HAVE_EBX_AVAILABLE && HAVE_EBP_AVAILABLE))
the subcondition HAVE_7REGS && HAVE_EBX_AVAILABLE is equal
to HAVE_7REGS (for 32 bit at least). The correct simplification
of the original condition thus is HAVE_7REGS, not
HAVE_EBX_AVAILABLE.

This fixes compilation in some cases where HAVE_EBP_AVAILABLE = 0
and HAVE_EBX_AVAILABLE = 1.

Signed-off-by: Martin Storsjö <martin@martin.st>
2011-12-27 09:05:14 +02:00
Diego Biurrun 6fdb2ce34a x86: Tighten register constraints for decode_significance*_x86.
On 32-bit OS X with gcc 4.0/4.2 and shared libraries enabled, the ebx register
is not available, but required to assemble the functions.

This reverts commit 8742a4f to a simplified version of the original constraints.
2011-12-21 12:06:37 +01:00
Diego Biurrun 8742a4ff87 h264_cabac: synchronize decode_significance_*_x86 conditionals
The definition and the call site where under different #ifdefs.
2011-12-21 09:04:25 +01:00
Luca Barbato 5bf2ac2b37 error_resilience: use the ER_ namespace
Add the namespace to {AC_,DC_,MV_}{END,ERROR} macros

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2011-12-13 16:20:58 +01:00
Diego Biurrun 58c42af722 doxygen: misc consistency, spelling and wording fixes 2011-12-12 23:06:23 +01:00
Baptiste Coudurier 76741b0e56 h264: 4:2:2 intra decoding support
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-21 01:00:41 -07:00
Jason Garrett-Glaser 6c32576548 H.264: optimize CABAC x86 asm for Atom 2011-07-28 13:06:13 -07:00
Diego Biurrun 657ccb5ac7 Eliminate FF_COMMON_FRAME macro.
FF_COMMON_FRAME holds the contents of the AVFrame structure and is also copied
to struct Picture.  Replace by an embedded AVFrame structure in struct Picture.
2011-07-11 00:19:00 +02:00
Jason Garrett-Glaser 99b6d2c065 H.264: use fill_rectangle in CABAC decoding 2011-07-08 16:12:39 -07:00
Jason Garrett-Glaser 556f8a066c H.264: template left MB handling
Faster H.264 decoding with ALLOW_INTERLACE off.
2011-07-03 15:06:00 -07:00
Jason Garrett-Glaser 3b7ebeb4d5 H.264: faster write_back_*
Avoid aliasing, unroll loops, and inline more functions.
2011-07-03 15:05:55 -07:00
Jason Garrett-Glaser c90b94424c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 21:16:30 -07:00
Jason Garrett-Glaser 504811baea Roll back 4:4:4 H.264 for now
Needs some ARM/PPC asm modifications.
2011-06-13 13:38:46 -07:00
Jason Garrett-Glaser c9c493872c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 12:21:39 -07:00
Oskar Arvidsson fcc0224e4f Add support for higher QP values in h264.
In high bit depth, the QP values may now be up to (51 + 6*(bit_depth-8)).

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:35 -04:00
Oskar Arvidsson 6e3ef511d7 Add the notion of pixel size in h264 related functions.
In high bit depth the pixels will not be stored in uint8_t like in the
normal case, but in uint16_t. The pixel size is thus 1 in normal bit
depth and 2 in high bit depth.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:33 -04:00
Stefano Sabatini 975a1447f7 Replace deprecated FF_*_TYPE symbols with AV_PICTURE_TYPE_*.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-02 12:18:44 +02:00
Mans Rullgard 2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Ronald S. Bultje 66c6b5e2a5 Revert 2a1f431d38, it broke H264 lossless. 2011-01-20 17:24:44 -05:00
Ronald S. Bultje 8bcfe7f7fd Set gray (128) U/V planes for chroma-less samples. Fixes two fate samples
when played with -flags emu_edge.
2011-01-20 17:24:44 -05:00
Jason Garrett-Glaser b9af15402d Remove evil timers that snuck their way into r26375.
Originally committed as revision 26377 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-15 18:14:36 +00:00
Jason Garrett-Glaser fb2734c8a6 Fix r26375 on non-x86.
Originally committed as revision 26376 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-15 18:13:40 +00:00
Jason Garrett-Glaser f14bdd8e75 H.264: Partially inline CABAC residual decoding
Improves CABAC performance about ~1.2%.

Trick originates from x264 and has also been used in ffvp8.  It's useful because
coded block flags are usually zero, so it helps to have the early termination
inlined into the main function.

Originally committed as revision 26375 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-15 17:52:48 +00:00
Jason Garrett-Glaser 2a1f431d38 H.264/SVQ3: make chroma DC work the same way as luma DC
No speed improvement, but necessary for some future stuff.
Also opens up the possibility of asm chroma dc idct/dequant.

Originally committed as revision 26349 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-15 01:10:46 +00:00
Jason Garrett-Glaser 5657d14094 H.264: switch to x264-style tracking of luma/chroma DC NNZ
Useful so that we don't have to run the hierarchical DC iDCT if there aren't
any coefficients.  Opens up some future opportunities for optimization as well.

Originally committed as revision 26337 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 21:36:16 +00:00
Jason Garrett-Glaser 19fb234e4a H.264: split luma dc idct out and implement MMX/SSE2 versions
About 2.5x the speed.

NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.

Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 21:34:25 +00:00
Diego Biurrun ba87f0801d Remove explicit filename from Doxygen @file commands.
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.

Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-20 14:45:34 +00:00