About 30% faster on 32 bit Atom, 120% faster on 64 bit Phenom2.
This is interesting because supporting P16 is easier in e.g.
OpenGL (can misuse support for any 2-component 8 bit format),
whereas supporting p9/p10 without conversion needs a texture
format with at least 14 bits actual precision.
The shiftonly == 0 case is not optimized since the code is more
complex and the speed gain less obvious.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
* qatar/master:
swscale: make yuv2yuv1 use named registers.
h264: mark h264_idct_add8_10 with number of XMM registers.
swscale: fix V plane memory location in bilinear/unscaled RGB/YUYV case.
vp8: always update next_framep[] before returning from decode_frame().
avconv: estimate next_dts from framerate if it is set.
avconv: better next_dts usage.
avconv: rename InputStream.pts to last_dts.
avconv: reduce overloading for InputStream.pts.
avconv: rename InputStream.next_pts to next_dts.
avconv: rework -t handling for encoding.
avconv: set encoder timebase for subtitles.
pva-demux test: add -vn
swscale: K&R formatting cosmetics for SPARC code
apedec: allow the user to set the maximum number of output samples per call
apedec: do not unnecessarily zero output samples for mono frames
apedec: allocate a single flat buffer for decoded samples
apedec: use sizeof(field) instead of sizeof(type)
swscale: split C output functions into separate file.
swscale: Split C input functions into separate file.
bytestream: Add bytestream2 writing API.
The avconv changes are due to massive regressions and bugs not merged yet.
Conflicts:
ffmpeg.c
libavcodec/vp8.c
libswscale/swscale.c
libswscale/x86/swscale_template.c
tests/fate/demux.mak
tests/ref/lavf/asf
tests/ref/lavf/avi
tests/ref/lavf/mkv
tests/ref/lavf/mpg
tests/ref/lavf/nut
tests/ref/lavf/ogg
tests/ref/lavf/rm
tests/ref/lavf/ts
tests/ref/seek/lavf_avi
tests/ref/seek/lavf_mkv
tests/ref/seek/lavf_rm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
FATE: add tests for targa
ARM: fix Thumb-mode simple_idct_arm
ARM: 4-byte align start of all asm functions
rgb2rgb: rgb12to15()
swscale-test: fix stack overread.
swscale: fix invalid conversions and memory problems.
cabac: split cabac.h into declarations and function definitions
cabac: Mark ff_h264_mps_state array as static, it is only used within cabac.c.
cabac: Remove ff_h264_lps_state array.
Conflicts:
libswscale/rgb2rgb.h
libswscale/swscale_unscaled.c
tests/fate/image.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes problems where rgbToRgbWrapper() is called even though it doesn't
support this particular conversion (e.g. converting from RGB444 to
anything). Thirdly, fixes issues where rgbToRgbWrapper() is called for
non-native endiannness conversions (e.g. RGB555BE on a LE system).
Fourthly, fixes crashes when converting from e.g. monowhite to
monowhite, which calls planarCopyWrapper() and overwrites/reads because
n_bytes != n_pixels.
* qatar/master:
fate: split off vqf/twinvq FATE tests into their own file
fate: split off mpc FATE tests into their own file
fate: split off libavcodec FATE tests into their own file
fate: split off Microsoft codec FATE tests into their own file
fate: group all VP* codec FATE tests together in one file
swscale: prevent invalid writes in packed_16bpc_bswap
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
h264: clear trailing bits in partially parsed NAL units
vc1: Handle WVC1 interlaced stream
xl: Fix overreads
mpegts: rename payload_index to payload_size
segment: introduce segmented chain muxer
lavu: add AVERROR_BUG error value
avplay: clear pkt_temp when pkt is freed.
qcelpdec: K&R formatting cosmetics
qcelpdec: cosmetics: drop some pointless parentheses
x86: conditionally compile dnxhd encoder optimizations
Revert "h264: skip start code search if the size of the nal unit is known"
swscale: fix formatting and indentation of unscaled conversion routines.
h264: skip start code search if the size of the nal unit is known
cljr: fix buf_size sanity check
cljr: Check if width and height are positive integers
Conflicts:
libavcodec/cljr.c
libavcodec/vc1dec.c
libavformat/Makefile
libavformat/mpegtsenc.c
libavformat/segment.c
libswscale/swscale_unscaled.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
get_bits: remove A32 variant
avconv: support stream specifiers in -metadata and -map_metadata
wavpack: Fix 32-bit clipping
wavpack: Clip samples after shifting
h264: don't drop B-frames after next keyframe on POC reset.
get_bits: remove useless pointer casts
configure: refactor lists of tests and components into variables
rv40: NEON optimised weak loop filter
mpegts: replace some magic numbers with the existing define
swscale: add unscaled packed 16 bit per component endianess conversion
Conflicts:
libavcodec/get_bits.h
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Although gcc guarantees 16 byte stack alignment, threads under WinXP
don't appear to be guaranteed to start stack aligned. So fix the
alignment.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
libswscale/swscale_unscaled.c:915:9: warning: new qualifiers in middle of multi-level non-const cast are unsafe
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
libswscale/swscale_unscaled.c:805:5: warning: passing argument 1 of ‘check_image_pointers’ from incompatible pointer type
libswscale/swscale_unscaled.c:774:12: note: expected ‘uint8_t **’ but argument is of type ‘const uint8_t * const*’
libswscale/swscale_unscaled.c:809:5: warning: passing argument 1 of ‘check_image_pointers’ discards qualifiers from pointer target type
libswscale/swscale_unscaled.c:774:12: note: expected ‘uint8_t **’ but argument is of type ‘uint8_t * const*’
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
lavc: Deprecate unused FF_ER_VERY_AGGRESSIVE
x11grab: add show_region AVOption.
x11grab: add follow_mouse AVOption.
Do not convert RGB buffer at once when stride does not fit exact samples.
Conflicts:
libswscale/swscale_unscaled.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
When converting RGB format to RGB format with the same bits per sample,
unscaled path performs conversion on the whole buffer at once. For
non-multiple-of-16 BGR24 to RGB24 conversion it means that padding at the
end of line will be converted too. Since it may be of arbitrary length
(e.g. 8 bytes), operating on the whole buffer produces obviously wrong
results.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>