Commit Graph

1005 Commits

Author SHA1 Message Date
Janne Grunau 6e9bb5aa3e swscale: prevent invalid writes in packed_16bpc_bswap
Writes past the end of the destination buffer were occuring when its
stride was smaller than the stride of the source. Fixes Bug #183.
2011-12-26 15:50:17 +01:00
Anton Khirnov 131609dc2a sws: readd PAL8 to isPacked()
Fixes PAL8 to YUV conversion.
2011-12-22 11:01:28 +01:00
Nathan Adil Maxson 7b3894bee9 swscale: fix formatting and indentation of unscaled conversion routines.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-12-18 15:32:08 -08:00
Ronald S. Bultje d49352c7cc swscale: fix overflows in vertical scaling at top/bottom edges.
This fixes integer multiplication overflows in RGB48 output
(vertical) scaling as detected by IOC. What happens is that for
certain types of filters (lanczos, spline, bicubic), the
intermediate sum of coefficients in the middle of a filter can
be larger than the fixed-point equivalent of 1.0, even if the
final sum is 1.0. This is fine and we support that.

However, at frame edges, initFilter() will merge the coefficients
for the off-screen pixels into the top or bottom pixel, such as
to emulate edge extension. This means that suddenly, a single
coefficient can be larger than the fixed-point equivalent of
1.0, which the vertical scaling routines do not support.

Therefore, remove the merging of coefficients for edges for
the vertical scaling filter, and instead add edge detection
to the scaler itself so that it copies the pointers (not data)
for the edges (i.e. it uses line[0] for line[-1] as well), so
that a single coefficient is never larger than the fixed-point
equivalent of 1.0.
2011-12-18 08:27:43 -08:00
Ronald S. Bultje 72dafea0fc swscale: fix overflow in gray16 vertical scaling.
This fixes the same overflow as in the RGB48/16-bit YUV scaling;
some filters can overflow both negatively and positively (e.g.
spline/lanczos), so we bias a signed integer so it's "half signed"
and "half unsigned", and can cover overflows in both directions
while maintaining full 31-bit depth.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-17 22:41:53 +00:00
Mans Rullgard 77d88b872d swscale: fix integer overflows in RGB pixel writing.
We're shifting individual components (8-bit, unsigned) left by 24,
so making them unsigned should give the same results without the
overflow.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-12-17 18:59:24 +00:00
Janne Grunau b4dc68803b swscale: add endian conversion for RGB555 and RGB444 pixel formats
Add a macro to shorten the if condition.
2011-12-17 19:52:19 +01:00
Ronald S. Bultje be1bafc303 swscale: fix overflows in output of RGB48 pixels.
For certain types of filters where the intermediate sum of coefficients
can go above the fixed-point equivalent of 1.0 in the middle of a filter,
the sum of a 31-bit calculation can overflow in both directions and can
thus not be represented in a 32-bit signed or unsigned integer. To work
around this, we subtract 0x40000000 from a signed integer base, so that
we're halfway signed/unsigned, which makes it fit even if it overflows.
After the filter finishes, we add the scaled bias back after a shift.

We use the same trick for 16-bit bpc YUV output routines.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-17 18:36:20 +00:00
Janne Grunau ed46a3d842 swscale: add rgb565 endianess conversion 2011-12-17 19:35:16 +01:00
Ronald S. Bultje 4391805916 swscale: fix overflows in RGB rounding constants.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-17 14:36:09 +00:00
Janne Grunau 72cb904453 swscale: add unscaled packed 16 bit per component endianess conversion 2011-12-16 01:41:33 +01:00
Diego Biurrun 3c62a71486 swscale_mmx: drop no longer required parameters from VSCALEX macros 2011-12-14 12:00:44 +01:00
Diego Biurrun 52de07e1f1 swscale: Mark yuv2planeX_8_mmx as MMX2; it contains MMX2 instructions. 2011-12-14 11:58:46 +01:00
Mans Rullgard 878dda5db1 build: move inclusion of subdir.mak to main subdir loop
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-13 14:26:49 +00:00
Diego Biurrun 58c42af722 doxygen: misc consistency, spelling and wording fixes 2011-12-12 23:06:23 +01:00
Mans Rullgard 373211d828 Remove extraneous semicolons
These semicolons cause invalid empty top-level declarations.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-11 17:23:24 +00:00
Reinhard Tartler 5089ce1b5a swscale: #include "libavutil/mathematics.h"
this file uses the M_PI macro since
4e74187db2, so include the correct header
directly.

Signed-off-by: Reinhard Tartler <siretart@tauware.de>
2011-12-01 19:12:26 +01:00
Mans Rullgard 7c5ce99bd9 swscale: fix signed overflow in yuv2mono_X_c_template
As old bits are shifted out of the accumulator, they cause signed
overflows when they reach the end.  Making the variable unsigned fixes
this.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-11-26 22:53:47 +00:00
Martin Storsjö f32dfad9dc swscale: Readd #define _SVID_SOURCE
This was removed erroneously in
046f081b46. This define still is
necessary for getting MAP_ANONYMOUS defined on linux/glibc,
despite the define reshuffling done in that commit.

Without MAP_ANONYMOUS defined, the mprotect calls for setting the
generated mmx2 scaler code pages executable are left out, causing
crashes if that codepath is chosen.

This patch fixes scaling from 192x144 to 320x240 with
-sws_flags fast_bilinear, which crashes on linux at the
moment.

Signed-off-by: Martin Storsjö <martin@martin.st>
2011-11-25 19:59:15 +02:00
Ronald S. Bultje f7f1835258 swscale: fix failing fate tests.
isGray() is left as a FIXME for later.
2011-11-24 12:21:03 -08:00
Ronald S. Bultje 185655c601 swscale: add support for planar RGB input. 2011-11-24 10:40:05 -08:00
Ronald S. Bultje 6b0768e202 Clean up swscale pixfmt macros using av_pix_fmt_descriptors[]. 2011-11-24 08:24:55 -08:00
John Stebbins 09d243ddd0 swscale: Fix stack alignment for SSE
Although gcc guarantees 16 byte stack alignment, threads under WinXP
don't appear to be guaranteed to start stack aligned.  So fix the
alignment.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-11-19 09:55:07 -08:00
Ronald S. Bultje 8283f90a52 swscale: handle unaligned buffers in yuv2plane1
The issue had been introduced in
c435653627

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2011-11-13 08:27:20 +01:00
Sean McGovern 124e56454d swscale: add padding to conversion buffer.
Altivec does unaligned reads from this buffer in
hscale_altivec_real(), and can thus read up to 16 bytes beyond
the end of the buffer. Therefore, add an extra 16 bytes of
padding at the end of the conversion buffer.

This fixes fate-lavfi-pixfmts_scale on AltiVec-enabled builds
under valgrind.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-11-11 07:44:35 -08:00
Ronald S. Bultje c435653627 swscale: write yuv2plane1 MMX/SSE2/SSE4/AVX functions. 2011-11-05 20:48:14 -07:00
Ronald S. Bultje 1deb08fcb6 swscale: align vertical filtersize by 2 on x86.
The vertical scaler handles 2 rows at a time and thus requires
alignment by 2, or else it'll read invalid memory and result in
corrupt output.
2011-11-05 07:06:38 -07:00
Ronald S. Bultje 9e66b892e8 swscale: add missing colons to x86 assembly yuv2planeX.
This fixes assembling using "nasm".
2011-10-23 09:44:03 -07:00
Ronald S. Bultje f48b12e0a6 swscale: update altivec yuv2planeX asm to new per-plane API. 2011-10-22 10:35:14 -07:00
Ronald S. Bultje 6cacecdca3 swscale: make yuv2yuvX_10_sse2/avx 8/9/16-bits aware.
Also implement MMX/MMX2 versions and SSE4 versions.
2011-10-22 10:35:14 -07:00
Kieran Kunhya 7fbbf95293 yuv2planeX10 SIMD
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-22 10:35:14 -07:00
Ronald S. Bultje 109f62e8f8 swscale: decide whether to use yuv2plane1/X on a per-plane basis. 2011-10-22 10:35:14 -07:00
Ronald S. Bultje f99654d470 swscale: reintroduce full precision in 16-bit output. 2011-10-22 10:35:14 -07:00
Kieran Kunhya ff7913aef1 Split up yuv2yuvX functions
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-22 10:35:13 -07:00
Kieran Kunhya 34e8d147b3 Split out yuv2yuv1 luma and chroma in order to make them generic DSP functions
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-22 10:35:13 -07:00
Mans Rullgard 41ac093f7e swscale: fix signed shift overflows in ff_yuv2rgb_c_init_tables()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-10-21 20:56:59 +01:00
Ronald S. Bultje dc49bf1270 sws/pixfmt/pixdesc: add support for yuv420p9le/be. 2011-10-21 00:58:01 -07:00
Ronald S. Bultje 8305041e13 swscale: prevent overflow in coefficient calculation. 2011-10-21 00:14:11 -07:00
Ronald Bultje d1d421cbc0 swscale: prevent overflow during initialization
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2011-10-18 10:29:49 +02:00
Anton Khirnov 145f741e11 AVOptions: rename FF_OPT_TYPE_* => AV_OPT_TYPE_* 2011-10-12 16:51:16 +02:00
Anton Khirnov 04de1569cd sws: support yuv444p9/10 output. 2011-10-12 08:27:30 +02:00
Ronald S. Bultje 6aa3cac6bf swscale: use aligned move for storage into temporary buffer.
The intermediate buffer is always aligned.
2011-10-11 07:50:48 -07:00
Mans Rullgard d853e571ad ppc: fix some pointer to integer casts
Use uintptr_t instead of plain int.  Without this change, the
comparisons will come out wrong for pointers in certain ranges.
Fixes random failures on ppc64.  Also fixes some compiler warnings.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-09-25 18:33:38 +01:00
Kieran Kunhya 4d4d0e8176 Fix unnecessary shift with 9/10bit vertical scaling
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-09-23 02:13:30 +02:00
Ronald S. Bultje ea540401d6 swscale: fix byte overreads in SSE-optimized hscale().
SSE-optimized hScale() scales up to 4 pixels at once, so we need to
allocate up to 3 padding pixels to prevent overreads. This fixes
valgrind errors in various swscale-tests on fate.
2011-09-15 07:30:46 -07:00
Ronald S. Bultje e0c3e07387 sws: implement MMX/SSE2/SSSE3/SSE4 versions for horizontal scaling.
Speed: from 3.9x to 9.6x speed improvement over C, and some small
(up to 15%) speed improvements over existing MMX code (particularly
for bigger filters).
2011-09-13 09:53:42 -07:00
Anton Khirnov fb4ca26bdb lavf,lavc,sws: add {avcodec,avformat,sws}_get_class() functions. 2011-09-03 20:53:35 +02:00
Ronald S. Bultje 3f04ab4fcd swscale: split hScale() function pointer into h[cy]Scale().
This allows using more specific implementations for chroma/luma, e.g.
we can make assumptions on filterSize being constant, thus avoiding
that test at runtime.
2011-08-17 20:56:06 -07:00
Luca Barbato 3304a1e69a swscale: add dithering to yuv2yuvX_altivec_real
It just does that part in scalar form, I doubt using a vector store
over 2 array would speed it up particularly.

The function should be written to not use a scratch buffer.
2011-08-13 00:06:04 +02:00
Ronald S. Bultje 28c1115a91 swscale: use 15-bit intermediates for 9/10-bit scaling. 2011-08-12 11:54:25 -07:00