mpv/DOCS/tech/colorspaces.txt

109 lines
4.7 KiB
Plaintext

In general
==========
There are planar and packed modes.
- Planar mode means: you have 3 separated image, one for each component,
each image 8 bits/pixel. To get the real colored pixel, you have to
mix the components from all planes. The resolution of planes may differ!
- Packed mode means: you have all components mixed/interleaved together,
so you have small "packs" of components in a single, big image.
There are RGB and YUV colorspaces.
- RGB: Read, Green and Blue components. Used by analog VGA monitors.
- YUV: Luminance (Y) and Chrominance (U,V) components. Used by some
video systems, like PAL. Also most m(j)peg/dct based codecs use this.
With YUV, they used to reduce the resolution of U,V planes:
The most common YUV formats:
fourcc: bpp: IEEE: plane sizes: (w=width h=height of original image)
444P 24 YUV 4:4:4 Y: w * h U,V: w * h
YUY2,UYVY 16 YUV 4:2:2 Y: w * h U,V: (w/2) * h [MJPEG]
YV12,I420 12 YUV 4:2:0 Y: w * h U,V: (w/2) * (h/2) [MPEG, h263]
411P 12 YUV 4:1:1 Y: w * h U,V: (w/4) * h [DV-NTSC, CYUV]
YVU9,IF09 9 YUV 4:1:0 Y: w * h U,V: (w/4) * (h/4) [Sorenson, Indeo]
The YUV a:b:c naming style means: for <a> samples of Y there are <b> samples
of UV in odd lines and <c> samples of UV in even lines.
conversion: (some cut'n'paste from www and maillist)
RGB to YUV Conversion:
Y = (0.257 * R) + (0.504 * G) + (0.098 * B) + 16
Cr = V = (0.439 * R) - (0.368 * G) - (0.071 * B) + 128
Cb = U = -(0.148 * R) - (0.291 * G) + (0.439 * B) + 128
YUV to RGB Conversion:
B = 1.164(Y - 16) + 2.018(U - 128)
G = 1.164(Y - 16) - 0.813(V - 128) - 0.391(U - 128)
R = 1.164(Y - 16) + 1.596(V - 128)
In both these cases, you have to clamp the output values to keep them in
the [0-255] range. Rumour has it that the valid range is actually a subset
of [0-255] (I've seen an RGB range of [16-235] mentioned) but clamping the
values into [0-255] seems to produce acceptable results to me.
Julien (sorry, I can't call back his surname) suggests that there are
problems with the above formula and suggests the following instead:
Y = 0.299R + 0.587G + 0.114B
Cb = U'= (B-Y)*0.565
Cr = V'= (R-Y)*0.713
with reciprocal versions:
R = Y + 1.403V'
G = Y - 0.344U' - 0.714V'
B = Y + 1.770U'
note: this formula doesn't contain the +128 offsets of U,V values!
Conclusion:
Y = luminance, the weighted average of R G B components. (0=black 255=white)
U = Cb = blue component (0=green 128=grey 255=blue)
V = Cr = red component (0=green 128=grey 255=red)
Huh. The planar YUV modes.
==========================
The most misunderstood thingie...
In MPlayer, we usually have 3 pointers to the Y, U and V planes, so it
doesn't matter what is the order of the planes in the memory:
for mp_image_t and libvo's draw_slice():
planes[0] = Y = luminance
planes[1] = U = Cb = blue
planes[2] = V = Cr = red
Note: planes[1] is ALWAYS U, and planes[2] is V, the fourcc
(YV12 vs. I420) doesn't matter here! So, every codecs using 3 pointers
(not only the first one) normally supports YV12 and I420 (=IYUV) too!
But there are some codecs (vfw, dshow) and vo drivers (xv) ignoring the 2nd
and 3rd pointer, and use only a single pointer to the planar yuv image. In
this case we must know the right order and alignment of planes in the memory!
from the webartz fourcc list:
YV12: 12 bpp, full sized Y plane followed by 2x2 subsampled V and U planes
I420: 12 bpp, full sized Y plane followed by 2x2 subsampled U and V planes
IYUV: the same as I420
YVU9: 9 bpp, full sized Y plane followed by 4x4 subsampled V and U planes
Huh 2. RGB vs. BGR ?
====================
The 2nd most misunderstood thingie...
You know, there are Intel and Motorola, and they use different byteorder.
There are also others, like MIPS or Alpha, they all follow either Intel
or Motorola byteorder.
Unfortunately, the packed colorspaces depend on CPU byteorder. So, RGB
on Intel and Motorola means different order of bytes.
In MPlayer, we have constants IMGFMT_RGBxx and IMGFMT_BGRxx.
Unfortunately, some codecs and vo drivers follow Intel, some follow Motorola
byteorder, so they are incompatible. We had to find a stable base, so long
time ago I've chosen OpenGL, as it's a wide-spreaded standard, and it well
defines what RGB is and what BGR is. So, MPlayer's RGB is compatible with
OpenGL's GL_RGB on all platforms, and the same goes for BGR - GL_BGR.
Unfortunately, most of the x86 codecs call our BGR to RGB, so it sometimes
confuse developers.
If you are unsure, try the OpenGL driver (-vo gl:manyfmts). There is at least
software OpenGL implementation for all major platforms and OS's, but you will
need OpenGL version >= 1.2 for most formats.