Remove the vaguely defined plane_bits and component_bits fields from
struct mp_imgfmt_desc. Add weird replacements for existing uses. Remove
the bytes[] field, replace uses with bpp[].
Fix some potential alignment issues in existing code. As a compromise,
split mp_image_pixel_ptr() into 2 functions, because I think it's a bad
idea to implicitly round, but for some callers being slightly less
strict is convenient.
This shouldn't really change anything. In fact, it's a 100% useless
change. I'm just cleaning up what I started almost 8 years ago (see
commit 00653a3eb0). With this I've decided to keep mp_imgfmt_desc,
just removing the weird parts, and keeping the saner parts.
Whatever it's worth. Instead of doing a pretty stupid conversion to
float, just blend it directly. This works for most RGB formats that are
8 bits per component or below (the latter because we expand packed
fringe RGB formats for simplicity). For higher bit depth RGB this would
need extra code.
This checked params->color.space for being RGB. If the colorspace is
unset, this did dumb things because even if the imgfmt was a RGB one,
the colorspace was not set to RGB. This actually also happened to the
tests.
(Short-cutting RGB like this is actually wrong, since RGB could still
have strange gamma or primaries, which would warrant a full conversion.
So you'd need to check for these other parameters as well. To be fixed
later.)
Maybe this is useful for some of the lesser VOs. It's preferable over
bad ad-hoc solutions based on the more complex sub_bitmap data
structures (as observed e.g. in vo_vaapi.c), and does not use that much
more code since draw_bmp already created such an overlay internally.
But I still wanted something that avoids having to upload/render a full
screen-sized overlay if for example there's only a tiny subtitle line on
the bottom of the screen. So the new API can return a list of modified
pixels (for upload) and non-transparent pixels (for display). The way
these pixel rectangles are computed is a bit dumb and returns dumb
results, but it should be usable, and the implementation can change.
They are sort of confusing, and they hide the fact that they have an
alpha component. Using the actual formats directly is no problem, sicne
these were useful only for big endian systems, something we can't test
anyway.
draw_bmp.c is the software blender for subtitles and OSD. It's used by
encoding mode (burning subtitles), and some VOs, like vo_drm, vo_x11,
vo_xv, and possibly more.
This changes the algorithm from upsampling the video to 4:4:4 and then
blending to downsampling the OSD and then blending directly to video.
This has far-reaching consequences for its internals, and results in an
effective rewrite.
Since I wanted to avoid un-premultiplying, all blending is done with
premultiplied alpha. That's actually the sane thing to do. The old code
just didn't do it, because it's very weird in YUV fixed point.
Essentially, you'd have to compensate for the chroma centering constant
by subtracting src_alpha/255*128. This seemed so hairy (especially with
correct rounding and high bit depths involved) that I went for using
float.
I think it turned out mostly OK, although it's more complex and less
maintainable than before. reinit() is certainly a bit too long. While it
should be possible to optimize the RGB path more (for example by
blending directly instead of doing the stupid float conversion), this is
probably slower. vo_xv users probably lose in this, because it takes the
slowest path (due to subsampling requirements and using YUV).
Why this rewrite? Nobody knows. I simply forgot the reason. But you'll
have it anyway. Whether or not this would have required a full rewrite,
at least it supports target alpha now (you can for example hard sub
transparent PNGs, if you ever wanted to use mpv for this).
Remove the check in vf_sub. The new draw_bmp.c is not as reliant on
libswscale anymore (mostly uses repack.c now), and osd.c shows an
error message on missing support instead now.
Formats with chroma subsampling of 4 are not supported, because FFmpeg
doesn't provide pixfmt definitions for alpha variants. We could provide
those ourselves (relatively trivial), but why bother.
Making OSD/subtitle bitmaps refcounted was planend a longer time ago,
e.g. the sub_bitmaps.packed field (which refcounts the subtitle bitmap
data) was added in 2016. But nothing benefited much from it, because
struct sub_bitmaps was usually stack allocated, and there was this weird
callback stuff through osd_draw().
Make it possible to get actually refcounted subtitle bitmaps on the OSD
API level. For this, we just copy all subtitle data other than the
bitmaps with sub_bitmaps_copy(). At first, I had planned some fancy
refcount shit, but when that was a big mess and hard to debug and just
boiled to emulating malloc(), I made it a full allocation+copy. This
affects mostly the parts array. With crazy ASS subtitles, this parts
array can get pretty big (thousands of elements or more), in which case
the extra alloc/copy could become performance relevant. But then again
this is just pure bullshit, and I see no need to care. In practice, this
extra work most likely gets drowned out by libass murdering a single
core (while mpv is waiting for it) anyway. So fuck it.
I just wanted this so draw_bmp.c requires only a single call to render
everything. VOs also can benefit from this, because the weird callback
shit isn't necessary anymore (simpler code), but I haven't done anything
about it yet. In general I'd hope this will work towards simplifying the
OSD layer, which is prerequisite for making actual further improvements.
I haven't tested some cases such as the "overlay-add" command. Maybe it
crashes now? Who knows, who cares.
In addition, it might be worthwhile to reduce the code duplication
between all the things that output subtitle bitmaps (with repacking,
image allocation, etc.), but that's orthogonal.
UB sanitizer complains that aval<<24 (if >=128) cannot be represented as
int. Indeed, we would shift a bit into the sign of an int, which is
probably UB or implementation defined (I can't even remember, but the
stupidity of it burns). So technically, ubsan might be right.
Change aval to uint32_t, which I don't think has a chance of getting
promoted to int. Change the other *val to uint32_t too for cosmetic
symmetry.
So we have to obscure the intention of the code (*val can take only 8
bits) out of language stupidity. How nice. (What a shitty language.)
First we shift the values up to the actual amount of bits in draw_ass,
so that they will be drawn correctly when using formats with more than
8 bpc. (draw_rgba is already correct w.r.t. RGB formats with 9 or more
bpc)
Then, in scale_sb_rgba, by setting the amount of bits per channel used
for planar RGB formats (formats are always planar at this point in
draw_bmp) to be the same as the source from 9 to 16 bpc (in effect all
the various GBRP formats) we manage to fit the special case that does
not require any conversion in chroma_up and chroma_down when handling
these formats (as long as the source itself is a planar format),
instead writing directly to the combined dst/src buffer. This in turn
works around a bug (incorrect colors) in libswscale when scaling
between GBRP formats with 9 or more bpc. Additionally this should be
more efficient, since we skip up- and down-conversion and temporary
buffers.
This has two reasons:
1. I tend to add new fields to this metadata, and every time I've done
so I've consistently forgotten to update all of the dozens of places in
which this colorimetry metadata might end up getting used. While most
usages don't really care about most of the metadata, sometimes the
intend was simply to “copy” the colorimetry metadata from one struct to
another. With this being inside a substruct, those lines of code can now
simply read a.color = b.color without having to care about added or
removed fields.
2. It makes the type definitions nicer for upcoming refactors.
In going through all of the usages, I also expanded a few where I felt
that omitting the “young” fields was a bug.
This covers source files which were added in mplayer2 and mpv times
only, and where all code is covered by LGPL relicensing agreements.
There are probably more files to which this applies, but I'm being
conservative here.
A file named ao_sdl.c exists in MPlayer too, but the mpv one is a
complete rewrite, and was added some time after the original ao_sdl.c
was removed. The same applies to vo_sdl.c, for which the SDL2 API is
radically different in addition (MPlayer supports SDL 1.2 only).
common.c contains only code written by me. But common.h is a strange
case: although it originally was named mp_common.h and exists in MPlayer
too, by now it contains only definitions written by uau and me. The
exceptions are the CONTROL_ defines - thus not changing the license of
common.h yet.
codec_tags.c contained once large tables generated from MPlayer's
codecs.conf, but all of these tables were removed.
From demux_playlist.c I'm removing a code fragment from someone who was
not asked; this probably could be done later (see commit 15dccc37).
misc.c is a bit complicated to reason about (it was split off mplayer.c
and thus contains random functions out of this file), but actually all
functions have been added post-MPlayer. Except get_relative_time(),
which was written by uau, but looks similar to 3 different versions of
something similar in each of the Unix/win32/OSX timer source files. I'm
not sure what that means in regards to copyright, so I've just moved it
into another still-GPL source file for now.
screenshot.c once had some minor parts of MPlayer's vf_screenshot.c, but
they're all gone.
Merge blend_src8_alpha and blend_src16_alpha into blend_src_alpha, and
the same for blend_const_alpha.
One thing that changes is that the vertical loop is now shared for both
code paths.
I think this is slightly easier to read, and it's a bit shorter as well.
This removes the need to define IMGFMT_GBRAP, which fixes compilation
with the current Libav release.
This also makes it automatically pick up a GBRP format with the same bit
width. (Unfortunately, it seems libswscale does not support conversion
to AV_PIX_FMT_GBRAP16, so our code falls back to 8 bit, removing
precision for video covered by subtitles in cases this code is used.)
Also, when the source video is e.g. 10 bit YUV, upsample to 16 bit.
Whether this is good or bad, it fixes behavior with alpha. Although I'm
not sure if the alpha range is really correct ([0,2^16-1] vs.
[0,255*256]). Keep in mind that libswscale doesn't even agree with the
way we do it.
This actually treats destination alpha correctly, and gives much better
results than before. I don't know if this is perfectly correct yet,
though. Slight difference with vo_opengl behavior suggests it might not
be.
Note that this does not affect VOs with true alpha support. vo_opengl
does not use this code at all, and does the alpha calculations in OpenGL
instead.
This has no reason to be there. Put the functionality into another
function instead. While we're at it, also adjust for possible accuracy
issues with high bit depth YUV (matters for rendering subtitles into
screenshots only).
Because gcc (and clang) is a goddamn PITA and unnecessarily warns if
the universal initializer for structs is used (like mp_image x = {})
and the first member of the struct is also a struct, move the w/h
fields to the top.
They are redundant. They were used by draw_bmp.c only, and only in a
special code path that 1. used fixed image formats, and 2. had image
sized perfectly aligned to chroma boundaries (so computing the chroma
width/height is trivial).
There was a somewhat obscure optimization in the OSD and subtitle
rendering path: if only the position of the sub-images changed, and not
the actual image data, uploading of the image data could be skipped. In
theory, this could speed up things like scrolling subtitles.
But it turns out that even in the rare cases subtitles have such scrolls
or axis-aligned movement, modern libass rarely signals this kind of
change. Possibly this is because of sub-pixel handling and such, which
break this.
As such, it's a worthless optimization and just introduces additional
complexity and subtle bugs (especially in cases libass does the
opposite: incorrectly signaling a position change only, which happened
before). Remove this optimization, and rename bitmap_pos_id to
change_id.
Although the line count increases, this is better for making sure
everything is handled consistently for all users of the mp_csp_params
stuff.
This also makes sure mp_csp_params is always initialized with
MP_CSP_PARAMS_DEFAULTS (for consistency).
Silences a Coverity warning.
Also, drop the assert(); although it should be pretty much guaranteed
that things happen this way, it's still a bit fuzzy.
Until now, failure to allocate image data resulted in a crash (i.e.
abort() was called). This was intentional, because it's pretty silly to
degrade playback, and in almost all situations, the OOM will probably
kill you anyway. (And then there's the standard Linux overcommit
behavior, which also will kill you at some point.)
But I changed my opinion, so here we go. This change does not affect
_all_ memory allocations, just image data. Now in most failure cases,
the output will just be skipped. For video filters, this coincidentally
means that failure is treated as EOF (because the playback core assumes
EOF if nothing comes out of the video filter chain). In other
situations, output might be in some way degraded, like skipping frames,
not scaling OSD, and such.
Functions whose return values changed semantics:
mp_image_alloc
mp_image_new_copy
mp_image_new_ref
mp_image_make_writeable
mp_image_setrefp
mp_image_to_av_frame_and_unref
mp_image_from_av_frame
mp_image_new_external_ref
mp_image_new_custom_ref
mp_image_pool_make_writeable
mp_image_pool_get
mp_image_pool_new_copy
mp_vdpau_mixed_frame_create
vf_alloc_out_image
vf_make_out_image_writeable
glGetWindowScreenshot
Image areas with subtitles are upsampled to 4:4:4 in order to render
the subtitles (makes blending easier). Try not to copy the Y plane on
upsampling. The libswscale API requires this, but this commit works it
around by scaling the chroma planes separately as AV_PIX_FMT_GRAY8. The
Y plane is not touched at all. This is done for 420p8 only, which is the
most commonly needed case. For other formats, the old way is used.
Seems to make ASS rendering faster about 15% in the setup I tested.
Allocate it even if it's needed. The actually done work is almost the
same, except that the code is a bit simpler. May need more memory at
once for RGB subs that use more than one part, which is rare.
In order to support OSD redrawing for vo_xv and vo_x11, draw_bmp.c
included an awkward "backup" mechanism to copy and restore image
regions that have been changed by OSD/subtitles.
Replace this by a much simpler mechanism: keep a reference to the
original image, and use that to restore the Xv/X framebuffers.
In the worst case, this may increase cache pressure and memory usage,
even if no OSD or subtitles are rendered. In practice, it seems to be
always faster.
Even though #ifdef ACCURATE is removed, the result should be about the
same. The fallback is only used by packed YUV formats (YUYV, NV12), and
doing 16 bit for them instead of 8 bit is not useful.
A side effect is that Y8 (gray) is not converted drawing subs, and for
alpha formats, the alpha plane is not removed. This means the number of
planes after upsampling can be 1-4 (1: gray, 2: gray+alpha, 3: planar,
4: planar+alpha). The code has to be adjusted accordingly to work on the
color planes only. Also remove the workaround for the chroma shift 31
hack.
mp_image_alloc_planes() allocated images with minimal stride, even if
the resulting stride was unaligned. It was the responsibility of
vf_get_image() to set an image's width to something larger than
required to get an aligned stride, and then crop it. Always allocate
with aligned strides instead.
Get rid of IMGFMT_IF09 special handling. This format is not used
anymore. (IF09 has 4x4 chroma sub-sampling, and that is what it was
mainly used for - this is still supported.) Get rid of swapped chroma
plane allocation. This is not used anywhere, and VOs like vo_xv,
vo_direct3d and vo_sdl do their own swapping.
Always round chroma width/height up instead of down. Consider 4:2:0 and
an uneven image size. For luma, the size was left uneven, and the chroma
size was rounded down. This doesn't make sense, because chroma would be
missing for the bottom/right border.
Remove mp_image_new_empty() and mp_image_alloc_planes(), they were not
used anymore, except in draw_bmp.c. (It's still allowed to setup
mp_images manually, you just can't allocate image data with them
anymore - this is also done in draw_bmp.c.)
Setting the size of a mp_image must be done with mp_image_set_size()
now. Do this to guarantee that the redundant fields (like chroma_width)
are updated consistently. Replacing the redundant fields by function
calls would probably be better, but there are too many uses of them,
and is a bit less convenient.
Most code actually called mp_image_setfmt(), which did this as well.
This commit just makes things a bit more explicit.
Warning: the video filter chain still sets up mp_images manually,
and vf_get_image() is not updated.
Based on a patch by qyot27. Add the missing parts in mp_get_chroma_shift(),
which allow allocation of such images, and which make vo_opengl
automatically accept the new formats. Change the IMGFMT_IS_YUVP16_LE/BE
macros to properly report IMGFMT_444P14 as supported: this pixel format
has the highest numerical bit width identifier (0x55), which is not
covered by the mask ~0xfc. Remove 1 bit from the mask (makes it 0xf8) so
that IMGFMT_IS_YUVP16(IMGFMT_444P14) is 1. This is slightly risky, as
the organization of the image format IDs (actually FourCCs + mplayer
internal IDs) is messy at best, but it should be ok.
As pointed out in commit ed01df, the quality loss due to frequent
conversion between RGB and YUV is too much when drawing OSD and
subtitles.
Fix this by staying in the same colorspace when drawing subtitles.
Render directly to RGB, without converting to YUV first.
The bad thing about packed RGB is that there are many pixel formats,
which would all require special code for blending. It's also completely
incompatible to planar YUV. Use planar RGB instead, which allows us to
reuse all code originally written for planar YUV. The only thing that
needs to be changed is the color conversion in the libass case. (In
exchange for simpler code, the image has to be copied, but this is
still much better than converting to YUV.)
Unfortunately, libswscale doesn't support planar RGB output. Add a hack
to sws_utils.c to handle conversion to planar RGB. In the common case,
when converting 32 bit per pixel RGB, calling swscale can be avoided
entirely.
The change in mp_image.c is needed to allocate GBRP images correctly.
(The issue with vo_x11 could be easily solved by always backing up the
same bounding box as the bitmap drawing RGB<->YUV conversion does, but
this commit is probably the better fix.)