e97819f88e corrected a special case
condition that lead to an out of bounds access if the total width
happened to be an integer multiple of SLICE_W (256) which could cause a
crash in software VOs. However, it turns out that the functions in this
file evaluate quite differently when using encoding mode (and presumably
libmpv as well according to reports although I could not independently
verify it).
The logic here gets complicated but what ends up happening is that, in
blend_overlay_with_video, the value of x + w can be greater than p->w in
certain cases in encoding mode. The x is the positional value of the
slice which remained unchanged from before, but w can take the full
value of SLICE_W (256) which is not necessarily correct. The width of
the final slice here should be the total remaining width. We can handle
this in mark_rect by simply always adjusting x1 of the last slice to be
equal to total width - SLICE_W * x so it can never extend beyond where
it should be. In practice, this value should be the maximum allowed
here. I'm not sure if the existing x1 value can possibly already be
lower than SLICE_W, but just MPMIN it to be on the safe side.
Fixes#10908.
This has been a long standing annoyance - ffmpeg is removing
sizeof(AVPacket) from the API which means you cannot stack-allocate
AVPacket anymore. However, that is something we take advantage of
because we use short-lived AVPackets to bridge from native mpv packets
in our main decoding paths.
We don't think that switching these to `av_packet_alloc` is desirable,
given the cost of heap allocation, so this change takes a different
approach - allocating a single packet in the relevant context and
reusing it over and over.
That's fairly straight-forward, with the main caveat being that
re-initialising the packet is unintuitive. There is no function that
does exactly what we need (what `av_init_packet` did). The closest is
`av_packet_unref`, which additionally frees buffers and side-data.
However, we don't copy those things - we just assign them in from our
own packet, so we have to explicitly clear the pointers before calling
`av_packet_unref`. But at least we can make a wrapper function for
that.
The weirdest part of the change is the handling of the vtt subtitle
conversion. This requires two packets, so I had to pre-allocate two in
the context struct. That sounds excessive, but if allocating the
primary packet is too expensive, then allocating the secondary one for
vtt subtitles must also be too expensive.
This change is not conditional as heap allocated AVPackets were
available for years and years before the deprecation.
It turns out, even xy-VSFilter and XySubFilter do not
mangle colours if the video is native RGB regardless of
the sub's YCbCr header. libass' docs were also updated
to reflect this.
Commit 740b701 introduced handling for subtitles with unknown duration,
but it came with a minor flaw where a track event that shares identical
start time with following track event will has its Duration value set
to 0, we don't want this to happen since it will prevent simultaneous
rendering of multiple track events.
This commit aims to address this issue.
When the width is exactly a multiple of SLICE_W (currently 256),
heap buffer overflow is reported by Address Sanitizer. So adjust
the maximum index for the line array accordingly.
ASS must only automatically break at ASCII spaces (\x20), but other
subtitle formats might expect more breaking oppurtinities.
Especially non-ASS subs in scripts, which typically do not use (ASCII)
spaces to seperate words, like e.g. CJK, might overflow the screen
if the conversion didn't insert additional linebreaks (ffmpeg does not).
Thus try to enable Unicode linebreaking for converted subs and the OSD
if libass is new enough. The feature may still be unavailable at runtime
if libass wasn't build with Unicode linebreaking support.
TL;DR: previously a JavaScript VM was created + destroyed whenever
a sub track was initialized, even if no jsre filter was set.
Now a JS VM is created only if jsre filters were set.
Sub filters are initialized once when a subtitle track is chosen, and
then whenever the sub track changes or when some sub options change.
Sub filters init is synchronous - playback is suspended till it ends.
A filter can abort init early (get disabled) depending on conditions
specific to each filter. The regex and jsre filters aborted early
if the filter is disabled (default is enabled) or if the track is not
ass (relativey rare, e.g. bitmap subs).
The init then iterates over the filter strings, and if the result is
empty (common - no filter was added, but also if all strings failed
regex init) then it's also aborted during init.
While this iteration step is cheap with filter regex, with jsre it
requires instanciating the JS VM (mujs) in advance in order to parse
the filter strings at the list, and the VM is then destroyed if the
list ends up empty.
This VM create+destroy is fast but measurable (0.2 - 0.7 ms, slowest
measured on 2010 MacBook Air), but can be avoided altogether if we
check that the filter list is not empty before we create the VM.
So now we do just that.
this field is used only in a special vo draining edge case.
switching to an atomic reduces osd->lock contention
between the mpv core (in write_video) and vo threads
which are managing osd rendering manually (such as vo_rpi).
Signed-off-by: Aman Karmani <aman@tmm1.net>
Previously, the sub-visibility option changed the visibility of all
subtitles including secondary ones. This meant that it was not possible
to only display secondary subtitles while hiding the primary ones. This
modifies the sub-visibility option so that it only affects primary
subtitles which allows only secondary subtitles to be displayed.
This reverts commit 04f0b0abe4.
It's not a good idea to unify the names only for visibility, while
keeping secondary-* for everything else.
This needs a bit more thought before we allow secondary sub to be
visible on its own.
Adds --sub-visibility choices 'primary-only' for only displaying the
primary subtitle track, and 'secondary-only' for only displaying
secondary subtitle track.
Removes --secondary-sub-visibility and displays a message telling the
user to use --sub-visibility=yes/primary-only instead.
These changes make it so that the default 'sub-visibility' bind 'v'
cycles through all the 'sub-visibility' choices, 'no', 'yes',
'primary-only', and 'secondary-only'.
Since libavcodec major version 59, the requested "ass" format became
the default as the old timing-included format was disabled starting
with that version. Additionally, this option by itself has since
been deprecated as it no longer serves any purpose.
References:
FFmpeg/FFmpeg@1f63665ca5FFmpeg/FFmpeg@176b8d785bFixes#9413
This is an artificial background box with the original colors,
rendered as a plain rectangle instead of libass's opaque-box.
Fixes#9134 (together with the previous commit)
This breaks the rendering in various ways, so first workaround is to
simply disable the opaque box for the bar.
Next commit will add an artificial background box.
29e15e6248 prefixed youtube-dl's subs url with an edl prefix to not
download them until they're selected, which is useful when there are
many sub languages. But this prefix broke the alignment of secondary
subs, which would overlap the primary subs instead of always being
placed at the top. This can be tested with
--sub-file='edl://!no_clip;!delay_open,media_type=sub;secondary_sub.srt'
When a sub is added, sub.c:reinit_sub() is called. This calls in
init_subdec() -> dec_sub.c:sub_create() -> init_decoder() ->
sd_ass:init(). Then reinit_sub() calls
sub_control(track->d_sub, SD_CTRL_SET_TOP, &(bool){!!order}) which sets
sd_ass_priv.on_top = true for secondary subs.
But for EDL subs the real sub is initialized again when in
dec_sub.c:sub_read_packets() is_new_segment() returns true and
update_segment() is called, or when sub_get_bitmaps() calls
update_segment(). update_segment() then calls init_decoder(), which
calls sd_ass:init(), so sd_ass_priv is reinitialized, and its on_top
property is left false. This commit sets it to true again.
For URLs that need to be downloaded it seems that the update_segment()
call that reinitializes sd_ass_priv is always the one in
sub_read_packets(), but with local subs sub_get_bitmaps() is usually
called earlier (though there shouldn't be a reason to use the EDL URL
for local subs), so I added the order parameter to sub_create(), rather
than adding it to all of update_segment(), sub_read_packets() and
sub_get_bitmaps().
Also removes the cast to bool in the other sub_control call, since
sub/sd_ass.c:control already casts arg to bool when cmd is
SD_CTRL_SET_TOP.
Using --sub-filter-regex-plain (default:no)
The ass-to-plaintext functionality already existed at sd_ass.c, but
it's internal and uses a private buffer type, so a trivial utility
wrapper was added with standard char*/bstr interface.
The plaintext can be multi-line, and the multi-line regexp flag is now
always set, but only affects plaintext (the ASS source is one line).
Pretty much identical to filter-regex but with JS expressions and
requires only JS support. Shares the filter-regex-* control options.
The target audience is Windows users - where filter-regex doesn't
work due to missing APIs, but mujs builds cleanly on Windows, and JS
is usually enabled in 3rd party Windows mpv builds.
Lua could have been used with similar effort, however, the JS regex
syntax is more extensive and also much more similar to POSIX.
1. On a pathological case where event_format is NULL, previously the
filter was trying to use it with each new sub - and re-failed. Now
the filter gets disabled on init (event_format doesn't change).
2. Previously, if the filter didn't modify the text or if the text
could not be extracted - it still allocated a new packet with same
content. Now it returns the original, saving a whole lot of no-ops
(there are still few allocations in this case though).
1 above is preparation for the next commit, and 2 was trivial, but
there's more to do if anyone cares (NIH string functions instead of
bstr, unused arguments, messages could be improved, and more).
Add two stand-alone function to help with the text-extraction task
which ass filters need. Makes it easier to add new filters without
cargo-culting this functionality.
Currently, on malformed event (which shouldn't happen), a warning is
printed when a filter tries to extract the text, so if few filters
are enabled, we'll get multiple warnings (like before) - not critical.
The regex filter now uses these utils, the SDH filter not yet.
This allows users to control whether full dialogue subtitles are displayed
with an audio track already in their preferred subtitle language.
Additionally, this improves handling for the forced flag, automatically
selecting between forced and unforced subtitle streams based on the user's
settings and the selected audio.
Seems like this is requested all the time.
It seems libass allows out of range values, but does allows the subtitle
to go out of the screen at the bottom (only when moving it to the top
it's "clamped"). Too bad, don't do that then. The bitmap sub rendering
code on the other hand is under our control, and will not move a
subtitle out of the screen.
Fixes: #7986
Options like --sub-ass-force-style and others could not be changed at
runtime (the changes didn't take any effect). Fix this by using the
brutal approach, and completely reinit the subtitle state when this
happens. Maybe a bit clunky, but for now I'd rather not put more effort
into this.
Fixes: #7689
libass recently switched the default from 1 to 0 for compatibility
with ASS scripts that rely on the historical/VSFilter default of 0.
libass does attempt to detect and avoid breaking scripts that rely
on the historic libass-only default of 1, but it doesn't cover tracks
created directly through the API, so set the header explicitly.
Fixes https://github.com/mpv-player/mpv/issues/7900.
Remove the vaguely defined plane_bits and component_bits fields from
struct mp_imgfmt_desc. Add weird replacements for existing uses. Remove
the bytes[] field, replace uses with bpp[].
Fix some potential alignment issues in existing code. As a compromise,
split mp_image_pixel_ptr() into 2 functions, because I think it's a bad
idea to implicitly round, but for some callers being slightly less
strict is convenient.
This shouldn't really change anything. In fact, it's a 100% useless
change. I'm just cleaning up what I started almost 8 years ago (see
commit 00653a3eb0). With this I've decided to keep mp_imgfmt_desc,
just removing the weird parts, and keeping the saner parts.
See manpage additions. This was requested, sort of. Although what has
been requested might be something completely different. So this is
speculative.
This also changes sub_get_text() to return an allocated copy, because
the buffer shit was too damn messy.
Whatever it's worth. Instead of doing a pretty stupid conversion to
float, just blend it directly. This works for most RGB formats that are
8 bits per component or below (the latter because we expand packed
fringe RGB formats for simplicity). For higher bit depth RGB this would
need extra code.
This checked params->color.space for being RGB. If the colorspace is
unset, this did dumb things because even if the imgfmt was a RGB one,
the colorspace was not set to RGB. This actually also happened to the
tests.
(Short-cutting RGB like this is actually wrong, since RGB could still
have strange gamma or primaries, which would warrant a full conversion.
So you'd need to check for these other parameters as well. To be fixed
later.)
Maybe this is useful for some of the lesser VOs. It's preferable over
bad ad-hoc solutions based on the more complex sub_bitmap data
structures (as observed e.g. in vo_vaapi.c), and does not use that much
more code since draw_bmp already created such an overlay internally.
But I still wanted something that avoids having to upload/render a full
screen-sized overlay if for example there's only a tiny subtitle line on
the bottom of the screen. So the new API can return a list of modified
pixels (for upload) and non-transparent pixels (for display). The way
these pixel rectangles are computed is a bit dumb and returns dumb
results, but it should be usable, and the implementation can change.
They are sort of confusing, and they hide the fact that they have an
alpha component. Using the actual formats directly is no problem, sicne
these were useful only for big endian systems, something we can't test
anyway.
draw_bmp.c is the software blender for subtitles and OSD. It's used by
encoding mode (burning subtitles), and some VOs, like vo_drm, vo_x11,
vo_xv, and possibly more.
This changes the algorithm from upsampling the video to 4:4:4 and then
blending to downsampling the OSD and then blending directly to video.
This has far-reaching consequences for its internals, and results in an
effective rewrite.
Since I wanted to avoid un-premultiplying, all blending is done with
premultiplied alpha. That's actually the sane thing to do. The old code
just didn't do it, because it's very weird in YUV fixed point.
Essentially, you'd have to compensate for the chroma centering constant
by subtracting src_alpha/255*128. This seemed so hairy (especially with
correct rounding and high bit depths involved) that I went for using
float.
I think it turned out mostly OK, although it's more complex and less
maintainable than before. reinit() is certainly a bit too long. While it
should be possible to optimize the RGB path more (for example by
blending directly instead of doing the stupid float conversion), this is
probably slower. vo_xv users probably lose in this, because it takes the
slowest path (due to subsampling requirements and using YUV).
Why this rewrite? Nobody knows. I simply forgot the reason. But you'll
have it anyway. Whether or not this would have required a full rewrite,
at least it supports target alpha now (you can for example hard sub
transparent PNGs, if you ever wanted to use mpv for this).
Remove the check in vf_sub. The new draw_bmp.c is not as reliant on
libswscale anymore (mostly uses repack.c now), and osd.c shows an
error message on missing support instead now.
Formats with chroma subsampling of 4 are not supported, because FFmpeg
doesn't provide pixfmt definitions for alpha variants. We could provide
those ourselves (relatively trivial), but why bother.
The OSD is passed to VOs via struct sub_bitmaps, which has a change_id
field. This field is incremented whenever there is a (potential) change
to the other struct contents. If not, the VO can rely on it not having
changed. This must include for example sub_bitmap.x and sub_bitmap.dw.
If these two fields (and y equivalents) change, change_id must change,
even if the subtitle bitmap data might still be the same.
sd_lavc.c stopped respecting this at some unknown point. It could
sometimes cause problems, though usually only with bad and old VOs which
somehow relied on this more than vo_gpu. (I've actually encountered this
before with sd_lavc subtitle scaling, as indicated by a nasty comment,
though probably didn't track this down, since said old VOs can die in a
fire.)
Fix this by maintaining the change_id explicitly. Unfortunately adds
even more code. Instead of comparing the result we could track property
changes, but I think this is better. The number of parts is always very
low with this subtitle decoder, so there's no actual performance issue
to worry about.
This could be triggered by scaling changes (video-zoom etc.), but
probably also changing bitmap subtitle position or scaling.
Should be somewhat helpful. (All VOs are full of code trying to
compensate for this, more or less, and this will allow simplifying
some code later. Maybe.)
The screen size is mostly for robustness checks.
Making OSD/subtitle bitmaps refcounted was planend a longer time ago,
e.g. the sub_bitmaps.packed field (which refcounts the subtitle bitmap
data) was added in 2016. But nothing benefited much from it, because
struct sub_bitmaps was usually stack allocated, and there was this weird
callback stuff through osd_draw().
Make it possible to get actually refcounted subtitle bitmaps on the OSD
API level. For this, we just copy all subtitle data other than the
bitmaps with sub_bitmaps_copy(). At first, I had planned some fancy
refcount shit, but when that was a big mess and hard to debug and just
boiled to emulating malloc(), I made it a full allocation+copy. This
affects mostly the parts array. With crazy ASS subtitles, this parts
array can get pretty big (thousands of elements or more), in which case
the extra alloc/copy could become performance relevant. But then again
this is just pure bullshit, and I see no need to care. In practice, this
extra work most likely gets drowned out by libass murdering a single
core (while mpv is waiting for it) anyway. So fuck it.
I just wanted this so draw_bmp.c requires only a single call to render
everything. VOs also can benefit from this, because the weird callback
shit isn't necessary anymore (simpler code), but I haven't done anything
about it yet. In general I'd hope this will work towards simplifying the
OSD layer, which is prerequisite for making actual further improvements.
I haven't tested some cases such as the "overlay-add" command. Maybe it
crashes now? Who knows, who cares.
In addition, it might be worthwhile to reduce the code duplication
between all the things that output subtitle bitmaps (with repacking,
image allocation, etc.), but that's orthogonal.