There are 3 packet reading functions in the demux API, which all
function completely differently. One of them, demux_read_packet(), has
only 1 caller, which is in dec_sub.c. Change this caller to use
demux_read_packet_async() instead. Since it really wants to do a
blocking call, setup some proper waiting. This uses mp_dispatch_queue,
because even though it's overkill, it needs the least code.
In practice, waiting actually never happens. This code is only called on
code paths where everything is already read into memory (libavformat's
subtitle demuxers simply behave this way). It's still a bit of a
"coincidence", so implement it properly anyway.
If suubtitle decoder init fails, we still need to unset the demuxer
wakeup callback. Add a sub_destroy() call to the failure path. This also
happens to fix a missed pthread_mutex_destroy() call (in practice this
was a nop, or a memory leak on BSDs).
UB-sanitizer complains that we shift bits into the sign (when a is
used). Change it to unsigned, which in theory is more correct and
silences the warning.
Doesn't matter in practice, both the "bug" and the fix have 0 impact.
It would always autodetect it based on the passed style block,
but as we are defining it - we might as well define it always.
(As far as I can see all decoders in libavcodec utilize 4+ style
blocks)
Manual changes done:
* Merged the interface-changes under the already master'd changes.
* Moved the hwdec-related option changes to video/decode/vd_lavc.c.
Only printable ASCII characters were considered to be valid texts. Make
it possible that UTF-8 contents are also considered valid.
This does not make the SDH subtitle filter support non-English
languages. This just prevents the filter from blindly marking lines that
have only UTF-8 characters as empty.
Fixes#6502
libavcodec normally drops subtitle lines that fail a check for invalid
UTF-8 (their check is slightly broken too, by the way). This was always
annoying and inconvenient, but now there is a mechanism to prevent
it from doing this. Requires newst libavcodec.
Remove them from the big MPOpts struct and move them to their sub
structs. In the places where their fields are used, create a private
copy of the structs, instead of accessing the semi-deprecated global
option struct instance (mpv_global.opts) directly.
This actually makes accessing these options finally thread-safe. They
weren't even if they should have for years. (Including some potential
for undefined behavior when e.g. the OSD font was changed at runtime.)
This is mostly transparent. All options get moved around, but most users
of the options just need to access a different struct (changing sd.opts
to a different type changes a lot of uses, for example).
One thing which has to be considered and could cause potential
regressions is that the new option copies must be explicitly updated.
sub_update_opts() takes care of this for example.
Another thing is that writing to the option structs manually won't work,
because the changes won't be propagated to other copies. Apparently the
only affected case is the implementation of the sub-step command, which
tries to change sub_delay. Handle this one explicitly (osd_changed()
doesn't need to be called anymore, because changing the option triggers
UPDATE_OSD, and updates the OSD as a consequence). The way the option
value is propagated is rather hacky, but for now this will do.
It was split at least across osd.c and sd_ass.c/sd_lavc.c. sd_lavc.c
actually ignored most of the more obscure subtitle timing things.
There's no reason for this - just move it all to dec_sub.c (mostly from
sd_ass.c, because it has some of the most complex stuff).
Now timestamps are transformed as they enter or leave dec_sub.c.
There appear to have been some subtle mismatches about how subtitle
timestamps were transformed, e.g. sd_functions.accepts_packet didn't
apply the subtitle speed to the timestamp. This patch should fix them,
although it's not clear if they caused actual misbehavior.
The semantics of SD_CTRL_SUB_STEP are slightly changed, which is the
reason for the changes in command.c and sd_lavc.c.
I've decided that MP_TRACE means “noisy spam per frame”, whereas
MP_DBG just means “more verbose debugging messages than MSGL_V”.
Basically, MSGL_DBG shouldn't create spam per frame like it currently
does, and MSGL_V should make sense to the end-user and provide mostly
additional informational output.
MP_DBG is basically what I want to make the new default for --log-file,
so the cut-off point for MP_DBG is if we probably want to know if for
debugging purposes but the user most likely doesn't care about on the
terminal.
Also, the debug callbacks for libass and ffmpeg got bumped in their
verbosity levels slightly, because being external components they're a
bit less relevant to mpv debugging, and a bit too over-eager in what
they consider to be relevant information.
I exclusively used the "try it on my machine and remove messages from
MSGL_* until it does what I want it to" approach of refactoring, so
YMMV.
The OpenType Font File specification recommends that "Collection fonts
that use CFF or CFF2 outlines should have an .OTC extension." mpv
should accept .otc as a fallback extension for font detection should
the mimetype detection fail.
IETF RFC8081 added the "font" top-level media type,
including font/ttf, font/otf, font/sfnt, and also
font/collection. These font formats are all supported
by mpv/libass but they are not accepted as valid
Matroska mime types. mpv can load them via file extension
and they work as expected, so files using the new types
should not trigger a warning from mpv.
The current invocation of bstr_cut is as good as no cutting at all.
Almost the entire header is reread in every iteration of the loop.
I don't know how many styles libavcodec tends to generate, but if
(now or in the future) it generates many, then this loop is slow
for no good reason. If anything, the code would be more clear and
have the same performance if it didn't call bstr_cut at all.
The intention here (and the sensible thing regardless) seems to be
to skip the part of the string that bstr_find has already looked
through and found nothing. This commit additionally skips the whole
substring, because overlapping matches are impossible.
In some cases, demux_mkv will detect a start time slightly above 0, but
there might still be a subtitle starting at exactly 0. When the player
rebases the timestamps to assumed start time, the subtitle will have a
slightly negative timestamp in the end. libavcodec's subtitle converter
turns this into a larger number due to underflow. Fix by clamping
subtitles always to 0, which may or may not be what you want.
At least it fixes#5047.
The new_segment field was used to track the decoder data flow handler of
timeline boundaries, which are used for ordered chapters etc. (anything
that sets demuxer_desc.load_timeline). This broke seeking with the
demuxer cache enabled. The demuxer is expected to set the new_segment
field after every seek or segment boundary switch, so the cached packets
basically contained incorrect values for this, and the decoders were not
initialized correctly.
Fix this by getting rid of the flag completely. Let the decoders instead
compare the segment information by content, which is hopefully enough.
(In theory, two segments with same information could perhaps appear in
broken-ish corner cases, or in an attempt to simulate looping, and such.
I preferred the simple solution over others, such as generating unique
and stable segment IDs.)
We still add a "segmented" field to make it explicit whether segments
are used, instead of doing something silly like testing arbitrary other
segment fields for validity.
Cached seeking with timeline stuff is still slightly broken even with
this commit: the seek logic is not aware of the overlap that segments
can have, and the timestamp clamping that needs to be performed in
theory to account for the fact that a packet might contain a frame that
is always clipped off by segment handling. This can be fixed later.
If a VO-area option changes, gl_video_resize() is called
unconditionally. This function does something even if the size does not
change (at least it discards buffered frames for interpolation), which
can lead to stutter when you keep firing option change events during
playback.
Check for an actual resize, and if nothing changes, exit early.
Lua scripts can call osd_set_external() early (before the VO window is
created and obj->vo_res is filled), in which case the PlayResX field
would be set to nonsense, and libass would print a pointless warning.
There's an easy and a hard fix: either just go on and pass dummy values
to libass (basically like before, just clamp them to avoid the values
which make libass print the warning). Or attempt to update the PlayRes
field to correct values on rendering (since at rendering time, you
always know the screen size and the correct values). Do the latter.
Since various things use PlayRes for scaling things, this might still
not be fully ideal. This is a general problem with the async scripting
interface.
This API isn't deprecated (yet?), but it's still inferior and harder to
use than avcodec_free_context().
Leave the call only in 1 case in af_lavcac3enc.c, where we apparently
seriously close and reopen the encoder for whatever reason.
List of changes:
1. Rename `signfs` to `scale`, to better match what it actually does
(force --sub-scale to apply to ASS subtitles), and fix the blatantly
wrong documentation (it actually specifically does *not* apply to
signs)
2. Rename `--sub-ass-style-override` to `--sub-ass-override` to help
reduce confusion between it and `--sub-ass-force-style`, as well as
pointing out that it doesn't necessarily actually override styles.
(The new `scale` option, for example, only sets
ASS_OVERRIDE_BIT_FONT_SIZE, but not ASS_OVERRIDE_BIT_STYLE)
3. Mention that `--sub-ass-override` is generally sort of smart about
only overriding dialog, not signs.
All contributors of the code used for these files agreed to the LGPL
relicensing.
There are some unaccounted contributors, but all of their code was
completely removed before. (The only exception is one contributor whose
only line left was "#include <string.h>". I don't know if that's
copyrightable, but it wasn't needed anyway, so just remove it.)
These files started out as libvo/sub.* (renamed to sub/sub.*, then
renamed again to sub/osd.*). They used to contain code for rendering
the OSD (as in, actual pixel manipulation and text layouting). But
later all this code was dropped, and libass was used to render the OSD
instead. Actual subtitle rendering was reimplemented in other files
(the old subtitle rendering path is completely gone).
One potential problem are the option declarations, which makes this
harder, as these options involve more history. But it turns out most of
them were reimplemented since 80270218cb, rather than taken from old
code. (Although not all - but the rest covered by relicensing
agreements.)
This also affects osd_state.h, which was apparently incorrectly implied
to be LGPL.
All contributors have agreed.
Compared to sd_ass.c, this has a pretty simple history:
av_sub.c -> sub/av_sub.c -> sub/sd_lavc.c
At one point, some code from spudec.c was added to it, but it was
removed again later.
All contributors of the code used for sd_ass.c agreed to the LGPL
relicensing. Some code has a very chaotic history, due to MPlayer
subtitle handling being awful, chaotic, and having been refactored a
dozen of times. Most of the subtitle code was actually rewritten from
scratch (a few times), and the initial sd_ass.c was pretty tiny. So we
should be fine, but it's still a good idea to look at this closely.
Potentially problematic cases of old code leaking into sd_ass.c are
mentioned below.
Some code originates from demux_mkv. Most of this was added by eugeni,
and later moved into mplayer.c or mpcommon.c. The old demux_mkv ASS/SSA
subtitle code is somewhat dangerous from a legal perspective, because it
involves 2 patches by a certain Tristan/z80, who disagreed with LGPL,
and who told us to "rewrite" parts we need. These patches were for
converting the ASS packet data to the old MPlayer text subtitle data
structures. None of that survived in the current code base.
Moving the subtitle handling out of demux_mkv happened in the following
commits: bdb6a07d2a, de73d4dd97, 61e4a80191. The code by
z80 was removed in b44202b69f.
At this time, the z80 code was located in mplayer.c and subreader.c.
The code was fully removed, being unnecessary due to the entire old
subtitle rendering code being removed. This adds a ass_to_plaintext(),
function, which replaces the old ASS tag stripping code in
sub_add_text(), which was based on the z80 code. The new function was
intended to strip ASS tags in a correct way, instead of somehow
dealing with other subtitle types (like HTML-style SRT tags), so it
was written from scratch.
Another potential issue is the --sub-fix-timing option is based on
-overlapsub added in d459e64463. But the implementation is new, and
no code from that commit is used in sd_ass.c. The new implementation
started out in 64b1374a44. (The following commit, bd45eb468c
removes the original code that was replaced.) The code was later
moved into sd_ass.c.
The --sub-fps option has a similar history.
Somewhat chaostic history: libass/ass_mp.* -> ass_mp.* -> sub/ass_mp.*
As far as I can tell, everyone who ever touched these files has agreed
to the relicensing.
When the format of the subtitle bitmaps changes, such as with taking
screenshots with vo_vaapi (RGBA for the VO vs. Y8 for screenshots), the
cache image obviously needs to be recreated.
Fixes#4325.
When doing harder filtering not require a space after : results
in lines with a clock (like 10:05) to be taken as a speaker label.
So require a space after : even when doing harder filtering as
missing space is very uncommon.
Some like to add text in parentheses in the speaker label,
like XXX (loud): or just (loud):
allow parentheses when doing harder filtering
Add subtitle filter to remove additions for deaf or hard-of-hearing
(SDH). This is for English, but may in part work for others too.
This is an ASS filter and the intention is that it can always be
enabled as it by default do not remove parts that may be normal text.
Harder filtering can be enabled with an additional option.
Signed-off-by: wm4 <wm4@nowhere>
This means the subtitles will show as "intended".
For some weird reason, --sub-ass-style-override is the option that
controls style override, which implies it's specific to ASS. While that
seems weird and doesn't always reflect reality, I don't care about that
now.
To make it easier for the eyes, multi line subtitles should
be left justified (for most languages).
This adds an option to define how subtitles are to be justified
inpendently of how they are aligned.
Also add option to enable --sub-justify to be applied on ASS subtitles.
A hacky, convoluted, half-working mess that attempts to cut off overlong
playlists.
It does so by relying on the ASS formatting rule that the font size is
specified in the virtual PlayResY resolution. This means we can
(normally) easily tell how many lines fit on the screen. On the other
hand, this does not work if the text is wrapped.
This as a kludge until a Better™ solution is available.
This core of this heuristic was once copied from MPlayer's spudec.c. I
think it was meant for the case when the resolution field was missing or
so.
I couldn't find a file for which this actually does something. On the
other hand, there are samples which actually have a smaller resolution
than 720x576, and which are broken by this old hack.
For subtitles that set no resolution (I'm not sure which codec/container
that would be), there's still the fallback on video resolution.
Just get rid of this hack. Also cleanup a bit. SD_CTRL_GET_RESOLUTION
hasn't been used since DVD menu removal. get_resolution() is left with 1
call site, and would be quite awkward to keep, so un-inline it.
The previous commit means subtitles were reinitialized on every seek
(even within a segment). This commit restores the old behavior.
To check whether the segment changed at all, we don't reset the current
start/end values. This assumes the decoder wrapper is always fed by a
stream which doesn't mix segment and non-segment packets, which is
currently always true.