Over the years, we've accumulated several secondary subtitle related
options and properties, but the implementation was not really consistent
and it wasn't clear what the right process for adding more should be. So
to make things nicer, let's refactor all of the subtitle options with
secondary variants (sub-delay, sub-pos, and sub-visibility) and split
them off to a new, separate struct. All of the underlying values are
stored in an array instead for simplicity. Additionally, the
implementation of some secondary-sub-* properties were slightly changed
so there would be less redundancy.
In the sub seek code path, there was an arbitrary small offset added to
the pts before the seek. However when seeking backwards, the offset was
an additional subtraction. de6eace6e9
added this logic 10 years ago and perhaps it made sense then, but the
additional subtraction when seeking backwards causes the subtitle seek
to go too far to the previous subtitle if the durations overlap. This
should always be an addition to work correctly. Additionally, the sub
stepping code path also could use this offset for the same reason
(duration overlaps). However, it is only applicable to sd_ass not
sd_lavc. sd_lavc has step_sub support but on a sample it didn't even
work anyway. Perhaps it only works for certain kinds of subtitles
(patches welcome).
Anyways instead of keeping this offset as a magic number, we can define
it in sd.h which is handy for this. For sd_ass, we add the offset when
sub stepping, and the offset is always added for sub seeking like it was
before. Update the comment to be a little more relevant to what actually
happens today. Fixes#11445.
First of all, this never worked. Or if it ever did, it was in some
select few scenarios. c9474dc9ed is what
originally added support for the auto choice. However, that commit
worked by propagating a value to a fake option used internally. This
shouldn't have ever worked because the underlying m_config_cache was
never updated so the value shouldn't have been preserved when accessed
in sd_lavc. And indeed with some testing, the value there is always 0
unsurprisingly.
This was later rewritten in ba7cc07106
along with a lot of other sub changes, but with that, it was still
mostly broken. The reason is because one of the key parts of having to
hit this logic (prefer_forced) required `--no-subs-with-matching-audio`
to be set. If the audio language matches the subtitle language (the
requirement also excludes forced subs), the option makes no subtitle
selection in the first place so pick->forced_only_def is not set to true
and nothing even happens. Another way around this would be to attempt to
change your OS language (like with the LANG environment variable) so
that the subtitle track gets selected but then audio_matches mistakenly
becomes false because it compares the OS language to the audio language
which then make preferred_forced 0, so nothing happens. I don't think
there's a scenario where pick->forced_only_def is actually set to true
(thus meaning `auto` is useless), but maybe someone could contrive
something very strange. Regardless, it's definitely not something even
remotely common.
fbe8f99194 changed track selection again
but didn't consider this particular case. The net result is that DVD/PGS
subs become equivalent to --sub-forced-only being yes, so this a change
in behavior and probably not a good one. Note that I wasn't able to
actually observe any difference in a PGS sample. It still displayed
subtitles fine but that sample probably didn't have the right flags to
hit the sub-forced-only logic.
Anyways, the auto feature is extremely questionable at best and in my
view, not actually worth it. It is meant to be used with
`--no-subs-with-matching-audio` to display forced pictures in subtitle
tracks that are not marked as forced, but that contradicts that
particular option's purpose and description in the manual (secretly
selecting a track under certain conditions even though it says not to).
Instead of trying to shove all this logic into select_default_track
which is already insanely complicated as it is, recognize that this is a
trivial lua script. If you absolutely want to turn --sub-forced-only on
under these certain conditions (DVD/PGS subtitles, matching audio and
subtitle languages, etc.), just look at the current-tracks property and
do your thing. The very, very niche behavior that this option tried to
accomplish basically never worked, no user even knows what this option
does, and well it's just not worth supporting in core mpv code. Drop
all this code for sanity's sake and change --sub-forced-only back to a
bool.
This way we receive such minor details as the profile (necessary for
ARIB captions, among others) during init. This enables decoders
to switch between ARIB caption profile A and profile C streams.
Using --sub-filter-regex-plain (default:no)
The ass-to-plaintext functionality already existed at sd_ass.c, but
it's internal and uses a private buffer type, so a trivial utility
wrapper was added with standard char*/bstr interface.
The plaintext can be multi-line, and the multi-line regexp flag is now
always set, but only affects plaintext (the ASS source is one line).
Pretty much identical to filter-regex but with JS expressions and
requires only JS support. Shares the filter-regex-* control options.
The target audience is Windows users - where filter-regex doesn't
work due to missing APIs, but mujs builds cleanly on Windows, and JS
is usually enabled in 3rd party Windows mpv builds.
Lua could have been used with similar effort, however, the JS regex
syntax is more extensive and also much more similar to POSIX.
Add two stand-alone function to help with the text-extraction task
which ass filters need. Makes it easier to add new filters without
cargo-culting this functionality.
Currently, on malformed event (which shouldn't happen), a warning is
printed when a filter tries to extract the text, so if few filters
are enabled, we'll get multiple warnings (like before) - not critical.
The regex filter now uses these utils, the SDH filter not yet.
See manpage additions. This was requested, sort of. Although what has
been requested might be something completely different. So this is
speculative.
This also changes sub_get_text() to return an allocated copy, because
the buffer shit was too damn messy.
Making OSD/subtitle bitmaps refcounted was planend a longer time ago,
e.g. the sub_bitmaps.packed field (which refcounts the subtitle bitmap
data) was added in 2016. But nothing benefited much from it, because
struct sub_bitmaps was usually stack allocated, and there was this weird
callback stuff through osd_draw().
Make it possible to get actually refcounted subtitle bitmaps on the OSD
API level. For this, we just copy all subtitle data other than the
bitmaps with sub_bitmaps_copy(). At first, I had planned some fancy
refcount shit, but when that was a big mess and hard to debug and just
boiled to emulating malloc(), I made it a full allocation+copy. This
affects mostly the parts array. With crazy ASS subtitles, this parts
array can get pretty big (thousands of elements or more), in which case
the extra alloc/copy could become performance relevant. But then again
this is just pure bullshit, and I see no need to care. In practice, this
extra work most likely gets drowned out by libass murdering a single
core (while mpv is waiting for it) anyway. So fuck it.
I just wanted this so draw_bmp.c requires only a single call to render
everything. VOs also can benefit from this, because the weird callback
shit isn't necessary anymore (simpler code), but I haven't done anything
about it yet. In general I'd hope this will work towards simplifying the
OSD layer, which is prerequisite for making actual further improvements.
I haven't tested some cases such as the "overlay-add" command. Maybe it
crashes now? Who knows, who cares.
In addition, it might be worthwhile to reduce the code duplication
between all the things that output subtitle bitmaps (with repacking,
image allocation, etc.), but that's orthogonal.
Works as ad-filter. I had some more plans, for example replacing
matching text with different text, but for now it's dropping matches
only. There's a big warning in the manpage that I might change
semantics. For example, I might turn it into a primitive sed.
In a sane world, you'd probably write a simple script that processes
downloaded subtitles before giving them to mpv, and avoid all this
complexity. But we don't live in a sane world, and the sooner you learn
this, the happier you will be. (But I also want to run this on muxed
subtitles.)
This is pretty straightforward. We use POSIX regexes, which are readily
available without additional pain or dependencies. This also means it's
(apparently) not available on win32 (MinGW). The regex list is because I
hate big monolithic regexes, and this makes it slightly better.
Very superficially tested.
Until now, filter_sdh was simply a function that was called by sd_ass
directly (if enabled).
I want to add another filter, so it's time to turn this into a somewhat
more general subtitle filtering infrastructure.
I pondered whether to reuse the audio/video filtering stuff - but better
not. Also, since subtitles are horrible and tend to refuse proper
abstraction, it's still messed into sd_ass, instead of working on the
dec_sub.c level. Actually mpv used to have subtitle "filters" and even
made subtitle converters part of it, but it was fairly horrible, so
don't do that again.
In addition, make runtime changes possible. Since this was supposed to
be a quick hack, I just decided to put all subtitle filter options into
a separate option group (=> simpler change notification), to manually
push the change through the playloop (like it was sort of before for OSD
options), and to recreate the sub filter chain completely in every
change. Should be good enough.
One strangeness is that due to prefetching and such, most subtitle
packets (or those some time ahead) are actually done filtering when we
change, so the user still needs to manually seek to actually refresh
everything. And since subtitle data is usually cached in ASS_Track (for
other terrible but user-friendly reasons), we also must clear the
subtitle data, but of course only on seek, since otherwise all subtitles
would just disappear. What a fucking mess, but such is life. We could
trigger a "refresh seek" to make this more automatic, but I don't feel
like it currently.
This is slightly inefficient (lots of allocations and copying), but I
decided that it doesn't matter. Could matter slightly for crazy ASS
subtitles that render with thousands of events.
Not very well tested. Still seems to work, but I didn't have many test
cases.
Remove them from the big MPOpts struct and move them to their sub
structs. In the places where their fields are used, create a private
copy of the structs, instead of accessing the semi-deprecated global
option struct instance (mpv_global.opts) directly.
This actually makes accessing these options finally thread-safe. They
weren't even if they should have for years. (Including some potential
for undefined behavior when e.g. the OSD font was changed at runtime.)
This is mostly transparent. All options get moved around, but most users
of the options just need to access a different struct (changing sd.opts
to a different type changes a lot of uses, for example).
One thing which has to be considered and could cause potential
regressions is that the new option copies must be explicitly updated.
sub_update_opts() takes care of this for example.
Another thing is that writing to the option structs manually won't work,
because the changes won't be propagated to other copies. Apparently the
only affected case is the implementation of the sub-step command, which
tries to change sub_delay. Handle this one explicitly (osd_changed()
doesn't need to be called anymore, because changing the option triggers
UPDATE_OSD, and updates the OSD as a consequence). The way the option
value is propagated is rather hacky, but for now this will do.
Add subtitle filter to remove additions for deaf or hard-of-hearing
(SDH). This is for English, but may in part work for others too.
This is an ASS filter and the intention is that it can always be
enabled as it by default do not remove parts that may be normal text.
Harder filtering can be enabled with an additional option.
Signed-off-by: wm4 <wm4@nowhere>
The accepts_packet packet callback is supposed to deal with subtitle
decoders which have only a small queue of current subtitle events (i.e.
sd_lavc.c), in case feeding it too many packets would discard events
that are still needed.
Normally, the number of subtitles that need to be preserved is estimated
by the rendering pts (get_bitmaps() argument). Rendering lags behind
decoding, so normally the rendering pts is smaller than the next video
frame pts, and we simply discard all subtitle events until the rendering
pts.
This breaks down in some annoying corner cases. One of them is seeking
backwards: the VO will still try to render the old PTS during seeks,
which passes a high PTS to the subtitle renderer, which in turn would
discard more subtitles than it should. There is a similar issue with
forward seeks. Add hacks to deal with those issues.
There should be a better way to deal with the essentially unknown
"rendering position", which is made worse by screenshots or rendering
with vf_sub. At the very least, we could handle seeks better, and e.g.
either force the VO not to re-render subs after seeks (ugly), or
introduce seek sequence numbers to distinguish attempts to render
earlier subtitles when a seek is done.
The intention is to let mp_ass_packer_pack() produce different output
for the RGBA and LIBASS formats. VOs (or whatever generates the OSD)
currently do not signal a preferred format, and this mechanism just
exists to switch between RGBA and LIBASS formats correctly, preferring
LIBASS if the VO supports it.
Subtitles can be preloaded, which means they're fully read and copied
into ASS_Track. This in turn is mainly for the sake of being able to do
subtitle seeking (when it comes down to it, subtitle seeking is the
cause for most trouble here).
Commit a714f8e92 broke preloaded subtitles which have events with
unknown duration, such as some MicroDVD samples. The event list gets
cleared on every seek, so the property of being preloaded obviously gets
lost.
Fix this by moving most of the preloading logic to dec_sub.c. If the
subtitle list gets cleared, they are not considered preloaded anymore,
and the logic for demuxed subtitles is used.
As another minor thing, preloadeding subtitles did neither disable the
demux stream, nor did it discard packets. Thus you could get queue
overflows in theory (harmless, but annoying). Fix this by explicitly
discarding packets in preloaded mode.
In summary, now the only difference between preloaded and normal
demuxing are:
1. a seek is issued, and all packets are read on start
2. during playback, discard the packets instead of feeding them to the
subtitle decoder
This is still petty annoying. It would be nice if maintaining the
subtitle index (and maybe a subtitle packet cache for instant subtitle
presentation when seeking back) could be maintained in the demuxer
instead. Half of all file formats with interleaved subtitles have
this anyway (mp4, mkv muxed with newer mkvmerge).
Commit 8d4a179c made subtitle decoders pick up fonts strictly from the
same source file (i.e. the same demuxer).
It breaks some fucked up use-case, and 2 people on this earth complained
about the change because of this. Add it back.
This copies all attached fonts on each subtitle init. I considered
converting attachments to use refcounting, but it'd probably be much
more complex.
Since it's slightly harder to get a list of active demuxers with
duplicate removed, the prev_demuxer variable serves as a hack to achieve
almost the same thing, except in weird corner cases. (In which fonts
could be added twice.)
This is mainly a refactor. I'm hoping it will make some things easier
in the future due to cleanly separating codec metadata and stream
metadata.
Also, declare that the "codec" field can not be NULL anymore. demux.c
will set it to "" if it's NULL when added. This gets rid of a corner
case everything had to handle, but which rarely happened.
Just simplify by removing parts not needed anymore. This includes
merging dec_sub allocation and initialization (since things making
initialization complicated were removed), or format support queries (it
simply tries to create a decoder, and if that fails, tries the next
one).
So that the video FPs is not required at initialization, and can be set
later.
(As for whether this MicroDVD crap is worth the trouble to handle it
"correctly": MicroDVD files are unfortunately still around, and in at
least one case using the video FPS seemed to help indeed.)
Keeping ASS_Renderers around for a potentially large number of subtitle
tracks could lead to excessive memory usage, especially since the libass
cache is broken (caches even unneeded data), and might consume up to
~500MB of memory for no reason.
This includes the case of switching ordered chapter boundaries. It will
now be recreated on each timeline part switch. This shouldn't be much of
a problem with modern libass. (Older libass versions use fontconfig for
memory fonts, and will be very slow to reinitialize memory fonts.)
Apparently, this was replaced by the SD_CTRL_SET_VIDEO_PARAMS set
dimensions. But I can't find out when this happened - possibly, these
fields were never used by sd_lavc.c, and only by the (long removed)
MPlayer dvdsub decoder.
It was stupid. The only thing that still effectively used it was
sd_lavc_conv - all other "filters" were the subtitle decoder/renderers
for text (sd_ass) and bitmap (sd_lavc) subtitles.
While having a subtitle filter chain was interesting (and actually
worked in almost the same way as the audio/video ones), I didn't
manage to use it in a meaningful way, and I couldn't e.g. factor
secondary features like fixing subtitle timing into filters.
Refactor the shit and drop unneeded things as it goes.
This affects non-ASS text subtitles (those which go through libavcodec's
subtitle converter), which are muxed with video/audio. (Typically srt
subs in mkv.)
The problem is that seeking in the file can send a subtitle packet to
the decoder multiple times. These packets are interlaved with video,
and thus can't be all read when opening the file. Rather, subtitle
packets can essentially be randomly skipped or repeated (by seeking).
Until recently, this was solved by scanning the libass event list for
duplicates. Then our builtin srt-to-ass converter was removed, and
the problem was handled by fully clearing the subtitle list on each
seek.
This resulted in sub-seek not working properly for this type of file.
Since the subtitle list was cleared on seek, it was not possible to
do e.g. sub-seeks to subtitles before the current playback position.
Fix this by not clearing the list, and intead explicitly rejecting
duplicate packets. We use the packet file position was unique ID for
subtitles; this is confirmed working for most file formats (although
it is slightly risky - new demuxers may not necessarily set the file
position to something unique, or at all).
The list of seen packets is sorted, and the lookup uses binary search.
This is to avoid quadratic complexity when subtitles are added in
bulks, such as when opening a text subtitle file.
In some places, the code has to be adjusted to pass through the packet
file position correctly.
With the FFmpeg subtitle decoder used for _all_ non-ASS text subtitle
format, this code is simply unused now.
Ironically, the FFmpeg subtitle decoder does not handle things correctly
in a bunch of cases. Should it turn out they actually matter, they will
have to hack back.
The extend_event one is a candidate, although even though there were
allegedly files which need it, I couldn't get samples from the user who
originally reported such files. As such, extend_event was only confirmed
to handle trailing events with no (endless) duration like with MicroDVD
and LRC, but FFmpeg "fudges" these anyway, so no special handling is
needed.
This code also had logic to handle seeking with muxed srt subtitles,
which made the sub-seek command work. But this has been broken before
this commit already. Currently, seeking with muxed srt subs will clear
all subtitles, as the broken FFmpeg ASS format output by the libavcodec
subtitle converters does not check for duplicates. Since the subtitles
are all cleared, ass_step_sub() can not work properly and sub-seek can
not seek to already seen subtitles.
I feel like it's better there. Note that there is no reduced
functionality, as bitmaps subs (i.e. not handled by sd_ass.c) were never
fully read on init, and thus never went through sub_read_all_packets().
On the other hand, this might lead to confusion, as --sub-fps etc. will
now also affect muxed subtitles (which makes not much sense).
Until now, feeding packets to the decoder in advance was done for text
subtitles only. This was possible because libass buffers all subtitle
data anyway (in ASS_Track). sd_lavc, responsible for bitmap subs, does
not do this. But it can buffer a small number of subtitle frames ahead.
Enable this.
Repurpose the sub_accept_packets_in_advance(). Instead of "can take all
packets" it means "can take 1 packet" now. (The old meaning is still
needed locally in dec_sub.c; keep it there.) It asks the decoder whether
there is place for at least 1 subtitle packet. sd_lavc implements it and
returns true if its internal fixed-size subtitle queue still has a free
slot. (The implementation of this in dec_sub.c isn't entirely clean.
For one, decode_chain() ignores this mechanism, so it's implied that
bitmap subtitles do not use the subtitle filter chain in any advanced
way.)
Also fix 2 bugs in the sd_lavc queue handling. Subtitles must be checked
in reverse, because the first entry will often have endpts==NOPTS, which
would always match. alloc_sub() must cycle the queue buffer, because it
reuses memory allocations (like sub.imgs) by design.
Each subtitle track gets its own decoder instance (sd_ass). But they use
a shared ASS_Renderer. This is done mainly because of fontconfig.
Initializing fontconfig is very slow when using it with memory fonts, so
there's a practical need to cache this memory font state, which is done
by not creating separate ASS_Renderers. This is very dirty and very
evil, but we probably can't get rid of it any time soon.
The shared ASS_Renderer was not properly synchronized. While the program
logic guarantees that only one sd_ass instance is visible at a time,
there are other interactions that require synchronization. In
particular, I suspect concurrent execution of mp_ass_configure_fonts()
and sd_ass.get_bitmaps cause issues in a newer libass development
branch.
So here's a shitty hack that hopefully fixes things, hopefully only
until libass becomes less dependent on fontconfig.
Instead of parsing the ASS file in demux_libass.c and trying to pass the
ASS_Track to the subtitle renderer, just read all file data in
demux_libass.c, and let the subtitle renderer pass the file contents to
ass_process_codec_private(). (This happens to parse full files too.)
Makes the code simpler, though it also relies harder on the (messy)
probe logic in demux_libass.c.
The mplayer decoder (spudec.c) actually handled this. There was explicit
code for binary palettes (16 32 bit values), and the subtitle resolution
was handled by video resolution coincidentally matching the subtitle
resolution.
Whoever puts vobsub into mp4 should be punished.
Fixes the sample gundam_sample.mp4, closes github issue #547.
This means the direct libass usage can be removed from command.c, and no
weird hacks for retrieving the ASS_Track are needed.
Also fix a bug when using this feature with ordered chapters.
When e.g. converting SRT to ASS, we certainly don't want them stretched
by video aspect ratio, even if that's necessary for native ASS
subtitles.
Annoying weird details...
This mirrors commit "sub: remove check_duplicate_plaintext_event()".
That code was basically duplicated. In general, this code is still
needed when doing conversion during demuxing (mostly because you can
seek during demuxing, which will cause duplicate events by replaying).
This allows using some formats that were not supported until now, like
WebVTT.
We still prefer the internal subtitle reader (subreader.c), because
1. Libav, and 2. random things which we probably want to keep, such as
control over formatting, codepage stuff, or various mysterious
postprecessing done in that code.
This means subassconvert.c is split in sd_srt.c and sd_microdvd.c. Now
this code is involved in the sub conversion chain like sd_movtext is.
The invocation of the converter in sd_ass.c is removed.
This requires some other changes to make the new sub converter code work
with loading external subtitles. Until now, subtitles loaded via
subreader.c was assumed to be in plaintext, or for some formats, in ASS
(except in -no-ass mode). Then these were added to an ASS_Track. Change
this so that subtitles are always in their original format (as far as
decoders/converters for them are available), and turn every sub event
read by subreader.c as packet to the dec_sub.c subtitle chain.
This removes differences between external/demuxed and -ass/-no-ass code
paths further.
Add a basic infrastructure for subtitle converters. These converters
work sort-of like decoders, except that they produce packets instead
of subtitle bitmaps. They are put in front of actual decoders.
Start with sd_movtext. 4 lines of code are blown up to a 55 lines file,
but fortunately this is not going to be that bad for the following
converters.
Make the sub decoder stuff independent from sh_sub (except for
initialization of course). Sub decoders now access a struct sd only,
instead of getting access to sh_sub. The glue code in dec_sub.c is
similarily independent from osd.
Some simplifications are made. For example, the switch_id stuff is
unneeded: the frontend code just has to make sure to call osd_changed()
any time subtitles are switched.
This is also preparation for introducing subtitle converters. It's much
cleaner to completely separate demuxer header/renderer glue/decoders
for this purpose, especially since sub converters might completely
change how demuxer headers have to be interpreted.
Also pass data as demux_packets. Currently, this doesn't help much, but
libavcodec converters might need scary stuff like packet side data, so
it's perhaps better to go with passing packets.
The -no-ass switch used to disable any use of libass for text subtitles.
This is not really the case anymore, because libass is now always
involved when rendering text. The only remaining use of -no-ass is
disabling styling or showing subtitles on the terminal. On the other
hand, the old subtitle rendering path is a big reason why the subtitle
code is still a big mess with an awful number of obscure special cases.
In order to simplify it, remove the old subtitle rendering code, and
always go through sd_ass.c. Basically, we use ASS_Track as central data
structure for storing text subtitles instead of struct sub_data. This
also makes libass mandatory for all text subs, even if they are printed
to the terminal in -no-video mode. (We could add something like sd_text
to avoid this, but it's not worth the trouble.)
struct sub_data and subreader.c are still around, even its ASS/SSA
reader. But struct sub_data is freed right after converting it to
ASS_Track. The internal ASS reader actually can handle some obscure
cases libass can't, like files encoded in UTF-16.