Should we actually get into trouble for unproper handling of
frame-based subtitle formats, this might be the simplest way to
work this around. Also is a bit more intuitive than -subfps, which
might use an unknown, misdetected, or non-sense video FPS.
Still pretty silly, though.
Before this commit, SRT demuxing and display actually happened to work
on Libav. But it was using the libavcodec srt converter (which is
essentially unmaintained in Libav), and timing postprocessing didn't
work. For some background explanations see sd_lavf_srt.c.
Until now, timing and charset recoding postprocessing was applied on
packets as they were output by the demuxer, and then passed to the
decoders. Make it so that postprocessing can happen after some decoders
in special situations.
This code was once part of subreader.c, then traveled to libass, and now
made its way back to the fork of the fork of the original code, MPlayer.
It works pretty much the same as subreader.c, except that we have to
concatenate some packets to do auto-detection. This is rather annoying,
but for all we know the actual source file could be a binary format.
Unlike subreader.c, the iconv context is reopened on each packet. This
is simpler, and with respect to multibyte encodings, more robust.
Reopening is probably not a very fast, but I suspect subtitle charset
conversion is not an operation that happens often or has to be fast.
Also, this auto-detection is disabled for microdvd - this is the only
format we know that has binary data in its packets, but is actually
decoded to text. FFmpeg doesn't really allow us to solve this properly,
because a) the input packets can be binary, and b) the output will be
checked whether it's UTF-8, and if it's not, the output is thrown away
and an error message is printed. We could just recode the decoded
subtitles before sd_ass if it weren't for that.
demux_libass.c allows us to make subtitle format detection part of the
normal file loading process. libass has no probe function, but trying to
load the start of a file (the first 4 KB) is good enough. Hope that
libass can even handle random binary input gracefully without printing
stupid log messages, and that the libass parser doesn't accept too many
non-ASS files as input.
This doesn't handle the -subcp option correctly yet. This will be fixed
later.
subreader.c (before this commit renamed to demux_subreader.c) was
special cased to the -sub option. The plan is using the normal demuxer
codepath for all subtitle formats (so we can prefer libavformat demuxers
for most formats).
There are some subtle changes. The probe size is restricted to 32 KB
(instead of unlimitted + giving up after 100 lines of input). For
formats like MicroDVD, the video FPS isn't used anymore, because it's
not available on the subtitle demuxer level. Instead, hardcode it to
23.976 FPS (libavformat seems to do the same). The user can probably
still use -sub-fps to fix the timing. Checking the file extension for
".utf"/".utf8"/".utf-8" is simply removed (seems worthless, was in the
way, and I've never seen this anywhere).
Actually check the newly added text for whitespace, and not the
uninitialized buffer after it. Also, if an even is only whitespace,
don't add it at all.
sd_ass contains some code that treats subtitle events with duration 0
specially, and adjust their duration so that they will disappear with
the next event.
This is most likely not needed anymore. Some subtitle formats allow
omitting the duration so that the event is visible until the next one,
but both subreader.c as well as libavformat subtitle demuxers already
handle this.
Subtitles embedded in mp4 files (movtext) used to trigger this code. But
these files appear to export subtitle duration correctly (at least
libavcodec's movtext decoder is using this assumption). Since commit
6dbedd2 changed demux_lavf to actually copy the packet duration field,
the code removed with this commit isn't needed anymore for correct
display of movtext subtitles. (The change in sd_movtext is for dropping
empty subtitle events, which would now be "displayed" - libavcodec does
the same.)
On the other hand, this code incorrectly displayed hidden events in .srt
subtitles. See for example the first event in SubRip_capability_tester.srt
(part of FFmpeg's FATE). These intentionally have a duration of 0, and
should not be displayed. (As of with this commit, they are still
displayed in external .srt subs because of subreader.c hacks.)
However, we can't be 100% sure that this code is really unneeded, so
just comment the code. Hopefully it can be removed if there are no
regressions after some weeks or months.
Currently, we are filtering libavformat style ASS packets by checking
whether they are prefixed "Dialogue: ". Unfortunately, comment packets
are demuxed too. These start with "Comment: ", so they are not caught.
Change the filtering, and use the codec ID instead. libavformat uses
"ssa" as codec ID for ASS subtitles, while mpv uses "ass". Also, at
least FFmpeg will change the ASS packet format to the same format mpv
and Matroska use, and identify these with "ass" as codec ID, so this is
works out nicely.
Some of this (fixing timing) is now done in dec_sub.c (although it's
not active for subreader.c code yet - this will be fixed when
subreader.c subs are read through a demuxer wrapper).
Another reason to remove this is that this code doesn't do much good
anymore. libass does handle overlap, and trying to fold overlapping
lines into single subtitle events will prevent libass from handling
this properly.
This fixes the -subfps option (which unfortunately is still useful),
and fixes minor annoying timing errors (which unfortunately still
happen).
Note that none of these affect ASS or image subtitles. ASS is specially
handled: libass loads subtitles as ASS_Track. There are no actual
packets passed around, and sd_ass just uses the ASS_Track.
Disable the --sub-no-text-pp option. It's misleading now and always was
completely useless.
If a subtitle is external, read it completely and add all subtitle
events in advance when the subtitle track is selected. This is done
for text subtitles only. (Note that subreader.c and subtitles loaded
with libass are different and don't have anything to do with this
commit.)
Seems like a completely unnecessary complication. Instead, always add a
1 byte padding (could be extended if a caller needs it), and clear it.
Also add some documentation. There was some, but it was outdated and
incomplete.
This function was called in various places. Most time, it was used
before a seek. In other cases, the purpose was apparently resetting
the EOF flag. As far as I can see, this makes no sense anymore. At
least the stream_reset() calls paired with stream_seek() are completely
pointless. A seek will either seek inside the buffer (and reset the
EOF flag), or do an actual seek and reset all state.
Both converters can output \pos and deal with font sizes, so they assume
a specific script resolution (PlayResX/PlayResY). The implicit
assumption was that a specific resolution was guaranteed. The
MP_ASS_FONT_PLAYRESY constant is connected to this.
Better make it explicit, so that the implicit dependency on
MP_ASS_FONT_PLAYRESY is removed. (Unfortunately, libavcodec sub
converters still don't set PlayResX/PlayResY explicitly, so the value
set by that constant can't be declared as arbitrary yet.)
PlayResY=288 is most likely the SSA natural script resolution (or
something like this?), as well as the libass and VSFilter default.
PlayResX=384 is the fallback value set by libass if PlayResY is set to
288, and PlayResX is unset.
The default style is added by mp_ass_default_track(), but not by
ass_new_track(). Considering this, the previous condition at this point
didn't make much sense anymore: the actual (converted) subtitle format
doesn't matter much for what styling should be applied. What matters is
if the subtitle was originally ASS, or if it was converted to it.
Change the code such that the default style is added if there aren't
any, even after reading sub extradata. (The extradata contains the ASS
header, including the style section.) This might change behavior with
scripts that don't define any styles. The change is either with this
commit or with an earlier commit in this branch, depending on the
situation - there are multiple places where default styles are added
in libass API functions, and it's all a big mess.
Other than with very old or broken files (where different behavior
doesn't matter much), the current code should be pretty safe, though.
Audio and video had their own (very similar) functions to initialize an
AVPacket (ffmpeg's packet struct) from a demux_packet (mplayer's packet
struct). Add a common function for these.
Also use this function for sd_lavc_conv. This is actually a functional
change, as some libavfilter subtitle demuxers add weird out-of-band
stuff as side-data.
When e.g. converting SRT to ASS, we certainly don't want them stretched
by video aspect ratio, even if that's necessary for native ASS
subtitles.
Annoying weird details...
This mirrors commit "sub: remove check_duplicate_plaintext_event()".
That code was basically duplicated. In general, this code is still
needed when doing conversion during demuxing (mostly because you can
seek during demuxing, which will cause duplicate events by replaying).
Normally, libavcodec subtitle converters will output a style header like
this as part of the extradata:
Style: Default,Arial,16,&Hffffff,&Hffffff,&H0,&H0,0,0,0,1,1,0,2,10,10,10,0,0
We don't want that, so use some bruteforce to get rid of them.
Otherwise this could happily open decoders for image subtitles or even
audio/video decoders. AV_CODEC_PROP_TEXT_SUB is a preprocessor symbol,
but it's still better to detect this properly instead of using #ifdef,
because these flags might as well be changed into enums sooner or later.
This allows using some formats that were not supported until now, like
WebVTT.
We still prefer the internal subtitle reader (subreader.c), because
1. Libav, and 2. random things which we probably want to keep, such as
control over formatting, codepage stuff, or various mysterious
postprecessing done in that code.
This means subassconvert.c is split in sd_srt.c and sd_microdvd.c. Now
this code is involved in the sub conversion chain like sd_movtext is.
The invocation of the converter in sd_ass.c is removed.
This requires some other changes to make the new sub converter code work
with loading external subtitles. Until now, subtitles loaded via
subreader.c was assumed to be in plaintext, or for some formats, in ASS
(except in -no-ass mode). Then these were added to an ASS_Track. Change
this so that subtitles are always in their original format (as far as
decoders/converters for them are available), and turn every sub event
read by subreader.c as packet to the dec_sub.c subtitle chain.
This removes differences between external/demuxed and -ass/-no-ass code
paths further.
Add a basic infrastructure for subtitle converters. These converters
work sort-of like decoders, except that they produce packets instead
of subtitle bitmaps. They are put in front of actual decoders.
Start with sd_movtext. 4 lines of code are blown up to a 55 lines file,
but fortunately this is not going to be that bad for the following
converters.
Make the sub decoder stuff independent from sh_sub (except for
initialization of course). Sub decoders now access a struct sd only,
instead of getting access to sh_sub. The glue code in dec_sub.c is
similarily independent from osd.
Some simplifications are made. For example, the switch_id stuff is
unneeded: the frontend code just has to make sure to call osd_changed()
any time subtitles are switched.
This is also preparation for introducing subtitle converters. It's much
cleaner to completely separate demuxer header/renderer glue/decoders
for this purpose, especially since sub converters might completely
change how demuxer headers have to be interpreted.
Also pass data as demux_packets. Currently, this doesn't help much, but
libavcodec converters might need scary stuff like packet side data, so
it's perhaps better to go with passing packets.
Subtitle files are opened in mplayer.c, not using the demuxer
infrastructure in general. Pretend that this is not the case (outside of
the loading code) by opening a pseudo demuxer that does nothing. One
advantage is that the initialization code is now the same, and there's
no confusion about what the difference between track->stream,
track->sh_sub and mpctx->sh_sub is supposed to be.
This is a bit stupid, and it would be much better if there were proper
subtitle demuxers (there are many in recent FFmpeg, but not Libav). So
for now this is just a transition to a more proper architecture. Look
at demux_sub like an artifical limb: it's ugly, but don't hate it - it
helps you to get on with your life.
This was broken with 84829a4 "Merge branch 'osd_changes' into master".
The new OSD/subtitle code never respected the --sub-forced-only option,
and the old code containing the code for this was removed in fd5c4a1.
This unifies the subtitle rendering path. Now all subtitle rendering
goes through sd_ass.c/sd_lavc.c/sd_spu.c.
Before that commit, the spudec.h functions were used directly in
mplayer.c, which introduced many special cases. Add sd_spu.c, which is
just a small wrapper connecting the new subtitle render API with the
dusty old vobsub decoder in spudec.c.
One detail that changes is that we always pass the palette as extra
data, instead of passing the libdvdread palette as pointer to spudec
directly. This is a bit roundabout, but actually makes the code simpler
and more elegant: the difference between DVD and non-DVD dvdsubs is
reduced.
Ideally, we would just delete spudec.c and use libavcodec's DVD sub
decoder. However, DVD playback with demux_mpg produces packets
incompatible to lavc. There are incompatibilities the other way around
as well: packets from libavformat's vobsub demuxer are incompatible to
spudec.c. So we define a new subtitle codec name for demux_mpg subs,
"dvd_subtitle_mpg", which only sd_spu can decode.
There is actually code in spudec.c to "assemble" fragments into complete
packets, but using the whole spudec.c is easier than trying to move this
code into demux_mpg to fix subtitle packets.
As additional complication, Libav 9.x can't decode DVD subs correctly,
so use sd_spu in that case as well.
This was once needed to handle subtitle packages coming from a demuxer,
where seeking back might repeat previous events. This doesn't happen
anymore, and this code is used to convert complete files. So if there
are any duplicate lines, they must have been duplicated in the file,
and the old subtitle renderer would have shown them twice as well.
Today checking for duplicate events happens in sd_ass.c (and has been
for a while). There's no reason to keep this code, and it actually
causes trouble. Loading big subtitle files is extremely slow because
this makes adding n subtitles O(n^2).
The -no-ass switch used to disable any use of libass for text subtitles.
This is not really the case anymore, because libass is now always
involved when rendering text. The only remaining use of -no-ass is
disabling styling or showing subtitles on the terminal. On the other
hand, the old subtitle rendering path is a big reason why the subtitle
code is still a big mess with an awful number of obscure special cases.
In order to simplify it, remove the old subtitle rendering code, and
always go through sd_ass.c. Basically, we use ASS_Track as central data
structure for storing text subtitles instead of struct sub_data. This
also makes libass mandatory for all text subs, even if they are printed
to the terminal in -no-video mode. (We could add something like sd_text
to avoid this, but it's not worth the trouble.)
struct sub_data and subreader.c are still around, even its ASS/SSA
reader. But struct sub_data is freed right after converting it to
ASS_Track. The internal ASS reader actually can handle some obscure
cases libass can't, like files encoded in UTF-16.
These were found by the cppcheck and scan-build static analyzers. Most
of these aren't interesting (the 2 previous commits fix some interesting
cases found by these analyzers), and they don't nearly fix all warnings.
(Most of the unfixed warnings are spam, things MPlayer never cared
about, or false positives.)
"%[,.:]" conversion was used with a buffer that could be shorter than
the matched string. Suppress assignment of the conversion since the
value wasn't used anyway, and also limit match length to 1 as it
doesn't look like the intent was to match longer runs of the
characters.
Merged from mplayer2 commit 5cb9aac. Note that the other half of the
mplayer2 commit is already part of the mpv commit d98e61e. (I'm not
sure why. The mplayer2 commit date precedes mpv's, but was pushed long
after the mpv change was pushed; either one of the dates is wrong, or
we did the same work twice - in that case, thanks a lot...)