We certainly don't want to maintain and improve the internal converter,
but we still need the internal one for Libav. (In the Libav case,
demux_subreader.c will be used to read the MicroDVD file.)
While I'm not very fond of "const", it's important for declarations
(it decides whether a symbol is emitted in a read-only or read/write
section). Fix all these cases, so we have writeable global data only
when we really need.
Instead of parsing the ASS file in demux_libass.c and trying to pass the
ASS_Track to the subtitle renderer, just read all file data in
demux_libass.c, and let the subtitle renderer pass the file contents to
ass_process_codec_private(). (This happens to parse full files too.)
Makes the code simpler, though it also relies harder on the (messy)
probe logic in demux_libass.c.
The mplayer decoder (spudec.c) actually handled this. There was explicit
code for binary palettes (16 32 bit values), and the subtitle resolution
was handled by video resolution coincidentally matching the subtitle
resolution.
Whoever puts vobsub into mp4 should be punished.
Fixes the sample gundam_sample.mp4, closes github issue #547.
The plan is to make the whole OSD thread-safe, and we start with this.
We just put locks on all entry points (fortunately, dec_sub.c and all
sd_*.c decoders are very closed off, and only the entry points in
dec_sub.h let you access it). I think this is pretty ugly, but at least
it's very simple.
There's a special case with sub_get_bitmaps(): this function returns
pointers to decoder data (specifically, libass images). There's no way
to synchronize this internally, so expose sub_lock/sub_unlock functions.
To make things simpler, and especially because the lock is sort-of
exposed to the outside world, make the locks recursive. Although the
only case where this is actually needed (although trivial) is
sub_set_extradata().
One corner case are ASS subtitles: for some reason, we keep a single
ASS_Renderer instance for subtitles around (probably to avoid rescanning
fonts with ordered chapters), and this ASS_Renderer instance is not
synchronized. Also, demux_libass.c loads ASS_Track objects, which are
directly passed to sd_ass.c. These things are not synchronized (and
would be hard to synchronize), and basically we're out of luck. But I
think for now, accesses happen reasonably serialized, so there is no
actual problem yet, even if we start to access OSD from other threads.
Subtitle formats with frame based timing require using the video FPS to
compute proper subtitle timestamps. But it looks like the calculation to
do that was inversed.
Since m_option.h and options.h are extremely often included, a lot of
files have to be changed.
Moving path.c/h to options/ is a bit questionable, but since this is
mainly about access to config files (which are also handled in
options/), it's probably ok.
Use sh_stream over sh_sub. Use dec_sub (and mpctx->d_sub) instead of the
stream header. This aligns the subtitle code with the recent audio and
video refactoring.
sh_sub still has the decoder context, though. This is because we want to
avoid reinit when switching segments with ordered chapters. (Reinit is
fast, except for creating the ASS_Renderer, which in turn triggers
fontconfig.) Not sure how much this matters, though, because the initial
segment switch will lazily initialize the decoder anyway.
The configure followed 5 different convetions of defines because the next guy
always wanted to introduce a new better way to uniform it[1]. For an
hypothetic feature 'hurr' you could have had:
* #define HAVE_HURR 1 / #undef HAVE_DURR
* #define HAVE_HURR / #undef HAVE_DURR
* #define CONFIG_HURR 1 / #undef CONFIG_DURR
* #define HAVE_HURR 1 / #define HAVE_DURR 0
* #define CONFIG_HURR 1 / #define CONFIG_DURR 0
All is now uniform and uses:
* #define HAVE_HURR 1
* #define HAVE_DURR 0
We like definining to 0 as opposed to `undef` bcause it can help spot typos
and is very helpful when doing big reorganizations in the code.
[1]: http://xkcd.com/927/ related
Broken UTF-8 in this context means we treat it as UTF-8, but we also
interpret broken UTF-8 sequences as Latin1.
Also, run our own UTF-8 check function before the charset detectors.
This prevents from ENCA's UTF-8 check possibly messing up (like
detecting 7-bit clean UTF-8 as ASCII, or other things). It also takes
care of UTF-8 detection if no charset detector (ENCA, libguess) is
compiled in, and it lets us deal better with cut-off UTF-8 sequences.
The fix_overlaps_and_gaps() function in dec_sub.c fixes small gaps or
overlaps between subtitle events. However, sometimes it could happen
that the corrected subtitle events could overlap by 1ms due to bad
rounding, making libass shift subtitles to reduce collisions. (The
second subtitle will be shown above the previous one, even if both
subtitles are visible only for 1ms.)
sd_ass.c rounds the timestamps when converting to integers for unknown
reasons. I think it would work fine without that rounding, but since
I have no clue why it rounds, and since it could be needed to ensure
correct timestamps with ASS subtitles demuxed from Matroska, I'd rather
not touch it. So the solution is to use already rounded timestamps to
calculate the new subtitle duration in fix_overlaps_and_gaps().
See github issue #182.
Partial packet reads were needed because the video/audio parsers were
working on top of them. So it could happen that a parser read a part of
a packet, and returned that to the decoder. With libavformat/libavcodec,
packets are already parsed, and everything is much simpler.
Most of the simplifications in ad_spdif could have been done earlier.
Remove some other stuff as well, like the questionable slave mode start
time reporting (could be replaced by proper code, but we don't bother).
Remove the unused skip_audio_frame() functionality as well (it was used
by old demuxers). Some functions become private to demux.c, like
demux_fill_buffer(). Introduce new packet read functions, which have
simpler semantics. Packets returned from them are owned by the caller,
and all packets in the demux.c packet queue are considered unread.
Remove special code that dropped subtitle packets with size 0. This
used to be needed because it caused special cases in the old code.
Delete demux_avi, demux_asf, demux_mpg, demux_ts. libavformat does
better than them (except in rare corner cases), and the demuxers have
a bad influence on the rest of the code. Often they don't output
proper packets, and require additional audio and video parsing. Most
work only in --no-correct-pts mode.
Remove them to facilitate further cleanups.
This means the direct libass usage can be removed from command.c, and no
weird hacks for retrieving the ASS_Track are needed.
Also fix a bug when using this feature with ordered chapters.
Should we actually get into trouble for unproper handling of
frame-based subtitle formats, this might be the simplest way to
work this around. Also is a bit more intuitive than -subfps, which
might use an unknown, misdetected, or non-sense video FPS.
Still pretty silly, though.
Before this commit, SRT demuxing and display actually happened to work
on Libav. But it was using the libavcodec srt converter (which is
essentially unmaintained in Libav), and timing postprocessing didn't
work. For some background explanations see sd_lavf_srt.c.
Until now, timing and charset recoding postprocessing was applied on
packets as they were output by the demuxer, and then passed to the
decoders. Make it so that postprocessing can happen after some decoders
in special situations.
This code was once part of subreader.c, then traveled to libass, and now
made its way back to the fork of the fork of the original code, MPlayer.
It works pretty much the same as subreader.c, except that we have to
concatenate some packets to do auto-detection. This is rather annoying,
but for all we know the actual source file could be a binary format.
Unlike subreader.c, the iconv context is reopened on each packet. This
is simpler, and with respect to multibyte encodings, more robust.
Reopening is probably not a very fast, but I suspect subtitle charset
conversion is not an operation that happens often or has to be fast.
Also, this auto-detection is disabled for microdvd - this is the only
format we know that has binary data in its packets, but is actually
decoded to text. FFmpeg doesn't really allow us to solve this properly,
because a) the input packets can be binary, and b) the output will be
checked whether it's UTF-8, and if it's not, the output is thrown away
and an error message is printed. We could just recode the decoded
subtitles before sd_ass if it weren't for that.
subreader.c (before this commit renamed to demux_subreader.c) was
special cased to the -sub option. The plan is using the normal demuxer
codepath for all subtitle formats (so we can prefer libavformat demuxers
for most formats).
There are some subtle changes. The probe size is restricted to 32 KB
(instead of unlimitted + giving up after 100 lines of input). For
formats like MicroDVD, the video FPS isn't used anymore, because it's
not available on the subtitle demuxer level. Instead, hardcode it to
23.976 FPS (libavformat seems to do the same). The user can probably
still use -sub-fps to fix the timing. Checking the file extension for
".utf"/".utf8"/".utf-8" is simply removed (seems worthless, was in the
way, and I've never seen this anywhere).
This fixes the -subfps option (which unfortunately is still useful),
and fixes minor annoying timing errors (which unfortunately still
happen).
Note that none of these affect ASS or image subtitles. ASS is specially
handled: libass loads subtitles as ASS_Track. There are no actual
packets passed around, and sd_ass just uses the ASS_Track.
Disable the --sub-no-text-pp option. It's misleading now and always was
completely useless.
If a subtitle is external, read it completely and add all subtitle
events in advance when the subtitle track is selected. This is done
for text subtitles only. (Note that subreader.c and subtitles loaded
with libass are different and don't have anything to do with this
commit.)
When e.g. converting SRT to ASS, we certainly don't want them stretched
by video aspect ratio, even if that's necessary for native ASS
subtitles.
Annoying weird details...
This mirrors commit "sub: remove check_duplicate_plaintext_event()".
That code was basically duplicated. In general, this code is still
needed when doing conversion during demuxing (mostly because you can
seek during demuxing, which will cause duplicate events by replaying).
This allows using some formats that were not supported until now, like
WebVTT.
We still prefer the internal subtitle reader (subreader.c), because
1. Libav, and 2. random things which we probably want to keep, such as
control over formatting, codepage stuff, or various mysterious
postprecessing done in that code.
This means subassconvert.c is split in sd_srt.c and sd_microdvd.c. Now
this code is involved in the sub conversion chain like sd_movtext is.
The invocation of the converter in sd_ass.c is removed.
This requires some other changes to make the new sub converter code work
with loading external subtitles. Until now, subtitles loaded via
subreader.c was assumed to be in plaintext, or for some formats, in ASS
(except in -no-ass mode). Then these were added to an ASS_Track. Change
this so that subtitles are always in their original format (as far as
decoders/converters for them are available), and turn every sub event
read by subreader.c as packet to the dec_sub.c subtitle chain.
This removes differences between external/demuxed and -ass/-no-ass code
paths further.
Add a basic infrastructure for subtitle converters. These converters
work sort-of like decoders, except that they produce packets instead
of subtitle bitmaps. They are put in front of actual decoders.
Start with sd_movtext. 4 lines of code are blown up to a 55 lines file,
but fortunately this is not going to be that bad for the following
converters.
Make the sub decoder stuff independent from sh_sub (except for
initialization of course). Sub decoders now access a struct sd only,
instead of getting access to sh_sub. The glue code in dec_sub.c is
similarily independent from osd.
Some simplifications are made. For example, the switch_id stuff is
unneeded: the frontend code just has to make sure to call osd_changed()
any time subtitles are switched.
This is also preparation for introducing subtitle converters. It's much
cleaner to completely separate demuxer header/renderer glue/decoders
for this purpose, especially since sub converters might completely
change how demuxer headers have to be interpreted.
Also pass data as demux_packets. Currently, this doesn't help much, but
libavcodec converters might need scary stuff like packet side data, so
it's perhaps better to go with passing packets.
This unifies the subtitle rendering path. Now all subtitle rendering
goes through sd_ass.c/sd_lavc.c/sd_spu.c.
Before that commit, the spudec.h functions were used directly in
mplayer.c, which introduced many special cases. Add sd_spu.c, which is
just a small wrapper connecting the new subtitle render API with the
dusty old vobsub decoder in spudec.c.
One detail that changes is that we always pass the palette as extra
data, instead of passing the libdvdread palette as pointer to spudec
directly. This is a bit roundabout, but actually makes the code simpler
and more elegant: the difference between DVD and non-DVD dvdsubs is
reduced.
Ideally, we would just delete spudec.c and use libavcodec's DVD sub
decoder. However, DVD playback with demux_mpg produces packets
incompatible to lavc. There are incompatibilities the other way around
as well: packets from libavformat's vobsub demuxer are incompatible to
spudec.c. So we define a new subtitle codec name for demux_mpg subs,
"dvd_subtitle_mpg", which only sd_spu can decode.
There is actually code in spudec.c to "assemble" fragments into complete
packets, but using the whole spudec.c is easier than trying to move this
code into demux_mpg to fix subtitle packets.
As additional complication, Libav 9.x can't decode DVD subs correctly,
so use sd_spu in that case as well.
The -no-ass switch used to disable any use of libass for text subtitles.
This is not really the case anymore, because libass is now always
involved when rendering text. The only remaining use of -no-ass is
disabling styling or showing subtitles on the terminal. On the other
hand, the old subtitle rendering path is a big reason why the subtitle
code is still a big mess with an awful number of obscure special cases.
In order to simplify it, remove the old subtitle rendering code, and
always go through sd_ass.c. Basically, we use ASS_Track as central data
structure for storing text subtitles instead of struct sub_data. This
also makes libass mandatory for all text subs, even if they are printed
to the terminal in -no-video mode. (We could add something like sd_text
to avoid this, but it's not worth the trouble.)
struct sub_data and subreader.c are still around, even its ASS/SSA
reader. But struct sub_data is freed right after converting it to
ASS_Track. The internal ASS reader actually can handle some obscure
cases libass can't, like files encoded in UTF-16.
Get rid of the 1-char subtitle type field. Use sh_stream->codec instead
just like audio and video do. Use codec names as defined by libavcodec
for simplicity, even if they're somewhat verbose and annoying.
Note that ffmpeg might switch to "ass" as codec name for ASS, so we
don't bother with the current silly "ssa" name.