Commit Graph

1332 Commits

Author SHA1 Message Date
wm4 1d53a8f2e7 demux_edl: warn if no_clip is used with multiple parts
Well whatever.
2020-02-15 15:25:12 +01:00
wm4 921f316281 demux_edl: allow a redundant new_stream at the beginning
Normally, the first sub-stream is implicitly created. This change lets
the user use more orthogonal syntax, and use a new_stream header for
every sub-stream, instead of having to skip the header for the first
one.
2020-02-15 15:22:05 +01:00
wm4 64c4c59770 demux_edl: accept protocol entries in EDL entries again
Accidentally broken by commit 99700bc52c. mp_path_join() does not
check for this, because it's supposed to work on filesystem strings (and
e.g. "http://fubar" is a valid relative path in UNIX).
2020-02-15 15:11:05 +01:00
wm4 96ef62161a demux_edl: improve parsing slightly
Add a mp_log context to the parse_edl() function, and report some
errors. Previously, this just told you that something was wrong. Add
some error reporting to make it slightly less evil.

Put all parameters in a list before processing them. This makes adding
parameters for special headers easier, and we can report parameters that
were not understood. (For "compatibility", which probably doesn't matter
at all, still don't count them as errors; as before.)
2020-02-15 15:07:52 +01:00
wm4 121bc6ad62 demux_timeline: fix another cursed memory management issue
The timeline stuff has messed up memory management because there are no
clear ownership rules between a some demuxer instances (master or
demux_timeline) and the timeline object itself.

This is another subtle problem that happened: apparently,
demux_timeline.open is supposed to take over ownership of timeline, but
only on success. If it fails, it's not supposed to free it. It didn't
follow this, which lead to a double-free if demux_timeline.open failed.
The failure path in demux.c calls both timeline_destroy() and
demux_timeline.close on failure.
2020-02-15 14:07:46 +01:00
wm4 64a03b03ea demux_timeline: fix a comment 2020-02-15 13:55:54 +01:00
wm4 d5de8ddb65 demux_timeline: reorder some functions
Move them around in the source code to get rid of the forward
declarations. Other than rearranging the lines and removing the 2
forward declarations, there are no other changes at all.
2020-02-15 13:55:37 +01:00
wm4 27d5d32020 demux: add option to disable "sharing" between back and forward buffers
As requested. I guess option name and manpage text could be better and
clearer.

Closes: #7442
2020-02-07 15:58:13 +01:00
wm4 cbee577d0a cue: tolerate NBSP as whitespace
Apparently such .cue files exist. They fail both probing and parsing. To
make it worse, the sample at hand was encoded as Latin1.

One part of this is replacing bstr_lstrip() with a version that supports
NBSP. One could argue that bstr_lstrip() should always do this, but I
don't want to overdo it. There are many more unicode abomination which
it could be said it's supposed to handle, so it will stay ASCII instead
of going down this rabbit hole. I'm just assuming this cue sheet was
generated by some stupid software that inexplicably liked NBSPs (which
is how we justify a one-off fix). The new lstrip_whitespace() doesn't
look particularly efficient, but it doesn't have to be.

The second part is dealing with the fact that the charset is not
necessarily UTF-8. We don't want to do conversion before probing thinks
it knows it's a cue sheet (would probably make it more fragile all
around), so just make it work with Latin1 by assuming invalid code
points are Latin1. This fallback is part of why lstrip_whitespace() is
sort of roundabout.

(You could still rewrite it as much more efficient state machine,
instead of using a slow and validating UTF-8 parser that is called per
codepoint. Starting to overthink this.)

Multimedia is terrible. Legacy charsets are terrible. Everything is
terrible.

Fixes: #7429
2020-02-03 19:13:44 +01:00
wm4 1b283f6b60 libarchive: some shitty hack to make opening slightly faster
See manpage additions. The libarchive behavior mentioned in the last
paragraph there is technically unrelated, but makes this new option
mostly pointless.

See: #7182
2020-01-04 19:56:09 +01:00
wm4 5016a1e4a6 demux: add per-demuxer sub-options
Until now, they were all just added to options.c (e.g. demux_mkv_conf).
This adds a mechanism which can be used to add future options in a
(very) slightly more elegant way.
2020-01-04 19:47:36 +01:00
wm4 04bde06095 stream_libarchive: some more hacks to improve multi-volume archives
Instead of opening every volume on start just to see if it's there, all
all volumes that could possibly exist, and "handle" it on opening. This
requires working around some of libarchive's amazing stupidity and using
some empirically determined behavior. Will possibly break if libarchive
changes some of this behavior.

See: #7182
2020-01-04 18:59:23 +01:00
wm4 99700bc52c demux_edl: restore relative path resolution
Playing e.g. "dir/f.edl" should make all non-absolute paths in f.edl
relative to "dir".

Commit 0e98b2ad8e accidentally broke this.
2020-01-02 23:31:02 +01:00
wm4 3b6c4e7be1 demux: make track switching instant with certain mpegts files
When switching tracks, the data for the new track is missing by the
amount of data prefetched. This is because all demuxers return
interleaved data, and you can't just seek the switched track alone.
Normally, this would mean that the new track simply gets no data for a
while (e.g. silence if it's an audio track). To avoid this, mpv performs
a special "refresh seek" in the demuxer, which tries to resume demuxing
from an earlier position, in a way that does not disrupt decoding for
the non-changed tracks. (Could write a lot about the reasons for doing
something so relatively complex, and the alternatives and their
weaknesses, but let's not.)

This requires that the demuxer can tell whether a packet after a seek
was before or after a previously demuxed packet, sort of like an unique
ID. The code can use the byte position (pos) and the DTS for this. The
DTS is normally strictly monotonically increasing, the position in most
sane file formats too (notably not mp4, in theory).

The file at hand had DTS==NOPTS packets (which is fine, usually this
happens when PTS can be used instead, but the demux.c code structure
doesn't make it easy to use this), and pos==-1 at the same time. The
latter is what libavformat likes to return when the packet was produced
by a "parser" (or in other words, packets were split or reassembled),
and the packet has no real file position. That means the refresh seek
mechanism has no packet position and can't work.

Fix this by making up a pseudo-position if it's missing. This needs to
set the same value every time, which is why it does not work for
keyframe packets (which, by definition, could be a seek target).

Fixes: #7306 (sort of)
2019-12-31 00:17:46 +01:00
wm4 f0d0822595 demux: fix --stream-record runtime change handling
Well, if that wasn't particularly dumb.
2019-12-29 20:18:35 +01:00
wm4 582f3f7cc0 playlist: change from linked list to an array
Although a linked list was ideal at first, there are cases where it
sucks, and became increasingly awkward (with the mpv command API
preferring integer indexes to access the list). In future, we probably
want to add more playlist-related functionality, so better change it to
an array now.

An array isn't always ideal either. Since playlist entries are still
separate objects (because in some cases you need a stable "iterator" to
it), but you still need to efficiently get the next/previous playlist
entry, there's a pl_index field, that needs to be maintained. E.g.
adding an entry at the start of the playlist => update the pl_index
field for all other entries. Well, it's not really worth to do something
more complicated to avoid these things.

This commit is probably buggy as shit. It's not like I bothered to test
everything. That's _your_ role.
2019-12-28 21:32:15 +01:00
wm4 36da3325a3 demux: stop setting dummy stream on demux_close_stream()
Demuxers can call demux_close_stream() to close the underlying stream if
it's not needed anymore. (Useful to release "heavy" resources like FDs
and sockets. Plus merely keeping a file open can have visible side
effects such as inability to unmount a filesystem or, on Windows, to do
anything with the file.)

Until now, this set demuxer->stream to a dummy stream, because most code
used to assume that the stream field is non-NULL. But this requirement
disappeared (in some cases, the stream field is already NULL), so stop
doing that. demux_lavf.c, one of the demuxers which calls this function,
still had some of this, though.
2019-12-23 11:09:42 +01:00
wm4 cd09ea92be demux_mf: use stream API to open list files
mf:// has an obscure feature that lets you pass a list of filenames
separated by newlines. Who knows whether anyone is using that. It opened
these listfiles with fopen(), so the recent stream origin bullshit
doesn't operate on it. Fix this by using the mpv internal stream API
instead. Unfortunately there is no fgets(), so write an ad-hoc one. (An
implementation of line reading via "stream" is still in demux_playlist,
but it's better to keep it quarantined there.)
2019-12-23 11:01:29 +01:00
wm4 9e15e3ad8f demux: remove debug abort()
WHAT THE FUCK

Fixes: #7279
2019-12-22 04:57:18 +01:00
wm4 8448fe0b62 demux: add an option to control tag charset
Fucking gross that you need this in almost-2020.

Fixes: #7255
2019-12-20 13:00:39 +01:00
wm4 0e98b2ad8e edl: accept arbitrary paths
Until now, .edl files accepted only "simple" filenames, i.e. no relative
or absolute paths, no URLs. Now that the origin bullshit is a bit
cleaned up and enforced in the EDL code, there's absolutely no reason to
keep this.

The new code behaves somewhat similar to playlists. (Although playlists
are special because they're not truly recursively opened.)
2019-12-20 13:00:39 +01:00
wm4 1cb9e7efb8 stream, demux: redo origin policy thing
mpv has a very weak and very annoying policy that determines whether a
playlist should be used or not. For example, if you play a remote
playlist, you usually don't want it to be able to read local filesystem
entries. (Although for a media player the impact is small I guess.)

It's weak and annoying as in that it does not prevent certain cases
which could be interpreted as bad in some cases, such as allowing
playlists on the local filesystem to reference remote URLs. It probably
barely makes sense, but we just want to exclude some other "definitely
not a good idea" things, all while playlists generally just work, so
whatever.

The policy is:
- from the command line anything is played
- local playlists can reference anything except "unsafe" streams
  ("unsafe" means special stream inputs like libavfilter graphs)
- remote playlists can reference only remote URLs
- things like "memory://" and archives are "transparent" to this

This commit does... something. It replaces the weird stream flags with a
slightly clearer "origin" value, which is now consequently passed down
and used everywhere. It fixes some deviations from the described policy.

I wanted to force archives to reference only content within them, but
this would probably have been more complicated (or required different
abstractions), and I'm too lazy to figure it out, so archives are now
"transparent" (playlists within archives behave the same outside).

There may be a lot of bugs in this.

This is unfortunately a very noisy commit because:
- every stream open call now needs to pass the origin
- so does every demuxer open call (=> params param. gets mandatory)
- most stream were changed to provide the "origin" value
- the origin value needed to be passed along in a lot of places
- I was too lazy to split the commit

Fixes: #7274
2019-12-20 13:00:39 +01:00
wm4 572c32abbe libarchive: prefix entry names in archive URLs with '/'
This has the advantage that playlists within the archive will work as
expected, because demux_playlist will correctly join the archive base
URL and entry name. Before this change, it could skip before the "|",
resulting in a broken URL.
2019-12-20 08:35:08 +01:00
wm4 0bf0efd6d3 demux_edl: fix reusing segment source files
EDL files can have multiple segments taken from the same source file. In
this case, the source file is supposed to be opened only once. This
stopped working, and it created a new demuxer instance for every single
segment entry. This made it slow and made it use much more memory than
needed.

This was because it tried to iterate over the array of source files, but
the array count (num_parts) was only set to a non-0 value later. Fix
this by maintaining the count correctly.

In addition, the actual code for checking whether a source can be reused
(in open_source()) regressed and stopped working correctly. d->stream
could be NULL. Use demuxer.filename instead; I'm not entirely sure
whether this is always correct, but fortunately we have a distributed
almost-AI driven test suite (called "users") which will probably find
and report such cases.

Probably broke with commit a09396ee60 or something close, but didn't
check closer.

Fixes: #7267
2019-12-17 01:57:42 +01:00
wm4 d60bbd86e3 demux_lavf: export demuxer_id for more formats which have it
See previous commit. libavformat exports this information as AVStream.id
field.

The big problem is that the libavformat field is simply 0 if it's
unknown (i.e. the demuxer never sets it). So it needs to remain a
whitelist. Just add more formats which are known to have a meaningful
ID.

I considered exporting IDs for all formats, and then either leaving the
values as they are, or filtering duplicate values (and choosing
arbitrary but unique different IDs). But then again, I think it's sort
of mpv's job to filter FFmpeg's absurd bullshit API, and it should make
an effort to hide it rather than to reflect it.

See: #7211
2019-12-03 21:15:40 +01:00
wm4 370ed5777c demux: do not make up demuxer_id
The demuxer_id (exported in as "src-id" property) is supposed to be the
native stream ID, as it exists in the file, or -1 if that does not exist
(actually any negative value), or if it is unknown.

Until now, an ID was made up if it was missing. That seems like strange
non-sense, and I can't find the reason why it was done. But it was
probably for convenience by the EDL stuff or so.

Stop doing this. Fortunately, the src-id property was documented as
being unavailable if the ID is not known. Even the code for this was
present, it was just inactive until now. Extend input.rst with some
explanations.

Also fixing 3 other places where negative demuxer_id was ignored or
avoided.
2019-12-03 21:04:53 +01:00
wm4 1cb085a82e options: get rid of GLOBAL_CONFIG hack
Just an implementation detail that can be cleaned up now. Internally,
m_config maintains a tree of m_sub_options structs, except for the root
it was not defined explicitly. GLOBAL_CONFIG was a hack to get access to
it anyway. Define it explicitly instead.
2019-11-29 12:14:43 +01:00
Aman Gupta dbb5dd7c33 demux_lavf: log packet read errors
Signed-off-by: Aman Gupta <aman@tmm1.net>
2019-11-22 12:56:46 -08:00
wm4 c7487cebd1 demux_mf: fix backward seeking behavior
If SEEK_FORWARD is set, a demuxer should skip to the next frame if the
timestamp does not fall on the start of a frame. If that flag is not
set, it should always seek to the first frame before the target
timestamp (or the first frame in the file).
2019-11-17 02:11:45 +01:00
wm4 b6413f82b2 demux_lavf: fight ffmpeg API some more and get the timeout set
It sometimes happens that HLS streams freeze because the HTTP server is
not responding for a fragment (or something similar, the exact
circumstances are unknown). The --timeout option didn't affect this,
because it's never set on HLS recursive connections (these download the
fragments, while the main connection likely nothing and just wastes a
TCP socket).

Apply an elaborate hack on top of an existing elaborate hack to somehow
get these options set. Of course this could still break easily, but hey,
it's ffmpeg, it can't not try to fuck you over. I'm so fucking sick of
ffmpeg's API bullshit, especially wrt. HLS.

Of course the change is sort of pointless. For HLS, GET requests should
just aggressively retried (because they're not "streamed", they're just
actual files on a CDN), while normal HTTP connections should probably
not be made this fragile (they could be streamed, i.e. they are backed
by some sort of real time encoder, and block if there is no data yet).
The 1 minute default timeout is too high to save playback if this
happens with HLS.

Vaguely related to #5793.
2019-11-16 13:15:45 +01:00
wm4 8d4e012bfa demux_playlist: fix previous commit
This just froze, due to obvious stupidity (I forgot to deal with all
semantic changes done to the the former stream_skip()).

Fixes: ac7f67b3f2
2019-11-15 12:10:01 +01:00
wm4 5a99015acf stream_lavf: set --network-timeout to 60 seconds by default
Until now, we've made FFmpeg use the default network timeout - which is
apparently infinite. I don't know if this was changed at some point,
although it seems likely, as I was sure there was a more useful default.

For most use cases, a smaller timeout is more useful (for example
recording something in the background), so force a timeout of 1 minute.

See: #5793
2019-11-14 13:46:03 +01:00
wm4 ac7f67b3f2 demux_mkv, stream: attempt to improve behavior in unseekable streams
stream_skip() semantics were kind of bad, especially after the recent
change to the stream code. Forward stream_skip() calls could still
trigger a seek and fail, even if it was supposed to actually skip data.
(Maybe the idea that stream_skip() should try to seek is worthless in
the first place.)

Rename it to stream_seek_skip() (takes absolute position now because I
think that's better), and make it always skip if the stream is marked as
forward.

While we're at it, make EOF detection more robust. I guess s->eof
shouldn't exist at all, since it's valid only "sometimes". It should be
removed... but not today. A 1-byte stream_read_peek() call is good to
get the s->eof flag set to a correct value.
2019-11-14 12:59:14 +01:00
wm4 19becc8ea9 stats, demux: log byte level stream seeks 2019-11-07 22:53:13 +01:00
wm4 e5a9b792ec stream: replace STREAM_CTRL_GET_SIZE with a proper entrypoint
This is overlay convoluted as a stream control, and important enough to
warrant "first class" functionality.
2019-11-07 22:53:13 +01:00
wm4 12d1761064 stream: remove eof getter
demux_mkv was the only thing using this, and everything else accessed it
directly. No need to keep the indirection wrapper around.

(Funny how this getter was in the initial commit of MPlayer.)
2019-11-07 22:53:10 +01:00
wm4 f37f4de849 stream: turn into a ring buffer, make size configurable
In some corner cases (see #6802), it can be beneficial to use a larger
stream buffer size. Use this as argument to rewrite everything for no
reason.

Turn stream.c itself into a ring buffer, with configurable size. The
latter would have been easily achievable with minimal changes, and the
ring buffer is the hard part. There is no reason to have a ring buffer
at all, except possibly if ffmpeg don't fix their awful mp4 demuxer, and
some subtle issues with demux_mkv.c wanting to seek back by small
offsets (the latter was handled with small stream_peek() calls, which
are unneeded now).

In addition, this turns small forward seeks into reads (where data is
simply skipped). Before this commit, only stream_skip() did this (which
also mean that stream_skip() simply calls stream_seek() now).

Replace all stream_peek() calls with something else (usually
stream_read_peek()). The function was a problem, because it returned a
pointer to the internal buffer, which is now a ring buffer with
wrapping. The new function just copies the data into a buffer, and in
some cases requires callers to dynamically allocate memory. (The most
common case, demux_lavf.c, required a separate buffer allocation anyway
due to FFmpeg "idiosyncrasies".) This is the bulk of the demuxer_*
changes.

I'm not happy with this. There still isn't a good reason why there
should be a ring buffer, that is complex, and most of the time just
wastes half of the available memory. Maybe another rewrite soon.

It also contains bugs; you're an alpha tester now.
2019-11-06 21:36:02 +01:00
wm4 48fc642e0c demux: unconditionally reposition stream to start before opening
The old code made it depend on ->seekable. If it isn't seekable, and
something discarded the data, then it'll just show an error message,
which will at least be somewhat informative. If no data was discarded,
the seek call is always a no-op.

There's a weird "timeline" condition in the old code; this doesn't
matter anymore, because timeline stuff does not pass streams down to
nested demuxers anymore.
2019-11-06 21:35:32 +01:00
wm4 5189ea4696 demux: reduce log level for cache index resizing
Now that I probably removed all bugs in this (?), this is uninteresting.
2019-11-01 01:54:12 +01:00
wm4 67aa7b0439 demux_mkv: reduce log level of mkvinfo part to debug
demux_mkv has lots of logging that shows information about the file. It
sort of reminds of mkvinfo output. While this is sometimes interesting,
it's too much for verbose mode, and should be in debug log level.
2019-11-01 01:37:09 +01:00
wm4 6d92e55502 Replace uses of FFMIN/MAX with MPMIN/MAX
And remove libavutil includes where possible.
2019-10-31 11:24:20 +01:00
wm4 a267452b00 stream: move stream_read_line to demux_playlist.c
demux_playlist.c is the only remaining user of this. Not sure if it
should stay this way, but for now I'll say yes.
2019-10-31 11:05:48 +01:00
wm4 bc2058fcd4 demux_mkv: add V_MPEG4/MS/V3 mapping
Fixes: #6547
2019-10-24 13:52:09 +02:00
wm4 9565ff522b build: add --enable-ffmpeg-strict-abi option
This can be used by distros to disable all known FFmpeg ABI violations.

Currently only 1 is known, in demux_lavf.c. In addition to if-defing out
the access to the private FFmpeg field, this disables the possibly
fragile nested open callbacks, which make sense only if the
aforementioned field can be accessed.
2019-10-21 01:38:25 +02:00
wm4 60ab82df32 video, demux: rip out unused spherical metadata code
This was preparation into something that never happened.

Spherical video is a shit idea anyway.
2019-10-17 22:49:26 +02:00
wm4 5cbbd25090 demux_timeline, demux_edl: correctly enable cache in pseudo-DASH mode
In pseudo-DASH mode, we may have no real streams opened until the
demuxer layer is fully loaded and playback actually starts. The only
hint that the stream is from network is, at that point, the init
segment, which is only opened as stream, and then separately as demuxer
(which is dumb but happened to fit the internal architecture better).

So just propagate the flags from the init segment stream. Seems like an
annoyance, but doesn't hurt that much I guess. (Until someone gets the
idea to pass the init segment data inline or so, but nothing does that.)

The sample link in the linked issue will probably soon switch to another
format, because that service always does this after recent uploads or
so.

Fixes: #7038
2019-10-08 23:55:05 +02:00
wm4 1f77102ee8 demux_edl: better selection of part which defines the track layout
Someone crazy is trying to mix images with videos in EDL files. Putting
an image as first thing into the EDL disabled audio, because the first
EDL entry was used to define the layout.

Change this. Make it user-configurable, and use a "better" heuristic to
select the default otherwise.

In theory, EDL could be easily extended to specify track layout and
mapping of parts to virtual EDL tracks manually and in great detail. But
I don't think it's worth it - who would bother using it?

Fixes: #6764
2019-10-06 23:35:02 +02:00
wm4 1c63869d0a demux: restore some of the DVD/BD/CDDA interaction layers
This partially reverts commit a9d83eac40
("Remove optical disc fancification layers").

Mostly due to the timestamp crap, this was never really going to work.
The playback layer is sensitive to timestamps, and derives the playback
time directly from the low level packet timestamps. DVD/BD works
differently, and libdvdnav/libbluray do not make it easy at all to
compensate for this. Which is why it never worked well, but not doing it
at all is even more awful.

demux_disc.c tried this and rewrote packet timestamps from low level TS
to playback time. So restore demux_disc.c, which should bring behavior
back to the old often non-working but slightly better state.

I did not revert anything that affects components above the demuxer
layer. For example, the properties for switching DVD angles or listing
disc titles are still gone. (Disc titles could be reimplemented as
editions. But not by me.)

This commit modifies the reverted code a bit; this can't be avoided,
because the internal API changed quite a bit. The old seek resync in
demux_lavf.c (which was a hack) is replaced with a hack. SEEK_FORCE and
demux_params.external_stream are new additions.

Some of this could/should be further cleaned up. If you don't want
"proper" DVD/BD support to disappear, you should probably volunteer.

Now why am I wasting my time for this? Just because some idiot users are
too lazy to rip their ever-wearing out shitty physical discs? Then why
should I not be lazy and drop support completely? They won't even be
thankful for me maintaining this horrible garbage for no compensation.
2019-10-03 00:22:18 +02:00
wm4 86c229fede demux_lavf: remove recently added author name from license header
This was added in 585f9ff42f by @bbarenblat (github handle). We
don't do this. This file alone probably has multiple dozen of authors (I
didn't count, but it has a history of 15 years). If everyone added their
names with each small change, this project would have giant lists of
contributing authors on every source file.

Neither copyright law nor any of the used licenses require listing
authors in the license header. Authorship is recorded in the git log.

So don't start with this, and remove this recent case to avoid setting a
precedent.

Some files still have an author in the header. These cases are
grandfathered, and usually are the actual authors of the original code.
2019-10-01 22:51:46 +02:00
wm4 07d9ca5ee3 demux_mkv: better behavior/warnings on partial files/unseekable streams
demux_mkv may seek to the end of the file to read certain headers (which
should probably be called "footers", but in theory they are just headers
that have been placed at the end of the file unfortunately).

This commit changes behavior not to seek if the stream is not marked as
seekable. Before this, it only checked whether the stream size was
unknown (end negative). In practice it doesn't make much of a
difference, since seekable usually equals known stream size.

Also improve the wording, and distinguish between actual incomplete
files, and unseekable ones.
2019-10-01 21:27:25 +02:00
wm4 5a9046222b demux: make --record-file/cache dump command work with disabled streams
This passed all streams to mp_recorder_create(), even disabled ones. The
disabled streams never get packets, so recorder.c eventually errors out
with unrelated-looking errors. The reason is that recorder.c waits for
packets to appear on other streams, which in turn is because libavformat
refuses to mux empty streams anyway.

recorder.c could call demux_stream_is_selected(), which would have made
the patch much smaller. But this feels like a bad idea, since recorder.c
should use sh_stream only for metadata (and not in an "active" way), nor
should it care what demux.c is currently doing with it. So make the API
user (demux.c) pass only the streams it really wants.

Fixes: #6999
2019-09-29 02:36:52 +02:00
wm4 a604dc12be recorder: don't use a magic index for mp_recorder_get_sink()
Although this was sort of elegant, it just seems to complicate things
slightly. Originally, the API meant that you cache mp_recorder_sink
yourself (which would avoid the mess of passing an index around), but
that too seems slightly roundabout.

In a later change, I want to change the set of streams passed to
mp_recorder_create(), and then I'd have to keep track of the index for
each stream, which would suck. With this commit, I can just pass the
unambiguous sh_stream to it, and it will be guaranteed to match the
correct stream.

The disadvantages are barely worth discussing. It's a new linear search
per packet, but usually only 2 to 4 streams are active at a time. Also,
in theory a user could want to write 2 streams using the same sh_stream
(same metadata, just writing different packets or so), but in practice
this is never done.
2019-09-29 01:41:19 +02:00
Philip Sequeira a7158ceec0 demux: sort filenames naturally when playing a directory / archive 2019-09-29 01:13:00 +03:00
wm4 68ce36a2db demux: force reading packets again after seeks
in->eof is used as an indicator whether reading packets still makes
sense. (Without this, the prefetcher would obviously burn CPU by
retrying reading even though everything has been read.)

This was not reset properly after seeks were performed. It led to
getting stuck in at least one corner case: when enabling a track, the
demuxer would seek backwards to get new packets from the current
playback position ("refresh seeks"). But if playback was paused, and EOF
was previously reached, it would not try to read packers again due to
in->eof being false. There was not anything else that would make it
retry reading, so it was stuck in a weird underrun/buffering state.

Fixes: #6986
2019-09-24 19:06:59 +02:00
Gunnar Marten d2a9e3fb34 demux: remove redundant seek range update
This was a leftover from commit b2752321 which fixed #6522 but after
the recent demux refactoring this fix is superseded by commit 0f6cda4ab.
Remove the redundant update call.
2019-09-24 17:14:25 +02:00
wm4 cbff8a5862 demux_lavf: fix seeking in ogg audio streams
This detected the first packet demuxed after a seek as timestamp
discontinuity. Obviously this is non-sense. Since the OGG radio streams
for which this feature was introduced are normally unseekable, it's
simple to fix this: simply disable it (if in auto mode, the default) as
soon as a seek is performed. This code is never called if the stream is
considered unseekable, unless the user forced it.

There's still a chance this linearization is performed before a seek
happens. This will be a bit awkward, but no worse than without this
feature, since seeking with timestamp resets is inherently broken in
both mpv and libavformat.

Fixes: #6974
Fixes: 27fcd4d
2019-09-22 20:52:37 +02:00
wnoun 1c43920fb8 demux_cue: auto-detect CUE sheet charset 2019-09-21 15:18:20 +02:00
wm4 94bfe83355 demux: propagate streaming flag through demux_timeline
Before this commit, EDL or CUE files did not properly enable the cache
if they were on "slow" media (stream->streaming==true). This happened
because the stream is unset for demux_timeline, so the streaming flag
could not be queried anymore.

Fix this by adding this flag to struct demuxer, and propagate it exactly
like the is_network flag. is_network is not used for checking the cache
options anymore, and its main function seems to be something else.
Normal http streams set the streaming flag already.

This should fix #6958.
2019-09-20 17:01:35 +02:00
wm4 d75bdf070f demux_lavf: document intentional FFmpeg API violation
This field is documented as internal, so an API user should not
access it. However, this is the only way to get some read statistics
without replacing FFmpeg's entire HLS demuxer. (Using custom I/O as
workaround doesn't work: the HLS code uses some weird internal APIs
that cannot be provided by FFmpeg API users; I even made the author
of the relevant patch to provide a public API, but which was shot
down by another FFmpeg developer. So I take this as my right to
access this field.)

Mention this explicitly, as it affects ABI and API compatibility, and
I don't want that anyone claims this was a "mistake". Add some
explanations.
2019-09-19 20:37:05 +02:00
wm4 389f1b0ef3 packet: fix theoretical UB if called on "empty" packets
In theory, a 0 size allocation could have made it memset() on a NULL
pointer (with a non-0 size, which makes it crash in addition to
theoretical UB).

This should never happen, since even packets with size 0 should have an
associated allocation, as FFmpeg currently does. But avoiding this makes
the API slightly more orthogonal and less tricky, I guess.
2019-09-19 20:37:05 +02:00
wm4 9a7a6958ca Revert "demux/packet: fix demux_packet_shorten"
This reverts commit 95636c65e7.

This change shouldn't be needed, and in fact it's wrong. The FFmpeg API
function could do anything it wants with the packet, including changing
the packet data pointer. Likewise, it's not guaranteed that the
referenced packet's fields mirror the current state of the mpv packet
struct (the AVPacket is only kept for the AVBuffer and the side data
stuff).
2019-09-19 20:37:05 +02:00
wm4 4c5df406a8 demux: fix another incorrect BOF cache flag issue 2019-09-19 20:37:05 +02:00
wm4 82f2613ade command, demux: add AB-loop keyframe cache align command
Helper for the ab-loop-dump-cache command, see manpage additions.

This is kind of shit. Not only is this a very "special" feature, but it
also vomits more messy code into the big and already bloated demux.c,
and the implementation is sort of duplicated with the dump-cache code.
(Except it's different.) In addition, the results sort of depend what a
video player would do with the dump-cache output, or what the user wants
(for example, a user might be more interested in the range of output
audio, instead of the video).

But hey, I don't actually need to justify it. I'm only justifying it for
fun.
2019-09-19 20:37:05 +02:00
wm4 023b5964b0 demux, command: add a third stream recording mechanism
That's right, and it's probably not the end of it. I'll just claim that
I have no idea how to create a proper user interface for this, so I'm
creating multiple partially-orthogonal, of which some may work better in
each of its special use cases.

Until now, there was --record-file. You get relatively good control
about what is muxed, and it can use the cache. But it sucks that it's
bound to playback. If you pause while it's set, muxing stops. If you
seek while it's set, the output will be sort-of trashed, and that's by
design.

Then --stream-record was added. This is a bit better (especially for
live streams), but you can't really control well when muxing stops or
ends. In particular, it can't use the cache (it just dumps whatever the
underlying demuxer returns).

Today, the idea is that the user should just be able to select a time
range to dump to a file, and it should not affected by the user seeking
around in the cache. In addition, the stream may still be running, so
there's some need to continue dumping, even if it's redundant to
--stream-record.

One notable thing is that it uses the async command shit. Not sure
whether this is a good idea. Maybe not, but whatever. Also, a user can
always use the "async" prefix to pretend it doesn't.

Much of this was barely tested (especially the reinterleaving crap),
let's just hope it mostly works. I'm sure you can tolerate the one or
other crash?
2019-09-19 20:37:05 +02:00
wm4 226e050b83 demux: move packet cache reading to a function
Useful for a following commit.
2019-09-19 20:37:05 +02:00
wm4 b55b9cb98c demux: move a seek helper to a separate function
It makes some slight sense and helps with one of the following commits.

Also rename that other function to make it sound less similar to
find_seek_target().
2019-09-19 20:37:05 +02:00
wm4 7893ab5a7e demux: minor simplification for backward cache size option
Always set max_bytes_bw to 0 if seekable cache is disabled, instead at
the place of its use. This is the only use of it, so the commit should
not change any behavior.

(Alternatively, this could drop the max_bytes_bw variable, use the
option directly, and keep the old code that resets it on use of the
cache is disabled.)
2019-09-19 20:37:05 +02:00
wm4 5c0a626dee demux: allow backward cache to use unused forward cache
Until now, the following could happen: if you set a 1GB forward cache,
and a 1GB backward cache, and you opened a 2GB file, it would prune away
the data cached at the start as playback progressed past the 50% mark.

With this commit, nothing gets pruned, because the total memory usage
will still be 2GB, which equals the total allowed memory usage of 1GB +
1GB.

There are no explicit buffers (every packet is malloc'ed and put into a
linked list), so it all comes down to buffer size computations. Both
reader and prune code use these sizes to decide whether a new packet
should be read / an old packet discarded. So just add the remaining free
"space" from the forward buffer to the available backward buffer. Still
respect if the back buffer is set to 0 (e.g. unseekable cache where it
doesn't make sense to keep old packets).

We need to make sure that the forward buffer can always append, as long
as the forward buffer doesn't exceed the set size, even if the back
buffer "borrows" free space from it. For this reason, always keep 1 byte
free, which is enough to allow it to read a new packet. Also, it's now
necessary to call pruning when adding a packet, to get back "borrowed"
space that may need to be free'd up after a packet has been added.

I refrained from doing the same for forward caching (making forward
cache use unused backward cache). This would work, but has a
disadvantage. Assume playback starts paused. Demuxing will stop once the
total allowed low total cache size is reached. When unpausing, the
forward buffer will slowly move to the back buffer. That alone will not
change the total buffer size, so demuxing remains stopped. Playback
would need to pass over data of the size of the back buffer until
demuxing resume; consider this unacceptable. Live playback would break
(or rather, would not resume in unintuitive ways), even normal streaming
may break if the server invalidates the URL due to inactivity. As an
alternative implementation, you could prune the back buffer immediately,
so the forward buffer can grow, but then the back buffer would never
grow. Also makes no sense.

As far as the user interface is concerned, the idea is that the limits
on their own aren't really meaningful, the purpose is merely to vaguely
restrict the cache memory usage. There could be just a single option to
set the total allowed memory usage, but the separate backward cache
controls the default ratio of backward/forward cache sizes. From that
perspective, it doesn't matter if the backward cache uses more of the
total buffer than assigned, if the forward buffer is complete.
2019-09-19 20:37:05 +02:00
wm4 e6911f82a5 demux: don't clobber internal demuxer EOF state in cache seeks
The last_eof field is the last known EOF state from the underlying
demuxer. Normally, seeks reset it, because obviously if seek back into
the middle of the file, you don't want last_eof to have a "wrong" value
for a short time window (until a packet is read, which would reset the
field to its correct value).

This shouldn't happen during cache seeks, because you don't touch the
underlying demuxer state.

At first, I made this change because some other work in progress
required it. It turned out that it was unnecessary, but keep the change
anyway, since it's still correct and makes the logic cleaner.
2019-09-19 20:37:05 +02:00
wm4 11027e99f2 packet: change memory estimation heuristics
Determining how much memory something uses is very hard, especially in
high level code (yes we call code using malloc high level). There's no
way to get an exact amount, especially since the malloc arena is shared
with the entire process anyway. So the demuxer packet cache tries to get
by with an estimate using a number of rough guesses.

It seems this wasn't quite good. In some ways, it was too optimistic, in
others it seemed to account for too much data. Try to get it closer to
what malloc and ta probably do. In particular, talloc adds some
singificant overhead (using talloc for mass-data was a mistake, and it's
even my fault). The result appears to match better with measured memory
usage. This is still extremely dependent on malloc implementation and so
on.

The effect is that you may need to adjust the demuxer cache limits to
cache as much data as it did before this commit. In any case, seems to
be better for me.
2019-09-19 20:37:05 +02:00
wm4 3cea180cc0 packet: free some unnecessary memory in disk cache case
If the disk cache is used, the AVPacket is not used anymore and is
completely deallocated when the packet is written to disk. As a minor
bug, the AVPacket allocation itself was not freed (although it wasn't a
memory leak, since talloc still automatically freed it when the entire
demux_packet was freed). For very large caches, this could easily add up
to over hundred MB, so actually free the unneeded allocation.
2019-09-19 20:37:05 +02:00
wm4 739cd99881 demux: honor seek discontinuities with --stream-record
Do the same thing --record-file does when seeks happen.
2019-09-19 20:37:05 +02:00
wm4 b945952e0d demux: runtime option changing for cache and stream recording
Make most of the demuxer options runtime-changeable. This includes the
cache options and stream recording. The manpage documents some of the
possibly weird issues related to this.

In particular, the disk cache isn't shuffled around if the setting
changes at runtime.
2019-09-19 20:37:05 +02:00
wm4 8a48a277ed demux: enable --stream-record for things using timeline
Although this is not useful in general, it makes --stream-record work
with a certain video streaming service by a large dystopian company.

In the general case, this fails because normal muxing can, quite
obviously, not handle the segmented metadata in the packets. (There
isn't even a file format which could handle these, except possibly mp4.)
On the other hand, ytdl merely uses timeline/EDL to emulate DASH
streaming (unfortunately), which does not use the segmented stuff, and
stream recording will actually work.
2019-09-19 20:37:05 +02:00
wm4 7b382f3acd demux_mkv: add hacks to avoid a single warning
It prints "Unexpected end of file (no clusters found)" when opening a
webm init fragment. The warning is correct, but unwanted in this case.
Add tons of kludges to avoid it.

(Actually it prints that twice, for audio and video each.)

Also, suppress another warning about a seek head entry that points
exactly to the end of the file. This is a MATROSKA_ID_CUES, which is
harmless, and, very strangely, doesn't point at any cues when you
concatenate the init fragment with a media fragment. No idea what that
crap is supposed to be.
2019-09-19 20:37:05 +02:00
wm4 c4dc600f1f demux: make webm dash work by using init fragment on all demuxers
Retarded webshit streaming protocols (well, DASH) chop a stream into
small fragments, and move unchanging header parts to an "init" fragment
to save some bytes (in the case at hand about 300 bytes for each
fragment that is 100KB-200KB, sure was worth it, fucking idiots).

Since mpv uses an even more retarded hack to inefficiently emulate DASH
through EDL, it opens a new demuxer for every fragment. Thus the
fragment needs to be virtually concatenated with the init fragment. (To
be fair, I'm not sure whether the alternative, reusing the demuxer and
letting it see a stream of byte-wise concatenated fragmenmts, would
actually be saner.)

demux_lavc.c contained a hack for this. Unfortunately, a certain shitty
streaming site by an evil company, that will bestow dytopia upon us soon
enough, sometimes serves webm based DASH instead of the expected mp4
DASH. And for some reason, libavformat's mkv demuxer can't handle the
init fragment or rejects it for some reason. Since I'd rather eat
mushrooms grown in Chernobyl than debugging, hacking, or (god no)
contributing to FFmpeg, and since Chernobyl is so far away, make it work
with our builtin mkv demuxer instead.

This is not hard. We just need to copy the hack in demux_lavf.c to
demux_mkv.c. Since I'm not _that_ much of a dumbfuck to actually do
this, remove the shitty gross demux_lavf.c hack, and replace it by a
slightly less bad generic implementation (stream_concat.c from the
previous commit), and use it on all demuxers. Although this requires
much more code, this frees demux_lavf.c from a hack, and doesn't require
adding a duplicated one to demux_mkv.c, so to the naive eye this seems
to be a much better outcome.

Regarding the code, for some reason stream_memory_open() is never meant
to fail, while stream_concat_open() can in extremely obscure situations,
and (currently) not in this case, but we handle failure of it anyway.
Yep.
2019-09-19 20:37:05 +02:00
wm4 2b37f9a984 demux: never set demux->stream for timeline mess
Timeline (demux_timeline, for EDL and mkv ordered chapters) are a mess,
because it's the only nested demuxer case. Part of the mess comes from
shared struct stream pointers. This makes no sense, because the wrapper
(demux_timeline) doesn't have any business setting it.

Try to lessen it by not passing down streams. Instead, pass down NULL.
This prevents unintended interference, and tightens the ownership rules.
Now a demuxer always owns its stream.

On the other hand, demuxer->stream can now be NULL. This was never the
case before, and consequently there will be new bugs. At least they will
be spotted, because they've been bugs before.

struct stream is also used to access stream properties (such as whether
something is considered a network stream). Most of these have been
mirrored in struct demuxer (because the frontend has been forbidden to
access struct stream because of threading). But during initialization
was still used, so introduce an awkward struct parent_stream_info, which
unifies these.

Commit e0419fb181 changed demux_is_network_cached() to use
demuxer->stream->streaming instead of demuxer->is_network. To enable
timeline stuff to use the cache anyway, change it so that both flags can
contribute to it. The stream NULL-check is obviously due to changes in
this commit.
2019-09-19 20:37:05 +02:00
wm4 e40885d963 stream: create memory streams in more straightforward way
Instead of having to rely on the protocol matching, make a function that
creates a stream from a stream_info_t directly. Instead of going through
a weird indirection with STREAM_CTRL, add a direct argument for non-text
arguments to the open callback. Instead of creating a weird dummy
mpv_global, just pass an existing one from all callers. (The latter one
is just an artifact from the past, where mpv_global wasn't available
everywhere.)

Actually I just wanted a function that creates a stream without any of
that bullshit. This goal was slightly missed, since you still need this
heavy "constructor" just to setup a shitty struct with some shitty
callbacks.
2019-09-19 20:37:05 +02:00
wm4 de3ecc60cb demux_playlist: extend maximum line size
Raise it from 8KB to 512KB.

Do this because ytdl_hook.lua generated a 40KB EDL file (from 80KB
youtube-dl JSON output), and putting it into a .m3u file for easier
debugging failed due to the size limit.
2019-09-19 20:37:05 +02:00
wm4 4291329d56 demux: fix backward demuxing not grabbing all audio packets
The previous commit broke audio playback (maybe this is what 4. was
about?). But it wasn't the fault of the commit; it just exposed
pre-existing issues.

If the packet queue search can't get all packets, it checked
queue->is_bof to see whether there could be previous packets. But
obviously, is_bof can be set, even if the search start packet wasn't the
first one.

This was especially observable with audio, because audio by default
needs preroll packets, and plays artifacts if they're missing.

Fix by using the BOF playback condition for this purpose too.
2019-09-19 20:37:05 +02:00
wm4 f2cee22311 demux: another questionable backwards playback mud party
In theory, backward demuxing does way too much work by doing a full
cache seek every time you need to step back through a packet queue. In
theory, it would be exceedingly more efficient to just iterate backwards
through the queue, but this is not possible because I'm too stingy to
add 8 bytes per packet for backlinks. (In theory, you could probably
come up with some sort of deque, that'd allow efficient iteration into
any direction, but other requirements make this tricky, and I'm
currently too dumb/lazy to do this. For example, the queue can grow to
millions of elements, all while packet pointers need to stay valid.)

Another possibility is to "locally" seek the queue. This still has less
overhead than a full seek.

Also, it just so happens that, as a side effect, this avoids performing
range merging, which commit f4b0e7e942 "broke". That wasn't a bug at
all, but since range joining is relatively slow, avoiding it is good.
This is really just a coincidental side effect, I'm not even quite sure
why it happens this way.

There are 4 ugly things about this change:

1. To get a keyframe "before" a certain one, we recompute the target
PTS, and then subtract 0.001 as arbitrary number to "fudge" it. This
isn't the first place where this is done, and hey, it wasn't my damn
idea that MPlayer should use floats for timestamps. (At first, it even
used 32 bit timestamps.)

2. This is the first time reader_head is reset to an earlier position
outside of the seek code. This might cause conceptual problems since
this code is now "duplicated".

3. In theory, find_backward_restart_pos() needs to be run again after
the backstep. Normally, the seek code calls it explicitly. We could call
it right in the new code, but then the damn function would be recursive.
We could shuffle the code around to make it a loop, but even then
there'd be an offchance into running into an unexpected corner case (aka
subtle bug), where it would loop forever. To avoid refactoring the code
and having to think too hard about it, make it deferred - add some new
state and the check_backward_seek() function for this. Great, even more
subtle mutable state for this backwards shit.

4. I forgot this one, but I can assure you, it's bad.

Without doubt someone will have to clean up this slightly in the future
(or rip it out), but it won't be me.
2019-09-19 20:37:05 +02:00
wm4 2f64c84b44 demux: remove some redundancy in backward playback code
This code tries to determine the "current" position, which is used as
base for the seek target when it needs to seek back more (the point is
to prevent seeking back too far). But compute_keyframe_times() almost
computes the same thing, so use that. Unfortunately needs a forward
declaration.

("Almost", because it differs in some details that should not really
matter.)
2019-09-19 20:37:05 +02:00
wm4 c13bfd271c demux_mkv: fix subtitle preroll in some cases
Subtitle packets with a timestamp before the seek target may overlap
with the seek target anyway. This is why this subtitle preroll crap
exists: it needs to return packets before the seek target to ensure that
the subtitle is displayed at the seek target.

This didn't always work. Maybe it's a regression, but it must have been
an old one. The breakage is triggered by heuristic that is to prevent
excessive queuing of packets in garbage files (this heuristic apparently
became immediately necessary when this preroll mechanism was
implemented).

If a video keyframe packet was found, but no audio packet yet, then
subtitle_preroll was set to 0, and since a_skip_to_keyframe was still 0,
the subtitle packet was discarded. The dumb thing is that subtitle and
video seeking is finished at this point, so the preroll crap should not
be applied at all.

Fix this by moving the preoll overflow code into the block that handles
preroll.
2019-09-19 20:37:05 +02:00
wm4 021ccb9644 demux: turn some redundant assignments into asserts
demux_packet.next should not be used outside of demux.c, and in this
case it's a packet that was just passed to demux.c from the outside.

demux_packet.stream is already set by the demuxer, and this is assured
by the add_packet_locked() caller.
2019-09-19 20:37:05 +02:00
wm4 e320f02187 demux: move a function
The new location makes equally much sense (or more, since it's close to
its per-stream companion function), and we don't need a forward
declaration.
2019-09-19 20:37:05 +02:00
wm4 95c97cd66e demux: disable backward demuxing if it fatally fails
We don't care much about this case, because backward playback can fail
terribly without a good way to detect it, so this was fine.

However, this froze in certain situations. Reading from a subtitle file
for which backward demuxing failed could make it get stuck in
demux_read_packet_async() in unthreaded mode. (That we don't support
backwards subtitle decoding anyway doesn't matter for this.)

So aggressively disable backward demuxing to prevent worse in these
situations. The behavior will still be awful, because the frontend is
still in backwards playback mode, but at least it won't freeze.
2019-09-19 20:37:05 +02:00
wm4 17da9071a4 demux: add a on-disk cache
Somewhat similar to the old --cache-file, except for the demuxer cache.
Instead of keeping packet data in memory, it's written to disk and read
back when needed.

The idea is to reduce main memory usage, while allowing fast seeking in
large cached network streams (especially live streams). Keeping the
packet metadata on disk would be rather hard (would use mmap or so, or
rewrite the entire demux.c packet queue handling), and since it's
relatively small, just keep it in memory.

Also for simplicity, the disk cache is append-only. If you're watching
really long livestreams, and need pruning, you're probably out of luck.
This still could be improved by trying to free unused blocks with
fallocate(), but since we're writing multiple streams in an interleaved
manner, this is slightly hard.

Some rather gross ugliness in packet.h: we want to store the file
position of the cached data somewhere, but on 32 bit architectures, we
don't have any usable 64 bit members for this, just the buf/len fields,
which add up to 64 bit - so the shitty union aliases this memory.

Error paths untested. Side data (the complicated part of trying to
serialize ffmpeg packets) untested.

Stream recording had to be adjusted. Some minor details change due to
this, but probably nothing important.

The change in attempt_range_joining() is because packets in cache
have no valid len field. It was a useful check (heuristically
finding broken cases), but not a necessary one.

Various other approaches were tried. It would be interesting to list
them and to mention the pros and cons, but I don't feel like it.
2019-09-19 20:37:05 +02:00
wm4 2e3d3bbfc8 demux: move comment to slightly better location 2019-09-19 20:37:05 +02:00
wm4 e8ff816ccd demux: fix excessive backwards seeking with backwards playback
Backwards demuxing usually seeks back back by a "random" amount (set by
a user option) when it needs new preceding packets. It turns out a past
change made these backwards seek amounts add up when it didn't need to
(i.e. subtracting the amount from the seek pos without properly
resetting it), which could possibly slow down playback as it went on.

The reason for this was that back_seek_pos was set for every stream on
every seek. This made the reset not affect other streams (in particular
streams which weren't used and never were reset, or which didn't reset
that often). But as the commit adding it showed, this is needed only to
set the initial position. So do that.

Fixes: "demux: fix initial backward demuxing state in some cases"
2019-09-19 20:37:05 +02:00
wm4 73a48ff47b demux: fix minor seek_preroll consistency issue
When packet appending sets the start of the range, it adjusts the range
by seek_preroll. Do this when packets are pruned from the start of the
range too.

(Yeah, seek_preroll handling is probably broken in some other cases. It
was halfhearted to begin with.)
2019-09-19 20:37:05 +02:00
wm4 0f6cda4ab1 demux: mess with seek range updates and pruning
The main thing this commit does is removing demux_packet.kf_seek_pts. It
gets rid of 8 bytes per packet. Which doesn't matter, but whatever.

This field was involved with much of seek range updating and pruning,
because it tracked the canonical seek PTS (i.e. start PTS) of a packet
range. We have to deal with timestamp reordering, and assume the start
PTS is the lowest PTS across all packets (not necessarily just the first
packet). So knowing this PTS requires looping over all packets of a
range (no, the demuxer isn't going to tell us, that would be too sane).

Having this as packet field was perfectly fine. I'm just removing it
because I started hating extra packet fields recently.

Before this commit, this value was cached in the kf_seek_pts field (and
computed "incrementally" when adding packets). This commit computes the
value on demand (compute_keyframe_times()) by iterating over the placket
list. There is some similarity with the state before 10d0963d85,
where I introduced the kf_seek_pts field - maybe I'm just moving in
circles. The commit message claims something about quadratic complexity,
but if the code before that had this problem, this new commit doesn't
reintroduce it, at least. (See below.)

The pruning logic is simplified (I think?) - there is no "incremental"
cached pruning decision anymore (next_prune_target is removed), and
instead it simply prunes until the next keyframe like it's supposed to.
I think this incremental stuff was only there because of very old code
that got refactored away before. I don't even know what I was thinking
there, it just seems complex. Now the seek range is updated when a
keyframe packet is removed.

Instead of using the kf_seek_pts field, queue->seek_start is used to
determine the stream with the lowest timestamp, which should be pruned
first. This is different, but should work well. Doing the same as the
previous code would require compute_keyframe_times(), which would
introduce quadratic complexity.

On the other hand, it's fine to call compute_keyframe_times() when the
seek range is recomputed on pruning, because this is called only once
per removed keyframe packet. Effectively, this will iterate over the
packet list twice instead of once, and with some locality. The same
happens when packets are appended - it loops over the recently added
packets once again. (And not more often, which would go above linear
complexity.)

This introduces some "cleverness" with avoiding calling
update_seek_ranges() even when keyframe packets added/removed, which is
not really tightly coupled to the new code, and could have been in a
separate commit.

Removing next_prune_target achieves the same as commit b275232141,
which is hereby reverted (stale is_bof flags prevent seeking before the
current range, even if the beginning of the file was pruned). The seek
range is now strictly computed after at least one packet was removed,
and stale state should not be possible anymore.

Range joining may over-allocate the index a little. It tried hard to
avoid this before by explicitly freeing the old index before creating a
new one. Now it iterates over the old index while adding the entries to
the new one, which is simpler, but may allocate twice the memory in the
worst case. It's not going to matter for anything, though.

Seeking will be slightly slower. It needs to compute the seek PTS values
across all packets in the vicinity of the seek target. The previous code
also iterated over these packets, but now it iterates one packet range
more.

Another minor detail is that the special seeking code for SEEK_FORWARD
goes away. The seeking code will now iterate over the very last packet
range too, even if it's incomplete (i.e. packets are still being
appended to it). It's fine that it touches the incomplete range, because
the seek_end fields prevent that anything particularly incorrect can
happen. On the other hand, SEEK_FORWARD can now consider this as seek
target, which the deleted code had to do explicitly, as kf_seek_pts was
unset for incomplete packet ranges.
2019-09-19 20:37:05 +02:00
wm4 e8ec271859 demux: fix a comment
Obviously doesn't sense with this order. The git history shows that this
comment was touched multiple times, without ever fixing it. It was
originally added in 2016, where the "for" was missing. Later, the "for"
was added, but to the wrong position.

What the fuck?
2019-09-19 20:37:05 +02:00
wm4 628abf53d1 demux: cache a value
Just for readability purposes. Although the field is mutable, it never
changes within the function.
2019-09-19 20:37:05 +02:00
wm4 aa03ee7300 demux: redo timed metadata
The old implementation didn't work for the OGG case. Discard the old
shit code (instead of fixing it), and write new shit code. The old code
was already over a year old, so it's about time to rewrite it for no
reason anyway.

While it's true that the old code appears to be broken, the main reason
to rewrite this is to make it simpler. While the amount of code seems to
be about the same, both the concept and the actual tag handling are
simpler. The result is probably a bit more correct.

The packet struct shrinks by 8 byte. That fact that it wasted 8 bytes
per packet for a rather obscure use case was the reason I started this
at all (and when I found that OGG updates didn't work). While these 8
bytes aren't going to hurt, the packet struct was getting too bloated.
If you buffer a lot of data, these extra fields will add up. Still quite
some effort for 8 bytes. Fortunately, it's not like there are any
managers that need to be convinced whether it's worth doing. The freedom
to waste time on dumb shit.

The old implementation attached the current metadata to each packet.
When the decoder read the packet, the packet's metadata was made
current. The new implementation stores metadata as separate list, and
requires that the player frontend tells it the current playback time,
which will be used to find the currently valid metadata. In both cases,
the objective was to correctly update metadata even if a lot of data is
buffered ahead (and to update them correctly when seeking within the
demuxer cache).

The new implementation is actually slightly more correct, because it
uses the playback time for the metadata lookup. Consider if you have an
audio filter which buffers 15 seconds (unfortunately such a filter
exists), then the old code would update the current title 15 seconds too
early, while the new one does it correctly.

The new code also simplifies mixing the 3 metadata sources (global, per
stream, ICY). We assume these aren't mixed in a meaningful way. The old
code tried to be a bit more "exact". I didn't bother to look how the old
code did this, but the new code simply always "merges" with the previous
metadata, so if a newer tag removes a field, it's going to stick around
anyway.

I tried to keep it simple. Other approaches include making metadata a
special sh_stream with metadata packets. This would have been
conceptually clean, but the implementation would probably have been
unnatural (and doesn't match well with libavformat's API anyway). It
would have been nice to make the metadata updates chapter points (makes
a lot of sense for the intended use case, web radio current song
information), but I don't think it would have been a good idea to make
chapters suddenly so dynamic. (Still an idea to keep in mind; the new
code actually makes it easier to work towards this.)

You could mention how subtitles are timed metadata, and actually are
implemented as sparse packet streams in some formats. mp4 implements
chapters as special subtitle stream, AFAIK. (Ironically, this is very
not-ideal for files. It would be useful for streaming like web radio,
but mp4 is extremely bad for streaming by design for other reasons.)

bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
2019-09-19 20:37:05 +02:00
wm4 27fcd4ddc6 demux_lavf: compensate timestamp resets for OGG web radio streams
Some OGG web radio streams use timestamp resets when a new song starts
(you can find those Xiph's directory - other streams there don't show
this behavior). Basically, the OGG stream behaves like concatenated OGG
files, and "of course" the timestamps will start at 0 again when the
song changes. This is very inconvenient, and breaks the seekable demuxer
cache. In fact, any kind of seeking will break

This is more time wasted in Xiph's bullshit. No, having timestamp resets
by design is not reasonable, and fuck you. I much prefer the awful
ICY/mp3 streaming mess, even if that's lower quality and awful. Maybe it
wouldn't be so bad if libavformat could tell us WHERE THE FUCK THE RESET
HAPPENS. But it doesn't, and the randomly changing timestamps is the
only thing we get from its API.

At this point, demux_lavf.c is like 90% hacks. But well, if libavformat
applies this strange mixture of being clever for us vs. giving us
unfiltered garbage (while pretending it abstracts everything, and hiding
_useful_ implementation/low level details), not much we can do.

This timestamp linearizing would, in general, probably be better done
after the decoder, because then we wouldn't need to deal with timestamp
resets. But the main purpose of this change is to fix seeking within the
demuxer cache, so we have to do it on the lowest level.

This can probably be applied to other containers and video streams too.
But that is untested. Some further caveats are explained in the manpage.
2019-09-19 20:37:05 +02:00
wm4 cb82a206a9 demux_lavf: add per-stream state
Seems like this will be useful later.
2019-09-19 20:37:05 +02:00
wm4 91abd7a4f7 demux_lavf: use common mpv/ffmpeg timestamp conversion function
Probably doesn't change anything, other than looking slightly better. In
theory, the common function has some stuff that makes it more likely
that timestamps round-trip through conversions properly, but I didn't
confirm that.
2019-09-19 20:37:05 +02:00
wm4 fae31f39c7 demux: refactor cache range init/deinit
Remove the duplicated creation of the first range. Explicitly destroy
ranges, including the last one on final deinit.

It looks like this also fixes a leak of removed range structs, which was
never noticed because they're so small, and were freed on final deinit
due to having the demuxer as talloc parent.

This improves upon the previous commit too (that change should have
been part of it I guess). Sub-demuxers (demux_timeline only) now
automatically don't use the cache (like it was intended by the previous
commit). The cache is "initialized" (or disabled) last in the recursive
call chain, which is messy, but this sub demuxer stuff FUCKING SUCKS, as
mentioned in the previous commit message. This would be no problem if
the caching layer and actual demuxer implementations were separate.

Most of this change has no purpose. Might make (de-)initialization of
further cache exerpiments simpler.
2019-09-19 20:37:05 +02:00
wm4 e8147843fc demux: really disable cache for sub-demuxers
It seems the so called demuxer cache wasn't really disabled for
sub-demuxers (timeline stuff). This was relatively harmless, since the
actual packet data was shared anyway via refcounting. But with the
addition of a mmap cache backend, this may change a lot.

So strictly disable any caching for sub-demuxers. This assumes that
users of sub-demuxers (only demux_timeline.c by now?) strictly use
demux_read_any_packet(), since demux_read_packet_async() will require
some minor read-ahead if a low level packet read returned a packet for a
different stream.

This requires some awkward messing with this fucking heap of trash. The
thing that is really wrong here is that the demuxer API mixes different
concepts, and sub-demuxers get the same API as decoders, and use the
cache code.
2019-09-19 20:37:05 +02:00
wm4 5d6b7c39ab demux: handle accounting for index size differently
The demuxer cache tries to track the number of bytes allocated for the
cache. In addition to the packet queue, the seek index is another data
structure that roughly depends on the amount of packets cached. So the
index size should somehow be part of the total number of bytes tracking.

Until now, this was handled with KF_SEEK_ENTRY_WORST_CASE, basically a
shitty heuristic. It was a guess (and probably rather an upper bound
than a lower bound). The implementation details made it annoying, and it
was conceptually inaccurate too.

Change this, and instead simply add the index size to the total cache
size. This essentially makes it part of the backbuffer. It's nice that
this cleanly decouples it from the packet size tracking itself.

Since it's part of the backbuffer number of bytes now, packet pruning
can't necessarily free enough space in the backbuffer anymore. Before
this commit, the backbuffer consisted of packets only, so it was
possible to reduce its size to 0 by pruning all packets until the
decoder reader position, at which point a packet was accounted as
forward buffered. Now the index is added to this, and it can't be
pruned. Replace the assert() because of this changed invariant.
2019-09-19 20:37:05 +02:00