Until now, the audio chain could handle both little endian and big
endian formats. This actually doesn't make much sense, since the audio
API and the HW will most likely prefer native formats. Or at the very
least, it should be trivial for audio drivers to do the byte swapping
themselves.
From now on, the audio chain contains native-endian formats only. All
AOs and some filters are adjusted. af_convertsignendian.c is now wrongly
named, but the filter name is adjusted. In some cases, the audio
infrastructure was reused on the demuxer side, but that is relatively
easy to rectify.
This is a quite intrusive and radical change. It's possible that it will
break some things (especially if they're obscure or not Linux), so watch
out for regressions. It's probably still better to do it the bulldozer
way, since slow transition and researching foreign platforms would take
a lot of time and effort.
IEC 61937 frames should always be little endian (little endian 16 bit
words). I don't see any apparent need why the audio chain should handle
swapped-endian formats.
It could be that some audio outputs might want them (especially on big
endian architectures). On the other hand, it's not clear how that works
on these architectures, and it's not even known whether the current code
works on big endian at all. If something should break, and it should
turn out that swapped-endian spdif is needed on any platform/AO,
swapping still could be done in-place within the affected AO, and
there's no need for the additional complexity in the rest of the player.
Note that af_lavcac3enc outputs big endian spdif frames for unknown
reasons. Normally, the resulting data is just pulled through an auto-
inserted conversion filter and turned into little endian. Maybe this was
done as a trick so that the code didn't have to byte-swap the actual
audio frame. In any case, just make it output little endian frames.
All of this is untested, because I have no receiver hardware.
libavcodec/libavformat now handles gapless audio better. In theory, this
could be implemented with ad_mpg123 too, but since libavformat strips
metadata from mp3 files and passes pure mp3 packets to the decoders
only, this can't work by itself. Instead, the player must pass this
metadata separately. libav* do this relatively transparently over packet
"side data" (attached to AVPacket).
It might also be possible to let libmpg123 handles all this by
implementing it as demuxer that outputs PCM, but that would have other
problems, and I think it's better to make libavformat work correctly.
libmpg123 can still be used with '--ad=mpg123:mp3'.
Also see issue #1101.
bstr.c doesn't really deserve its own directory, and compat had just
a few files, most of which may as well be in osdep. There isn't really
any justification for these extra directories, so get rid of them.
The compat/libav.h was empty - just delete it. We changed our approach
to API compatibility, and will likely not need it anymore.
Use OPT_KEYVALUELIST() for all places where AVOptions are directly set
from mpv command line options. This allows escaping values, better
diagnostics (also no more "pal"), and somehow reduces code size.
Remove the old crappy option parser (av_opts.c).
It probably happens relatively often that the first packet (or even the
first N packets) of a stream will fail to decode, but decoding will
eventually succeed at a later point. Before commit 261506e3, this was
handled by an explicit retry loop (although this was also for other
purposes), but with then was changed to abort on the first error. This
makes it impossible to decode some audio streams.
Change this so that errors are ignored for the first 50 packets, which
should make it equivalent to the old code.
This commit makes audio decoding non-blocking. If e.g. the network is
too slow the playloop will just go to sleep, instead of blocking until
enough data is available.
For video, this was already done with commit 7083f88c. For audio, it's
unfortunately much more complicated, because the audio decoder was used
in a blocking manner. Large changes are required to get around this.
The whole playback restart mechanism must be turned into a statemachine,
especially since it has close interactions with video restart. Lots of
video code is thus also changed.
(For the record, I don't think switching this code to threads would
make this conceptually easier: the code would still have to deal with
external input while blocked, so these in-between states do get visible
[and thus need to be handled] anyway. On the other hand, it certainly
should be possible to modularize this code a bit better.)
This will probably cause a bunch of regressions.
Accidentally broken in b6af44d3. For ad_lavc (and in general), the PTS
was not updated correctly when filtering only parts of audio frames,
and for ad_mpg123 and ad_spdif the PTS was additionally offset by the
frame size.
This could lead to incorrect time display, and possibly broken A/V sync.
Execute the format change based on whether we logically detected EOF
(after filters), instead of when the decode buffer was drained. It's
slightly cleaner. (The requirement of len>0 existed before.)
Don't return an EOF code if there's still buffered data.
Also, don't call demux_stream_eof() in the playloop. There's probably
nothing wrong with it, but it's cleaner not to use it.
Also give AD_EOF its own value, so that a decoding error doesn't drain
audio by causing an EOF condition.
Move a function call, which does not change semantics.
Write the extra buffer sample count in a more straight-forward way; the
old code was not meaningful in any way (anymore).
It's true that the decoder can successfully decode, but return no data
(for various reasons). We don't need to handle this specially, though.
We just let the decoder decode some more data. This doesn't increase the
danger of an endless loop either, because audio_decode() already calls
this function until enough is decoded.
This commit mainly moves the initial decoding of data (done to probe the
audio format) to generic code. This will make it easier to make audio
decoding non-blocking in a later commit.
This commit also changes how decoders return data: instead of having
them write the data into a prepared buffer, they return a reference to
an internal buffer (by setting dec_audio.decoded). This makes it
significantly easier to handle audio format changes, since the decoders
don't really need to care anymore.
If the decoder didn't set a samplerate, it was initialized from the
container samplerate.
This probably didn't make much sense, because it's passed to the
decoder on initialization (so it could definitely use it). It's an
artifact from commit 66a9eb57 (which removed some Matroska-specific non-
sense), and I've never seen it actually happen since it was made into a
warning. Just get rid of it.
In most places where af_fmt2bits is called to get the bits/sample, the
result is immediately converted to bytes/sample. Avoid this by getting
bytes/sample directly by introducing af_fmt2bps.
The i_bps members of the sh_audio and dev_video structs are mostly used
for displaying the average audio and video bitrates. Keeping them in
bits-per-second avoids truncating them to bytes-per-second and changing
them back lateron.
Also remove MSGL_SMODE and friends.
Note: The indent in options.rst was added to work around a bug in
ReportLab that causes the PDF manual build to fail.
This collects statistics and other things. The option dumps raw data
into a file. A script to visualize this data is included too.
Litter some of the player code with calls that generate these
statistics.
In general, this will be helpful to debug timing dependent issues, such
as A/V sync problems. Normally, one could argue that this is the task of
a real profiler, but then we'd have a hard time to include extra
information like audio/video PTS differences. We could also just
hardcode all statistics collection and processing in the player code,
but then we'd end up with something like mplayer's status line, which
was cluttered and required a centralized approach (i.e. getting the data
to the status line; so it was all in mplayer.c). Some players can
visualize such statistics on OSD, but that sounds even more complicated.
So the approach added with this commit sounds sensible.
The stats-conv.py script is rather primitive at the moment and its
output is semi-ugly. It uses matplotlib, so it could probably be
extended to do a lot, so it's not a dead-end.
Set refcounted_frames, because in some versions of libavcodec mixing the
new AVFrame API and non-refcounted decoding could cause memory
corruption. Likewise, it's probably still required to unref a frame
before calling the decoder.
request_channels has been deprecated for years (request_channel_layout
is the replacement), but it appears it's still needed despite the
deprecation at least on older libavcodec versions.
So still set request_channels, but to it with the avoption API, which
hides the deprecation warning. This should also prevent mpv getting
trashed when libavcodec happens to bump its major version.
Since m_option.h and options.h are extremely often included, a lot of
files have to be changed.
Moving path.c/h to options/ is a bit questionable, but since this is
mainly about access to config files (which are also handled in
options/), it's probably ok.
The tmsg stuff was for the internal gettext() based translation system,
which nobody ever attempted to use and thus was removed. mp_gtext() and
set_osd_tmsg() were also for this.
mp_dbg was once enabled in debug mode only, but since we have log level
for enabling debug messages, it seems utterly useless.
This can be reproduced with:
mpv short.wav -af 'lavfi="aecho=0.8:0.9:5000|6800:0.3|0.25"'
An audio file that is just 1-2 seconds long should play for 8-9 seconds,
which audible echo towards the end.
The code assumes that when playing with AF_FILTER_FLAG_EOF, the filter
will either produce output, or has all remaining data flushed. I'm not
really sure whether this really works if there are multiple filters with
EOF handling in the chain. To handle it correctly, af_lavfi should retry
filtering if 1. EOF flag is set, 2. there were input samples, and 3. no
output samples were produced. But currently it seems to work well enough
anyway.
The new signature is actually closer to how it actually works, and
someone who is not familiar to the API and how it works might make fewer
fatal mistakes with the new signature than the old one. Pretty weird.
Do this to sneak in a flags parameter, which will later be used to flush
remaining data of at least vf_lavfi.
Normally, audio decoder don't have a decoder delay, so the code was
fine. But FFmpeg supports multithreaded decoding for some audio codecs,
which introduces such a delay.
The delay means that we won't get decoded audio for the first few
packets, and that we need to do something to get the trailing audio
still buffered in the decoder when reaching EOF.
Two changes are needed to deal with the delay:
- If EOF is reached, pass a "flush" packet to the decoder to return the
buffered audio. Such a flush packet is automatically setup when
calling mp_set_av_packet() with a NULL packet.
- Use the PTS returned by the decoder, instead of the packet's. This is
important to get correct timestamps for decoded audio. Ignoring this
would result into offsetting the audio playback time by the decoder
delay. Note that we can still use the timestamp of the first packet
to get the timestamp for the start of the audio.
If the timebase is set, it's used for converting the packet timestamps.
Otherwise, the previous method of reinterpret-casting the mpv style
double timestamps to libavcodec style int64_t timestamps is used.
Also replace the kind of awkward mp_get_av_frame_pkt_ts() function by
mp_pts_from_av(), which simply converts timestamps in a way the old
function did. (Plus it takes a timebase parameter, similar to the
addition to mp_set_av_packet().)
Note that this should not change anything yet. The code in ad_lavc.c and
vd_lavc.c passes NULL for the timebase parameters. We could set
AVCodecContext.pkt_timebase and use that if we want to give libavcodec
"proper" timestamps.
This could be important for ad_lavc.c: some codecs (opus, probably mp3
and aac too) have weird requirements about doing decoding preroll on the
container level, and thus require adjusting the audio start timestamps
in some cases. libavcodec doesn't tell us how much was skipped, so we
either get shifted timestamps (by the length of the skipped data), or we
give it proper timestamps. (Note: libavcodec interprets or changes
timestamps only if pkt_timebase is set, which by default it is not.)
This would require selecting a timebase though, so I feel uncomfortable
with the idea. At least this change paves the way, and will allow some
testing.
These used the suffix _resync_stream, which is a bit misleading. Nothing
gets "resynchronized", they really just reset state.
(Some audio decoders actually used to "resync" by reading packets for
resuming playback, but that's not the case anymore.)
Also move the function in dec_video.c to the top of the file.
This includes the case when lavc decodes audio with more than 8
channels, which our audio chain currently does not support.
the changes in ad_lavc.c are just simplifications. The code tried to
avoid overriding global parameters if it found something invalid, but
that is not needed anymore.
Apparently just 5 packets is not enough for the initial audio decode
(which is needed to find the format). The old code (before the recent
refactor) appeared to use 5 packets, but there were apparently other
code paths which in the end amounted to more than 5 packets being read.
The sample that failed (see github issue #368) needed 9 packets.
Fixes#368.
This used to be needed to access the generic stream header from the
specific headers, which in turn was needed because the decoders had
access only to the specific headers. This is not the case anymore, so
this can finally be removed again.
Also move the "format" field from the specific headers to sh_stream.
sh_audio is supposed to contain file headers, not whatever was decoded.
Fix this, and write the decoded format to separate fields in the decoder
context, the dec_audio.decoded field. (Note that this field is really
only needed to communicate the audio format from decoder driver to the
generic code, so no other code accesses it.)
Move all state that basically changes during decoding or is needed in
order to manage decoding itself into a new struct (dec_audio).
sh_audio (defined in stheader.h) is supposed to be the audio stream
header. This should reflect the file headers for the stream. Putting the
decoder context there is strange design, to say the least.