Commit 5e25a3d2 broke handling of the initial frame (the one decoded
with initial_audio_decode()). It didn't update the pts_offset field,
leading to a shift in timestamps by one audio frame.
Fix by calling the actual decode function in a single place. This
requires slightly more changes than what would be necessary to fix the
bug, but it also somewhat simplifies the data flow.
The goal is switching the whole audio chain to using refcounted frames.
This brings the architecture closer to FFmpeg, enables better
integration with libavfilter, will reduce useless copying somewhat, and
will probably allow better timestamp tracking.
For now, every filter goes through a semi-awful wrapper in
af_do_filter(), though. This will be fixed step by step, and the wrapper
should eventually be removed. Another thing that will have to be done is
improving the timestamp handling and avoiding extra copies for the AO.
Some of the new code is rather similar to the video filter code (the
core filter code basically just has types replaced). Such code
duplication is normally very unwanted, but in this case there's probably
no other choice. On the other hand, this code is pretty simple (even if
somewhat tricky). Maybe there will be unified filter code in the future,
but this is still far away.
This rewrites the audio decode loop to some degree. Audio filters don't
do refcounted frames yet, so af.c contains a hacky "emulation".
Remove some of the weird heuristic-heavy code in dec_audio.c. Instead of
estimating how much audio we need to filter, we always filter full
frames. Maybe this should be adjusted later: in case filtering increases
the volume of the audio data, we should try not to buffer too much
filter output by reducing the input that is fed at once.
For ad_spdif.c and ad_mpg123.c, we don't avoid extra copying yet - it
doesn't seem worth the trouble.
Use a pseudo-filter when changing speed with resampling, instead of
somehow changing a samplerate somewhere. This uses the same underlying
mechanism, but is a bit more structured and cleaner. It also makes some
of the following changes easier.
Since we now always use filters to change audio speed, move most of the
work set_playback_speed() does to recreate_audio_filters().
This gets rid of this warning:
Could not update timestamps for skipped samples.
This required an API addition to FFmpeg (otherwise it would instead
doing arithmetic on the timestamps itself), so whether it works depends
on the FFmpeg version.
There's no real reason why audio_init_filter() should exist. Just use
af_init or af_reinit directly. (We lose a useless message; the same
information is printed in a quite close place with more details.)
Requires less code, and the way the filter chain is marked as having
failed to initialize allows just switching off audio instead of
crashing if trying to insert a volume filter in mixer.c fails, and
recreating the old filter chain fails too.
Let codec_tags.c do the messy mapping.
In theory we could simplify further by makign demux_mkv.c directly use
codec names instead of the MPlayer-inherited "internal FourCC" business,
but I'd rather not touch this - it would just break things.
For a while, we used this to transfer PCM from demuxer to the filter
chain. We had a special "codec" that mapped what MPlayer used to do
(MPlayer passes the AF sample format over an extra field to ad_pcm,
which specially interprets it).
Do this by providing a mp_set_pcm_codec() function, which describes a
sample format in a generic way, and sets the appropriate demuxer header
fields so that libavcodec interprets it correctly. We use the fact that
libavcodec has separate PCM decoders for each format. These are
systematically named, so we can easily map them.
This has the advantage that we can change the audio filter chain as we
like, without losing features from the "rawaudio" demuxer. In fact, this
commit also gets rid of the audio filter chain formats completely.
Instead have an explicit list of PCM formats. (We could even just have
the user pass libavcodec PCM decoder names directly, but that would be
annoying in other ways.)
Before this commit, there was AF_FORMAT_AC3 (the original spdif format,
used for AC3 and DTS core), and AF_FORMAT_IEC61937 (used for AC3, DTS
and DTS-HD), which was handled as some sort of superset for
AF_FORMAT_AC3. There also was AF_FORMAT_MPEG2, which used
IEC61937-framing, but still was handled as something "separate".
Technically, all of them are pretty similar, but may use different
bitrates. Since digital passthrough pretends to be PCM (just with
special headers that wrap digital packets), this is easily detectable by
the higher samplerate or higher number of channels, so I don't know why
you'd need a separate "class" of sample formats (AF_FORMAT_AC3 vs.
AF_FORMAT_IEC61937) to distinguish them. Actually, this whole thing is
just a mess.
Simplify this by handling all these formats the same way.
AF_FORMAT_IS_IEC61937() now returns 1 for all spdif formats (even MP3).
All AOs just accept all spdif formats now - whether that works or not is
not really clear (seems inconsistent due to earlier attempts to make
DTS-HD work). But on the other hand, enabling spdif requires manual user
interaction, so it doesn't matter much if initialization fails in
slightly less graceful ways if it can't work at all.
At a later point, we will support passthrough with ao_pulse. It seems
the PulseAudio API wants to know the codec type (or maybe not - feeding
it DTS while telling it it's AC3 works), add separate formats for each
codecs. While this reminds of the earlier chaos, it's stricter, and most
code just uses AF_FORMAT_IS_IEC61937().
Also, modify AF_FORMAT_TYPE_MASK (renamed from AF_FORMAT_POINT_MASK) to
include special formats, so that it always describes the fundamental
sample format type. This also ensures valid AF formats are never 0 (this
was probably broken in one of the earlier commits from today).
Until now, the audio chain could handle both little endian and big
endian formats. This actually doesn't make much sense, since the audio
API and the HW will most likely prefer native formats. Or at the very
least, it should be trivial for audio drivers to do the byte swapping
themselves.
From now on, the audio chain contains native-endian formats only. All
AOs and some filters are adjusted. af_convertsignendian.c is now wrongly
named, but the filter name is adjusted. In some cases, the audio
infrastructure was reused on the demuxer side, but that is relatively
easy to rectify.
This is a quite intrusive and radical change. It's possible that it will
break some things (especially if they're obscure or not Linux), so watch
out for regressions. It's probably still better to do it the bulldozer
way, since slow transition and researching foreign platforms would take
a lot of time and effort.
IEC 61937 frames should always be little endian (little endian 16 bit
words). I don't see any apparent need why the audio chain should handle
swapped-endian formats.
It could be that some audio outputs might want them (especially on big
endian architectures). On the other hand, it's not clear how that works
on these architectures, and it's not even known whether the current code
works on big endian at all. If something should break, and it should
turn out that swapped-endian spdif is needed on any platform/AO,
swapping still could be done in-place within the affected AO, and
there's no need for the additional complexity in the rest of the player.
Note that af_lavcac3enc outputs big endian spdif frames for unknown
reasons. Normally, the resulting data is just pulled through an auto-
inserted conversion filter and turned into little endian. Maybe this was
done as a trick so that the code didn't have to byte-swap the actual
audio frame. In any case, just make it output little endian frames.
All of this is untested, because I have no receiver hardware.
libavcodec/libavformat now handles gapless audio better. In theory, this
could be implemented with ad_mpg123 too, but since libavformat strips
metadata from mp3 files and passes pure mp3 packets to the decoders
only, this can't work by itself. Instead, the player must pass this
metadata separately. libav* do this relatively transparently over packet
"side data" (attached to AVPacket).
It might also be possible to let libmpg123 handles all this by
implementing it as demuxer that outputs PCM, but that would have other
problems, and I think it's better to make libavformat work correctly.
libmpg123 can still be used with '--ad=mpg123:mp3'.
Also see issue #1101.
bstr.c doesn't really deserve its own directory, and compat had just
a few files, most of which may as well be in osdep. There isn't really
any justification for these extra directories, so get rid of them.
The compat/libav.h was empty - just delete it. We changed our approach
to API compatibility, and will likely not need it anymore.
Use OPT_KEYVALUELIST() for all places where AVOptions are directly set
from mpv command line options. This allows escaping values, better
diagnostics (also no more "pal"), and somehow reduces code size.
Remove the old crappy option parser (av_opts.c).
It probably happens relatively often that the first packet (or even the
first N packets) of a stream will fail to decode, but decoding will
eventually succeed at a later point. Before commit 261506e3, this was
handled by an explicit retry loop (although this was also for other
purposes), but with then was changed to abort on the first error. This
makes it impossible to decode some audio streams.
Change this so that errors are ignored for the first 50 packets, which
should make it equivalent to the old code.
This commit makes audio decoding non-blocking. If e.g. the network is
too slow the playloop will just go to sleep, instead of blocking until
enough data is available.
For video, this was already done with commit 7083f88c. For audio, it's
unfortunately much more complicated, because the audio decoder was used
in a blocking manner. Large changes are required to get around this.
The whole playback restart mechanism must be turned into a statemachine,
especially since it has close interactions with video restart. Lots of
video code is thus also changed.
(For the record, I don't think switching this code to threads would
make this conceptually easier: the code would still have to deal with
external input while blocked, so these in-between states do get visible
[and thus need to be handled] anyway. On the other hand, it certainly
should be possible to modularize this code a bit better.)
This will probably cause a bunch of regressions.
Accidentally broken in b6af44d3. For ad_lavc (and in general), the PTS
was not updated correctly when filtering only parts of audio frames,
and for ad_mpg123 and ad_spdif the PTS was additionally offset by the
frame size.
This could lead to incorrect time display, and possibly broken A/V sync.
Execute the format change based on whether we logically detected EOF
(after filters), instead of when the decode buffer was drained. It's
slightly cleaner. (The requirement of len>0 existed before.)
Don't return an EOF code if there's still buffered data.
Also, don't call demux_stream_eof() in the playloop. There's probably
nothing wrong with it, but it's cleaner not to use it.
Also give AD_EOF its own value, so that a decoding error doesn't drain
audio by causing an EOF condition.
Move a function call, which does not change semantics.
Write the extra buffer sample count in a more straight-forward way; the
old code was not meaningful in any way (anymore).
It's true that the decoder can successfully decode, but return no data
(for various reasons). We don't need to handle this specially, though.
We just let the decoder decode some more data. This doesn't increase the
danger of an endless loop either, because audio_decode() already calls
this function until enough is decoded.
This commit mainly moves the initial decoding of data (done to probe the
audio format) to generic code. This will make it easier to make audio
decoding non-blocking in a later commit.
This commit also changes how decoders return data: instead of having
them write the data into a prepared buffer, they return a reference to
an internal buffer (by setting dec_audio.decoded). This makes it
significantly easier to handle audio format changes, since the decoders
don't really need to care anymore.
If the decoder didn't set a samplerate, it was initialized from the
container samplerate.
This probably didn't make much sense, because it's passed to the
decoder on initialization (so it could definitely use it). It's an
artifact from commit 66a9eb57 (which removed some Matroska-specific non-
sense), and I've never seen it actually happen since it was made into a
warning. Just get rid of it.
In most places where af_fmt2bits is called to get the bits/sample, the
result is immediately converted to bytes/sample. Avoid this by getting
bytes/sample directly by introducing af_fmt2bps.
The i_bps members of the sh_audio and dev_video structs are mostly used
for displaying the average audio and video bitrates. Keeping them in
bits-per-second avoids truncating them to bytes-per-second and changing
them back lateron.
Also remove MSGL_SMODE and friends.
Note: The indent in options.rst was added to work around a bug in
ReportLab that causes the PDF manual build to fail.
This collects statistics and other things. The option dumps raw data
into a file. A script to visualize this data is included too.
Litter some of the player code with calls that generate these
statistics.
In general, this will be helpful to debug timing dependent issues, such
as A/V sync problems. Normally, one could argue that this is the task of
a real profiler, but then we'd have a hard time to include extra
information like audio/video PTS differences. We could also just
hardcode all statistics collection and processing in the player code,
but then we'd end up with something like mplayer's status line, which
was cluttered and required a centralized approach (i.e. getting the data
to the status line; so it was all in mplayer.c). Some players can
visualize such statistics on OSD, but that sounds even more complicated.
So the approach added with this commit sounds sensible.
The stats-conv.py script is rather primitive at the moment and its
output is semi-ugly. It uses matplotlib, so it could probably be
extended to do a lot, so it's not a dead-end.
Set refcounted_frames, because in some versions of libavcodec mixing the
new AVFrame API and non-refcounted decoding could cause memory
corruption. Likewise, it's probably still required to unref a frame
before calling the decoder.
request_channels has been deprecated for years (request_channel_layout
is the replacement), but it appears it's still needed despite the
deprecation at least on older libavcodec versions.
So still set request_channels, but to it with the avoption API, which
hides the deprecation warning. This should also prevent mpv getting
trashed when libavcodec happens to bump its major version.
Since m_option.h and options.h are extremely often included, a lot of
files have to be changed.
Moving path.c/h to options/ is a bit questionable, but since this is
mainly about access to config files (which are also handled in
options/), it's probably ok.
The tmsg stuff was for the internal gettext() based translation system,
which nobody ever attempted to use and thus was removed. mp_gtext() and
set_osd_tmsg() were also for this.
mp_dbg was once enabled in debug mode only, but since we have log level
for enabling debug messages, it seems utterly useless.