Move all state that basically changes during decoding or is needed in
order to manage decoding itself into a new struct (dec_audio).
sh_audio (defined in stheader.h) is supposed to be the audio stream
header. This should reflect the file headers for the stream. Putting the
decoder context there is strange design, to say the least.
The AF control commands used an elaborate and unnecessary organization
for the command constants. Get rid of all that and convert the
definitions to a simple enum. Also remove the control commands that
were not really needed, because they were not used outside of the
filters that implemented them.
And by "cleanup", I mean "remove". Actually, only remove the parts that
are redundant and doxygen noise. Move useful parts to the comment above
the function's implementation in the C source file.
When the decoder detects a format change, it overwrites the values
stored in sh_audio (this affects the members sample_format, samplerate,
channels). In the case when the old audio data still needs to be
played/filtered, the audio format as identified by sh_audio and the
format used for the decoder buffer can mismatch. In particular, they
will mismatch in the very unlikely but possible case the audio chain is
reinitialized while old data is draining during a format change.
Or in other words, sh_audio might contain the new format, while the
audio chain is still configured to use the old format.
Currently, the audio code (player/audio.c and init_audio_filters) access
sh_audio to get the current format. This is in theory incorrect for the
reasons mentioned above. Use the decoder buffer's format instead, which
should be correct at any point.
Commit 22b3f522 not only redid major aspects of audio decoding, but also
attempted to fix audio format change handling. Before that commit, data
that was already decoded but not yet filtered was thrown away on a
format change. After that commit, data was supposed to finish playing
before rebuilding filters and so on.
It was still buggy, though: the decoder buffer was initialized to the
new format too early, triggering an assertion failure. Move the reinit
call below filtering to fix this.
ad_mpg123.c needs to be adjusted so that it doesn't decode new data
before the format change is actually executed.
Add some more assertions to af_play() (audio filtering) to make sure
input data and configured format don't mismatch. This will also catch
filters which don't set the format on their output data correctly.
Regression due to planar_audio branch.
Simulate proper handling of AOPLAY_FINAL_CHUNK. Print when underruns
occur (i.e. running out of data). Add some options that control
simulated buffer and outburst sizes.
All this is useful for debugging and self-documentation. (Note that
ao_null always was supposed to simulate an ideal AO, which is the reason
why it fools people who try to use it for benchmarking video.)
This should allow it to select better fallback formats, instead of
picking the first encoder sample format if ao->format is not equal to
any of the encoder sample formats.
Not sure what is supposed to happen if the encoder provides no
compatible sample format (or no sample format list at all), but in this
case ao_lavc.c still fails gracefully.
The added function af_format_conversion_score() can be used to select
the best sample format to convert to in order to reduce loss and extra
conversion work.
It calculates a "loss" score when going from one format to another, and
for each conversion that needs to be done a certain score is subtracted.
Thus, if you have to convert from one format to a set of other formats,
you can calculate the score for each conversion, and pick the one with
the highest score.
Conversion between int and float is considered the worst case. One odd
consequence is that when converting from s32 to u8 or float, u8 will be
picked.
Test program used to develop this follows:
#define MAX_FMT 200
struct entry {
const char *name;
int score;
};
static int compentry(const void *px1, const void *px2)
{
const struct entry *x1 = px1;
const struct entry *x2 = px2;
if (x1->score > x2->score)
return 1;
if (x1->score < x2->score)
return -1;
return 0;
}
int main(int argc, char *argv[])
{
for (int n = 0; af_fmtstr_table[n].name; n++) {
struct entry entry[MAX_FMT];
int entries = 0;
for (int i = 0; af_fmtstr_table[i].name; i++) {
assert(i < MAX_FMT);
entry[entries].name = af_fmtstr_table[i].name;
entry[entries].score =
af_format_conversion_score(af_fmtstr_table[i].format,
af_fmtstr_table[n].format);
entries++;
}
qsort(&entry[0], entries, sizeof(entry[0]), compentry);
for (int i = 0; i < entries; i++) {
printf("%s -> %s: %d \n", af_fmtstr_table[n].name,
entry[i].name, entry[i].score);
}
}
}
These must be written even if there was no "final frame", e.g. due to
the player being exited with "q".
Although the issue is mostly of theoretical nature, as most audio codecs
don't need the final encoding calls with NULL data. Maybe will be more
relevant in the future.
This used to be in bytes, now it's in samples. Divide the value by 8
(assuming a typical audio format, float samples with 2 channels).
Fix some editing mistake or non-sense about the extra buffering added
(1<<x instead of x<<5).
Also sneak in a s/MPlayer/mpv/.
This changes option parsing as well as filter defaults slightly. The
default is now to encode to spdif (this is way more useful than writing
raw AC3 - what was this even useful for, other than writing broken ac3
-in-wav files?). The bitrate parameter is now always in kbps.
Apparently this was completely broken after commit 22b3f522. Basically,
this locked up immediately completely while decoding the first packet.
The reason was that the buffer calculations confused bytes and number of
samples. Also, EOF reporting was broken (wrong return code).
The special-casing of ad_mpg123 and ad_spdif (with DECODE_MAX_UNIT) is a
bit annoying, but will eventually be solved in a better way.
ao_null should simulate a "perfect" AO, but framestepping behaved quite
badly with it. Framstepping usually exposes problems with AOs dropping
their buffers on pause, and that's what happened here.
Most libavcodec decoders output non-interleaved audio. Add direct
support for this, and remove the hack that repacked non-interleaved
audio back to packed audio.
Remove the minlen argument from the decoder callback. Instead of
forcing every decoder to have its own decode loop to fill the buffer
until minlen is reached, leave this to the caller. So if a decoder
doesn't return enough data, it's simply called again. (In future, I
even want to change it so that decoders don't read packets directly,
but instead the caller has to pass packets to the decoders. This fits
well with this change, because now the decoder callback typically
decodes at most one packet.)
ad_mpg123.c receives some heavy refactoring. The main problem is that
it wanted to handle format changes when there was no data in the decode
output buffer yet. This sounds reasonable, but actually it would write
data into a buffer prepared for old data, since the caller doesn't know
about the format change yet. (I.e. the best place for a format change
would be _after_ writing the last sample to the output buffer.) It's
possible that this code was not perfectly sane before this commit,
and perhaps lost one frame of data after a format change, but I didn't
confirm this. Trying to fix this, I ended up rewriting the decoding
and also the probing.
libav* is generally freaking horrible, and might do bad things if the
data pointer passed to it are not aligned. One way to be sure that the
alignment is correct is allocating all pointers using av_malloc().
It's possible that this is not needed at all, though. For now it might
be better to keep this, since the mp_audio code is intended to replace
another buffer in dec_audio.c, which is currently av_malloc() allocated.
The original reason why this uses av_malloc() is apparently because
libavcodec used to directly encode into mplayer buffers, which is not
the case anymore, and thus (probably) doesn't make sense anymore.
(The commit subject uses the word "cargo cult", after all.)
Remove the awkward planarization. It had to be done because the AC3
encoder requires planar formats, but now we support them natively.
Try to simplify buffer management with mp_audio_buffer.
Improve checking for buffer overflows and out of bound writes. In
theory, these shouldn't happen due to AC3 fixed frame sizes, but being
paranoid is better.
The format negotiation is the same, except don't confusingly copy the
input format into af->data, just to overwrite it later. af->data should
alwass contain the output format, and the existing code was just a very
misguided use of the af_test_output() helper function.
Just set af->data to the output format immediately, and modify the input
format properly.
Also, if format negotiation fails (and needs another iteration), don't
initialize the libavcodec encoder.
Before this commit, the af_instance->mul/delay values were in bytes.
Using bytes is confusing for non-interleaved audio, so switch mul to
samples, and delay to seconds. For delay, seconds are more intuitive
than bytes or samples, because it's used for the latency calculation.
We also might want to replace the delay mechanism with real PTS
tracking inside the filter chain some time in the future, and PTS
will also require time-adjustments to be done in seconds.
For most filters, we just remove the redundant mul=1 initialization.
(Setting this used to be required, but not anymore.)
Since ao_openal simulates multi-channel audio by placing a bunch of
mono-sources in 3D space, non-interleaved audio is a perfect match for
it. We just have to remove the interleaving code.
ALSA supports non-interleaved audio natively using a separate API
function for writing audio. (Though you have to tell it about this on
initialization.) ALSA doesn't have separate sample formats for this,
so just pretend to negotiate the interleaved format, and assume that
all non-interleaved formats have an interleaved companion format.
Replace the code that used a single buffer with mp_audio_buffer. This
also enables non-interleaved output operation, although it's still
disabled, and no AO supports it yet.
Implementation wise, this could be much improved, such as using a
ringbuffer that doesn't require copying data all the time. This is
why we don't use mp_audio directly instead of mp_audio_buffer.
This comes with two internal AO API changes:
1. ao_driver.play now can take non-interleaved audio. For this purpose,
the data pointer is changed to void **data, where data[0] corresponds to
the pointer in the old API. Also, the len argument as well as the return
value are now in samples, not bytes. "Sample" in this context means the
unit of the smallest possible audio frame, i.e. sample_size * channels.
2. ao_driver.get_space now returns samples instead of bytes. (Similar to
the play function.)
Change all AOs to use the new API.
The AO API as exposed to the rest of the player still uses the old API.
It's emulated in ao.c. This is purely to split the commits changing all
AOs and the commits adding actual support for outputting N-I audio.
Allocate af_instance->data in generic code before filter initialization.
Every filter needs af->data (since it contains the output
configuration), so there's no reason why every filter should allocate
and free it.
Remove RESIZE_LOCAL_BUFFER(), and replace it with mp_audio_realloc_min().
Interestingly, most code becomes simpler, because the new function takes
the size in samples, and not in bytes. There are larger change in
af_scaletempo.c and af_lavcac3enc.c, because these had copied and
modified versions of the RESIZE_LOCAL_BUFFER macro/function.
No AO can handle these, so it would be a problem if they get added
later, and non-interleaved formats get accepted erroneously. Let them
gracefully fall back to other formats.
Most AOs actually would fall back, but to an unrelated formats. This is
covered by this commit too, and if possible they should pick the
interleaved variant if a non-interleaved format is requested.
Based on earlier work by Stefano Pigozzi.
There are 2 changes:
1. Instead of mp_audio.audio, mp_audio.planes[0] must be used.
2. mp_audio.len used to contain the size of the audio in bytes. Now
mp_audio.samples must be used. (Where 1 sample is the smallest unit
of audio that covers all channels.)
Also, some filters need changes to reject non-interleaved formats
properly.
Nothing uses the non-interleaved features yet, but this is needed so
that things don't just break when doing so.
This affects 64 bit floats and big endian integer PCM variants
(basically crap nobody uses). Possibly not all MS-muxed files work, but
I couldn't get or produce any samples.
Remove a bunch of format tags that are not needed anymore. Most of these
were used by demux_mov, which is long gone. Repurpose/abuse 'twos' as
mpv-internal tag for dealing with the PCM variants mentioned above.
Now to shift audio pts when outputting to e.g. avi, you need an explicit
facility to insert/remove initial samples, to avoid initial regions of
the video to be sped up/slowed down.
One such facility is the delay filter in libavfilter.