Until recently, ao_lavc and vo_lavc started encoding whenever the core
happened to send them data. Since audio and video are not initialized at
the same time, and the muxer was not necessarily opened when the first
encoder started to produce data, the resulting packets were put into a
queue. As soon as the muxer was opened, the queue was flushed.
Change this to make the core wait with sending data until all encoders
are initialized. This has the advantage that we don't need to queue up
the packets.
The main change is that we wait with opening the muxer ("writing
headers") until we have data from all streams. This fixes race
conditions at init due to broken assumptions in the old code.
This also changes a lot of other stuff. I found and fixed a few API
violations (often things for which better mechanisms were invented, and
the old ones are not valid anymore). I try to get away from the public
mutex and shared fields in encode_lavc_context. For now it's still
needed for some timestamp-related fields, but most are gone. It also
removes some bad code duplication between audio and video paths.
Print them as a warning.
Note that there may be some cases where it underruns, without being a
bad condition. This could possibly happen e.g. if the last chunk is
written, and then it resumes playback some time after that. Eventually I
want to add more code to avoid such spurious warnings.
There is a dedicated thread for feeding audio to the ALSA API from a
buffer with a larger size. There is little reason to have such a large
device buffer.
One can now set the number of buffers and the buffer size.
This can reduce the CPU usage and the total latency stays mostly the same.
As there are sync mechanisms the A/V sync continue intact and working.
It also modifies 6.1 channel order, as per OpenAL spec
and add AOPLAY_FINAL_CHUNK support
OpenAL Soft's AL_SOFT_source_latency extension allows one to correctly
get the device output latency, facilitating the syncronization with
video.
Also added a simpler generic fallback that does not take into account
latency of the device.
Uses OpenAL Soft's AL_DIRECT_CHANNELS_SOFT extension and can be controlled through
a new CLI option, --openal-direct-channels.
This allows one to send the audio data direrctly to the desired channel without
effects applied.
Although half (non-fast track on sink rate) or one-third (non-fast track not on sink rate) of the buffer size of the created AudioTrack instance as the SL Enqueue buffer size is basically enough for dropout-free playback, only using the full size can avoid stutter upon (re)start of playback.
Here are the various buffer sizes on different track/sink rate when on Bluetooth audio on Android O:
aptX @ 48kHz:
Sink rate: 48000 Hz
44100 Hz: 10632 frames (241.09 ms)
48000 Hz: 11544 frames (240.50 ms)
88200 Hz: 21216 frames (240.54 ms)
96000 Hz: 23088 frames (240.50 ms)
176400 Hz: 42384 frames (240.27 ms)
192000 Hz: 46128 frames (240.25 ms)
SBC/AAC/aptX @ 44.1kHz:
Sink rate: 44100 Hz
44100 Hz: 10776 frames (244.35 ms)
48000 Hz: 11748 frames (244.75 ms)
88200 Hz: 21552 frames (244.35 ms)
96000 Hz: 23448 frames (244.25 ms)
176400 Hz: 43056 frames (244.08 ms)
192000 Hz: 46848 frames (244.00 ms)
The above results were produced with the following code:
import android.media.AudioAttributes;
import android.media.AudioFormat;
import android.media.AudioTrack;
class AudioInfo {
public static void main(String[] args) {
int nosr = AudioTrack.getNativeOutputSampleRate(3);
System.out.printf("Sink rate: %d Hz\n", nosr);
int[] rates = {44100,48000,88200,96000,176400,192000};
for (int rate: rates) {
AudioAttributes aa = new AudioAttributes.Builder().setFlags(256).build();
AudioFormat af = new AudioFormat.Builder().setSampleRate(rate).build();
AudioTrack at = new AudioTrack(aa, af, 4, 1, 0);
int sr = at.getSampleRate();
int bs = at.getBufferSizeInFrames();
float ms = bs * (float) 1000 / sr;
at.release();
System.out.printf("%d Hz: %d frames (%.2f ms)\n", sr, bs, ms);
}
}
}
Therefore bumping the device buffer size to 250ms.
If you set desired.samples to 0, SDL will return a default buffer size
on obtained.samples. This was broken, because ceil_power_of_two(0)
returns 1. Since 0 is usually not considered a power of two, this is
probably correct, but we still want to set desired.samples to 0 in this
case.
You can use --audio-buffer=0 to minimize the audio buffer size. But if
the AO reports no device buffer size (like e.g. ao_jack does), then the
buffer size is actually 0, and playback can never work properly.
Make it fallback to a size of 1, which is unlikely to work properly, but
you get what you asked for, instead of a freeze.
While the soft buffer size is already by default 200ms, it is not enough to guarantee dropout-free playback on Bluetooth audio. Bumping the device buffer size to the same value seems to suffice.
Always make the hw params dump function use MSGL_DEBUG, and remove the
MSGL_V use. That means you need -v -v to see them. The detailed
information is usually not very interesting, so this reduces the log
noise.
The af_get_best_sample_formats() function had an argument of
int[AF_FORMAT_COUNT], which is slightly incorrect, because it's 0
terminated and should in theory have AF_FORMAT_COUNT+1 entries. It won't
actually write this many formats (since some formats are fundamentally
incompatible), but it still feels annoying and incorrect. So fix it, and
require that callers pass an AF_FORMAT_COUNT+1 array.
Note that the array size has no meaning in C function arguments (just
another issue with C static arrays being weird and stupid), so get rid
of it completely.
Not changing the af_lavcac3enc use, since that is rewritten in another
branch anyway.
This commit eliminates the following clang warning:
warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
Going by the clang commit message, this seems to be explicitly specified
as UB by the standard, and they added this warning because MSVC
apparently results in different behavior. Whatever, we can just avoid
the warning with some small changes.
stdatomic.h defines no atomic_float typedef. We can't just use _Atomic
unconditionally, because we support compilers without C11 atomics. So
just create a custom atomic_float typedef in the wrapper, which uses
_Atomic in the C11 code path.
This does what af_volume used to do. Since we couldn't relicense it,
just rewrite it. Since we don't have a new filter mechanism yet, and the
libavfilter is too inconvenient, do applying the volume gain in ao.c
directly. This is done before handling the audio data to the driver.
Since push.c runs a separate thread, and pull.c is called asynchronously
from the audio driver's thread, the volume value needs to be
synchronized. There's no existing central mutex, so do some shit with
atomics. Since there's no atomic_float type predefined (which is at
least needed when using the legacy wrapper), do some nonsense about
reinterpret casting the float value to an int for the purpose of atomic
access. Not sure if using memcpy() is undefined behavior, but for now I
don't care.
The advantage of not using a filter is lower complexity (no filter auto
insertion), and lower latency (gain processing is done after our
internal audio buffer of at least 200ms).
Disavdantages include inability to use native volume control _before_
other filters with custom filter chains, and the need to add new
processing for each new sample type.
Since this doesn't reuse any of the old GPL code, nor does indirectly
rely on it, volume and replaygain handling now works in LGPL mode.
How to process the gain is inspired by libavfilter's af_volume (LGPL).
In particular, we use exactly the same rounding, and we quantize
processing for integer sample types by 256 steps. Some of libavfilter's
copyright may or may not apply, but I think not, and it's the same
license anyway.
Looks like this is covered by LGPL relicensing agreements now.
Notes about contributors who could not be reached or who didn't agree:
Commit 7fccb6486e has tons of mp_msg changes look like they are not
copyrightable (even if they were, all mp_msg calls were rewritten in
mpv times again). The additional play() change looks suspicious, but
the function was rewritten several times anyway (first time after that
commit in 4f40ec312).
Commit 89ed1748ae was rewritten in commit 325311af3 and then again
several times after that. Basically all this code is unnecessary in
modern mpv and has been removed.
No code survived from the following commits: 4d31c3c53, 61ecf838f2,
d38968bd, 4deb67c3f. At least two cosmetic typo fixes are not
considered as well.
Commit 22bb046ad is reverted (this wasn't a valid warning anyway, just
a C++-ism icc applied to C). Using the constants is nicer, but at least
I don't have to decide whether that change was copyrightable.
I _think_ this confuses Coverity and it thinks there is uninitialized
data to be read. Initialize the array to change/remove the warning, or
if there's a real problem, to make it easier to detect. (Basically apply
defensive coding.)
This should actually cover all of them, if you take into account that
some unchanged GPL source files include header files with such checks.
Also this was done already for the libaf derived code.
This is only for "safety" and to avoid misunderstandings.
Just reimplement it in some way, as mp_audio is GPL-only.
Actually I wanted to get rid of audio_buffer.c completely (and instead
have a list of mp_aframes), but to do so would require rewriting some
more player core audio code. So to get this LGPL relicensing over
quickly, just do some extra work.
Completely untested (rsound dev libs unavailable on my system). Trivial
enough that it's very likely that it'll just work. No port selection,
but could be added by parsing it as part of the device name.
Should fix#4714.
This is pretty pointless, but I believe it allows us to claim that the
new code is not affected by the copyright of the old code. This is
needed, because the original mp_audio struct was written by someone who
has disagreed with LGPL relicensing (it was called af_data at the time,
and was defined in af.h).
The "GPL'ed" struct contents that surive are pretty trivial: just the
data pointer, and some metadata like the format, samplerate, etc. - but
at least in this case, any new code would be extremely similar anyway,
and I'm not really sure whether it's OK to claim different copyright. So
what we do is we just use AVFrame (which of course is LGPL with 100%
certainty), and add some accessors around it to adapt it to mpv
conventions.
Also, this gets rid of some annoying conventions of mp_audio, like the
struct fields that require using an accessor to write to them anyway.
For the most part, this change is only dumb replacements of mp_audio
related functions and fields. One minor actual change is that you can't
allocate the new type on the stack anymore.
Some code still uses mp_audio. All audio filter code will be deleted, so
it makes no sense to convert this code. (Audio filters which are LGPL
and which we keep will have to be ported to a new filter infrastructure
anyway.) player/audio.c uses it because it interacts with the old filter
code. push.c has some complex use of mp_audio and mp_audio_buffer, but
this and pull.c will most likely be rewritten to do something else.
Any bad HRESULTs should have been printed already and lots of failure modes
don't have an HRESULT leading to awkward hr = E_FAIL business.
This also checks the exit status of GetBufferSize in the align hack. A final
fatal message is added if either of the retry hacks fail.