This commit tries to prepare for better underrun reporting. The goal is
to report underruns relatively immediately. Until now, this happened
only when play() was called. Change this, and abuse that get_delay() is
called "relatively often" - this reports the underrun immediately in
practice.
Background:
In commit 81e51a15f7 (and also e38b0b245e), we were quite confused
about ALSA underrun handling. The commit message showed uncertainty how
case 3 happened, but it's blindingly obvious and simple.
Actually reading the code shows that ALSA does not have a concept of a
"final chunk" (or we don't use it). It's obvious we never pass the
AOPLAY_FINAL_CHUNK flag along to the ALSA API in any way. The only thing
we do is simply writing a partial fragment. Of course this will cause an
underrun. Doing a partial write saves us the trouble to pad the last
frame with silence, or so.
The main reason why the underrun message was avoided was that play() was
never called with a non-0 sample count again (except if reset() was
called before that). That was OK, at least the goal of avoiding the
unwanted message was reached. (And the original "bogus" message at end
of playback was perfectly correct, as far as ALSA goes.)
If network stalls, play() will called again only once new data is
available. Obviously, this could take a long time, thus it's too late.
It turns out that case 2) mentioned in the previous commit happened
quite often when playback ended normally.
There is probably a legitimate underrun with normal buffer sizes (100
ms, 4 fragments, gapless audio in "weak" mode). This is a result of the
player waiting for video to end, and/or the time needed to kill the
video window. The former case means that it depends on your test case
whether it happens (a file where video ends slightly before audio is
less likely to trigger it).
This in turn is due to how gapless playback works. Achieving not having
a "gap" requires queuing the audio of the next file without playing a
partial chunk (as AOPLAY_FINAL_CHUNK would do). The partial chunk is
then played as part of the first chunk played from the next file. But if
it detects "later" that there is no next file, it still needs to get rid
of the last fragment with AOPLAY_FINAL_CHUNK. At this point it's too
late, and an underrun may have actually happened. The way the player
uninits and reinits the entire playback engine for the next file in a
"serial" manner means it cannot know in advance whether this works.
This is the reason why the idiot who added the underrun exception for
the last chunk in play() was wrong (I wrote that btw., before you accuse
me of being rude). Yes, it's a real underrun, and you could probably
hear it.
This XRUN (aka underrun) message was printed in the following
situations:
1) legitimate underrun during playback
2) legitimate underrun when playing final chunk
3) bogus underrun when playing final chunk
The old underrun case (in play()) happens in cases 1) and 2) as well,
but 3) did not happen. It appears 3) is indeed something that happens,
although it's not known for sure. It's still pretty annoying, so remove
the new XRUN message.
When testing, care should be taken to play with buffer sizes, video
versus no video, and gapless enabled/disabled. Also, suspending the
player with Ctrl+Z in the terminal (SIGSTOP) and then resuming is a good
way to trigger a "normal" underrun.
ioctl(..., SNDCTL_DSP_CHANNELS, &nchannels) for not supported
nchannels does not return an error and instead set nchannels to
the default value.
Instead of failing with no audio, fallback to stereo.
This flag makes mpv continue using the PulseAudio driver even if the
sink is suspended.
This can be useful if JACK is running with PulseAudio in bridge mode and
the sink-input assigned to mpv is the one JACK controls, thus being
suspended.
By forcing mpv to still use PulseAudio in this case, the user can now
adjust the sink to an unsuspended one.
According to ALSA doxy, EPIPE is a synonym to SND_PCM_STATE_XRUN,
and that is a state that we should attempt to automatically recover
from. In case recovery fails, log an error and return zero.
A warning message will still be output for each XRUN since those
are not something we should generally be receiving.
This has been way too long coming, and for me to notice that a
whole lot of ao_alsa functions do an early return if the AO is
paused.
For the STATE_SETUP case, I had this reproduced once, and never
since. Still, seems like we can start calling this function before
the ALSA device has been fully initialized so we might as well
early exit in that case.
ao->device_buffer will only affect the enqueue size if the latter
is not specified. In other word, its intended purpose will solely
be setting/guarding the soft buffer size.
This guarantees that the soft buffer size will be consistent no
matter a specific enqueue size is set or not. (In the past it
would drop to the default of the generic audio-buffer option.)
opensles-frames-per-buffer has been renamed to opensles-frames-per
-enqueue, as it was never purposed to set the soft buffer size. It
will only make sure the size is never smaller than itself, just as
before.
opensles-buffer-size-in-ms is introduced to allow easy tuning of
the relative (i.e. in time) soft buffer size (and enqueue size,
unless the aforementioned option is set). As "device buffer" never
really made sense in this AO, this option OVERRIDES audio-buffer
whenever its value (including the default) is larger than 0.
Setting opensl-buffer-size-in-ms to 1 allows you to equate the soft
buffer size to the absolute enqueue size set with opensl-frames-per
-enqueue conveniently (unless it is less than 1ms).
When both are set to 0, audio-buffer will be the ultimate fallback.
If audio-buffer is also 0, the AO errors out.
Fixes a bug with alsa dmix on Fedora 29. After several minutes,
audio suddenly becomes bad and muted.
Actually, I don't know what causes this. Probably this is a bug in alsa.
In any case, as snd_pcm_status() returns not only 'avail', but also other
fields such as tstamp, htstamp, etc, this could be considered a good
simplification, as only avail is required for this function.
Until recently, ao_lavc and vo_lavc started encoding whenever the core
happened to send them data. Since audio and video are not initialized at
the same time, and the muxer was not necessarily opened when the first
encoder started to produce data, the resulting packets were put into a
queue. As soon as the muxer was opened, the queue was flushed.
Change this to make the core wait with sending data until all encoders
are initialized. This has the advantage that we don't need to queue up
the packets.
The main change is that we wait with opening the muxer ("writing
headers") until we have data from all streams. This fixes race
conditions at init due to broken assumptions in the old code.
This also changes a lot of other stuff. I found and fixed a few API
violations (often things for which better mechanisms were invented, and
the old ones are not valid anymore). I try to get away from the public
mutex and shared fields in encode_lavc_context. For now it's still
needed for some timestamp-related fields, but most are gone. It also
removes some bad code duplication between audio and video paths.
Print them as a warning.
Note that there may be some cases where it underruns, without being a
bad condition. This could possibly happen e.g. if the last chunk is
written, and then it resumes playback some time after that. Eventually I
want to add more code to avoid such spurious warnings.
There is a dedicated thread for feeding audio to the ALSA API from a
buffer with a larger size. There is little reason to have such a large
device buffer.
One can now set the number of buffers and the buffer size.
This can reduce the CPU usage and the total latency stays mostly the same.
As there are sync mechanisms the A/V sync continue intact and working.
It also modifies 6.1 channel order, as per OpenAL spec
and add AOPLAY_FINAL_CHUNK support
OpenAL Soft's AL_SOFT_source_latency extension allows one to correctly
get the device output latency, facilitating the syncronization with
video.
Also added a simpler generic fallback that does not take into account
latency of the device.
Uses OpenAL Soft's AL_DIRECT_CHANNELS_SOFT extension and can be controlled through
a new CLI option, --openal-direct-channels.
This allows one to send the audio data direrctly to the desired channel without
effects applied.
Although half (non-fast track on sink rate) or one-third (non-fast track not on sink rate) of the buffer size of the created AudioTrack instance as the SL Enqueue buffer size is basically enough for dropout-free playback, only using the full size can avoid stutter upon (re)start of playback.
Here are the various buffer sizes on different track/sink rate when on Bluetooth audio on Android O:
aptX @ 48kHz:
Sink rate: 48000 Hz
44100 Hz: 10632 frames (241.09 ms)
48000 Hz: 11544 frames (240.50 ms)
88200 Hz: 21216 frames (240.54 ms)
96000 Hz: 23088 frames (240.50 ms)
176400 Hz: 42384 frames (240.27 ms)
192000 Hz: 46128 frames (240.25 ms)
SBC/AAC/aptX @ 44.1kHz:
Sink rate: 44100 Hz
44100 Hz: 10776 frames (244.35 ms)
48000 Hz: 11748 frames (244.75 ms)
88200 Hz: 21552 frames (244.35 ms)
96000 Hz: 23448 frames (244.25 ms)
176400 Hz: 43056 frames (244.08 ms)
192000 Hz: 46848 frames (244.00 ms)
The above results were produced with the following code:
import android.media.AudioAttributes;
import android.media.AudioFormat;
import android.media.AudioTrack;
class AudioInfo {
public static void main(String[] args) {
int nosr = AudioTrack.getNativeOutputSampleRate(3);
System.out.printf("Sink rate: %d Hz\n", nosr);
int[] rates = {44100,48000,88200,96000,176400,192000};
for (int rate: rates) {
AudioAttributes aa = new AudioAttributes.Builder().setFlags(256).build();
AudioFormat af = new AudioFormat.Builder().setSampleRate(rate).build();
AudioTrack at = new AudioTrack(aa, af, 4, 1, 0);
int sr = at.getSampleRate();
int bs = at.getBufferSizeInFrames();
float ms = bs * (float) 1000 / sr;
at.release();
System.out.printf("%d Hz: %d frames (%.2f ms)\n", sr, bs, ms);
}
}
}
Therefore bumping the device buffer size to 250ms.
If you set desired.samples to 0, SDL will return a default buffer size
on obtained.samples. This was broken, because ceil_power_of_two(0)
returns 1. Since 0 is usually not considered a power of two, this is
probably correct, but we still want to set desired.samples to 0 in this
case.
You can use --audio-buffer=0 to minimize the audio buffer size. But if
the AO reports no device buffer size (like e.g. ao_jack does), then the
buffer size is actually 0, and playback can never work properly.
Make it fallback to a size of 1, which is unlikely to work properly, but
you get what you asked for, instead of a freeze.
While the soft buffer size is already by default 200ms, it is not enough to guarantee dropout-free playback on Bluetooth audio. Bumping the device buffer size to the same value seems to suffice.