ALSA API reports -EPIPE even when the the device is lost, which we
currently always assume to be an XRUN. If we assumed XRUN 10 times and
didn't manage to recover, just consider the device lost and try to
reconnect. Allows ao_alsa to recover from alsa server being killed then
reinitialized. And even in the worst case, this should be better than
the status quo of mpv attempting to prepare a PCM device indefinitely
until the user restarts mpv.
This is admittedly not ideal, and I don't think the -EPIPE hack is
necessary anymore, but I can only test on my setup and removing the
'assume -EPIPE is an XRUN' hack might break some setups for whatever
mysterious reasons.
During AO init, snd_pcm_open() is called, which calls snd_config_update()
to allocate a global config node and stores it in the snd_config global
variable. This is never freed on uninit.
Fix this by freeing the global config node on uninit.
c784820454 introduced a bool option type
as a replacement for the flag type, but didn't actually transition and
remove the flag type because it would have been too much mundane work.
ao-volume is represented in the code with a `struct ao_control_vol_t`
which contains volumes for two channels, left and right.
However the code implementing this property in command.c never treats
these values individually. They are always averaged together.
On the other hand the code in the AOs handling these values also has to
handle the case where *not* exactly two channels are handled.
So let's remove the `struct ao_control_vol_t` and replace it with a
simple float.
This makes the semantics clear to AO authors and allows us to drop some code from the AOs and command.c.
In debug mode the macro causes an assertion failure.
In release mode it works differently and tells the compiler that it can
assume the codepath will never execute. For this reason I was conversative
in replacing it, e.g. in mpv-internal code that exhausts all valid values
of an enum or when a condition is clear from directly preceding code.
Set pcm state to SND_PCM_STATE_XRUN in case -EPIPE is received,
and handle this state as per the usual logic.
This way snd_pcm_prepare gets called, and the loop continued.
Inspired by a patch posted by malc_ on #mpv.
It is now the AO's responsibility to handle period size alignment. The
ao->period_size alignment field is unused as of the recent audio
refactor commit. Remove it.
It turns out that ao_alsa shows extremely inefficient behavior as a
consequence of the removal of period size aligned writes in the
mentioned refactor commit. This is because it could get into a state
where it repeatedly wrote single samples (as small as 1 sample), and
starved the rest of the player as a consequence. Too bad. Explicitly
align the size in ao_alsa. Other AOs, which need this, should do the
same.
One reason why it broke so badly with ao_alsa was that it retried the
write() even if all reported space could be written. So stop doing that
too. Retry the write only if we somehow wrote less.
I'm not sure about ao_pulse.
Instead of the relatively subtle underflow handling, simply signal
whether the stream is in a playing state. Should make it more robust.
Should affect ao_alsa and ao_pulse only (and ao_openal, but it's
broken).
For ao_pulse, I'm just guessing. How the hell do you query whether a
stream is playing? Who knows. Seems to work, judging from very
superficial testing.
This affects "pull" AOs only: ao_alsa, ao_pulse, ao_openal, ao_pcm,
ao_lavc. There are changes to the other AOs too, but that's only about
renaming ao_driver.resume to ao_driver.start.
ao_openal is broken because I didn't manage to fix it, so it exits with
an error message. If you want it, why don't _you_ put effort into it? I
see no reason to waste my own precious lifetime over this (I realize the
irony).
ao_alsa loses the poll() mechanism, but it was mostly broken and didn't
really do what it was supposed to. There doesn't seem to be anything in
the ALSA API to watch the playback status without polling (unless you
want to use raw UNIX signals).
No idea if ao_pulse is correct, or whether it's subtly broken now. There
is no documentation, so I can't tell what is correct, without reverse
engineering the whole project. I recommend using ALSA.
This was supposed to be just a simple fix, but somehow it expanded scope
like a train wreck. Very high chance of regressions, but probably only
for the AOs listed above. The rest you can figure out from reading the
diff.
The recent change to the common code removed all calls to ->drain. It's
currently emulated via a timed sleep and polling ao_eof_reached(). That
is actually fallback code for AOs which lacked draining. I could just
readd the drain call, but it was a bad idea anyway. My plan to handle
this better is to require the AO to signal a underrun, even if
AOPLAY_FINAL_CHUNK is not set. Also reinstate not possibly waiting for
ao_lavc.c. ao_pcm.c did not have anything to handle this; whatever.
Change all OPT_* macros such that they don't define the entire m_option
initializer, and instead expand only to a part of it, which sets certain
fields. This requires changing almost every option declaration, because
they all use these macros. A declaration now always starts with
{"name", ...
followed by designated initializers only (possibly wrapped in macros).
The OPT_* macros now initialize the .offset and .type fields only,
sometimes also .priv and others.
I think this change makes the option macros less tricky. The old code
had to stuff everything into macro arguments (and attempted to allow
setting arbitrary fields by letting the user pass designated
initializers in the vararg parts). Some of this was made messy due to
C99 and C11 not allowing 0-sized varargs with ',' removal. It's also
possible that this change is pointless, other than cosmetic preferences.
Not too happy about some things. For example, the OPT_CHOICE()
indentation I applied looks a bit ugly.
Much of this change was done with regex search&replace, but some places
required manual editing. In particular, code in "obscure" areas (which I
didn't include in compilation) might be broken now.
In wayland_common.c the author of some option declarations confused the
flags parameter with the default value (though the default value was
also properly set below). I fixed this with this change.
This commit tries to prepare for better underrun reporting. The goal is
to report underruns relatively immediately. Until now, this happened
only when play() was called. Change this, and abuse that get_delay() is
called "relatively often" - this reports the underrun immediately in
practice.
Background:
In commit 81e51a15f7 (and also e38b0b245e), we were quite confused
about ALSA underrun handling. The commit message showed uncertainty how
case 3 happened, but it's blindingly obvious and simple.
Actually reading the code shows that ALSA does not have a concept of a
"final chunk" (or we don't use it). It's obvious we never pass the
AOPLAY_FINAL_CHUNK flag along to the ALSA API in any way. The only thing
we do is simply writing a partial fragment. Of course this will cause an
underrun. Doing a partial write saves us the trouble to pad the last
frame with silence, or so.
The main reason why the underrun message was avoided was that play() was
never called with a non-0 sample count again (except if reset() was
called before that). That was OK, at least the goal of avoiding the
unwanted message was reached. (And the original "bogus" message at end
of playback was perfectly correct, as far as ALSA goes.)
If network stalls, play() will called again only once new data is
available. Obviously, this could take a long time, thus it's too late.
It turns out that case 2) mentioned in the previous commit happened
quite often when playback ended normally.
There is probably a legitimate underrun with normal buffer sizes (100
ms, 4 fragments, gapless audio in "weak" mode). This is a result of the
player waiting for video to end, and/or the time needed to kill the
video window. The former case means that it depends on your test case
whether it happens (a file where video ends slightly before audio is
less likely to trigger it).
This in turn is due to how gapless playback works. Achieving not having
a "gap" requires queuing the audio of the next file without playing a
partial chunk (as AOPLAY_FINAL_CHUNK would do). The partial chunk is
then played as part of the first chunk played from the next file. But if
it detects "later" that there is no next file, it still needs to get rid
of the last fragment with AOPLAY_FINAL_CHUNK. At this point it's too
late, and an underrun may have actually happened. The way the player
uninits and reinits the entire playback engine for the next file in a
"serial" manner means it cannot know in advance whether this works.
This is the reason why the idiot who added the underrun exception for
the last chunk in play() was wrong (I wrote that btw., before you accuse
me of being rude). Yes, it's a real underrun, and you could probably
hear it.
This XRUN (aka underrun) message was printed in the following
situations:
1) legitimate underrun during playback
2) legitimate underrun when playing final chunk
3) bogus underrun when playing final chunk
The old underrun case (in play()) happens in cases 1) and 2) as well,
but 3) did not happen. It appears 3) is indeed something that happens,
although it's not known for sure. It's still pretty annoying, so remove
the new XRUN message.
When testing, care should be taken to play with buffer sizes, video
versus no video, and gapless enabled/disabled. Also, suspending the
player with Ctrl+Z in the terminal (SIGSTOP) and then resuming is a good
way to trigger a "normal" underrun.
According to ALSA doxy, EPIPE is a synonym to SND_PCM_STATE_XRUN,
and that is a state that we should attempt to automatically recover
from. In case recovery fails, log an error and return zero.
A warning message will still be output for each XRUN since those
are not something we should generally be receiving.
This has been way too long coming, and for me to notice that a
whole lot of ao_alsa functions do an early return if the AO is
paused.
For the STATE_SETUP case, I had this reproduced once, and never
since. Still, seems like we can start calling this function before
the ALSA device has been fully initialized so we might as well
early exit in that case.
Fixes a bug with alsa dmix on Fedora 29. After several minutes,
audio suddenly becomes bad and muted.
Actually, I don't know what causes this. Probably this is a bug in alsa.
In any case, as snd_pcm_status() returns not only 'avail', but also other
fields such as tstamp, htstamp, etc, this could be considered a good
simplification, as only avail is required for this function.
Print them as a warning.
Note that there may be some cases where it underruns, without being a
bad condition. This could possibly happen e.g. if the last chunk is
written, and then it resumes playback some time after that. Eventually I
want to add more code to avoid such spurious warnings.
There is a dedicated thread for feeding audio to the ALSA API from a
buffer with a larger size. There is little reason to have such a large
device buffer.
Always make the hw params dump function use MSGL_DEBUG, and remove the
MSGL_V use. That means you need -v -v to see them. The detailed
information is usually not very interesting, so this reduces the log
noise.
The af_get_best_sample_formats() function had an argument of
int[AF_FORMAT_COUNT], which is slightly incorrect, because it's 0
terminated and should in theory have AF_FORMAT_COUNT+1 entries. It won't
actually write this many formats (since some formats are fundamentally
incompatible), but it still feels annoying and incorrect. So fix it, and
require that callers pass an AF_FORMAT_COUNT+1 array.
Note that the array size has no meaning in C function arguments (just
another issue with C static arrays being weird and stupid), so get rid
of it completely.
Not changing the af_lavcac3enc use, since that is rewritten in another
branch anyway.
This commit eliminates the following clang warning:
warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
Going by the clang commit message, this seems to be explicitly specified
as UB by the standard, and they added this warning because MSVC
apparently results in different behavior. Whatever, we can just avoid
the warning with some small changes.
Looks like this is covered by LGPL relicensing agreements now.
Notes about contributors who could not be reached or who didn't agree:
Commit 7fccb6486e has tons of mp_msg changes look like they are not
copyrightable (even if they were, all mp_msg calls were rewritten in
mpv times again). The additional play() change looks suspicious, but
the function was rewritten several times anyway (first time after that
commit in 4f40ec312).
Commit 89ed1748ae was rewritten in commit 325311af3 and then again
several times after that. Basically all this code is unnecessary in
modern mpv and has been removed.
No code survived from the following commits: 4d31c3c53, 61ecf838f2,
d38968bd, 4deb67c3f. At least two cosmetic typo fixes are not
considered as well.
Commit 22bb046ad is reverted (this wasn't a valid warning anyway, just
a C++-ism icc applied to C). Using the constants is nicer, but at least
I don't have to decide whether that change was copyrightable.
This should actually cover all of them, if you take into account that
some unchanged GPL source files include header files with such checks.
Also this was done already for the libaf derived code.
This is only for "safety" and to avoid misunderstandings.
Instead of the infrastructure added in the previous commit to do the
conversion within the AO.
If this is used, and snd_pcm_status_get_avail() returns more frames than
snd_pcm_write*() actually accepts, you will get some nice audio
corruption.
Also, this mutates the data passed via play(), which is rather fishy,
but sort of doesn't matter for now. Surely this will cause unintended
bugs and WTFs.
Before this change, AOs could have internal alignment, and play() would
not consume the trailing data if the size passed to it is not aligned.
Change this to require AOs to report their alignment (via period_size),
and make sure to always send aligned data.
The buffer reported by get_space() now always has to be correct and
reliable. If play() does not consume all data provided (which is bounded
by get_space()), an error is printed.
This is preparation for potential further AO changes.
I casually checked alsa/lavc/null/pcm, the other AOs might or might not
work.
It appears some device can be missing if we filter too many. In
particular, I've seen devices starting with "front" and "sysdefault"
being mapped to different hardware. I conclude that it's not sane trying
to present a nice device list to users in ALSA. It's fucked. (Although
kodi appears to attempt some intense "beautification" of the device
list, which includes parsing parameters from the device name and such.
Well, let's not.)
No other audio API requires such ridiculous acrobatics.
Apparently POLLERR can be set if poll is called while the device is in
the SND_PCM_STATE_PREPARED state. So assume that we can simply call
snd_pcm_status() to check whether the error is because the device went
away (i.e. we expect it to return ENODEV if this happened).
This avoids sporadic device lost warnings and AO reloads. The actual
device lost case is untested.
Long planned. Leads to some sanity.
There still are some rather gross things. Especially g_groups is ugly,
and a hack that can hopefully be removed. (There is a plan for it, but
whether it's implemented depends on how much energy is left.)
Until now, this was only implemented for ao_alsa and AOs not using
push.c. ao_alsa.c relied on enabling funny underrun semantics for
avoiding resets on lower levels, while other AOs using push.c didn't do
anything.
Change this and at least make push.c copy silent data to the AO. This
still isn't perfect as keeping track of how much silence was played when
seems complex, so we don't do it. The consequence is that frame-stepping
will essentially randomize the A/V offset (it'll recover immediately
when unpausing, but still ugly). Also, in order to empty the currently
buffered audio on seeks etc., we still call ao_driver->reset and so on,
so the AO driver will still need to handle this specially.
The intent is to make behavior with ALSA less weird (for one we can
remove the code in ao_alsa.c that tries to trigger an initial
underflow). Also might help with #3754.
The "default" entry (which is and always was mpv/mplayer's default) does
not have a description set in the ALSA API. (While "sysdefault"
strangely has.)
Instead of an empty description, this should show something nice, so
reuse the ao.c code for naming default devices (see previous commit).
It's still a bit ugly that audio-device-list will have a default entry
for "Autoselect device" and "Default (alsa)", but then again we probably
want to allow the user to force ALSA (i.e. prevent fallbacks to other
AOs) just because ALSA is so flaky and makes this a legitimate feature.
This happens when ALSA gives us more channels than we asked for, for
whatever reasons. It looks like this wasn't handled correctly. The mpv
and ALSA channel counts could mismatch, which would lead to UB.
I couldn't actually trigger this case, though. I'm fairly sure that
drivers or plugins exist that do it anyway. (Inofficial ALSA motto: if
it can be broken, then why not break it?)
If the input is already mono or stereo, or if channel map selection
results in mono or stereo, then disable further use of the champ ALSA
API (or rather, stop trusting its results). Then we behave like a simple
application that only wants to output mono or stereo.
See #3045 and #2905. I couldn't actually test these cases, but this
commit is supposed to fix them.