The previous RING_BUFFER_COUNT value, 64, would have ao_wasapi buffer 64
frames of audio in the ring buffer; a delay of 1280ms, which is clearly
overkill for everything. A value of 8 buffers 8 frames for a total of
160ms.
When get_space was converted to returning samples instead of bytes, a
unit type mismatch in get_delay's calculation returned bogus values. Fix
by converting get_space's value back to bytes.
Fixes playback with ao_wasapi when reaching EOF, or seeking past it.
This can be reproduced with:
mpv short.wav -af 'lavfi="aecho=0.8:0.9:5000|6800:0.3|0.25"'
An audio file that is just 1-2 seconds long should play for 8-9 seconds,
which audible echo towards the end.
The code assumes that when playing with AF_FILTER_FLAG_EOF, the filter
will either produce output, or has all remaining data flushed. I'm not
really sure whether this really works if there are multiple filters with
EOF handling in the chain. To handle it correctly, af_lavfi should retry
filtering if 1. EOF flag is set, 2. there were input samples, and 3. no
output samples were produced. But currently it seems to work well enough
anyway.
The new signature is actually closer to how it actually works, and
someone who is not familiar to the API and how it works might make fewer
fatal mistakes with the new signature than the old one. Pretty weird.
Do this to sneak in a flags parameter, which will later be used to flush
remaining data of at least vf_lavfi.
This will make af_channels output a channel layout that is compatible
with any destination layout. Not sure if that's a good idea though,
since the way the AO choses a layout is perhaps less predictable. On the
other hand, using the old MPlayer standard layouts doesn't make much
sense either. We'll see whether this improves or breaks someone's use
case.
Apparently this stopped working after some planar changes (broken format
negotiation). Radically change option parsing in an incompatible way.
Suggest alternatives to this filter, since it barely has any importance
anymore.
Normally, audio decoder don't have a decoder delay, so the code was
fine. But FFmpeg supports multithreaded decoding for some audio codecs,
which introduces such a delay.
The delay means that we won't get decoded audio for the first few
packets, and that we need to do something to get the trailing audio
still buffered in the decoder when reaching EOF.
Two changes are needed to deal with the delay:
- If EOF is reached, pass a "flush" packet to the decoder to return the
buffered audio. Such a flush packet is automatically setup when
calling mp_set_av_packet() with a NULL packet.
- Use the PTS returned by the decoder, instead of the packet's. This is
important to get correct timestamps for decoded audio. Ignoring this
would result into offsetting the audio playback time by the decoder
delay. Note that we can still use the timestamp of the first packet
to get the timestamp for the start of the audio.
If the timebase is set, it's used for converting the packet timestamps.
Otherwise, the previous method of reinterpret-casting the mpv style
double timestamps to libavcodec style int64_t timestamps is used.
Also replace the kind of awkward mp_get_av_frame_pkt_ts() function by
mp_pts_from_av(), which simply converts timestamps in a way the old
function did. (Plus it takes a timebase parameter, similar to the
addition to mp_set_av_packet().)
Note that this should not change anything yet. The code in ad_lavc.c and
vd_lavc.c passes NULL for the timebase parameters. We could set
AVCodecContext.pkt_timebase and use that if we want to give libavcodec
"proper" timestamps.
This could be important for ad_lavc.c: some codecs (opus, probably mp3
and aac too) have weird requirements about doing decoding preroll on the
container level, and thus require adjusting the audio start timestamps
in some cases. libavcodec doesn't tell us how much was skipped, so we
either get shifted timestamps (by the length of the skipped data), or we
give it proper timestamps. (Note: libavcodec interprets or changes
timestamps only if pkt_timebase is set, which by default it is not.)
This would require selecting a timebase though, so I feel uncomfortable
with the idea. At least this change paves the way, and will allow some
testing.
There are some use cases for this. For example, you can use it to set
defaults of automatically inserted filters (like af_lavrresample). It's
also useful if you have a non-trivial VO configuration, and want to use
--vo to quickly change between the drivers without repeating the whole
configuration in the --vo argument.
This is needed so that new processes (created with fork+exec) don't
inherit open files, which can be important for a number of reasons.
Since O_CLOEXEC is relatively new (POSIX.1-2008, before that Linux
specific), we #define it to 0 in io.h to prevent compilation errors on
older/crappy systems. At least this is the plan.
input.c creates a pipe. For that, add a mp_set_cloexec() function (which
is based on Weston's code in vo_wayland.c, but more correct). We could
use pipe2() instead, but that is Linux specific. Technically, we have a
race condition, but it won't matter.
This partially reverts commit 7d152965. It turns out that at least some
ALSA drivers (at least snd-hda-intel) report incorrect audio delay with
non-native sample rates, even if the sample rate is only very slightly
different from the native one.
For example, 48000Hz is fine on my hda-intel system, while both 8000Hz
and 47999Hz lead to a delay off by 40ms (according to mpv's A/V
difference display), which suggests that something in ALSA is
calculating the delay using the wrong sample rate.
As an additional problem, with ALSA resampling enabled, using
48001Hz/float/2ch fails, while 49000Hz/float/2ch or 48001Hz/s16/2ch
work. With resampling disabled, all these cases work obviously, because
our own resampler doesn't just refuse any of these formats.
Since some people want to use the ALSA resampler (because it's highly
configurable, supports multiple backends, etc.), we still allow enabling
ALSA resampling with an ao_alsa suboption.
Previous code was using the values of the AudioChannelLabel enum directly to
create the channel bitmap. While this was quite smart it was pretty unreadable
and fragile (what if Apple changes the values of those enums?).
Change it to use a 'dumb' conversion table.
These used the suffix _resync_stream, which is a bit misleading. Nothing
gets "resynchronized", they really just reset state.
(Some audio decoders actually used to "resync" by reading packets for
resuming playback, but that's not the case anymore.)
Also move the function in dec_video.c to the top of the file.
The code stopped at kAudioChannelLabel_TopBackRight and missed mapping for
5 more channel labels. These are in a completely different order that the mpv
ones so they must be mapped manually.
This includes the case when lavc decodes audio with more than 8
channels, which our audio chain currently does not support.
the changes in ad_lavc.c are just simplifications. The code tried to
avoid overriding global parameters if it found something invalid, but
that is not needed anymore.
Resampling with non-ancient ALSA setups works fine, so there is no
need to keep this around. Furthermore, as of writing, the default
builtin resampler used by many ALSA setups (taken from libspeex)
actually has higher quality than the default resampling modes of
avresample and swresample.
Apparently just 5 packets is not enough for the initial audio decode
(which is needed to find the format). The old code (before the recent
refactor) appeared to use 5 packets, but there were apparently other
code paths which in the end amounted to more than 5 packets being read.
The sample that failed (see github issue #368) needed 9 packets.
Fixes#368.
This used to be needed to access the generic stream header from the
specific headers, which in turn was needed because the decoders had
access only to the specific headers. This is not the case anymore, so
this can finally be removed again.
Also move the "format" field from the specific headers to sh_stream.
sh_audio is supposed to contain file headers, not whatever was decoded.
Fix this, and write the decoded format to separate fields in the decoder
context, the dec_audio.decoded field. (Note that this field is really
only needed to communicate the audio format from decoder driver to the
generic code, so no other code accesses it.)