Until recently, the AO was reinitialized strictly only on decoder format
changes. But the commit for simplifying audio format negotiation removed
this. Now the AO is recreated for any format change.
This is sort of annoying if you change playback speed. The
insertion/removal of af_scaletempo can change the sample format. For
example, the acompressor filter will convert output to double, so
toggling scaletempo will force the format back to float. This recreates
the AO under the --gapless-audio=weak default. This likely affects a lot
of other filters too.
Work this around by allowing sample format changes, and keeping the
current AO format in these cases. This is probably not a big problem.
Most audio APIs force the output format to float anyway.
This means you actually have to worry about what the default gapless
mode does to your audio. If you start with a file that uses 8 bit per
sample, and then continue playing a 24 bit FLAC, it will be converted
down to 8 bit per sample. (Assuming they are played in a way that uses
the gapless logic.)
Print them as a warning.
Note that there may be some cases where it underruns, without being a
bad condition. This could possibly happen e.g. if the last chunk is
written, and then it resumes playback some time after that. Eventually I
want to add more code to avoid such spurious warnings.
The audio format neogitation code was pretty complicated, although the
idea was simple: when the format changes (or on the first audio frame),
filter only the new frame through the entire filter chain, discard the
resulting frame, but use the format to initialize the AO.
This was useful for "fudging" the channel remix behavior (upmix or
downmix), and moving it before other filters. Apparently this was useful
for things like DRC filters, which might work better in stereo, and
which also can only achieve the desired volume levels by doing it before
a downmix, which would modify the volume. This mechanism was introduced
in commit 60048b7eb9 (which the commit message also describes as
"idiotic heuristic"). Knowing the output format is inherently necessary
for this, because otherwise we can't know what the hell the user defined
filters will do.
There were problems with robustness. Some filters needed more than one
frame. Resampling in particular would discard initial audio at high
resampling ratios. Some filters might drop audio intentionally (like
clipping data on timestamp ranges). There were also allegations that
some decoders output 0 length frames (although that is invalid in
libavcodec). The state machine was excessively complex and hard to
understand too.
There are 3 things that could have been done:
1. Fix robustness problems by doing more heuristics, like repeating
audio frames or simply decoding several frames. Since filters can
behave differently, this would have added lots of complexity.
2. Make use of libavfilter's format negotiation, and add the same to
mpv builtin filters. This is sort of annoying, because the format
negotiation in libavfilter changes the state of the filters. It also
reports only some parameters (mostly all for audio, but a lot of
holes for video). It would remove some of the state machine, but not
all.
3. Drop the channel remix fudging, and do the same as the video chain.
This would not require format negotiation, but instead you can just
filter the audio frames, and look what comes out of it. If nothing
comes out, simply never create an AO.
This commit selects option 3. It removes the remix fudging, which means
the loss of a feature. Users can instead add "--af=format=channels=2"
before their DRC filter, or something. I'm also considering changing the
default for --audio-channels back to stereo, and downmix in the decoder
or at the start of the filter chain, which would give the same results,
except requiring more configuration.
Implementation-wise, this is still a bit different from the video path.
The VO always remains the same instance, while the AO might have to be
recreated on configuration changes. This still requires explicit format
change handling + draining old data, but by putting it into
f_autoconvert, not much new code is needed.
This tried to avoid running the audio/video functions depending on
whether any of the audio or video related format restrictions were
called (so the filter would show an error if a mismatching media type
was passed in). It was a shit idea anyway, so fuck it.
There is a dedicated thread for feeding audio to the ALSA API from a
buffer with a larger size. There is little reason to have such a large
device buffer.
Some shittily muxed files (by a certain HandBrake+libavformat combo)
contain a SeekHead pointing to a SeekHead at the end of the file, which
in turn points to track headers (also at the end of the file). This
failed because the demuxer didn't bother to actually read the elements
listed by the second SeekHead, so no track headers were read, and
playback broke.
Somehow commit 6fe75c38 broke this for no reason. It adds a "needed"
field, which seems completely pointless and replaced the "parsed" flag
in an incomplete way. In particular, the "needed" field was not set when
a _recursive_ SeekHead was read, so those elements were not read. Just
get rid of the field and use "parsed" instead.
The CUDA dynamic loader was broken out of ffmpeg into its own repo
and package. This gives us an opportunity to re-use it in mpv and
remove our custom loader logic.
One can now set the number of buffers and the buffer size.
This can reduce the CPU usage and the total latency stays mostly the same.
As there are sync mechanisms the A/V sync continue intact and working.
It also modifies 6.1 channel order, as per OpenAL spec
and add AOPLAY_FINAL_CHUNK support
OpenAL Soft's AL_SOFT_source_latency extension allows one to correctly
get the device output latency, facilitating the syncronization with
video.
Also added a simpler generic fallback that does not take into account
latency of the device.
Uses OpenAL Soft's AL_DIRECT_CHANNELS_SOFT extension and can be controlled through
a new CLI option, --openal-direct-channels.
This allows one to send the audio data direrctly to the desired channel without
effects applied.
Quickly tested by a person who had FFmpeg linked with libaom.
Seems as simple as the VP9 mappings, where there is no extradata/
initialization data off-band, and just stuff in the packets
themselves.
Do note that the AV1 video format itself at this point is still
not frozen, so what you might produce one day might not be
decodable the following day.
When dump's argument is an array, it was displaying <VISITED> for all
the array's object elements (objects, arrays, etc), regardless if they're
actually visited or not.
The reason is that we try to stringify twice: once normally which may
throw (on cycles), and a second time while excluding visited items which
is indicated by binding the replacer to an empty array - in which we hold
the visited items, where the replacer tests if its 'this' is an array or
not and acts accordingly.
However, its "this" may also be an array even if we don't bind it to one,
because its "normal" this is the main stringified object, so the test of
Array.isArray(this) is true when the top object is an array, and the object
items are indeed are in it - so the replacer considers them visited.
Fix by binding to null on the first attempt such that "this" is an array
only when we want it to test for visited items and not when the argument
itself is an array.
Due to earlier misinterpretation of the Lua docs as if mp.register_idle
registers a one-shot callback, the JS docs suggested to use setTimeout.
But the behavior and Lua docs are such that it's a repeating callback
which fires just before the script thread goes to sleep.
Implement it for JS too.
Although half (non-fast track on sink rate) or one-third (non-fast track not on sink rate) of the buffer size of the created AudioTrack instance as the SL Enqueue buffer size is basically enough for dropout-free playback, only using the full size can avoid stutter upon (re)start of playback.
Here are the various buffer sizes on different track/sink rate when on Bluetooth audio on Android O:
aptX @ 48kHz:
Sink rate: 48000 Hz
44100 Hz: 10632 frames (241.09 ms)
48000 Hz: 11544 frames (240.50 ms)
88200 Hz: 21216 frames (240.54 ms)
96000 Hz: 23088 frames (240.50 ms)
176400 Hz: 42384 frames (240.27 ms)
192000 Hz: 46128 frames (240.25 ms)
SBC/AAC/aptX @ 44.1kHz:
Sink rate: 44100 Hz
44100 Hz: 10776 frames (244.35 ms)
48000 Hz: 11748 frames (244.75 ms)
88200 Hz: 21552 frames (244.35 ms)
96000 Hz: 23448 frames (244.25 ms)
176400 Hz: 43056 frames (244.08 ms)
192000 Hz: 46848 frames (244.00 ms)
The above results were produced with the following code:
import android.media.AudioAttributes;
import android.media.AudioFormat;
import android.media.AudioTrack;
class AudioInfo {
public static void main(String[] args) {
int nosr = AudioTrack.getNativeOutputSampleRate(3);
System.out.printf("Sink rate: %d Hz\n", nosr);
int[] rates = {44100,48000,88200,96000,176400,192000};
for (int rate: rates) {
AudioAttributes aa = new AudioAttributes.Builder().setFlags(256).build();
AudioFormat af = new AudioFormat.Builder().setSampleRate(rate).build();
AudioTrack at = new AudioTrack(aa, af, 4, 1, 0);
int sr = at.getSampleRate();
int bs = at.getBufferSizeInFrames();
float ms = bs * (float) 1000 / sr;
at.release();
System.out.printf("%d Hz: %d frames (%.2f ms)\n", sr, bs, ms);
}
}
}
Therefore bumping the device buffer size to 250ms.
On machines with multiple GPUs, /dev/dri/renderD128 isn't guaranteed
to point to a valid vaapi device. This just adds the option to specify
what path to use.
The old fallback /dev/dri/card0 is gone but that's not a loss as its
a legacy interface no longer accepted as valid by libva.
Fixes#4320
libavcodec normally drops subtitle lines that fail a check for invalid
UTF-8 (their check is slightly broken too, by the way). This was always
annoying and inconvenient, but now there is a mechanism to prevent
it from doing this. Requires newst libavcodec.
There was a "generic" function to run a hook and to wait for its
completion, yet there were two duplicated functions doing the same
anyway. Replace them with a single function.
They differed in how stop_play was handled, but it was broken anyway.
stop_play is set when playback is stopped due to quitting or changing
the playlist entry - but we still can't stop hook processing, because
that would mean asynchronously doing something else while the user hook
code is still busy and might still have the expectation that running the
hook stops everything else. So not waiting until the hook ends properly
is against the whole hook idea. That this was done inconsistently is
even worse. (Though it could be argued that when quitting the player,
everything should just be stopped violently. But I still think that's
up to the hook handler.)
process_hooks() does not return anything, since hook processing doesn't
really have a result (it's all about blocking and letting some other
code synchronously do something). Just let the caller check whether
loading was aborted in the meantime.
Also change the potentially misleading name of mp_hook_run().
As it turns out, there are multiple libmpv users who saw a need to
use the hook API. The API is kind of shitty and was never meant to be
actually public (it was mostly a hack for the ytdl script).
Introduce a proper API and deprecate the old one. The old one will
probably continue to work for a few releases, but will be removed
eventually.
There are some slight changes to the old API, but if a user followed
the manual properly, it won't break.
Mostly untested. Appears to work with ytdl_hook.
Move all of this stuff to a common function. This makes the error
messages less specific, but I don't think anyone will miss it.
The OSD flag handling is annoying, but it's nothing that should be
changed with this commit.
I think this will help with reducing code duplication (see following
commit). The error messages loses the multiplication factor, but the
error message will be replaced by a generic one in the following commit
anyway.
Hardware decoding things often need access to additional handles from
the windowing system, such as the X11 or Wayland display when using
vaapi. The opengl-cb had nothing dedicated for this, and used the weird
GL_MP_MPGetNativeDisplay GL extension (which was mpv specific and not
officially registered with OpenGL).
This was awkward, and a pain due to having to emulate GL context
behavior (like needing a TLS variable to store context for the pseudo GL
extension function). In addition (and not inherently due to this), we
could pass only one resource from mpv builtin context backends to
hwdecs. It was also all GL specific.
Replace this with a newer mechanism. It works for all RA backends, not
just GL. the API user can explicitly pass the objects at init time via
mpv_render_context_create(). Multiple resources are naturally possible.
The API uses MPV_RENDER_PARAM_* defines, but internally we use strings.
This is done for 2 reasons: 1. trying to leave libmpv and internal
mechanisms decoupled, 2. not having to add public API for some of the
internal resource types (especially D3D/GL interop stuff).
To remain sane, drop support for obscure half-working opengl-cb things,
like the DRM interop (was missing necessary things), the RPI window
thing (nobody used it), and obscure D3D interop things (not needed with
ANGLE, others were undocumented). In order not to break ABI and the C
API, we don't remove the associated structs from opengl_cb.h.
The parts which are still needed (in particular DRM interop) needs to be
ported to the render API.