There are situations where the resampler is destroyed and recreated
during playback. If recreating the resampler unexpectedly fails, the
filter function is supposed to return an error. This wasn't done
correctly, because get_out_samples() accessed the resampler before the
check. Move the check up to fix this.
The touched code is for seek resets and such - we simply want to reset
the entire resample state. But I noticed after a seek a tiny bit of
audio is missing (mpv's audio sync code inserted silence to compensate).
It turns out swr_drop_output() either does not reset some internal state
as we expect, or it's designed to drop not only buffered samples, but
also future samples.
On the other hand, libavresample's avresample_read(), does not have this
problem. (It is also pretty explicit in what it does - return/skip
buffered data, nothing else.)
Is the libswresample behavior a bug? Or a feature? Does nobody even
know? Who cares - use the hammer to unfuck the situation. Destroy and
deallocate the libswresample context and recreate it. On every seek.
The algorithm and functionality is the same, but the code becomes much
simpler and easier to follow.
The assumption that there is only 1 conversion filter (lavrresample)
helps with the simplification, but the main change is to use the same
code for format/channels/rate. Get rid of the different AF_CONTROL_SET_*
controls, and change the af->data parameters directly. (af->data is
badly named, but essentially is a placeholder for the output format.)
Also, instead of trying to use the af_reinit() loop to init inserted
conversion filters or filters with changed output formats, do it inline,
and move the common code to a filter_reinit() function. This gets rid of
the awful retry variable.
In general, this should not change any runtime behavior.
Remove flc-frc <-> sl<->sr. This was just plain wrong, and a mistaken
change to make 7.1 work properly on CoreAudio with 7.1(rear) layout.
Also see the following commit.
Add br-br <-> sl<->sr, because we decided that it makes sense.
Note that this "fudging" is applied only if the channel pairs are
replaced, i.e. they would get dropped and be replaced with silence. This
is done to compensate for libswresample's default rematrixing (which
takes care of some more common cases).
This is probably the 3rd time the user-visible behavior changes. This
time, switch back because not normalizing seems to be the more expected
behavior from users.
Prevents channels from being dropped, e.g. when going 7.1 -> 7.1(wide)
and similar cases. The reasoning here is that channel layouts over HDMI
don't work anyway, and not dropping a channel and playing it on a
slightly "wrong" (but expected) speaker is preferable to playing silence
on these speakers.
Do this to remove issues with ao_coreaudio. Frankly I'm not sure whether
our mapping (between CA and mpv/FFmpeg speakers) is correct, but on the
other hand due to the reasons stated above it's not all that meaningful.
Of course, only FFmpeg has av_clipd(), while Libav does not. (Nevermind
that it doesn't do much more than the mpv MPCLAMP() macro. Supposedly,
libavutil can provide optimized platform-specific versions for av_clip*,
but of course nothing actually does for av_clipf() or av_clipd().)
libswresample doesn't do it - although it should, but the patch is stuck
in limbo.
Probably reduces problems with artifacts on downmixing in some cases.
Just set the ratio directly by working around the intended semantics of
the API function. The silly rounding stuff we had isn't needed anymore
(and not entirely correct anyway).
Note that since the compensation is virtually active forever, we need to
reset if it's not needed. So always run this code to be sure to reset
it.
Also note that libswresample itself had a precision issue, until it
was fixed in FFmpeg commit 351e625d.
This reverts commit 4e358a9636.
Testing shows the channel pairs must indeed be swapped (details see
commit message of the reverted commit). Making the downmix code move
sl/sr to sdl/sdr is not an appropriate solution anymore, and it's
better to fix the unusual channel layout in ao_alsa.c directly.
(Not reverting the change in chmap.c; this is still correct.)
ao_alsa: attempt to fix 7.1 over HDMI
The last 2 channels of 7.1 (RLC/RRC in ALSA) were exported as sdl/sdr
instead of sl/sr (I don't even know why I chose sdl/sdr, but SL/SR
and RLC/RRC are different in the ALSA API). libsw/avresample do not
move the sl/sr channels to sdl/sdr when rematrixing, so silence was
sent for 2 channels. If my selection of sdl/sdr is essentially API
abuse, there's no reason why they should do this differently.
The mess here is really that ALSa doesn't map the HDMI layouts cleanly.
Most ALSA drivers export 7.1 in a way compatible to our expectations,
but Intel HDA/HDMI does not:
mpv/ffmpeg: fl-fr-fc-lfe-bl-br-sl-sr
ALSA/generic: FL FR FC LFE RL RR SL SR [1]
ALSA/HDMI: FL FR LFE FC RL RR RLC RRC [2]
The HDMI layout is layout 0x13 (going by CEA-861-B). The comment in
the kernel code has to be correct too. The early standard defines only
1 other layout, which replaces RLC/RRC with FRC/FLC - this probably
corresponds to what we call "7.1(wide)".
So it appears when ALSA requests RLC/RRC, we should feed it sl/sr.
To make it more complicated, Kodi/xbmc apparently also have to deal with
ALSA being special, but instead of sending sl/sr to RLC/RRC, they swap
the last two pairs of the layout, and send sl/sr to RL/RR and bl/br to
RLC/RRC. Or I might have misunderstood their code. I don't have a
7.1-capable A/V receiver, so I can't test this.
For now, go with the simpler solution, and wait until someone tests it.
If the speakers end up swapped, a completely different solution will be
needed.
[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/sound/core/pcm_lib.c?id=refs/tags/v4.3#n2434
[2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/sound/pci/hda/patch_hdmi.c?id=refs/tags/v4.3#n307
av_get_default_channel_layout() fails with channel counts larger than 8.
The channel layout doesn't need to make sense, so pick an arbitrary
fallback.
libswresample also has options for setting the channel counts directly,
but better not introduce new concepts in the code. Also, libavresample
doesn't have these options.
Small adjustments to the playback speed use swr_set_compensation()
to stretch the audio as it is required. But since large adjustments
are now handled by actually reinitializing libswresample, the small
adjustments get rounded off completely with typical frame sizes.
Compensate for this by accounting for the rounding error and keeping
track of fractional samples that should have been output to achieve
the correct ratio.
This fixes display sync mode behavior, which requires these adjustments
to be relatively accurate.
swr/avresample_set_compensation() was made for small speed adjustments.
Non-documentation says it should be used for changes not larger than 1%,
so reinitialize the sampler if the change is larger than that.
swr_set_compensation() changes the apparent sample rate on the fly (who
would have guessed). It is thus very well-suited for adjusting audio
speed on the fly during playback (like needed by the display-sync mode).
It skips the relatively slow resampler reinitialization.
If this doesn't work (libswresample soxr backend), then fall back to the
old method.
Not sure why struct af_resample_opts even exists. It seems useful to
group the fields set by user options. But storing the current format
conversion parameters doesn't seem very elegant, and having a separate
instance in the "ctx" field isn't helpful either.
This was a minor optimization to potentially avoid resampler
reconfiguration when the filter is reinitialized. But filter
reinitialization is a rare event, and the case when no reconfiguration
is needed is even rarer. As such, this is an unnecessary micro-
optimization and only adds potential for bugs.
This message bloats verbose log output if e.g. audio speed is frequently
readjusted, such as when syncing audio to video. So don't print the
message if only speed is changed. (This case requires reconfiguration,
but can't change the input/output channel maps.)
Also do not print the message if no remixing is done at all.
Replace all the check macros with function calls. Give them all the
same case and naming schema.
Drop af_fmt2bits(). Only af_fmt2bps() survives as af_fmt_to_bytes().
Introduce af_fmt_is_pcm(), and use it in situations that used
!AF_FORMAT_IS_SPECIAL. Nobody really knew what a "special" format
was. It simply meant "not PCM".
This avoids keeping "bad" state from previous reconfig calls, such as
the internal_sample_format option (which is set only on the first
reconfig call).
There's no advantage to keeping the resample contexts around anyway.
Basically, af_fix_format_conversion() behaves stupid you insert a
conversion filter that won't work, and adding back the conversion test
function is the simplest fix to it.
This attempted to find a minimal filter graph for a format conversion
involving multiple conversion filters. With the last 2 commits it
becomes dead code - remove it.
Now af_lavrresample can output 24 bit samples directly, by doing the
conversion "inline". Luckily, S32->S24 can be done in-place, so this
isn't too much work. But the output conversion logic (which seems to be
adding up) gets slightly more complicated again.
Normally this is done by af_convert24. But having multiple conversion
filters complicates some aspects of the filter chain. S24 output is the
only thing the code for multiple conversion filters is still needed for,
and getting rid of that is preferable.
If the code path for additional output conversion is active,
reorder_planes() is always called, even if the reorder_out array wasn't
filled. This is obviously wrong - always fill this array.
Until now, we didn't do this, because it required some effort, and
didn't seem to be necessary. It probably still isn't, but it sounds
like a good idea not to output arbitrary data on these channels.
The situation is complicated by the fact that just adding new channels
to a planar frame would require messing with buffers. So we would have
to allocate new buffers and add them to the frame. We could have to
maintain an extra buffer pool for this. Avoid this by being "clever",
and just allocate a frame with enough channels in the first place.
libav/swresample won't know about these channels and won't write to
them, but we can grab them in reorder_planes() and use them for the
NA channels.
This is better, because now we call swr_get_delay() with the output
samplerate, instead of with the input samplerate and then multiplying it
with the ratio and rounding it up.
This manually retrieved the remaining audio from the resampler. It
subtly missed a conversion which could leave to an unsubtle crash.
This could happen if reorder_planes() was supposed to insert NA
channels, and the resampler/actual output format were different.
Simplify it by reusing the normal drain path. One oddness is that
the filter will add an output frame outside of normal filtering,
but that should be fine.
Some audio APIs explicitly require you to add dummy channels. These are
not rendered, and only exist for the sake of the audio API or hardware
strangeness. At least ALSA, Sndio, and CoreAudio seem to have them.
This commit is preparation for using them with ao_coreaudio.
The result is a bit messy. libavresample/libswresample don't have good
API for this; avresample_set_channel_mapping() is pretty useless.
Although in theory you can use it to add and remove channels, you
can't set the channel counts. So we do the ordering ourselves by making
sure the audio data is planar, and by swapping the plane pointers. This
requires lots of messiness to get the conversions in place. Also, the
input reordering is still done with the "old" method, and doesn't
support padded channels - hopefully this will never be needed. (I tried
to come up with cleaner solutions, but compared to my other attempts,
the final commit is not that bad.)
libswresample doesn't normalize when remixing to a float format. This
will cause clipping due to float samples being out of the allowed range.
Fortunately this extremely bad default can be changed.
This does not happen with libavresample: it normalizes by default.
Fixes#1752.
Although the libraries we use for resampling (libavresample and
libswresample) do not support changing sampelrate on the fly, this makes
it easier to make sure no audio buffers are implicitly dropped. In fact,
this commit adds additional code to drain the resampler explicitly.
Changing speed twice without feeding audio in-between made it crash
with libavresample inc ertain cases (libswresample is fine). This is
probably a libavresample bug. Hopefully this will be fixed, and also I
attempted to workaround the situation that crashes it. (It seems to
point in direction of random memory corruption, though.)
The purpose of this function was to filter only as much audio input as
needed to produce a certain amount of audio output. This could (in
theory) avoid excessive buffering when e.g. changing playback speed with
resampling.
Use of this was already removed in commit 5fd8a1e0. No problems were
experienced, so let's assume this feature is practically worthless.
(Though it's possible that it was quite useful over a decade ago, or in
some cornercases with evil files.)