I can't explain this, but it seems to be a similar case to the ALSA HDMI
one. I find it hard to tell because of the slightly different names and
conventions in use in libavcodec, WAVEEXT channel masks, decoders, codec
specifications, HDMI, and platform audio APIs.
The fix is the same as the one for ao_alsa (see commit be49da72). This
should fix at least playing 7.1 sources on OSX with 7.1(rear) selected
in Audio MIDI Setup. The ao_alsa commit mentions XBMC, but I couldn't
find out where it does that or if it also does that for CoreAudio. It's
woth noting that PHT (essentially an old XBMC fork) also exhibited the
incorrect behavior (i.e. side and back speakers were swapped).
Remove flc-frc <-> sl<->sr. This was just plain wrong, and a mistaken
change to make 7.1 work properly on CoreAudio with 7.1(rear) layout.
Also see the following commit.
Add br-br <-> sl<->sr, because we decided that it makes sense.
Note that this "fudging" is applied only if the channel pairs are
replaced, i.e. they would get dropped and be replaced with silence. This
is done to compensate for libswresample's default rematrixing (which
takes care of some more common cases).
Will be helpful for the coming filter support. I planned on merging
audio/video decoding, but this will have to wait a bit longer, so only
remove the duplicate status codes.
This also makes it refcounted, i.e. the new AVFrame will reference the
mp_audio buffers, instead of potentially forcing the consumer of the
AVFrame to copy the data.
All the extra code is for handling the >8 channels case, which requires
very messy dealing with the extended_ fields (not our fault).
Correctly avoid a reload if the current device was specified by the user through
--audio-device. Previously, we only recognized if the user had specified
--ao=wasapi:device=.
This seems generally easier when using libmpv (and was already requested
and implemented before: see commit 327a779a; it was reverted some time
later).
With the weird internal logic we have to deal with, in particular the
--softvol=no case (using system volume), and using the audio API's mixer
(--softvol=auto on some systems), we still can't avoid all glitches and
corner cases that complicate this issue so much. The API user is either
recommended to use --softvol=yes or auto, or to watch the new
mixer-active property, and assume the volume/mute properties have
significant values if the mixer is active.
Remaining glitches:
- changing the volume/mute properties has no effect if no internal mixer
is used (--softvol=no) and the mixer is not active; the actual mixer
controls do not change, only the property values
- --volume/--mute do not have an effect on the volume/mute properties
before mixer initialization (the options strictly are only applied
during mixer init)
- volume-max is 100 while the mixer is not active
Notably, the address of the enumerator->count member is passed to
IMMDeviceCollection::GetCount(), which expects a UINT variable, not an int. How
did this ever work?
Previously, if the enumerator found no devices, attempting to get the default
device with IMMDeviceEnumerator::GetDefaultAudioEndpoint would result in the
cryptic (and undocumented) E_PROP_ID_UNSUPPORTED. This way, the user is given a
better indication of what exactly is wrong and isolates any other possible
triggers for this error.
Similar to the video path. dec_audio.c now handles decoding only. It
also looks very similar to dec_video.c, and actually contains some of
the rewritten code from it. (A further goal might be unifying the
decoders, I guess.)
High potential for regressions.
This is probably the 3rd time the user-visible behavior changes. This
time, switch back because not normalizing seems to be the more expected
behavior from users.
Seems useless.
This only helped in one case: one audio stream in the sample
av_find_best_stream_fails.ts had a AC3 packets which couldn't be
decoded, and for which avcodec_decode_audio4() returned 0 forever. In
this specific case, playback will now not start, and you have to
deselect audio manually.
(If someone complains, the old behavior might be restored, but
differently.)
Also remove the stale "bitrate" field.
While the situation is not really clear for the other rewritten
coreaudio code, it's very clear for the channel mapping code. It was all
written by us. (MPlayer doesn't even have any channel map handling.)
This covers source files which were added in mplayer2 and mpv times
only, and where all code is covered by LGPL relicensing agreements.
There are probably more files to which this applies, but I'm being
conservative here.
A file named ao_sdl.c exists in MPlayer too, but the mpv one is a
complete rewrite, and was added some time after the original ao_sdl.c
was removed. The same applies to vo_sdl.c, for which the SDL2 API is
radically different in addition (MPlayer supports SDL 1.2 only).
common.c contains only code written by me. But common.h is a strange
case: although it originally was named mp_common.h and exists in MPlayer
too, by now it contains only definitions written by uau and me. The
exceptions are the CONTROL_ defines - thus not changing the license of
common.h yet.
codec_tags.c contained once large tables generated from MPlayer's
codecs.conf, but all of these tables were removed.
From demux_playlist.c I'm removing a code fragment from someone who was
not asked; this probably could be done later (see commit 15dccc37).
misc.c is a bit complicated to reason about (it was split off mplayer.c
and thus contains random functions out of this file), but actually all
functions have been added post-MPlayer. Except get_relative_time(),
which was written by uau, but looks similar to 3 different versions of
something similar in each of the Unix/win32/OSX timer source files. I'm
not sure what that means in regards to copyright, so I've just moved it
into another still-GPL source file for now.
screenshot.c once had some minor parts of MPlayer's vf_screenshot.c, but
they're all gone.
Previously used opt_exclusive option to decide which volume control code to run.
The might not always reflect the actual state, for example if passthrough
is used. Admittedly, none of the volume controls will work anyway with
passthrough, but this is the right thing to do.
Prevents channels from being dropped, e.g. when going 7.1 -> 7.1(wide)
and similar cases. The reasoning here is that channel layouts over HDMI
don't work anyway, and not dropping a channel and playing it on a
slightly "wrong" (but expected) speaker is preferable to playing silence
on these speakers.
Do this to remove issues with ao_coreaudio. Frankly I'm not sure whether
our mapping (between CA and mpv/FFmpeg speakers) is correct, but on the
other hand due to the reasons stated above it's not all that meaningful.
This is mainly a refactor. I'm hoping it will make some things easier
in the future due to cleanly separating codec metadata and stream
metadata.
Also, declare that the "codec" field can not be NULL anymore. demux.c
will set it to "" if it's NULL when added. This gets rid of a corner
case everything had to handle, but which rarely happened.
Note that hresult_to_str() (coming from wasapi_explain_err()) is mostly
wasapi-specific, but since HRESULT error codes are unique, it can be
extended for any other use.
This is another attempt at making files with sparse video frames work
better.
The problem is that you generally can't know whether a jump in video
timestamps is just a (very) long video frame, or a timestamp reset. Due
to the existence of files with sparse video frames (new frame only every
few seconds or longer), every heuristic will be arbitrary (in general,
at least).
But we can use the fact that if video is continuous, audio should also
be continuous. Audio discontinuities can be easily detected, and if that
happens, reset some of the playback state.
The way the playback state is reset is rather radical (resets decoders
as well), but it's just better not to cause too much obscure stuff to
happen here. If the A/V sync code were to be rewritten, it should
probably strictly use PTS values (not this strange time_frame/delay
stuff), which would make it much easier to detect such situations and
to react to them.
It existed for XP-compatibility only. There was also a time where
ao_wasapi caused issues, but we're relatively confident that ao_wasapi
works better or at least as good as ao_dsound on Windows Vista and
later.
Normally, PulseAudio accepts any combination of sample format, sample
rate, channel count/map. Sometimes it does not. For example, the channel
rate or channel count have fixed maximum values. We should not fail
fatally in such cases, but attempt to fall back to a working format.
We could just send pass an "unset" format to Pulse, but this is not too
attractive. Pulse could use a format which we do not support, and also
doing so much for an obscure corner case is not reasonable. So just pick
a format that is very likely supported.
This still could fail at runtime (the stream could fail instead of going
to the ready state), but this sounds also too complicated. In
particular, it doesn't look like pulse will tell us the cause of the
stream failure. (Or maybe it does - but I didn't find anything.)
Last but not least, our fallback could be less dumb, and e.g. try to fix
only one of samplerate or channel count first to reduce the loss, but
this is also not particularly worthy the effort.
Fixes#2654.
Given 5.1(side), this lets it pick 5.1 from [5.1, 7.1]. Which was
probably the original intention of this replacement stuff. Until now,
the opposite was done in some cases.
Keep the old heuristic if the replacement is not perfect. This would
mean that a subset of the channel layout is an inexact equivalent, but
not all of it.
(My conclusion is that audio output APIs should be designed to simply
take any channel layout, like the PulseAudio API does.)
Unify and clean up listing and selection. Use common enumerator code for both
operations to avoid duplication or inconsistencies.
Maintain, but significatnly simplify manual device selection by id, name or
number. This actually fixes loading by name which didn't really work before
since the "name" displayed by --audio-device=help differed from that used to
match the selection, which used the device "description" instead.
Save the selected deviceID in the private structure for later loading. This will
permit moving the device selection into the main thread in a future commit.
Apparently it's only wine where the qpc_position returned by
IAudioClock_GetPosition can be overflowed. So actually do the rescaling
correctly, but throw away the result if it looks unreasonable.
this fixes a regression in 5afa68835a
Make sure that subtraction of performance counters is done correctly.
Follow the *exact* instructions for converting performance counter to something
comparable to the QPCposition returned by IAudioClient::GetPosition
https://msdn.microsoft.com/en-us/library/windows/desktop/dd370889%28v=vs.85%29.aspx
Also make sure that subtraction of unsigned integers is stored into a signed
integer to avoid nastiness. Also be more careful about overflow in the
conversion of the device position into number of samples.
Avoid casting mp_time_us() to a double, and use llrint to convert the
double precision delay_us back to integer for ao_read_data.
Finally, actually check the return value of ao_read_data and add a verbose
message if it is not the expected value. Unfortunately,
there is no way to tell WASAPI when this happens since the frame_count in
ReleaseBuffer must match GetBuffer.
Do not try and set/get master volume in exclusive if there is no
hardware support. This would just uselessly change the master slider,
but have no effect on the actual volume.
Furthermore if getting hardware volume support information fails, then assume
it has none.
It was complicated and not even very intuitive to the user.
If you are controlling the master volume, you just have to be
prepared to deal with the consequences.
A manually added af_volume could lead to muted audio when switching to a
new file. af_volume keeps the last volume set by AF_CONTROL_SET_VOLUME
to return it with AF_CONTROL_GET_VOLUME, but the initial value is 0. So
the mixer volume was forced to 0 when unintializing the filter chain and
reading back the previously set volume.
If there were many AO drivers without device selection, this added a
"Default" entry for each AO. These entries were not distinguishable, as
the device list feature is meant not to require to display the "raw"
device name in GUIs.
Disambiguate them by adding the driver name. If the AO is the first, the
name will remain just "Default". (The condition checks "num > 1",
because the very first entry is the dummy for AO autoselection.)
Of course, only FFmpeg has av_clipd(), while Libav does not. (Nevermind
that it doesn't do much more than the mpv MPCLAMP() macro. Supposedly,
libavutil can provide optimized platform-specific versions for av_clip*,
but of course nothing actually does for av_clipf() or av_clipd().)
libswresample doesn't do it - although it should, but the patch is stuck
in limbo.
Probably reduces problems with artifacts on downmixing in some cases.
Remove known useless device entries from the --audio-device list (and
corresponding property). Do this because the list is supposed to be a
high level list of devices the user can select. ALSA does not provide
such a list (in an useable manner), and ao_alsa.c is still in the best
position to improve the situation somewhat.
The ALSA doxygen says:
IOID - input / output identification ("Input" or "Output"), NULL
means both
This bug was blatantly introduced with commit cf94fce4.
Apparently, some audio drivers do not support the DTS subtype, but
passthrough works anyway if the AC3 subtype is set. Just retry with
AC3 if the proper format doesn't work. The audio device which
exposed this behavior reported itself as
"M601d-A3/A3R (Intel(R) Display Audio)".
xbmc/kodi even always passes DTS as AC3.
Just set the ratio directly by working around the intended semantics of
the API function. The silly rounding stuff we had isn't needed anymore
(and not entirely correct anyway).
Note that since the compensation is virtually active forever, we need to
reset if it's not needed. So always run this code to be sure to reset
it.
Also note that libswresample itself had a precision issue, until it
was fixed in FFmpeg commit 351e625d.
Deal with jittering Matroska crap timestamps. This reuses the mechanism
that is needed for frames without PTS, and adds a heuristic to it. If
the interpolated timestamp is less than 1ms away from the real one, it
might be due to Matroska timestamp rounding (or other file formats with
such rounding, or files remuxed from Matroska).
While there actually isn't much of a need to do this (audio PTS
jittering by such a low amount doesn't negatively influence much), it
helps with identifying jitter from other sources.
Instead of requiring the decoder to set the PTS directly on the
dec_audio context (including handling absence of PTS etc.), transfer the
packet PTS to the decoded audio frame. Marginally simpler, and gives
more control to the generic code.
Essentially we'd use something random, just because it's part of the srt
of traditionally used ALSA channel mappings. But each driver can do its
own things.
This doesn't let me sleep at night, so remove it.
This could accidentally change some spdif formats to AAC (because AAC is
the first on the list and will match first). spdif formats are
inherently uninterchangeable, so treat them as their own class of
formats (like int vs. float).
Might fix some issues with ao_wasapi.c.
Actually, it didn't really require that before (most work was avoided),
but some bits had to be run anyway. Separate the speed change into a
light-weight function, which merely updates already created filters, and
a heavy-weight one which messes with filter insertion.
This also happens to fix the case where the filters would "forget" the
current speed (force resampling, change speed, hit a volume control to
force af_volume insertion - it will reset speed and desync).
Since we now always run the light-weight function, remove the
af_scaletempo verbose message that is printed on speed setting. Other
than that, all setters are cheap.
For some reason, the encoder didn't like that the AVPacket already had
fields set. I'm not quite sure, but this might just be invalid API
usage. Do it as it's recommended.
We need to effectively swap the last channel pair. See commit 4e358a96
and 5a18c5ea for details.
Doing this seems rather strange, as 7.1 just extends 5.1 with 2 new
speakers, and 5.1 doesn't need this change. Going by the HDMI standard
and the Intel HDA sources (cited in the referenced commits), it also
looks like 7.1 should simply append two channels to 5.1 as well. But
swapping them is apparently correct. This is also what XBMC does. (I
didn't find any other applications doing 7.1 PCM using the ALSA channel
map API. VLC seems to ignore the 7.1 case.) Testing reveals that at
least the end result is correct.
"Normal" ALSA 7.1 is unaffected by this, as it reports a different
(and saner) channel layout.
Instead of constructing an ALSA channel map from mpv ones from scratch,
try to find the original ALSA channel map again. Th result is that we
need to convert channel maps only in one direction. If we need to map
a mp_chmap to ALSA, we fetch the device's channel map list, convert
each entry to mp_chmap, and find the first one which fits.
This seems helpful for the following commit. For now, this only gets rid
of mapping back the trivial MONO mapping, which alone would still be
acceptable, but with other channel layout mogrifications it gets messy
fast. While we need to do something awkward to keep our channel map
reordering for VAR chmaps (which basically gives nicer output and
possibly slightly better performance), this is still the better
solution.
This reverts commit 4e358a9636.
Testing shows the channel pairs must indeed be swapped (details see
commit message of the reverted commit). Making the downmix code move
sl/sr to sdl/sdr is not an appropriate solution anymore, and it's
better to fix the unusual channel layout in ao_alsa.c directly.
(Not reverting the change in chmap.c; this is still correct.)
ao_alsa: attempt to fix 7.1 over HDMI
The last 2 channels of 7.1 (RLC/RRC in ALSA) were exported as sdl/sdr
instead of sl/sr (I don't even know why I chose sdl/sdr, but SL/SR
and RLC/RRC are different in the ALSA API). libsw/avresample do not
move the sl/sr channels to sdl/sdr when rematrixing, so silence was
sent for 2 channels. If my selection of sdl/sdr is essentially API
abuse, there's no reason why they should do this differently.
The mess here is really that ALSa doesn't map the HDMI layouts cleanly.
Most ALSA drivers export 7.1 in a way compatible to our expectations,
but Intel HDA/HDMI does not:
mpv/ffmpeg: fl-fr-fc-lfe-bl-br-sl-sr
ALSA/generic: FL FR FC LFE RL RR SL SR [1]
ALSA/HDMI: FL FR LFE FC RL RR RLC RRC [2]
The HDMI layout is layout 0x13 (going by CEA-861-B). The comment in
the kernel code has to be correct too. The early standard defines only
1 other layout, which replaces RLC/RRC with FRC/FLC - this probably
corresponds to what we call "7.1(wide)".
So it appears when ALSA requests RLC/RRC, we should feed it sl/sr.
To make it more complicated, Kodi/xbmc apparently also have to deal with
ALSA being special, but instead of sending sl/sr to RLC/RRC, they swap
the last two pairs of the layout, and send sl/sr to RL/RR and bl/br to
RLC/RRC. Or I might have misunderstood their code. I don't have a
7.1-capable A/V receiver, so I can't test this.
For now, go with the simpler solution, and wait until someone tests it.
If the speakers end up swapped, a completely different solution will be
needed.
[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/sound/core/pcm_lib.c?id=refs/tags/v4.3#n2434
[2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/sound/pci/hda/patch_hdmi.c?id=refs/tags/v4.3#n307
These calls actually can leave the ALSA configuration space empty (how
very useful), which is why snd_pcm_hw_params() can fail. An earlier
change intended to make this non-fatal, but it didn't work for this
reason.
Backup the old parameters, so we can retry with the non-empty
configuration space. (It has to be non-empty, because the previous
setters didn't fail.)
Note that the buffer settings are not very important to us. They're
a leftover from MPlayer, which needed to write enough data to the
audio device to not underrun while decoding and displaying a video
frame. In mpv, most of these things happen asynchronously, _and_
there is a dedicated thread just for feeding the audio device, so
we should be pretty imune even against extreme buffer settings. But
I suppose it's still useful to prevent PulseAudio from making the
buffer too large, so still keep this code.
Again, this could have bad access, is unlikely, and has no bad
consequences. It's noteworthy that vlc and the ALSA PCM example both do
this first, even if they set the sample rate later.
I'm worried that not restricting the access type before restricting the
format will cause problems. While it's unlikely, it might prevent
failures in some corner cases. Also, since we by default always use
interleaved access (buggy ALSA plugins), this will have no effects at
all.
If the API doesn't list padded channel maps, but the final device
channel map is padded, and if unpadded output is not possible (unlike in
the somewhat similar dmix case), then we shouldn't apply the channel
count mismatch fallback in the beginning. Do it after channel map
negotiation instead.
Doesn't matter much; effectively this prevents just log spam in some
cases where the map is legitimately padded. Normally this is really
only needed for the dmix ALSA case. (See git blame for details.)
av_free_packet() got finally deprecated. Use av_packet_unref() instead,
which has almost the same semantics, has existed for a while, and is
available in all FFmpeg and Libav versions we support.
Until recently, the channel layout code happened to catch this, but now
an explicit check is needed. Otherwise, it'd try to pad the missing
channels with NA in the channel map fallback code.
This is intended for the case when CoreAudio returns only unknown
channel layouts, or no channel layout matches the number of channels the
CoreAudio device forces. Assume that outputting stereo or mono to the
first channels is safe, and that it's better than outputting nothing.
It's notable that XBMC/kodi falls back to a static channel layout in
this case. For some messed up reason, the layout it uses happens to
match with the channel order in ALSA's/mpv's "7.1(alsa)" layout.
Share some code between ca_init_chmap() and ca_get_active_chmap(), which
also makes it look slightly nicer. No functional changes, other than the
additional log message.
If no channel layouts were determined (which can actually happen with
some "strange" devices), the selection code was falling back to mono,
because mono is always added as a fallback. This doesn't seem quite
right.
Allow a fallback to stereo too, if no channel layout could be retrieved
at all. So we always assume that mono and stereo work, if no other
layouts are available.
(I still don't know what the CoreAudio stereo layout is supposed to do.
It could be used to swap left and right channels. It could also be used
to pad/move the channels, but I have never seen that. And it can be set
to non-stereo channels, which breaks mpv. Whatever.)
The main reason is that ao_coreaudio_exclusive needs this for some OSX
devices. They want packed audio, and special-casing this in the
coreaudio code would be too much of a pain.
The maximum of channels we can support is 64 (because FFmpeg uses 64 bit
masks for channel layouts), but since struct mp_audio can get pretty
big (has static allocations of 2 pointers for each channel for planar
mode), it's less wasteful to stay lower for now.
av_get_default_channel_layout() fails with channel counts larger than 8.
The channel layout doesn't need to make sense, so pick an arbitrary
fallback.
libswresample also has options for setting the channel counts directly,
but better not introduce new concepts in the code. Also, libavresample
doesn't have these options.
Change it so that it will always return a bitmask with the correct
number of channels set if an unknown channel map is passed. This didn't
work for channel counts larger than 8, as there are not any standard
channel layouts defined with more than 8 channels (both in mpv and
FFmpeg). Instead, it returned 0.
This will help when raising the maximum allowed channel count in mpv.
Some code in af_lavrresample relies on it, more or less.
One change is that unknown channel maps won't result in lavc standard
channel layouts anymore, just a set of random speakers. This should be
fine, as the caller of mp_chmap_to_lavc_unchecked() should handle these
cases. For mp_chmap_reorder_to_lavc() this is not so clear anymore, but
should also be ok.
For normal channel maps, simply dropping NA channels is still the
correct and wanted behavior.
Currently, the mpv maximum channel count is 8. This commit is
preparation for raising this limit.
mNumberChannelDescriptions being 0 is pretty much an error, but if it
can happen, then the code checking the chmap below will trigger UB, as
chmap is not initialized at all.
Also, simplify the code a little: we never change the number of
channels, so this is just fine.
This code removes filters which can not take spdif inout. This was made
so that PCM filters are transparently dropped in spdif mode.
This entered an endless loop with:
--af=lavcac3enc:::2 --audio-channels=5.1
The forced number of output channels is incompatible with spdif. It's
trying to insert af_lavrresample as conversion filter to compensate for
it. Of course this doesn't work, which triggers the PCM filter removal.
Then it goes on normally - since the new state is exactly as before, it
will try the same thing again, forever.
Fix by reusing the retry counter, which is a very dumb but very
effective measure against these cases of filter negotiation failure. We
could try to be more clever (for example, if the removed filter is a
conversion filter, we can be sure this won't work, and error out
immediately). But better keep it simple and robust.
Coreaudio gives us a channel map with all entries set to
kAudioChannelLabel_Unknown. This is translated to a mpv channel map with
all channels set to NA, which has special meaning: it's an "unknown"
channel map, which acts as wildcard and can be converted from/to any
channel layout. Not really what we want.
I've got this with USB audio, playing stereo. The multichannel layout
consisted of 2 unknown channels, while the stereo channel map was
stereo (as expected).
Note that channel maps with _some_ NA entries are not affected by this,
and must still work.
If the device returns an unexpected number of channels instead of the
requested count on init, don't immediately error out. Instead, look if
there's a channel map with the given number of channels.
If there isn't, still error out, because we don't want to guess the
channel layout.
Reportedly fixes operation with "USB connected Parasound ZDAC v.2". (OSX
and USB audio sure is not nice at all.)
This might be perceived as hang by some users, so it's quite possible
that this will have to be adjusted again somehow.
Fixes#2409.
Small adjustments to the playback speed use swr_set_compensation()
to stretch the audio as it is required. But since large adjustments
are now handled by actually reinitializing libswresample, the small
adjustments get rounded off completely with typical frame sizes.
Compensate for this by accounting for the rounding error and keeping
track of fractional samples that should have been output to achieve
the correct ratio.
This fixes display sync mode behavior, which requires these adjustments
to be relatively accurate.
swr/avresample_set_compensation() was made for small speed adjustments.
Non-documentation says it should be used for changes not larger than 1%,
so reinitialize the sampler if the change is larger than that.
swr_set_compensation() changes the apparent sample rate on the fly (who
would have guessed). It is thus very well-suited for adjusting audio
speed on the fly during playback (like needed by the display-sync mode).
It skips the relatively slow resampler reinitialization.
If this doesn't work (libswresample soxr backend), then fall back to the
old method.
The previous commit handled not falling back to normal decoding if the
AO was reloaded (I think...), and this tries to re-engage spdif pass-
through if it was previously falling back to normal decoding (e.g.
because it temporarily switched to an audio device incapable of
passthrough).
The manpage entry explains this.
(Maybe this option could be always enabled and removed. I don't quite
remember what valid use-cases there are for just disabling audio
entirely, other than that this is also needed for audio decoder init
failure.)
Make the code a bit more uniform. Always build a "dummy" audio output
list before probing, which means that opening preferred devices and
pure auto-probing is done with the same code. We can drop the second
ao_init() call.
This also makes the next commit easier, which wants to selectively
fallback to ao_null. This could have been implemented by passing a
different requested audio output list (instead of reading it from
MPOptions), but I think it's better if this rather special feature
is handled internally in the AO code. This also makes sure the AO
code can handle its own options (such as the audio output list) in
a self-contained way.
This can happen with USB audio. There was already code for this, but
something in mpv and ALSA changed - and now the old code is not
necessarily triggered anymore. It probably depends on the exact
situation.
This could sometimes cause crashes in hotplug events. (Apparently in
cases when CoreAudio changes its state asynchronously, or such.)
CA_GET_STR() does not set the string if there was an error, so errors
have to be strictly checked before using it.
This flag was used by some filters and made sure none of these filters
were inserted twice. This triggers only if the user explicitly tries to
add multiple filters (and not e.g. due to auto-insertion), so at best
this warned the user from doing something potentially pointless. At
worst, it blocked some (mildly) legitimate use-cases. Get rid of it.
Also see #2322.
The reason MPlayer traditionally duplicated them all over the place is
that it wanted every component to be a self-contained library (e.g.
audio filters were in "libaf"). But this is not necessarily helpful, and
this change makes the following commit a bit simpler.
* (de)planarize -1
* pad 1 byte -8
* truncate 1 byte -1024
* float -> int 1048576 * (8 - dst_bytes)
* int -> float -512
Now the score is negative if and only if the conversion is lossy
(e.g. previously s24 -> float was given a negative (lossy) score),
However, int->float is still considered bad
(s16->float is worse than than s16->s32).
This penalizes any loss of precision more than performance / bandwidth hits.
For example, previously s24->s16p was considered equal to s24->u8.
Finally, we penalize padding more than (de)planarizing as this will
increase the output size for example with ao_lavc.
This is just a refactor, which makes it use the previously introduced
function, and allows us to make af_format_conversion_score() private.
(We drop 2 unlikely warning messages too... who cares.)
This mixed up the returned score for some interleaved/non-interleaved
comparisons. Changing interleaving subtracted 1 point, while extending
sample size by 1 byte also subtracted 1 point.
(This scoring system is not ideal - it'd be much cleaner to do a 3-way
sample format comparison instead, and sort the formats according to the
comparison instead of the score.)
Not sure why struct af_resample_opts even exists. It seems useful to
group the fields set by user options. But storing the current format
conversion parameters doesn't seem very elegant, and having a separate
instance in the "ctx" field isn't helpful either.
Some users still use this filter, so the filter was going to be kept.
But I overlooked that libavfilter provides this filter. Remove the
redundant wrapper from mpv. Something like --af=lavfi=bs2b should work
and give exactly the same results.
All of these filters are considered not useful anymore by us. Some have
replacements in libavfilter (useable through af_lavfi).
af_center, af_extrastereo, af_karaoke, af_sinesuppress, af_sub,
af_surround, af_sweep: pretty simple and useless filters which probably
nobody ever wants.
af_ladspa: has a replacement in libavfilter.
af_hrtf: the algorithm doesn't work properly on most sources, and the
implementation was buggy and complicated. (The filter was inherited from
MPlayer; but even in mpv times we had to apply fixes that fixed major
issues with added noise.) There is a ladspa filter if you still want to
use it.
af_export: I'm not even sure what this is supposed to do. Possibly it
was meant for GUIs rendering audio visualizations, but it couldn't
really work well. For example, the size of the audio depended on the
samplerate (fixed number of samples only), and it couldn't retrieve the
complete audio, only fragments. If this is really needed for GUIs, mpv
should add native visualization, or a proper API for it.
So snd_device_name_get_hint() return values do in fact have to be freed.
Also, change listing semantics slightly: if io==NULL, skip the entry,
instead of assuming it's an output device.
Revert "win32: more wchar_t -> WCHAR replacements"
Revert "win32: replace wchar_t with WCHAR"
Doing a "partial" port of this makes no sense anymore from my
perspective. Revert the changes, as they're confusing without
context, maintenance, and progress. These changes were a bit
premature anyway, and might actually cause other issues
(locale neutrality etc. as it was pointed out).
This was essentially missing from commit 0b52ac8a.
Since L"..." string literals have the type wchar_t[], we can't use them
for UTF-16 strings. Use C11 u"..." string literals instead. These have
the type char16_t[], but we simply assume char16_t is the same
underlying type as WCHAR. In practice, they're both unsigned short.
For this reason use -std=c11 on Windows. Since Windows is a "special"
environment (we require either MinGW or Cygwin), we don't need to worry
too much about compiler compatibility.
WCHAR is more portable. While at least MinGW, Cygwin, and MSVC actually
use 16 bit wchar_t, Midipix will have 32 bit wchar_t. In that context,
using WCHAR instead is more portable.
This affects only non-MinGW parts, so not all uses of wchar_t need to
be changed. For example, terminal-win.c won't be used on Midipix at
all. (Most of io.c won't either, so the search & replace here is more
than necessary, but also not harmful.)
(Midipix is not useable yet, so this is just preparation.)
This was a minor optimization to potentially avoid resampler
reconfiguration when the filter is reinitialized. But filter
reinitialization is a rare event, and the case when no reconfiguration
is needed is even rarer. As such, this is an unnecessary micro-
optimization and only adds potential for bugs.
This message bloats verbose log output if e.g. audio speed is frequently
readjusted, such as when syncing audio to video. So don't print the
message if only speed is changed. (This case requires reconfiguration,
but can't change the input/output channel maps.)
Also do not print the message if no remixing is done at all.
Some filter chains require a huge number of auto-inserted conversion
filters. There is an overly stupid safeguard against infinite filter
insertions, which counts the number of conversion filters inserted. This
triggered accidentally in this case. Fix by resetting this counter after
a non-conversion filter was successfully configured.
ao_coreaudio (using AudioUnit) accounted only for part of the latency -
move the code in ao_coreaudio_exclusive to utils, and use that for the
AudioUnit code.
(There's still the question why CoreAudio and AudioUnit require you to
jump through hoops this much, but apparently that's how it is.)
Until now, this was for AC3 only. For PCM, we used AudioUnit in
ao_coreaudio, and the only reason ao_coreaudio_exclusive exists
is that there is no other way to passthrough AC3.
PCM support is actually rather simple. The most complicated
issue is that modern OS X versions actually do not support
copying through the data; instead everything must go through
float. So we have to deal with virtual and physical format
being different, which causes some complications.
This possibly also doesn't support some other things correctly.
For one, if the device allows non-interleaved output only, we
will probably fail. (I couldn't test it, so I don't even know
what is required. Supporting it would probably be rather
simple, and we already do it with AudioUnit.)
Mapping of spdif formats was imperfect. Since the first format on the
list is somehow AAC, it was returned first, which is confusing, because
CoreAudio calls all spdif formats AC3. Since the spdif formats have some
rather arbitrary, reverse mapping the formats didn"t actually work
either. Fix by explicitly ignoring these when spdif is used.
Also, don't forget to set the samplerate in ca_asbd_to_mpformat(), or it
will work only in some cases.
May help with (supposedly) bad drivers, which can put the device into
some sort of broken state when trying to set a different physical
format. When the previous format is restored, it apparently recovers.
This might make the change-physical-format suboption more robust.
We can be pretty sure that AudioUnit will remix for us.
Before this commit, we usually upmixed to stereo, because the
stereo and multichannel layouts were the only whitelisted ones.
Replace all the check macros with function calls. Give them all the
same case and naming schema.
Drop af_fmt2bits(). Only af_fmt2bps() survives as af_fmt_to_bytes().
Introduce af_fmt_is_pcm(), and use it in situations that used
!AF_FORMAT_IS_SPECIAL. Nobody really knew what a "special" format
was. It simply meant "not PCM".
Audio formats used a semi-clever schema to encode the properties of the
PCM encoding as bitfields into the format integer value.
The af_fmt_change_bits() implementation becomes a bit weird, but it's
an improvement to the rest of the code.
(I've always disliked it, so why not get rid of it.)
This may or may not fix some issues with the format switching
code. Actually, it seems somewhat unlikely, but then checking
the stream type isn't incorrect either, and is probably
something the API user should always be doing.
Originally, this was written for comparing the sample format only, but
ca_change_physical_format_sync() actually expects that the full format
is compared. (For all other uses it doesn't matter.)
The speaker replacement nonsense sometimes made blatantly incorrect
decisions. In this case, it prefered a 7.1(rear) upmix over outputting
5.1(side) as 5.1, which makes no sense at all. This happened because 5.1
and 7.1(rear) appeared equivalent to the final selection, as both of
them lose the sl-sr channels. The old code was too stupid to select the
one with the lower number of channels as well.
Redo this. There's really no reason why there should be a separate final
decision, so move the speaker replacement logic into the
mp_chmap_is_better() function.
Improve some other details. For example, we never should compare the
plain number of channels for deciding upmix/downmix, because due to NA
channels this is essentially meaningless. Remove the NA channels when
doing this comparison. Also, explicitly handle exact matches.
Conceptually this is not necessary, but it avoids that we have to
needlessly shuffle audio data around.
This avoids keeping "bad" state from previous reconfig calls, such as
the internal_sample_format option (which is set only on the first
reconfig call).
There's no advantage to keeping the resample contexts around anyway.
Basically, af_fix_format_conversion() behaves stupid you insert a
conversion filter that won't work, and adding back the conversion test
function is the simplest fix to it.
So apparently, this essentially happens when the kernel driver doesn't
implement write accesses in the channel map control. Which doesn't
necessarily mean that the channel map is unsupported, or that there is a
bug - it's just lazyness and a consequence of the terrible ALSA kernel
API for the channel mapping stuff.
In these cases, the channel count implicitly selects the channel map,
and snd_pcm_set_chmap() always fails with ENXIO.
I'm actually not sure what happens if dmix is on top of e.g. HDMI, which
actually lets you change the channel mapping.
I'm also not sure why commit d20e24e5d1614354e9c8195ed0b11fe089c489e4
(alsa-lib git repository) does not take care of this.
MPlayer traditionally had completely separate sh_ structs for
audio/video/subs, without a good way to share fields. This meant that
fields shared across all these headers had to be duplicated. This commit
deduplicates essentially the last remaining duplicated fields.
This attempted to find a minimal filter graph for a format conversion
involving multiple conversion filters. With the last 2 commits it
becomes dead code - remove it.
Now af_lavrresample can output 24 bit samples directly, by doing the
conversion "inline". Luckily, S32->S24 can be done in-place, so this
isn't too much work. But the output conversion logic (which seems to be
adding up) gets slightly more complicated again.
Normally this is done by af_convert24. But having multiple conversion
filters complicates some aspects of the filter chain. S24 output is the
only thing the code for multiple conversion filters is still needed for,
and getting rid of that is preferable.
If the code path for additional output conversion is active,
reorder_planes() is always called, even if the reorder_out array wasn't
filled. This is obviously wrong - always fill this array.
They are useless. Not only are they actually rarely in use; but
libavcodec doesn't even output them, as libavcodec has no such sample
formats for decoded audio.
Even if it should happen that we actually still need them (e.g. if doing
direct hardware output), there are better solutions. Swapping the sign
is a fast and lossless operation and can be done inplace, so AO actually
needing it could do this directly.
If you wonder why we keep U8 instead of S8: because libavcodec does it.
Channel maps reported by the device as SND_CHMAP_TYPE_VAR can be freely
reordered. We don't use this much (out of laziness), but in this case
it's a simple way to reduce necessary reordering (which would be an
extra libavresample invocation), and to make debug output more readable.
Until now, we didn't do this, because it required some effort, and
didn't seem to be necessary. It probably still isn't, but it sounds
like a good idea not to output arbitrary data on these channels.
The situation is complicated by the fact that just adding new channels
to a planar frame would require messing with buffers. So we would have
to allocate new buffers and add them to the frame. We could have to
maintain an extra buffer pool for this. Avoid this by being "clever",
and just allocate a frame with enough channels in the first place.
libav/swresample won't know about these channels and won't write to
them, but we can grab them in reorder_planes() and use them for the
NA channels.
This is just a conceptual issue, since for now every channel count has
an associated standard layout.
But should the max. channel count ever be bumped, some things would stop
function if mp_chmap_from_channels() refused to work for any channel
count within the allowed range.
In the AVFrame-style system (which we inreasingly map our internal data
stuctures on), buffers and plane pointers don't necessarily have a 1:1
correspondence. For example, a single buffer could cover 2 or more
planes, all while other planes are covered by a second buffer, and so
on. They don't need to be ordered in the same way.
Change mp_audio_get_allocated_size() to retrieve the maximum size all
planes provide. This also considers the case of planes not pointing to
buffer start.
Change mp_audio_realloc() to reset all planes, even if corresponding
buffers are not reallocated. (The caller has to be careful anyway if it
wants to be sure the contents are preserved on realloc calls.)
If you try to play surround with dmix, it will advertise surround and
lets you set more than 2 channels, but will report a stereo channel map,
with the extra channels identified as NA. We could handle this now, but
we don't want to (because it's excessively stupid).
Do it only if the channel map is not what we requested, instead of just
acting if it contains NA entries at all. This avoids that we hurt
ourselves in the unlikely but possible case we actually have to use
channel maps with NA entries.
If the audio API takes a while for starting the audio callback, the
current heuristic can be off. In particular, with very short files, it
can happen that the audio callback is not called before playback is
stopped, so no audio is output at all.
Change draining so that it essentially waits for the ringbuffer to
empty. The assumption is that once the audio API has read the data
via the callback, it will always output it, even if the audio API
is stopped right after the callback has returned.
If a frame could only be partially filled with real audio data, the
silence wasn't written at the correct offset. It could have happened
that the remainder of the frame contained garbage.
(This didn't happen in the more common case of playing dummy silence.)
This provides a new method for enabling spdif passthrough. The old
method via --ad (--ad=spdif:ac3 etc.) is deprecated. The deprecated
method will probably stop working at some point.
This also supports PCM fallback. One caveat is that it will lose at
least 1 audio packet in doing so. (I don't care enough to prevent this.)
(This is named after the old S/PDIF connector, because it uses the same
underlying technology as far as the higher level protoco is concerned.
Also, the user should be renamed that passthrough is backwards.)
This deprecates the --ad-spdif-dtshd option, and replaces it with a
pseudo decoder. This means ad_spdif will report two decoders, "dts" and
"dts-hd", of which the second simply enables what the option did.
The --ad-spdif-dtshd option will actually be deprecated in the next
commit.
This is better, because now we call swr_get_delay() with the output
samplerate, instead of with the input samplerate and then multiplying it
with the ratio and rounding it up.
Listening to kAudioDevicePropertyDeviceHasChanged does not send any
property change notifications when the device dies. Makes no sense,
but I suppose in CoreAudio logic a dead/removed device can't send
any notifications.
This caused the player to essentially pause playback if the audio
device was removed during playback.
Fix by listening to the kAudioHardwarePropertyDevices property too,
which will actually be sent in this specific case. Then, if
querying the already dead device fails, we know we have to reload.
In short, instead of letting the coreaudio property listener set atomic
flags (which are then polled), make the property listeners actually
active.
The format change listener used during audio output now simply calls
ao_request_reload() on its own. All code involved is thread-safe, so
there's no need to do it during this audio callback (we assumed the
callback was never run concurrently with itself).
The listener installed temporarily during ca_change_format() is changed
to post a semaphore. Get rid of the weird retry logic and replace it
with a flat loop + timeout. It appears the maximum wait time could be
2500ms; reduce the total timeout to 500ms instead.
This manually retrieved the remaining audio from the resampler. It
subtly missed a conversion which could leave to an unsubtle crash.
This could happen if reorder_planes() was supposed to insert NA
channels, and the resampler/actual output format were different.
Simplify it by reusing the normal drain path. One oddness is that
the filter will add an output frame outside of normal filtering,
but that should be fine.
This brings the volume control closer to what is percepted as linear
volume change.
Adjust the --softvol-max default to roughly the old maximum (roughly
doubles the gain).
Now --volume takes an absolute volume, meaning it doesn't depend on
--softvol-max. 0 is still silence, and 100 now always means unchanged
volume. The OSD and the "volume" property are changed accordingly.
Also raise the minimum value of --softvol-max. A value below 100 makes
no sense and breaks the OSD.
Apparently some A/V receivers do not behave well if "normal" DTS is
passed through using the high bitrate spdif format normally used for
DTS-HD (other receivers are fine with it).
Parse the first packet passed to ad_spdif by decoding it with libavcodec
in order to get the profile. Ignore the --ad-spdif-dtshd if it's not
DTS-HD. (If the codec profile changes midstream, the user is out of
luck. But this is probably an insignificant corner case.)
I thought about parsing the bitstream, but let's not. While it probably
wouldn't be that much effort, we are trying to keep it down on codec
details here - otherwise we could just do our own spdif framing instead
of using libavformat's spdif pseudo-muxer.
Another possibility, using the codec parameters signalled by
libavformat, is disregarded. Our builtin Matroska decoder doesn't do
this, and also we do not want on the demuxer having to decode some
packets in order to retrieve codec params (as libavformat does).
Fixes#1949.
There is not much of a reason to have these wrappers around. Use POSIX
standard functions directly, and use a separate utility function to take
care of the timespec calculations. (Course POSIX for using this weird
format for time values.)