This is not done automatically by CoreAudio. I am told that it would a PITA
to have to switch back the format manually on the device (especially if the
same device is used for lpcm output).
b2f9e0610 introduced this functionality with code that was quite 'monolithic'.
Split the functionality over several functions and ose the new macros to get
array properties.
Introduce some macros to deal with properties. These allow to work around the
limitation of CoreAudio's API being `void **` based. The macros allow to keep
their client's code DRY, by not asking size and other details which can be
derived by the macro itself. I have no idea why Apple didn't design their API
like this in the first place.
* ao_coreaudio_utils: contains several utility function
* ao_coreaudio_properties: contains functions to set and get audio object
properties.
Conflicts:
audio/out/ao_coreaudio.c
The condition was checked wrongly on asbd which is the input format
description. This lead to the condition always being true, thus selecting lpcm
streams for digital input.
The initialization is split more clearly between compressed and lpcm case.
For the compressed case, format selection is simplified a lot and negotiation
removed. The way it was written it just passed back to the core the original
requested format, not what was found available on hardware.
Since this is most likely useless for the compressed case, I didn't bother
with this. In the future I'd like to split this AO in two one that only uses
the AUHAL and the other with direct access to the hardware so that even
passthrough of lcpm can be possible. This would decrease the latency,
audiophiles would like that.
Split out some utility functions that use the CoreAudio API but are not related
the main task of the AOs (which is to move data correctly to the ringbuffer).
These are mainly need for the verbosity of the CoreAudio API and are just
obscuring the 'real' code.
Read only the requested amount by the AUHAL (instead of all the buffered data).
No idea what the deal is with pausing the audio units if there is no audio to
play, maybe to avoid underruns of some sort. Anyway from my tests this
condition never occurred so I'm removing it all.
Make the VF/VO/AO option parser available to audio filters. No audio
filter uses this yet, but it's still a quite intrusive change.
In particular, the commands for manipulating filters at runtime
completely change. We delete the old code, and use the same
infrastructure as for video filters. (This forces complete
reinitialization of the filter chain, which hopefully isn't a problem
for any use cases. The old code forced reinitialization too, but it
could potentially allow a filter to cache things; e.g. consider loaded
ladspa plugins and such.)
This code is supposed to run if dynamic filter insertion (such as when
inserting a volume filter in mixer.c) fails. Then it removes all filters
and recreates the default list of filters. But the code just blew up and
entered an endless loop, because it removed even the sentinel in/out
filters. This could happen when trying to use softvol controls while
using spdif, but also other situations. Fix it by calling the correct
code.
Also remove these obnoxious yoda-conditions.
MSDN tells me to multiply the samplerates by 4 (for setting up the S/PDIF
signal frequency), but doesn't mention that I'm only supposed to do it
on the new, NT6.1+ IEC 61937 structs. Works on my Realtek Digital Output,
but as I can't connect any hardware to it I can't hear the result.
Also, always ask for little-endian AC3. I'm not sure if this is supposed
to be LE or NE, but Windows is LE on all platforms, so we go with LE.
Entirely untested as this troper has no S/PDIF hardware.
Refuses trying any other format if we can't use passthrough, or we would
end up sending white noise at the user.
Do an strstr match against the device description and, if we have only
a single match, take it. This works as long as the devices in the system
don't change, but it's not supposed to be reliable; if one wants
reliability, one uses the device ID string.
Formatting.
This could turn valid parameters into syntax errors by the mere presence
or abscence of a device (e.g. USB audio devices), so don't do that.
We do validate that, if the parameter is an integer, it is not negative.
We also respond to the "help" parameter, which does the same as the "list"
suboption but exits after listing.
Demote the validation logging to MSGL_DBG2.
Validates by trying to pick the device using the device enumerator and
aborting with out of range on failure.
Refactors find_and_load_device to not use the wasapi_state; it might be
called during validation. Adds missing CoInitialize/CoUninitialize calls.
Remove unused variables (the SAFE_RELEASE macros keep them referenced so
compiler warnings don't help finding them...).
Remove the IMMDeviceEnumerator from the wasapi_state, it's only needed
during initialization and initialization is now well factored enough to
get rid of it.
Try and connect to unplugged devices as well when using the device ID
string.
Omit "{0.0.0.00000000}." on devices that start with that substring,
re-add when searching for devices by ID.
Log the device ID of the default device.
Log the friendly name of the used device.
Consistently refer to endpoints/devices as devices, as this is more
consistent with mpv terminology.
Uses WASAPI in shared mode by default, add :exclusive flag to choose
exclusive mode (duh). WASAPI works somewhat different in shared mode:
the OS suggests the sample format to use, and the GetBuffer call is
done slightly differently.
The shared mode driver does not consume audio as fast as it notifies
the thread; we need to check how much we're allowed to write. Not doing
this correctly results in spamming the console with
AUDCLNT_E_BUFFER_TOO_LARGE errors.
When guessing formats for exclusive mode, try several sample size and
sample rate combinations instead of just falling back to s16le@44100hz.
If none of the rates are accepted, tries remixing >6 channels to 5.1
channels. Failing that, tries remixing to stereo. Failing everything,
including the CD Red Book format, what else is left to test?
Calculate buffer_block_size based on the configured channels and bytes
per sample; MSDN docs say nBlockAlign is not guaranteed to be set for
anything but integer PCM formats.
Adds the :list suboption to ao_wasapi0, which enumerates the audio endpoints
in the system.
Adds the :device=<n> suboption, which either takes an ID string (as output by
list) or a device number and uses the requested device instead of the system
default.
These two options were supported by ALSA and OSS only. Further, their
values were specific to the respective audio systems, so it doesn't make
sense to keep them as top-level options.
This changes how device names are handled. Before this commit, device
names were mangled in strange ways to avoid clashing with the option
parser syntax. "." was replaced with ",", and "=" with ":" (the user had
to do the inverse to get the correct device name).
The "new" option parser has multiple ways to escape option strings, so
we don't need this confusing hack anymore.
Add an explicit note to the manpage as well.
Seeking calls thread_reset, but doesn't call thread_play. thread_reset
would disable WASAPI events, but they would never get re-enabled unless
the user paused and then unpaused.
Keep track of whether the stream is paused or not (there already was a
field for that, but it was apparently unused), and if it's not paused,
call thread_play after thread_reset. Fixes mpv freezing after seeks.
Fixes format specifies that assume windows TYPEDEFS are as long as they look
like they are.
Remove calls to _beginthreadex and _endthreadex, these are only present on
microsoft's C runtimes. Replace by the otherwise identical CreateThread and
ExitThread calls.
This actually requires fixes to devicetopology.h, but the problem has been
(kinda) reported to mingw-w64:
<Kovensky> I see that those KSJACK* structs are supposedly declared in
devicetopology.h itself, but for some reason (some of?) the decls that use
them aren't seeing them?
<Kovensky> ok, it seems that it expects ks.h and ksmedia.h to declare those
structs, but it doesn't
<Kovensky> the included files declare KDATAFORMAT, KSIDENTIFIER and LUID (and
the associated pointer typedefs)
<Kovensky> but everything else is essentially inside #if 0
<Kovensky> changing the #ifndef _KS_ to only include KDATAFORMAT, KSIDENTIFIER
and LUID (and putting the KSJACK stuff outside that #ifndef) makes the
header compile
<Kovensky> it solves my immediate problem, but if that happened to begin with
there's probably something more wrong with the ks headers :S
Matroska has an output sample rate (OutputSamplingFrequency), which in
theory should be forced instead of whatever the decoder outputs. But it
appears no software (other than mplayer2 and mpv until now) actually
respects this. Even worse, there were broken files around, which played
correctly with (in theory) broken software, but not mplayer2/mpv. Hacks
were added to our code to play these files correctly, but they didn't
catch all cases.
Simplify this by doing what everyone else does, and always use the
decoder's sample rate instead. In particular, we try to handle all
sample rate issues like libavformat's Matroska demuxer does.
It turns out that some code that was removed earlier was still needed.
avcodec_decode_audio4() can decode packets "partially". In that case,
you have to "slice" the packet and call the decode function again.
Codecs which need this are obscure and in low numbers. One sample that
needs it is here:
rsync://fate-suite.ffmpeg.org/fate-suite/lossless-audio/luckynight-partial.shn
(This one decodes in rather small increments.)
The new code is much simpler than what has been removed earlier,
though. The fact that we own the packet returned by the demuxer helps
a lot.
Not sure what should happen if avcodec_decode_audio4() returns 0.
Currently, we throw away the packet in this case. We don't want to be
stuck in an endless loop (could happen if the decoder produces no
output either).
This is not directly related to the handling of format changes itself,
but playing audio normally after the change. This was broken: the output
byte rate was not recalculated, so audio-video sync was simply broken.
Fix this by calculating the byte rate on the fly, instead of storing it
in sh_audio.
Format changes are relatively common (switches between stereo and 5.1
in TV recordings), so this fixes a somewhat critical bug.
pts_bytes can't just be changed at the end. It must be offset to the pts
value, which is reset with each packet read from the demuxer. Make sure
the pts_byte field is always reset after receiving a new PTS, i.e.
increment it after actually writing to the output buffer.
Flush the AVFormatContext's write buffer, because otherwise the audio
PTS will jump around too much: the calculation doesn't use the exact
output buffer size if there's still data in the avio buffer.
Partial packet reads were needed because the video/audio parsers were
working on top of them. So it could happen that a parser read a part of
a packet, and returned that to the decoder. With libavformat/libavcodec,
packets are already parsed, and everything is much simpler.
Most of the simplifications in ad_spdif could have been done earlier.
Remove some other stuff as well, like the questionable slave mode start
time reporting (could be replaced by proper code, but we don't bother).
Remove the unused skip_audio_frame() functionality as well (it was used
by old demuxers). Some functions become private to demux.c, like
demux_fill_buffer(). Introduce new packet read functions, which have
simpler semantics. Packets returned from them are owned by the caller,
and all packets in the demux.c packet queue are considered unread.
Remove special code that dropped subtitle packets with size 0. This
used to be needed because it caused special cases in the old code.
We don't need to deal with partial packet reads, manually using an audio
parser, or having to call the libavcodec decoder multiple times per
packet.
Actually, I'm not sure about the last point. ffplay still does this, but
the ffmpeg demuxing.c example doesn't.
The audio parser was needed only by the "old" demuxers, and
demux_rawaudio. All other demuxers output already parsed packets.
demux_rawaudio is usually for raw audio, so using a parser with it
doesn't usually make sense. But you can also force it to read
compressed formats with fixed packet sizes, in which case the parser
would have been used. This use case is probably broken now, but you
will be able to do the same thing with libavformat demuxers.
Delete demux_avi, demux_asf, demux_mpg, demux_ts. libavformat does
better than them (except in rare corner cases), and the demuxers have
a bad influence on the rest of the code. Often they don't output
proper packets, and require additional audio and video parsing. Most
work only in --no-correct-pts mode.
Remove them to facilitate further cleanups.
The core didn't use these fields, and use of them was inconsistent
accross AOs. Some didn't use them at all. Some only set them; the values
were completely unused by the core. Some made full use of them.
Remove these fields. In places where they are still needed, make them
private AO state.
Remove the --abs option. It set the buffer size for ao_oss and ao_dsound
(being ignored by all other AOs), and was already marked as obsolete. If
it turns out that it's still needed for ao_oss or ao_dsound, their
default buffer sizes could be adjusted, and if even that doesn't help,
AO suboptions could be added in these cases.
Some still do, because they use the value in other places of the init
function. ao_portaudio is tricky and reads ao->bps in the stream
thread, which might be started on initialization (not sure about that,
but better safe than sorry).
Currently every single AO was implementing it's own ringbuffer, many times
with slightly different semantics. This is an attempt to fix the problem.
I stole some good ideas from ao_portaudio's ringbuffer and went from there.
The main difference is this one stores wpos and rpos which are absolute
positions in an "infinite" buffer. To find the actual position for writing /
reading just apply modulo size.
The producer only modifies wpos while the consumer only modifies rpos. This
makes it pretty easy to reason about and make the operations thread safe by
using barriers (thread safety is guaranteed only in the Single-Producer/Single-
Consumer case).
Also adapted ao_coreaudio to use this ringbuffer.
This is hopefully the start of something good. ca_ringbuffer_read and
ca_ringbuffer_write can probably cleaned up from all the NULL checks once
ao_coreaudio.c gets simplyfied.
Conflicts:
audio/out/ao_coreaudio.c
Whatever this was supposed to be originally, it doesn't have much value
anymore. It just forced ad_mpg123 to upmix mono to stereo by default
(the audio chain can do that). As an option, it was mostly useless and
misleading, so get rid of it.
This was overlooked with commit 32a898f, because OSS4 volume control is
typically not available on Linux. BSD does have this feature, so the
broken code broke compilation there.
Fixes crashes when playing with certain numbers of channels. The core
assumes AOs accept data aligned on channels * samplesize, and ao_jack's
play() function broke that assumption:
mpv: core/mplayer.c:2348: fill_audio_out_buffers: Assertion `played % unitsize == 0' failed.
Fix by aligning the buffer and chunk sizes as needed.
Audio and video had their own (very similar) functions to initialize an
AVPacket (ffmpeg's packet struct) from a demux_packet (mplayer's packet
struct). Add a common function for these.
Also use this function for sd_lavc_conv. This is actually a functional
change, as some libavfilter subtitle demuxers add weird out-of-band
stuff as side-data.
GetTimer() is generally replaced with mp_time_us(). Both calls return
microseconds, but the latter uses int64_t, us defined to never wrap,
and never returns 0 or negative values.
GetTimerMS() has no direct replacement. Instead the other functions are
used.
For some code, switch to mp_time_sec(), which returns the time as double
float value in seconds. The returned time is offset to program start
time, so there is enough precision left to deliver microsecond
resolution for at least 100 years. Unless it's casted to a float
(or the CPU reduces precision), which is why we still use mp_time_us()
out of paranoia in places where precision is clearly needed.
Always switch to the correct time. The whole point of the new timer
calls is that they don't wrap, and storing microseconds in unsigned int
variables would negate this.
In some cases, remove wrap-around handling for time values.
The ALSA device was not closed when initialization failed.
The ALSA error handler (set with snd_lib_error_set_handler()) was not
unset when closing ao_alsa. If this is not done, the handler will still
be called when other libraries using ALSA cause errors, even though
ao_alsa was long closed. Since these messages were prefixed with
"[AO_ALSA]", they were misleading and implying ao_alsa was still used.
For some reason, our error handler is still called even after doing
snd_lib_error_set_handler(NULL), which should be impossible. Checking
with the debuggers, inserting printf(), as well as the alsa-lib source
code all suggest our error handler should not be called, but it still
happens. It's a complete mystery.
Mostly copied from vf_lavfi. The parts that could be shared are minor,
because most code is about setting up audio and video, which are too
different.
This won't work with Libav. I used ffplay.c as guide, and noticed too
late that their setup methods are incompatible with Libav's. Trying to
make it work with both would be too much effort. The configure test for
av_opt_set_int_list() should disable af_lavfi gracefully when compiling
with Libav.
Due to option parser chaos, you currently can't have a "," as part of
the filter graph string - not even with quoting or escaping. This will
probably be fixed later.
The audio filter chain is not PTS aware. So we have to do some hacks
to make up a fake PTS, and we have to map the output PTS back to the
filter chain's method of tracking PTS changes and buffering, by
adjusting af->delay.
FFmpeg (as well as Libav) have two layouts called "6.1":
AV_CH_LAYOUT_6POINT1 and AV_CH_LAYOUT_6POINT1_BACK. We call them "6.1"
and "6.1(back)". Change the default layout for 7 channels as well to
return the same layout as av_get_default_channel_layout(). (Looks a bit
questionable, but for now it's better to follow FFmpeg.)
It turns out that ALSA's 4 channel layout is different from mpv's and
ffmpeg's 4.0 layout. Thus trying to do 4 channel output led to incorrect
remixing via lib{av,sw}resample.
Fix the default layouts for the internal filter chain as well, although
I'm not sure if it matters at all.
The libavresample version of the current Libav stable release lacks the
avresample_set_channel_mapping() function. (FFmpeg's libswresample seems
to be fine, because they added swr_set_channel_mapping() first.)
Add a cheap/slow workaround to do channel reordering on our own. We
don't use the recently removed MPlayer code (see commit 586b75a),
because that is not generic enough.
The functionality should be the same as with full-featured
libavresample, and any differences are bugs. It's probably slower,
though.
af_reinit() is responsible for inserting automatic conversion filters
for channel remixing, format conversion, and resampling. We don't
require that a single filter can do all these (even though
af_lavrresample does nearly all of this, sometimes af_format has to be
used instead for format conversions). This makes setting up the chain
more complicated, and a way is needed to prevent endless appending of
conversion filters if a conversion is not possible.
Until now, this used a stupidly simple yet robust static retry limit to
detect failure. This is perfectly fine, and the limit (20) was good
enough to handle about ~5 filters. But with more filters, and if each
filter requires 3 additional conversion filters, this would fail. So
raise the limit to 4 retries per filter. This is still stupidly simple
and robust, but won't arbitrarily fail if the filter count is too large.
To make this easier, get rid of the direct mapping of the
AF_FORMAT_BITS_MASK bit field to number of bytes. This way we can throw
away the unused AF_FORMAT_48BIT and don't have to add ..._56BIT.
The snd_pcm_hw_params_test_format() call actually crashes in alsa-lib if
called with SND_PCM_FORMAT_UNKNOWN, so the already existing fallback
code won't work in this case.
Make all AOs use what has been introduced in the previous commit.
Note that even AOs which can handle all possible layouts (like ao_null)
use the new functions. This might be important if in the future
ao_select_champ() possibly honors global user options about downmixing
and so on.
The point is selecting a minimal fallback. The AOs will call this
through the AO API, so it will be possible to add options affecting
the general channel layout selection.
It provides the following mechanism to AOs:
- forcing the correct channel order
- downmixing to stereo if no layout is available
- allow 5.1 <-> 5.1(side) fallback
- handling "unknown" channel layouts
This is quite weak and lots of code/complexity for little gain. All AOs
already made sure the channel order was correct, and the fallback is of
little value, and could perhaps be done in the frontend instead, like
stereo downmixing with --channels=2 is handled. But I'm not really sure
how this stuff should _really_ work, and the new code will hopefully
provides enough flexibility to make radical changes to channel layout
negotiation easier.