Doesn't work quite right, and will pause for the latency duration after
seeking. Some users use --ao=null to disable audio (even though they
should probably use --no-audio), and this use-case is broken by this
issue too.
CC: @mpv-player/stable
Previous code was completly wrong. This still doesn't report the device
latency, but we report the buffer latency (as before the AO refactoring) and
the AudioUnit's latency (this is a new 'feature').
Apparently we can also report the device actual latency and we should also
calculate the actual sample rate of the audio device instead of using the
nominal sample rate, but I'll leave this for a later commit.
The mplayer1/2/mpv CoreAudio audio output historically contained both usage
of AUHAL APIs (these go through the CoreAudio audio server) and the Device
based APIs (used only for output of compressed formats in exclusive mode).
The latter is a very unwieldy and low level API and pretty much forces us to
write a lot of code for little workr. Also with the widespread of HDMI, the
actual need for outputting compressed audio directly to the device is getting
lower (it was very useful with S/PDIF for bandwidth constraints not allowing
a number if channels transmitted in LPCM).
Considering how invasive it is (uses hog/exclusive mode), the new AO
(`ao_coreaudio_device`) is not going to be autoprobed but the user will have
to select it.
Something like "char *s = ...; isdigit(s[0]);" triggers undefined
behavior, because char can be signed, and thus s[0] can be a negative
value. The is*() functions require unsigned char _or_ EOF. EOF is a
special value outside of unsigned char range, thus the argument to the
is*() functions can't be a char.
This undefined behavior can actually trigger crashes if the
implementation of these functions e.g. uses lookup tables, which are
then indexed with out-of-range values.
Replace all <ctype.h> uses with our own custom mp_is*() functions added
with misc/ctype.h. As a bonus, these functions are locale-independent.
(Although currently, we _require_ C locale for other reasons.)
rgain is not an additive value. It's a multiplier/gain.
Previous behaviour produced negative level values in some cases
(when rgain < 1.0) which caused volume to be louder when its value
was lowered.
CC: @mpv-player/stable
Signed-off-by: Mohammad Alsaleh <CE.Mohammad.AlSaleh@gmail.com>
Signed-off-by: wm4 <wm4@nowhere>
While I'm not very fond of "const", it's important for declarations
(it decides whether a symbol is emitted in a read-only or read/write
section). Fix all these cases, so we have writeable global data only
when we really need.
So the device buffer can be refilled quickly. Fixes dropouts in certain
cases: if all data is moved from the soft buffer to the audio device
buffer, the waiting code thinks it has to enter the mode in which it
waits for new data from the decoder. This doesn't work, because the
get_space() logic tries to keep the total buffer size down. get_space()
will return 0 (or a very low value) because the device buffer is full,
and the decoder can't refill the soft buffer. But this means if the AO
buffer runs out, the device buffer can't be refilled from the soft
buffer. I guess this mess happened because the code is trying to deal
with both AOs with proper event handling, and AOs with arbitrary
behavior.
Unfortunately this increases latency, as the total buffered audio
becomes larger. There are other ways to fix this again, but not today.
Fixes#818.
Apparently this can happen. So actually only return from waiting if ALSA
excplicitly signals that new output is available, or if we are woken up
externally.
This did not flush remaining audio in the buffer correctly (in case an
AO has an internal block size). So we have to make the audio feed thread
to write the remaining audio, and wait until it's done.
Checking the avoid_ao_wait variable should be enough to be sure that all
data that can be written was written to the AO driver.
This code handles buggy AOs (even if all AOs are bug-free, it's good for
robustness). Move handling of it to the AO feed thread. Now this check
doesn't require magic numbers and does exactly what's it supposed to do.
Until now, we've always calculated a timeout based on a heuristic when
to refill the audio buffers. Allow AOs to do it completely event-based
by providing wait and wakeup callbacks.
This also shuffles around the heuristic used for other AOs, and there is
a minor possibility that behavior slightly changes in real-world cases.
But in general it should be much more robust now.
ao_pulse.c now makes use of event-based waiting. It already did before,
but the code for time-based waiting was also involved. This commit also
removes one awkward artifact of the PulseAudio API out of the generic
code: the callback asking for more data can be reentrant, and thus
requires a separate lock for waiting (or a recursive mutex).
There were subtle and minor race conditions in the pull.c code, and AOs
using it (jack, portaudio, sdl, wasapi). Attempt to remove these.
There was at least a race condition in the ao_reset() implementation:
mp_ring_reset() was called concurrently to the audio callback. While the
ringbuffer uses atomics to allow concurrent access, the reset function
wasn't concurrency-safe (and can't easily be made to).
Fix this by stopping the audio callback before doing a reset. After
that, we can do anything without needing synchronization. The callback
is resumed when resuming playback at a later point.
Don't call driver->pause, and make driver->resume and driver->reset
start/stop the audio callback. In the initial state, the audio callback
must be disabled.
JackAudio of course is different. Maybe there is no way to suspend the
audio callback without "disconnecting" it (what jack_deactivate() would
do), so I'm not trying my luck, and implemented a really bad hack doing
active waiting until we get the audio callback into a state where it
won't interfere. Once the callback goes from AO_STATE_WAIT to NONE, we
can be sure that the callback doesn't access the ringbuffer or anything
else anymore. Since both sched_yield() and pthread_yield() apparently
are not always available, use mp_sleep_us(1) to avoid burning CPU during
active waiting.
The ao_jack.c change also removes a race condition: apparently we didn't
initialize _all_ ao fields before starting the audio callback.
In ao_wasapi.c, I'm not sure whether reset really waits for the audio
callback to return. Kovensky says it's not guaranteed, so disable the
reset callback - for now the behavior of ao_wasapi.c is like with
ao_jack.c, and active waiting is used to deal with the audio callback.
In most places where af_fmt2bits is called to get the bits/sample, the
result is immediately converted to bytes/sample. Avoid this by getting
bytes/sample directly by introducing af_fmt2bps.
The i_bps members of the sh_audio and dev_video structs are mostly used
for displaying the average audio and video bitrates. Keeping them in
bits-per-second avoids truncating them to bytes-per-second and changing
them back lateron.
In my opinion, we shouldn't use atomics at all, but ok.
This switches the mpv code to use C11 stdatomic.h, and for compilers
that don't support stdatomic.h yet, we emulate the subset used by mpv
using the builtins commonly provided by gcc and clang.
This supersedes an earlier similar attempt by Kovensky. That attempt
unfortunately relied on a big copypasted freebsd header (which also
depended on much more highly compiler-specific functionality, defined
reserved symbols, etc.), so it had to be NIH'ed.
Some issues:
- C11 says default initialization of atomics "produces a valid state",
but it's not sure whether the stored value is really 0. But we rely on
this.
- I'm pretty sure our use of the __atomic... builtins is/was incorrect.
We don't use atomic load/store intrinsics, and access stuff directly.
- Our wrapper actually does stricter typechecking than the stdatomic.h
implementation by gcc 4.9. We make the atomic types incompatible with
normal types by wrapping them into structs. (The FreeBSD wrapper does
the same.)
- I couldn't test on MinGW.
Use the time as returned by mp_time_us() for mpthread_cond_timedwait(),
instead of calculating the struct timespec value based on a timeout.
This (probably) makes it easier to wait for a specific deadline.
This didn't quite work. The main issue was that get_space tries to be
clever to reduce overall buffering, so it will cause the playloop to
decode and queue only as much audio as is needed to refill the AO in
reasonable time. Also, even if ignoring the problem, the logic of the
previous commit was slightly broken. (This required a few retries,
because I couldn't reproduce the issue on my own machine.)
When the audio buffer went low, but could not be refilled yet, it could
happen that the AO playback thread and the decode thread could enter a
wakeup feedback loop, causing up to 100% CPU usage doing nothing. This
happened because the decoder thread would wake up the AO thread when
writing 0 bytes of newly decoded data, and the AO thread in reaction
wakes up the decoder thread after writing 0 bytes to the AO buffer.
Fix this by waking up the decoder thread only if data was actually
played or queued. (This will still cause some redundant wakeups, but
will eventually settle down, reducing CPU usage close to ideal.)
I don't think this is really a very good idea because it is conceptually
incorrect but other prominent multimedia programs use this approach
(VLC and xbmc), and it seems to make the conversion more robust in certain
cases.
For example it has been reported, that configuring a receiver that can output
7.1 to output 5.1, will make CoreAudio report 8 channel descriptions, and the
last 2 descriptions will be tagged kAudioChannelLabel_Unknown.
Fixes#737
The code was falling back to the full waveext chmap_sel when less than 2
channels were detected. This new code is slightly more correct since it only
fills the chmap_sel with the stereo or mono chmap in the fallback case.
CoreAudio supports 3 kinds of layouts: bitmap based, tag based, and speaker
description based (using either channel labels or positional data).
Previously we tried to convert everything to bitmap based channel layouts,
but it turns out description based ones are the most generic and there are
built-in CoreAudio APIs to perform the conversion in this direction.
Moreover description based layouts support waveext extensions (like SDL and
SDR), and are easier to map to mp_chmaps.
Changing --softvol-max and then resuming would change the volume level
on resume to something different than the original volume. This is
because the user volume setting is always between 0-100, and 100
corresponds to --softvol-max gain.
Avoid that changing -softvol-max and resuming an older file could lead
to a too loud volume level by refusing to restore if --softvol-max
changed.
The comment says that it wakes up the main thread if 50% has been
played, but in reality the value was 0.74/2 => 37.5%. Correct this. This
probably changes little, because it's a very fuzzy heuristic in the
first place.
Also move down the min_wait calculation to where it's actually used.
Also remove MSGL_SMODE and friends.
Note: The indent in options.rst was added to work around a bug in
ReportLab that causes the PDF manual build to fail.
This avoids too many realloc() calls if the caller is appending to an
audo buffer. This case is actually quite noticeable when using something
that buffers a large amount of audio.
For some reason, the buffered_audio variable was used to "cache" the
ao_get_delay() result. But I can't really see any reason why this should
be done, and it just seems to complicate everything.
One reason might be that the value should be checked only if the AO
buffers have been recently filled (as otherwise the delay could go low
and trigger an accidental EOF condition), but this didn't work anyway,
since buffered_audio is set from ao_get_delay() anyway at a later point
if it was unset. And in both cases, the value is used _after_ filling
the audio buffers anyway.
Simplify it. Also, move the audio EOF condition to a separate function.
(Note that ao_eof_reached() probably could/should whether the last
ao_play() call had AOPLAY_FINAL_CHUNK set to avoid accidental EOF on
underflows, but for now let's keep the code equivalent.)
This was reported with PulseAudio 2.1. Apparently it still has problems
with reporting the correct delay. Since ao_pulse.c still has our custom
get_delay implementation, there's a possibility that this is our fault,
but this seems unlikely, because it's full of workarounds for issues
like this. It's also possible that this problem doesn't exist on
PulseAudio 5.0 anymore (I didn't explicitly retest it).
The check is general and works for all push based AOs. For pull based
AOs, this can't happen as pull.c implements all the logic correctly.
This should probably be an AO function, but since the playloop still has
some strange stuff (using the buffered_audio variable instead of calling
ao_get_delay() directly), just leave it and make it more explicit.
This collects statistics and other things. The option dumps raw data
into a file. A script to visualize this data is included too.
Litter some of the player code with calls that generate these
statistics.
In general, this will be helpful to debug timing dependent issues, such
as A/V sync problems. Normally, one could argue that this is the task of
a real profiler, but then we'd have a hard time to include extra
information like audio/video PTS differences. We could also just
hardcode all statistics collection and processing in the player code,
but then we'd end up with something like mplayer's status line, which
was cluttered and required a centralized approach (i.e. getting the data
to the status line; so it was all in mplayer.c). Some players can
visualize such statistics on OSD, but that sounds even more complicated.
So the approach added with this commit sounds sensible.
The stats-conv.py script is rather primitive at the moment and its
output is semi-ugly. It uses matplotlib, so it could probably be
extended to do a lot, so it's not a dead-end.
Same change as in e2184fcb, but this time for pull based AOs. This is
slightly controversial, because it will make a fast syscall from e.g.
ao_jack. And according to JackAudio developers, syscalls are evil and
will destroy realtime operation. But I don't think this is an issue at
all.
Still avoid locking a mutex. I'm not sure what jackaudio does in the
worst case - but if they set the jackaudio thread (and only this thread)
to realtime, we might run into deadlock situations due to priority
inversion and such. I'm not quite sure whether this can happen, but I'll
readily follow the cargo cult if it makes hack happy.
I'm not quite sure why ao_pulse needs this. It was broken when a thread
to fill audio buffers was added to AO - the pulseaudio callback was
waking up the playback thread, not the audio thread. But nobody noticed,
so it can't be very important. In any case, this change makes it wake up
the audio thread instead (which in turn wakes up the playback thread if
needed).
And also add a function ao_need_data(), which AO drivers can call if
their audio buffer runs low.
This change intends to make it easier for the playback thread: instead
of making the playback thread calculate a timeout at which the audio
buffer should be refilled, make the push.c audio thread wakeup the core
instead.
ao_need_data() is going to be used by ao_pulse, and we need to
workaround a stupid situation with pulseaudio causing a deadlock because
its callback still holds the internal pulseaudio lock.
For AOs that don't call ao_need_data(), the deadline is calculated by
the buffer fill status and latency, as before.
I hate tabs.
This replaces all tabs in all source files with spaces. The only
exception is old-makefile. The replacement was made by running the
GNU coreutils "expand" command on every file. Since the replacement was
automatic, it's possible that some formatting was destroyed (but perhaps
only if it was assuming that the end of a tab does not correspond to
aligning the end to multiples of 8 spaces).
Also fix a format string mistake in a log call using it.
I wonder if this code shouldn't use FormatMessage, but it looks kind
of involved [1], so: no, thanks.
[1] http://support.microsoft.com/kb/256348/en-us
This was accidentally broken in commit b72ba3f7. I somehow made the
wild assumption that replaygain adjusted the volume relative to 0%
instead of 100%.
The detach suboption was similarly broken.
Set refcounted_frames, because in some versions of libavcodec mixing the
new AVFrame API and non-refcounted decoding could cause memory
corruption. Likewise, it's probably still required to unref a frame
before calling the decoder.
Maybe this should be default. On the other hand, this filter does
something even if the volume is neutral: it clips samples against the
allowed range, should the decoder or a previous filter output garbage.
Currently, both replaygain adjustment and user volume control (if
softvol is enabled) share the same variable. Sharing the variable would
cause especially if --volume is used; then the replaygain volume would
always be overwritten.
Now both gain values are simple added right before doing filtering.
This adds the options replaygain-track and replaygain-album. If either is set,
the replaygain track or album gain will be automatically read from the track
metadata and the volume adjusted accordingly.
This only supports reading REPLAYGAIN_(TRACK|ALBUM)_GAIN tags. Other formats
like LAME's info header would probably require support from libav.
The main incompatibility was that Libav didn't have av_opt_set_int_list.
But since that function is excessively ugly and idiotic (look how it
handles types), I'm not missing it much. Use an aformat filter instead
to handle the functionality that was indirectly provided by it. This is
similar to how vf_lavfi works.
The other incompatibility was channel handling. Libav consistently uses
channel layouts only, why ffmpeg still requires messing with channel
counts to some degree. Get rid of most channel count uses (and hope
channel layouts are "exact" enough). Only in one case FFmpeg fails with
a runtime check if we feed it AVFrames with channel count unset.
Another issue were AVFrame accessor functions. FFmpeg introduced these
for ABI compatibility with Libav. I refuse to use them, and it's not my
problem if FFmpeg doesn't manage to provide a stable ABI for fields
provided both by FFmpeg and Libav.
The volume controls in mpv now affect the session's volume (the
application's volume in the mixer). Since we do not request a
non-persistent session, the volume and mute status persist across mpv
invocations and system reboots.
In exclusive mode, WASAPI doesn't have access to a mixer so the endpoint
(sound card)'s master volume is modified instead. Since by definition
mpv is the only thing outputting audio in exclusive mode, this causes no
conflict, and ao_wasapi restores the last user-set volume when it's
uninitialized.
Due to the COM Single-Threaded Apartment model, the thread owning the
objects will still do all the actual method calls (in the form of
message dispatches), but at least this will be COM's problem rather than
having to set up several handles and adding extra code to the event
thread.
Since the event thread still needs to own the WASAPI handles to avoid
waiting on another thread to dispatch the messages, the init and uninit
code still has to run in the thread.
This also removes a broken drain implementation and removes unused
headers from each of the files split from the original ao_wasapi.c.
ao_wasapi.c was almost entirely init code mixed with option code and
occasionally actual audio handling code. Split most things to
ao_wasapi_utils.c and keep the audio handling code in ao_wasapi.c.
Gets rid of the internal ring buffer and get_buffer. Corrects an
implementation error in thread_reset.
There is still a possible race condition on reset, and a few refactors
left to do. If feasible, the thread that handles everything
WASAPI-related will be made to only handle feed events.
Assume obtained.samples contains the number of samples the SDL audio
callback will request at once. Then make sure ao.c will set the buffer
size at least to 3 times that value (or more).
Might help with bad SDL audio backends like ESD, which supposedly uses a
500ms buffer.
In general, we don't need to have a large hw audio buffer size anymore,
because we can quickly fill it from the soft buffer.
Note that this probably doesn't change much anyway. On my system (dmix
enabled), the buffer size is only 170ms, and ALSA won't give more. Even
when using a hardware device the buffer size seems to be limited to
341ms.
This AO pretended to support volume operations when in spdif passthrough
mode, but actually did nothing. This is wrong: at least the GET
operations must write their argument. Signal that volume is unsupported
instead.
This was probably a hack to prevent insertion of volume filters or so,
but it didn't work anyway, while recovering after failed volume filter
insertion does work, so this is not needed at all.
Since the addition of the AO feed thread, 200ms of latency (MIN_BUFFER)
was added to all push-based AOs. This is not so nice, because even AOs
with relatively small buffering (e.g. ao_alsa on my system with ~170ms
of buffer size), the additional latency becomes noticable when e.g.
toggling mute with softvol.
Fix this by trying to keep not only 200ms minimum buffer, but also 200ms
maximum buffer. In other words, never buffer beyond 200ms in total. Do
this by estimating the AO's buffer fill status using get_space and the
initially known AO buffer size (the get_space return value on
initialization, before any audio was played). We limit the maximum
amount of data written to the soft buffer so that soft buffer size and
audio buffer size equal to 200ms (MIN_BUFFER).
To avoid weird problems with weird AOs, we buffer beyond MIN_BUFFER if
the AO's get_space requests more data than that, and as long as the soft
buffer is large enough.
Note that this is just a hack to improve the latency. When the audio
chain gains the ability to refilter data, this won't be needed anymore,
and instead we can introduce some sort of buffer replacement function in
order to update data in the soft buffer.
It is possible to have ao->reset() called between ao->pause() and
ao->resume() when seeking during the pause. If the underlying PCM
supports pausing, resuming an already reset PCM will produce an error.
Avoid that by explicitly checking PCM state before calling
snd_pcm_pause().
Signed-off-by: wm4 <wm4@nowhere>
The uint64_t math would cause overflow at long enough system uptimes
(...such as 3 days), and any precision error given by the double math will
be under one milisecond.
One strange issue is that we apparently can't stop the audio API on
audio reset (ao_driver.reset). We could use SDL_PauseAudio, but that
doesn't specify whether remaining audio is dropped. We also could use
SDL_LockAudio, but holding that over a long time will probably be bad,
and it probably doesn't drop audio. This means we simply play silence
after a reset, instead of stopping the callback completely. (The
existing code ran into an underrun in this situation.)
The delay estimation works about the same. We simply assume that the
callback is locked to audio timing (like ao_jack), and that 1 callback
corresponds to 1 period. It seems this (removed) code fragment assumes
there 1 one period size delay:
// delay subcomponent: remaining audio from the next played buffer, as
// provided by the callback
buffer_interval += callback_interval;
so we explicitly do that too.
Until now, this was always conflated with uninit. This was ugly, and
also many AOs emulated this manually (or just ignored it). Make draining
an explicit operation, so AOs which support it can provide it, and for
all others generic code will emulate it.
For ao_wasapi, we keep it simple and basically disable the internal
draining implementation (maybe it should be restored later).
Tested on Linux only.
Same deal as with the previous commit. We don't lose any functionality,
except for waiting "properly" on audio end, instead of waiting using the
delay estimate.
This removes the ringbuffer management from the code, and uses the
generic code added with the previous commit. The result should be
pretty much the same.
The "estimate" sub-option goes away. This estimation is now always
active. The new code for delay estimation is slightly different, and
follows the claim of the jack framework that callbacks are timed
exactly.
This has 2 goals:
- Ensure that AOs have always enough data, even if the device buffers
are very small.
- Reduce complexity in some AOs, which do their own buffering.
One disadvantage is that performance is slightly reduced due to more
copying.
Implementation-wise, we don't change ao.c much, and instead "redirect"
the driver's callback to an API wrapper in push.c.
Additionally, we add code for dealing with AOs that have a pull API.
These AOs usually do their own buffering (jack, coreaudio, portaudio),
and adding a thread is basically a waste. The code in pull.c manages
a ringbuffer, and allows callback-based AOs to read data directly.
Since the AO will run in a thread, and there's lots of shared state with
encoding, we have to add locking.
One case this doesn't handle correctly are the encode_lavc_available()
calls in ao_lavc.c and vo_lavc.c. They don't do much (and usually only
to protect against doing --ao=lavc with normal playback), and changing
it would be a bit messy. So just leave them.
We want to move the AO to its own thread. There's no technical reason
for making the ao struct opaque to do this. But it helps us sleep at
night, because we can control access to shared state better.
This field will be moved out of the ao struct. The encoding code was
basically using an invalid way of accessing this field.
Since the AO will be moved into its own thread too and will do its own
buffering, the AO and the playback core might not even agree which
sample a PTS timestamp belongs to. Add some extrapolation code to handle
this case.
Use QueryPerformanceCounter to improve the accuracy of
IAudioClock::GetPosition.
While this is mainly for "realtime correctness" (usually the delay is a
single sample or less), there are cases where IAudioClock::GetPosition
takes a long time to return from its call (though the documentation doesn't
define what a "long time" is), so correcting its value might be important in
case the documented possible delay happens.
The lack of device latency made get_delay report latencies shorter than
they should; on systems with fast enough drivers, the delay is not
perceptible, but high enough invisible delays would cause desyncs.
I'm not yet completely sure whether this is 100% accurate, there are
some issues involved when repeatedly pausing+unpausing (the delay might
jump around by several dozen miliseconds), but seeking seems to be
working correctly now.
The player didn't quit when the end of a file was reached. The reason
for this is that jack reported a constant audio delay even when all
audio was done playing. Whether that was recognized as EOF by the player
depended whether the exact value was higher or lower than the player's
threshhold for what it considers no more audio.
get_delay() should return amount of time it takes until the last sample
written to the audio buffer reaches the speaker. Therefore, we have to
track the estimated time when the last sample is done, and subtract it
from the calculated latency. Basically, the latency is the only amount
of time left in the delay, and it should go towards 0 as audio reaches
ths speakers.
I'm not sure if this is correct, but at least it solves the problem. One
suspicious thing is that we use system time to estimate the end of the
audio time. Maybe using jack_frame_time() would be more correct. But
apart from this, there doesn't seem to be a better way to handle this.
The step argument for "add volume <step>" was ignored until now. Fix it.
There is one problem: by defualt, "add volume" should use the value set
with --volstep. This value is 3 by default. Since the default volue for
the step argument is always 1 (and we don't really want to make the
generic code more complicated by introducing custom step sizes), we
simply multiply the step argument with --volstep to keep it compatible.
The --volstep option should probably be just removed in the future.
Windows applications that use LoadLibrary are vulnerable to DLL
preloading attacks if a malicious DLL with the same name as a system DLL
is placed in the current directory. mpv had some code to avoid this in
ao_wasapi.c. This commit just moves it to main.c, since there's no
reason it can't be used process-wide.
This change can affect how plugins are loaded in AviSynth, but it
shouldn't be a problem since MPC-HC also does this and it's a very
popular AviSynth client.
Balance controls as used by mixer.c was broken, because af_pan.c stopped
accepting its arguments. We have to allow 0 channels explicitly. Also,
fix null pointer access if the matrix parameter is not used.
Regression from commit 82983970.
Signed-off-by: wm4 <wm4@nowhere>
This merges pull request #496. The problem was that at least the
initialization of the distance[] array accessed af_fmtstr_table[]
entries that were out of bounds. Small cosmetic changes applied to
the original pull request.
1000ms is a bit insane. It makes behavior on playback speed changes
worse (because the player has to catch up the dropped audio due to
audio-chain reset), and perhaps makes seeking slower.
Note that the problem of playback speed changes misbehaving will be
fixed in the future, but even then we don't want to have a buffer that
large.
Always pass around mp_log contexts in the option parser code. This of
course affects all users of this API as well.
In stream.c, pass a mp_null_log, because we can't do it properly yet.
This will be fixed later.
Remove the nonsensical print_lock too.
Things that are called from the option validator are not converted yet,
because the option parser doesn't provide a log context yet.
This could output additional, potentially useful error messages. But the
callback is global and not library-safe, and would require us to add
additional state. Remove it, because it's obviously too much of a pain.
Also, it seems ALSA prints stuff to stderr anyway.
request_channels has been deprecated for years (request_channel_layout
is the replacement), but it appears it's still needed despite the
deprecation at least on older libavcodec versions.
So still set request_channels, but to it with the avoption API, which
hides the deprecation warning. This should also prevent mpv getting
trashed when libavcodec happens to bump its major version.
In my opinion, config.h inclusions should be kept to a minimum. MPlayer
code really liked including config.h everywhere, though, even in often
used header files. Try to reduce this.
Since m_option.h and options.h are extremely often included, a lot of
files have to be changed.
Moving path.c/h to options/ is a bit questionable, but since this is
mainly about access to config files (which are also handled in
options/), it's probably ok.
The tmsg stuff was for the internal gettext() based translation system,
which nobody ever attempted to use and thus was removed. mp_gtext() and
set_osd_tmsg() were also for this.
mp_dbg was once enabled in debug mode only, but since we have log level
for enabling debug messages, it seems utterly useless.
The previous RING_BUFFER_COUNT value, 64, would have ao_wasapi buffer 64
frames of audio in the ring buffer; a delay of 1280ms, which is clearly
overkill for everything. A value of 8 buffers 8 frames for a total of
160ms.
When get_space was converted to returning samples instead of bytes, a
unit type mismatch in get_delay's calculation returned bogus values. Fix
by converting get_space's value back to bytes.
Fixes playback with ao_wasapi when reaching EOF, or seeking past it.
This can be reproduced with:
mpv short.wav -af 'lavfi="aecho=0.8:0.9:5000|6800:0.3|0.25"'
An audio file that is just 1-2 seconds long should play for 8-9 seconds,
which audible echo towards the end.
The code assumes that when playing with AF_FILTER_FLAG_EOF, the filter
will either produce output, or has all remaining data flushed. I'm not
really sure whether this really works if there are multiple filters with
EOF handling in the chain. To handle it correctly, af_lavfi should retry
filtering if 1. EOF flag is set, 2. there were input samples, and 3. no
output samples were produced. But currently it seems to work well enough
anyway.
The new signature is actually closer to how it actually works, and
someone who is not familiar to the API and how it works might make fewer
fatal mistakes with the new signature than the old one. Pretty weird.
Do this to sneak in a flags parameter, which will later be used to flush
remaining data of at least vf_lavfi.
This will make af_channels output a channel layout that is compatible
with any destination layout. Not sure if that's a good idea though,
since the way the AO choses a layout is perhaps less predictable. On the
other hand, using the old MPlayer standard layouts doesn't make much
sense either. We'll see whether this improves or breaks someone's use
case.
Apparently this stopped working after some planar changes (broken format
negotiation). Radically change option parsing in an incompatible way.
Suggest alternatives to this filter, since it barely has any importance
anymore.
Normally, audio decoder don't have a decoder delay, so the code was
fine. But FFmpeg supports multithreaded decoding for some audio codecs,
which introduces such a delay.
The delay means that we won't get decoded audio for the first few
packets, and that we need to do something to get the trailing audio
still buffered in the decoder when reaching EOF.
Two changes are needed to deal with the delay:
- If EOF is reached, pass a "flush" packet to the decoder to return the
buffered audio. Such a flush packet is automatically setup when
calling mp_set_av_packet() with a NULL packet.
- Use the PTS returned by the decoder, instead of the packet's. This is
important to get correct timestamps for decoded audio. Ignoring this
would result into offsetting the audio playback time by the decoder
delay. Note that we can still use the timestamp of the first packet
to get the timestamp for the start of the audio.
If the timebase is set, it's used for converting the packet timestamps.
Otherwise, the previous method of reinterpret-casting the mpv style
double timestamps to libavcodec style int64_t timestamps is used.
Also replace the kind of awkward mp_get_av_frame_pkt_ts() function by
mp_pts_from_av(), which simply converts timestamps in a way the old
function did. (Plus it takes a timebase parameter, similar to the
addition to mp_set_av_packet().)
Note that this should not change anything yet. The code in ad_lavc.c and
vd_lavc.c passes NULL for the timebase parameters. We could set
AVCodecContext.pkt_timebase and use that if we want to give libavcodec
"proper" timestamps.
This could be important for ad_lavc.c: some codecs (opus, probably mp3
and aac too) have weird requirements about doing decoding preroll on the
container level, and thus require adjusting the audio start timestamps
in some cases. libavcodec doesn't tell us how much was skipped, so we
either get shifted timestamps (by the length of the skipped data), or we
give it proper timestamps. (Note: libavcodec interprets or changes
timestamps only if pkt_timebase is set, which by default it is not.)
This would require selecting a timebase though, so I feel uncomfortable
with the idea. At least this change paves the way, and will allow some
testing.
There are some use cases for this. For example, you can use it to set
defaults of automatically inserted filters (like af_lavrresample). It's
also useful if you have a non-trivial VO configuration, and want to use
--vo to quickly change between the drivers without repeating the whole
configuration in the --vo argument.
This is needed so that new processes (created with fork+exec) don't
inherit open files, which can be important for a number of reasons.
Since O_CLOEXEC is relatively new (POSIX.1-2008, before that Linux
specific), we #define it to 0 in io.h to prevent compilation errors on
older/crappy systems. At least this is the plan.
input.c creates a pipe. For that, add a mp_set_cloexec() function (which
is based on Weston's code in vo_wayland.c, but more correct). We could
use pipe2() instead, but that is Linux specific. Technically, we have a
race condition, but it won't matter.
This partially reverts commit 7d152965. It turns out that at least some
ALSA drivers (at least snd-hda-intel) report incorrect audio delay with
non-native sample rates, even if the sample rate is only very slightly
different from the native one.
For example, 48000Hz is fine on my hda-intel system, while both 8000Hz
and 47999Hz lead to a delay off by 40ms (according to mpv's A/V
difference display), which suggests that something in ALSA is
calculating the delay using the wrong sample rate.
As an additional problem, with ALSA resampling enabled, using
48001Hz/float/2ch fails, while 49000Hz/float/2ch or 48001Hz/s16/2ch
work. With resampling disabled, all these cases work obviously, because
our own resampler doesn't just refuse any of these formats.
Since some people want to use the ALSA resampler (because it's highly
configurable, supports multiple backends, etc.), we still allow enabling
ALSA resampling with an ao_alsa suboption.
Previous code was using the values of the AudioChannelLabel enum directly to
create the channel bitmap. While this was quite smart it was pretty unreadable
and fragile (what if Apple changes the values of those enums?).
Change it to use a 'dumb' conversion table.
These used the suffix _resync_stream, which is a bit misleading. Nothing
gets "resynchronized", they really just reset state.
(Some audio decoders actually used to "resync" by reading packets for
resuming playback, but that's not the case anymore.)
Also move the function in dec_video.c to the top of the file.
The code stopped at kAudioChannelLabel_TopBackRight and missed mapping for
5 more channel labels. These are in a completely different order that the mpv
ones so they must be mapped manually.
This includes the case when lavc decodes audio with more than 8
channels, which our audio chain currently does not support.
the changes in ad_lavc.c are just simplifications. The code tried to
avoid overriding global parameters if it found something invalid, but
that is not needed anymore.
Resampling with non-ancient ALSA setups works fine, so there is no
need to keep this around. Furthermore, as of writing, the default
builtin resampler used by many ALSA setups (taken from libspeex)
actually has higher quality than the default resampling modes of
avresample and swresample.
Apparently just 5 packets is not enough for the initial audio decode
(which is needed to find the format). The old code (before the recent
refactor) appeared to use 5 packets, but there were apparently other
code paths which in the end amounted to more than 5 packets being read.
The sample that failed (see github issue #368) needed 9 packets.
Fixes#368.
This used to be needed to access the generic stream header from the
specific headers, which in turn was needed because the decoders had
access only to the specific headers. This is not the case anymore, so
this can finally be removed again.
Also move the "format" field from the specific headers to sh_stream.
sh_audio is supposed to contain file headers, not whatever was decoded.
Fix this, and write the decoded format to separate fields in the decoder
context, the dec_audio.decoded field. (Note that this field is really
only needed to communicate the audio format from decoder driver to the
generic code, so no other code accesses it.)
Move all state that basically changes during decoding or is needed in
order to manage decoding itself into a new struct (dec_audio).
sh_audio (defined in stheader.h) is supposed to be the audio stream
header. This should reflect the file headers for the stream. Putting the
decoder context there is strange design, to say the least.
The AF control commands used an elaborate and unnecessary organization
for the command constants. Get rid of all that and convert the
definitions to a simple enum. Also remove the control commands that
were not really needed, because they were not used outside of the
filters that implemented them.
And by "cleanup", I mean "remove". Actually, only remove the parts that
are redundant and doxygen noise. Move useful parts to the comment above
the function's implementation in the C source file.
When the decoder detects a format change, it overwrites the values
stored in sh_audio (this affects the members sample_format, samplerate,
channels). In the case when the old audio data still needs to be
played/filtered, the audio format as identified by sh_audio and the
format used for the decoder buffer can mismatch. In particular, they
will mismatch in the very unlikely but possible case the audio chain is
reinitialized while old data is draining during a format change.
Or in other words, sh_audio might contain the new format, while the
audio chain is still configured to use the old format.
Currently, the audio code (player/audio.c and init_audio_filters) access
sh_audio to get the current format. This is in theory incorrect for the
reasons mentioned above. Use the decoder buffer's format instead, which
should be correct at any point.
Commit 22b3f522 not only redid major aspects of audio decoding, but also
attempted to fix audio format change handling. Before that commit, data
that was already decoded but not yet filtered was thrown away on a
format change. After that commit, data was supposed to finish playing
before rebuilding filters and so on.
It was still buggy, though: the decoder buffer was initialized to the
new format too early, triggering an assertion failure. Move the reinit
call below filtering to fix this.
ad_mpg123.c needs to be adjusted so that it doesn't decode new data
before the format change is actually executed.
Add some more assertions to af_play() (audio filtering) to make sure
input data and configured format don't mismatch. This will also catch
filters which don't set the format on their output data correctly.
Regression due to planar_audio branch.
Simulate proper handling of AOPLAY_FINAL_CHUNK. Print when underruns
occur (i.e. running out of data). Add some options that control
simulated buffer and outburst sizes.
All this is useful for debugging and self-documentation. (Note that
ao_null always was supposed to simulate an ideal AO, which is the reason
why it fools people who try to use it for benchmarking video.)
This should allow it to select better fallback formats, instead of
picking the first encoder sample format if ao->format is not equal to
any of the encoder sample formats.
Not sure what is supposed to happen if the encoder provides no
compatible sample format (or no sample format list at all), but in this
case ao_lavc.c still fails gracefully.
The added function af_format_conversion_score() can be used to select
the best sample format to convert to in order to reduce loss and extra
conversion work.
It calculates a "loss" score when going from one format to another, and
for each conversion that needs to be done a certain score is subtracted.
Thus, if you have to convert from one format to a set of other formats,
you can calculate the score for each conversion, and pick the one with
the highest score.
Conversion between int and float is considered the worst case. One odd
consequence is that when converting from s32 to u8 or float, u8 will be
picked.
Test program used to develop this follows:
#define MAX_FMT 200
struct entry {
const char *name;
int score;
};
static int compentry(const void *px1, const void *px2)
{
const struct entry *x1 = px1;
const struct entry *x2 = px2;
if (x1->score > x2->score)
return 1;
if (x1->score < x2->score)
return -1;
return 0;
}
int main(int argc, char *argv[])
{
for (int n = 0; af_fmtstr_table[n].name; n++) {
struct entry entry[MAX_FMT];
int entries = 0;
for (int i = 0; af_fmtstr_table[i].name; i++) {
assert(i < MAX_FMT);
entry[entries].name = af_fmtstr_table[i].name;
entry[entries].score =
af_format_conversion_score(af_fmtstr_table[i].format,
af_fmtstr_table[n].format);
entries++;
}
qsort(&entry[0], entries, sizeof(entry[0]), compentry);
for (int i = 0; i < entries; i++) {
printf("%s -> %s: %d \n", af_fmtstr_table[n].name,
entry[i].name, entry[i].score);
}
}
}
These must be written even if there was no "final frame", e.g. due to
the player being exited with "q".
Although the issue is mostly of theoretical nature, as most audio codecs
don't need the final encoding calls with NULL data. Maybe will be more
relevant in the future.
This used to be in bytes, now it's in samples. Divide the value by 8
(assuming a typical audio format, float samples with 2 channels).
Fix some editing mistake or non-sense about the extra buffering added
(1<<x instead of x<<5).
Also sneak in a s/MPlayer/mpv/.
This changes option parsing as well as filter defaults slightly. The
default is now to encode to spdif (this is way more useful than writing
raw AC3 - what was this even useful for, other than writing broken ac3
-in-wav files?). The bitrate parameter is now always in kbps.
Apparently this was completely broken after commit 22b3f522. Basically,
this locked up immediately completely while decoding the first packet.
The reason was that the buffer calculations confused bytes and number of
samples. Also, EOF reporting was broken (wrong return code).
The special-casing of ad_mpg123 and ad_spdif (with DECODE_MAX_UNIT) is a
bit annoying, but will eventually be solved in a better way.
ao_null should simulate a "perfect" AO, but framestepping behaved quite
badly with it. Framstepping usually exposes problems with AOs dropping
their buffers on pause, and that's what happened here.
Most libavcodec decoders output non-interleaved audio. Add direct
support for this, and remove the hack that repacked non-interleaved
audio back to packed audio.
Remove the minlen argument from the decoder callback. Instead of
forcing every decoder to have its own decode loop to fill the buffer
until minlen is reached, leave this to the caller. So if a decoder
doesn't return enough data, it's simply called again. (In future, I
even want to change it so that decoders don't read packets directly,
but instead the caller has to pass packets to the decoders. This fits
well with this change, because now the decoder callback typically
decodes at most one packet.)
ad_mpg123.c receives some heavy refactoring. The main problem is that
it wanted to handle format changes when there was no data in the decode
output buffer yet. This sounds reasonable, but actually it would write
data into a buffer prepared for old data, since the caller doesn't know
about the format change yet. (I.e. the best place for a format change
would be _after_ writing the last sample to the output buffer.) It's
possible that this code was not perfectly sane before this commit,
and perhaps lost one frame of data after a format change, but I didn't
confirm this. Trying to fix this, I ended up rewriting the decoding
and also the probing.
libav* is generally freaking horrible, and might do bad things if the
data pointer passed to it are not aligned. One way to be sure that the
alignment is correct is allocating all pointers using av_malloc().
It's possible that this is not needed at all, though. For now it might
be better to keep this, since the mp_audio code is intended to replace
another buffer in dec_audio.c, which is currently av_malloc() allocated.
The original reason why this uses av_malloc() is apparently because
libavcodec used to directly encode into mplayer buffers, which is not
the case anymore, and thus (probably) doesn't make sense anymore.
(The commit subject uses the word "cargo cult", after all.)
Remove the awkward planarization. It had to be done because the AC3
encoder requires planar formats, but now we support them natively.
Try to simplify buffer management with mp_audio_buffer.
Improve checking for buffer overflows and out of bound writes. In
theory, these shouldn't happen due to AC3 fixed frame sizes, but being
paranoid is better.
The format negotiation is the same, except don't confusingly copy the
input format into af->data, just to overwrite it later. af->data should
alwass contain the output format, and the existing code was just a very
misguided use of the af_test_output() helper function.
Just set af->data to the output format immediately, and modify the input
format properly.
Also, if format negotiation fails (and needs another iteration), don't
initialize the libavcodec encoder.
Before this commit, the af_instance->mul/delay values were in bytes.
Using bytes is confusing for non-interleaved audio, so switch mul to
samples, and delay to seconds. For delay, seconds are more intuitive
than bytes or samples, because it's used for the latency calculation.
We also might want to replace the delay mechanism with real PTS
tracking inside the filter chain some time in the future, and PTS
will also require time-adjustments to be done in seconds.
For most filters, we just remove the redundant mul=1 initialization.
(Setting this used to be required, but not anymore.)
Since ao_openal simulates multi-channel audio by placing a bunch of
mono-sources in 3D space, non-interleaved audio is a perfect match for
it. We just have to remove the interleaving code.