Relative seeks backwards with external audio tracks does not always work
well: it tends to happen that video seek back further than audio, so
audio will remain silent until the audio's after-seek position is
reached. This happens because we strictly seek both video and audio
demuxer to the approximate desirted target PTS, and then start decoding
from that.
Commit 81358380 removes an older method that was supposed to deal with
this. It was sort of bad, because it could lead to the playback core
freezing by waiting on network.
Ideally, the demuxer layer would probably somehow deal with such seeks,
and do them in a way the audio is seeked after video. Currently this is
infeasible, because the demuxer layer assumes a single demuxer, and
external tracks simply use separate demuxer layers. (MPlayer actually
had a pseudo-demuxer that joined external tracks into a single demuxer,
but this is not flexible enough - and also, the demuxer layer as it
currently exists can't deal with dynamically removing external tracks
either. Maybe some time in the future.)
Instead, add a gross hack, that essentially reseeks the audio if it
detects that it's too far off. The result is actually not too bad,
because we can reuse the mechanism that is used for instant track
switching. This way we can make sure of the right position, without
having to care about certain other issues.
It should be noted that if the audio demuxer is used for other tracks
too, and the demuxer does not support refresh seeking, audio will
probably be off by even a higher amount. But this should be rare.
This code is for resyncing audio-only streams (e.g. switching between
audio tracks if no video track is active). This must not be run if the
video PTS just isn't known yet. (Although the case in which this changes
anything is probably very obscure, if it can even happen. Still, it's a
bit more correct.)
This is a correction to commit 91a3bda6.
If an audio track is enabled during playback, then make it resume at the
exact "current position", instead of playing audio before that position.
This was already done for video.
This commit adds an --audio-channel=auto-safe mode, and makes it the
default. This mode behaves like "auto" with most AOs, except with
ao_alsa. The intention is to allow multichannel output by default on
sane APIs. ALSA is not sane as in it's so low level that it will e.g.
configure any layout over HDMI, even if the connected A/V receiver does
not support it. The HDMI fuckup is of course not ALSA's fault, but other
audio APIs normally isolate applications from dealing with this and
require the user to globally configure the correct output layout.
This will help with other AOs too. ao_lavc (encoding) is changed to the
new semantics as well, because it used to force stereo (perhaps because
encoding mode is supposed to produce safe files for crap devices?).
Exclusive mode output on Windows might need to be adjusted accordingly,
as it grants the same kind of low level access as ALSA (requires more
research).
In addition to the things mentioned above, the --audio-channels option
is extended to accept a set of channel layouts. This is supposed to be
the correct way to configure mpv ALSA multichannel output. You need to
put a list of channel layouts that your A/V receiver supports.
Pointless anyway. With superficial checking I couldn't find any decoder
which actually outputs this, and AO chmap negotiation would properly
ignore them anyway in most cases.
mixer.c didn't really deserve to be separate anymore, as half of its
contents were unnecessary glue code after recent changes. It also
created a weird split between audio.c and af.c due to the fact that
mixer.c could insert audio filters. With the code being in audio.c
directly, together with other code that unserts filters during runtime,
it will be possible to cleanup this code a bit and make it work like the
video filter code.
As part of this change, make the balance code work like the volume code,
and add an option to back the current balance value. Also, since the
balance semantics are unexpected for most users (panning between the
audio channels, instead of just changing the relative volume), and there
are some other volumes, formally deprecate both the old property and the
new option.
Drop the code for switching the volume options and properties between
af_volume and AO volume controls. interface-changes.rst mentions the
changes in detail.
Do this because this was exceedingly complex and had other problems as
well. It was also very hard to test. It's just not worth the trouble.
Some leftovers like AOCONTROL_HAS_PER_APP_VOLUME will be removed at a
later point.
Fixes#3322.
The check whether video is ready yet was done only in STATUS_FILLING.
But it also switched to STATUS_READY, which means the next time
fill_audio_out_buffers() was called, audio would actually be started
before video.
In most situations, this bug didn't show up, because it was only
triggered if the demuxer didn't provide video packets quickly enough,
but did for audio packets.
Also log when audio is started.
(I hate fill_audio_out_buffers(), why did I write it?)
Unfortunately I see no better solution.
The refresh seek is skipped if the amount of buffered audio is not
overly huge.
Unfortunately softvol af_volume insertion still can cause this issue,
because it's outside of the normal dynamic filter chain changing code.
Move the video refresh call to reinit_video_filters() to make it more
uniform along with the audio code.
See --lavfi-complex option.
This is still quite rough. There's no support for dynamic configuration
of any kind. There are probably corner cases where playback might freeze
or burn 100% CPU (due to dataflow problems when interaction with
libavfilter).
Future possible plans might include:
- freely switch tracks by providing some sort of default track graph
label
- automatically enabling audio visualization
- automatically mix audio or stack video when multiple tracks are
selected at once (similar to how multiple sub tracks can be selected)
Will be helpful for the coming filter support. I planned on merging
audio/video decoding, but this will have to wait a bit longer, so only
remove the duplicate status codes.
Let's fix broken samples with questionable heuristic without real
reasoning. Until this gets fixed properly, this is a good compromise,
though. A proper fix would properly resync audio and video without
brutally resetting the decoders, but on the other hand not doing the
brutal reset would cause issues in other obscure corner cases such
resyncing might cause.
This code is tricky because it has to wakeup the mainloop to make
progressing during syncing audio, but also has to avoid waking it up
when it's not needed. Failure to do so either burns CPU by not ever
going to sleep, or causes apparent "freezes" by going to sleep (and it
will continue if the mainloop is woken up e.g. due to user input).
In this case, simply starting A/V playback with --start=5 and removing
an unrelated wakeup in osd.c can trigger such a "freeze". The unrelated
wakeup did hide this bug, nonetheless it's a bug.
(Can't wait to rewrite this shitty audio resync code. And it's all my
fault.)
These changes don't make too much sense without context, but are
preparation for later. Then the audio_src/video_src fields will be
actually be NULL under circumstances.
Before this commit, reinit_audio_chain() did 2 things: create all the
management data structures and initialize the decoder, and handling lazy
filter/output init (as well as dealing with format changes). For the
second purpose, it could be called multiple times (even though it wasn't
really idempotent). This was pretty weird, so make them separate
functions. The new function is actually idempotent too.
It also turns out the reinit functions don't have to call themselves
recursively for the spdif PCM fallback.
Regression caused by commit 3b95dd47. Also see commit 4c25b000. We can
either use video_next_pts and add "delay", or we just use video_pts. Any
other combination breaks. The reason why the assumption that delay==0 at
this point was wrong exactly because after displaying the first video
frame (usually done before audio resync) a new frame might be "added"
immediately, resulting in a new video_next_pts and "delay", which will
still amount to video_pts.
Fixes#2770. (The reason why display-sync was blamed in this issue is
because enabling display-sync in the options forces a prefetch by 2
instead of 1 frames for seeks/playback restart, which triggers the
issue, even if display-sync is not actually enabled. In this case,
display-sync is never enabled because the frames have a unusually high
frame duration. This is also what exposed the initial desync issue.)
With the format left untouched, this would just try to reinit with a
spdif format again.
We're not clearing the format in reset_audio_state() so the audio chain
can be recreated any time without having to wait for a frame to be
decoded.
It doesn't need to be part of the big context, but is strictly part of
shuffling data from the audio filters to audio output, and thus belongs
into ao_chain.
It also turns out that clearing it in clear_audio_output_buffers() is
completely redundant.
(Of course ao_buffer is an abomination in the first place and shouldn't
exist at all.)
Similar to the video path. dec_audio.c now handles decoding only. It
also looks very similar to dec_video.c, and actually contains some of
the rewritten code from it. (A further goal might be unifying the
decoders, I guess.)
High potential for regressions.
Seems useless.
This only helped in one case: one audio stream in the sample
av_find_best_stream_fails.ts had a AC3 packets which couldn't be
decoded, and for which avcodec_decode_audio4() returned 0 forever. In
this specific case, playback will now not start, and you have to
deselect audio manually.
(If someone complains, the old behavior might be restored, but
differently.)
Also remove the stale "bitrate" field.
Eventually we want the VO be driven by a A->V filter, so a decoder
doesn't even have to exist. Some features definitely require a decoder
though (like reporting the decoder in use, hardware decoding, etc.), so
for each thing which accessed d_video, it has to be redecided if and how
it can access decoder state.
At least the "framedrop" property slightly changes semantics: you can
now always set this property, even if no video is active.
Some untested changes in this commit, but our bio-based distributed
test suite has to take care of this.
This is mainly a refactor. I'm hoping it will make some things easier
in the future due to cleanly separating codec metadata and stream
metadata.
Also, declare that the "codec" field can not be NULL anymore. demux.c
will set it to "" if it's NULL when added. This gets rid of a corner
case everything had to handle, but which rarely happened.
This is another attempt at making files with sparse video frames work
better.
The problem is that you generally can't know whether a jump in video
timestamps is just a (very) long video frame, or a timestamp reset. Due
to the existence of files with sparse video frames (new frame only every
few seconds or longer), every heuristic will be arbitrary (in general,
at least).
But we can use the fact that if video is continuous, audio should also
be continuous. Audio discontinuities can be easily detected, and if that
happens, reset some of the playback state.
The way the playback state is reset is rather radical (resets decoders
as well), but it's just better not to cause too much obscure stuff to
happen here. If the A/V sync code were to be rewritten, it should
probably strictly use PTS values (not this strange time_frame/delay
stuff), which would make it much easier to detect such situations and
to react to them.
Use the demux_set_ts_offset() added in the previous commit to base each
timeline segment to use timestamps according to its relative position
within the overall timeline. As a consequence we don't need to care
about these timestamps anymore, and everything becomes simpler.
(Another minor but delicious nugget of sanity.)
When the audio format is not known yet and the audio chain is still
initializing, filter reinit will fail. Normally, attempts to
reinitialize filters at this stage should be rare (e.g. user commands
editing the filter chain). But it sometimes happened with track
switching in combination with the video code calling
update_playback_speed() at arbitrary times.
Get rid of the message by not trying to change the filters for the sake
of playback speed update while decoding is still being initialized.
Actually, it didn't really require that before (most work was avoided),
but some bits had to be run anyway. Separate the speed change into a
light-weight function, which merely updates already created filters, and
a heavy-weight one which messes with filter insertion.
This also happens to fix the case where the filters would "forget" the
current speed (force resampling, change speed, hit a volume control to
force af_volume insertion - it will reset speed and desync).
Since we now always run the light-weight function, remove the
af_scaletempo verbose message that is printed on speed setting. Other
than that, all setters are cheap.
We still have a sample-based buffer between filters and audio outputs.
In order to avoid cutting frames into half (which can upset receivers),
we strictly need to align the boundaries on which we cut the audio.
Discontinuities (like toggling fullscreen) can cause multiple frames to
be dropped in succession, which sounds very weird. It's better to drop
some video frames instead to compensate for larger desyncs.
We roughly base it on the maximum allowed speed changes (audio change is
"additional" to the video change to account for deviations when playing
at max. video speed change).
It's not needed, because the additional data is not appended, but is the
total size of the audio buffer. The maximum size is the static audio
drop size (or twice, if the audio is duplicated).
The previous commit handled not falling back to normal decoding if the
AO was reloaded (I think...), and this tries to re-engage spdif pass-
through if it was previously falling back to normal decoding (e.g.
because it temporarily switched to an audio device incapable of
passthrough).
The manpage entry explains this.
(Maybe this option could be always enabled and removed. I don't quite
remember what valid use-cases there are for just disabling audio
entirely, other than that this is also needed for audio decoder init
failure.)
This should avoid unnecessary sleeping when audio playback start resync
has finished and goes into the normal playback state.
This is tricky; see e.g. commit 402fe381.
For video sync, we want separate playback speed controls for user-
requested speed and the "correction" speed for video timing. Further, we
use this separation to make sure only a resampler is inserted if
playback speed is only changed for video sync correction.
As of this commit, this is basically inactive code. It's just
preparation for the video sync code (the following commit).
Commit c5818046 fixed one case of audio EOF handling, and caused a new
one. This time, the ao_buffer doesn't actually contain everyting that
should be played - because if --end is used, only a part of it is
played. Of course this is stupid, and it will be changed later. For now,
this smaller change fixes the bug.
Fixes#2189.
time_frame is when the next video frame should be shown. It's normally
overwritten by the video timing code. This also says something about
"nosound mode" (--no-audio today), but at least these days we don't use
it at all if video is disabled.
Remove it; it likely has no function at all.
In paused mode, we never entered the audio EOF state. This shows e.g. in
--keep-open mode, which will not set the eof-reached property correctly.
Regression since commit c06cd1b9. This commit was the wrong fix. We need
to respect the buffer state, and pausing has nothing to do with this.
Fixes#2167.
Replace all the check macros with function calls. Give them all the
same case and naming schema.
Drop af_fmt2bits(). Only af_fmt2bps() survives as af_fmt_to_bytes().
Introduce af_fmt_is_pcm(), and use it in situations that used
!AF_FORMAT_IS_SPECIAL. Nobody really knew what a "special" format
was. It simply meant "not PCM".
This provides a new method for enabling spdif passthrough. The old
method via --ad (--ad=spdif:ac3 etc.) is deprecated. The deprecated
method will probably stop working at some point.
This also supports PCM fallback. One caveat is that it will lose at
least 1 audio packet in doing so. (I don't care enough to prevent this.)
(This is named after the old S/PDIF connector, because it uses the same
underlying technology as far as the higher level protoco is concerned.
Also, the user should be renamed that passthrough is backwards.)
This makes no sense, because the format can't be converted anyway. It
just sets up the filter chain init code, which will vomit a bunch of
useless and confusing messages. So uninit and fail explicitly when this
happens.
When starting in paused mode, no audio is written to the device at all,
because writing audio implicitly unpauses the AO. If the file is very
small, and all audio fits within the AO buffer, this accidentally
triggered the EOF condition. (In unpaused mode, it would write all
audio, end playback, and then wait until the AO has everything played.)
Commit 10915000 attempted to fix wasting CPU when resyncing and no new
data was actually coming from the demuxer. The fix assumed that at this
point it would have reached the sync point, but since the code attempts
weird incremental decoding, this wasn't actually true. So it broke
seeking in addition to removing the CPU waste.
Try something else. This time, we essentially only wakeup again if
data was read (i.e. audio_decode() returned successfully).
Thsi code path happens during seeking. If video is still being decoded
to get to the first video frame, audio has nothing to do, as it is
synchronized against the first video frame. We only want to wake up if
there's an actual state change.
Fixes#1958.
The af_add() function has a problem: if the inserted filter returns
AF_DETACH during init, the function will have a dangling pointer. Until
now this was avoided by making sure none of the used filters actually
return AF_DETACH, but it's getting infeasible.
Solve this by requiring passing an unique label to af_add(), which is
then used instead of the pointer.
Only reinit filters if it's actually needed. This is also slightly
easier to understand: if you look at the code, it should now be more
obvious why a reinit is needed (hopefully).
Precise seeking requires skipping audio, since the demuxer usually
doesn't seek precisely enough. There is a sanity check that prevents
skipping more than 300 seconds of audio. This still fails with very
large mp3s. For example, with a 1GB sized mp3 with Xing headers, entries
will be 4 MB apart on average, and occasionally much more.
Just bump the limit. I'm not even sure why it was added in the first
place; I suppose it's most important for files with real PTS resets.
When playback is started after seeking or opening a file, we need to
make sure audio and video line up exactly. This is done by cutting or
padding the audio stream to start on the video PTS.
This does not quite work with spdif: audio is compressed data, within a
spdif frame. There is no way to cut the audio "in between" the frames.
Cutting between the frames would just produce broken spdif packets, and
who knows how receivers will react to this (play noise?). But we still
can cut it in frame boundaries.
Unfortunately, we also insert 0 data for "silence" - we probably
shouldn't do this. Chances are the receiver will switch to PCM or so.
But for now this will have to do.
Note that this could be simplified somewhat, as soon as we work with
frames. See previous commit.
Handle the failure gracefully, instead of exploding and disabling audio.
Just set the speed back to 1.0.
Also remove the AF_DETACH from af_scaletempo. This actually created a
dangling pointer in af_add(), a tricky consequence of af_add()
reconfiguring the filter chain and the newly added filter using
AF_DETACH. Fortunately the AF_DETACH is not needed (and probably never
worked - it comes from MPlayer times, and MPlayer also disables audio
when trying to change speed with spdif).
Always use af_scaletempo if it's inserted, even if the option
--audio-pitch-correction=no is set.
Make sure all filters are reset on speed change. It's conceivable that
dynamic changes to the filter chain at runtime leave filters around
without resetting their speed parameters.
Also move the code to a separate function.
If the audio decoder was created, but no audio filter chain created yet
(still trying to decode a first audio frame), setting the "speed"
property could explode. It tried to recreate the filter chain, even
though no format was set yet.
This is inconvenient and should not happen.
Although the libraries we use for resampling (libavresample and
libswresample) do not support changing sampelrate on the fly, this makes
it easier to make sure no audio buffers are implicitly dropped. In fact,
this commit adds additional code to drain the resampler explicitly.
Changing speed twice without feeding audio in-between made it crash
with libavresample inc ertain cases (libswresample is fine). This is
probably a libavresample bug. Hopefully this will be fixed, and also I
attempted to workaround the situation that crashes it. (It seems to
point in direction of random memory corruption, though.)
In my opinion the artifacts created by af_scaletempo on extreme slowdown
(50% or so) are too bothersome - but users disagree. So use
af_scaletempo on any speed changes, not just on speedup.
This avoids potentially dropping some small amount of audio data
buffered in filters.
Reinit can be skipped only if the filter is af_scaletempo (which maps to
AF_CONTROL_SET_PLAYBACK_SPEED). The other case using af_lavrresample is
much more complicated due to filter chain politics.
Also, changing speed between 1.0 and something higher typically inserts
or removes the filter, so this obviously requires reinitialization. It
can be prevented by forcing the filter with --af=scaletempo.
I guess this was supposed to be some sort of optimization, but even
though it probably works, it's pretty meaningless and I couldn't measure
a difference. One special case killed.
mpctx->audio_delay always has the same value as opts->audio_delay. (This
was not the case a long time ago, when the audio-delay property didn't
actually write to opts->audio_delay. I think.)
Some files can have audio after video has ended, and playback of the
audio-only remainder is supposed to work just fine.
Seeking is broken-ish though. Not much can be done about this, since
it's the way demuxers work. Also, such files are obscure corner cases.
But enabling hr-seek for audio after video end can improve the situation
a lot.
This helps with issue #1533. The reported also provided a command line
to produce such a file:
ffmpeg -i image.jpg -i audio.flac -threads $(nproc) \
-c:v libvpx -crf 10 -qmin 5 -qmax 55 \
-vf scale=360:-1 -sws_flags lanczos -c:a libvorbis -ac 2 \
-b:a 128K out.webm
This was forgotten when the option was implemented, and makes this
option work as advertised.
Fixes#1473 (though the default behavior is probably still stupid).
This is a somewhat obscure situation, and happens only if audio starts
again after it has ended (in particular can happens with files where
audio starts later). It doesn't matter much whether audio starts
immediately or some milliseconds later, so simplify it.
When playing paused, the amount of decoded audio is limited to a small
amount (1 sample), because we don't write any audio to the AO when
paused. The small amount could trigger the case of the wanted audio
being too far in the future in the PTS sync code, which set the audio
status to STATUS_DRAINING, which in turn triggered the EOF code in the
next iteration. This was ok, but unfortunately, this triggered another
retry in order to check resuming from EOF by setting the status to
STATUS_SYNCING, which in turn lead to the busy loop by alternating
between the 2 states. So don't try resyncing while paused.
Since the PTS syncing code also calls ao_reset(), this could cause the
pulseaudio daemon to consume some CPU time as well.
This was caused by commit 33b57f55. Before that, the playloop was merely
run more often, but didn't cause any problems.
Fixes#1288.
We absolutely need to clear the AO reference in the mixer.
The audio_status must be changed to a state where no code assumes that
the AO is available. (It's allowed to do this blindly.)
This rewrites the audio decode loop to some degree. Audio filters don't
do refcounted frames yet, so af.c contains a hacky "emulation".
Remove some of the weird heuristic-heavy code in dec_audio.c. Instead of
estimating how much audio we need to filter, we always filter full
frames. Maybe this should be adjusted later: in case filtering increases
the volume of the audio data, we should try not to buffer too much
filter output by reducing the input that is fed at once.
For ad_spdif.c and ad_mpg123.c, we don't avoid extra copying yet - it
doesn't seem worth the trouble.
Use a pseudo-filter when changing speed with resampling, instead of
somehow changing a samplerate somewhere. This uses the same underlying
mechanism, but is a bit more structured and cleaner. It also makes some
of the following changes easier.
Since we now always use filters to change audio speed, move most of the
work set_playback_speed() does to recreate_audio_filters().
This is what you would expect. Before this commit, each
ao_request_reload() call would just queue a reload command, and then
recreate the AO for the number of times the function was called.
Instead of sending a command, introduce some sort of event retrieval
mechanism. At least for the reload case, use atomics, because we're too
lazy to setup an extra mutex.
This commit fixes a "cosmetic" user interface issue. Instead of
displaying the interpolated seek time on OSD, show the actual audio
time.
This is rather silly: when seeking in audio-only mode, it takes some
iterations until audio is "ready", but on the other hand, the audio
state machine is rather fickle, and fixing this cosmetic issue would be
intrusive. So just add a hack that paints over the ugly behavior as
perceived by the user. Probably the lesser evil.
It doesn't happen if video is enabled, because that mode sets the
current time immediately to video PTS. (Audio has to be synced to video,
so the code is a bit more complex.)
Fixes#1233.
The player was supposed to exit playback if both video and audio failed
to initialize (or if one of the streams was not selected when the other
stream failed). This didn't work; for one this check was missing from
one of the failure paths. And more importantly, both checked the
current_track array incorrectly.
Fix these issues, and move the failure handling code into a common
function.
CC: @mpv-player/stable
It possibly goes to sleep without actually starting to decode audio.
Possibly fixes a problem with --no-osc --no-video reported on IRC.
CC: @mpv-player/stable
Seems logical. For some reason, the player allows deselecting both audio
and video stream without quitting (a deliberate feature of which I have
no idea why it was added years ago), so this is needed.
Each subsystem (or similar thing) had an INITIALIZED_ flag assigned. The
main use of this was that you could pass a bitmask of these flags to
uninit_player(). Except in some situations where you wanted to
uninitialize nearly everything, this wasn't really useful. Moreover, it
was quite annoying that subsystems had most of the code in a specific
file, but the uninit code in loadfile.c (because that's where
uninit_player() was implemented).
Simplify all this. Remove the flags; e.g. instead of testing for the
INITIALIZED_AO flag, test whether mpctx->ao is set. Move uninit code
to separate functions, e.g. uninit_audio_out().
The messages "Audio: no audio" and "Video: no video" could be printed
twice each if initializing them failed. Prevent his silliness.
CC: @mpv-player/stable
Apparently this is what users want. When playing with normal speed,
nothing is done. When playing slower than normal, resampling is used
instead, because scaletempo (which does the pitch correction) adds
too many artifacts.
There's no real reason why audio_init_filter() should exist. Just use
af_init or af_reinit directly. (We lose a useless message; the same
information is printed in a quite close place with more details.)
Requires less code, and the way the filter chain is marked as having
failed to initialize allows just switching off audio instead of
crashing if trying to insert a volume filter in mixer.c fails, and
recreating the old filter chain fails too.
This would play some silence in case video was slower than audio. If
framedropping is already enabled, there's no other way to keep A/V
sync, short of changing audio playback speed (which would give worse
results). The --audiodrop option inserted silence if there was more
than 500ms desync.
This worked somewhat, but I think it was a silly idea after all. Whether
the playback experience is really bad or slightly worse doesn't really
matter. There also was a subtle bug with PTS handling, that apparently
caused A/V desync anyway at ridiculous playback speeds.
Just remove this feature; nobody is going to use it anyway.
Before this commit, there was AF_FORMAT_AC3 (the original spdif format,
used for AC3 and DTS core), and AF_FORMAT_IEC61937 (used for AC3, DTS
and DTS-HD), which was handled as some sort of superset for
AF_FORMAT_AC3. There also was AF_FORMAT_MPEG2, which used
IEC61937-framing, but still was handled as something "separate".
Technically, all of them are pretty similar, but may use different
bitrates. Since digital passthrough pretends to be PCM (just with
special headers that wrap digital packets), this is easily detectable by
the higher samplerate or higher number of channels, so I don't know why
you'd need a separate "class" of sample formats (AF_FORMAT_AC3 vs.
AF_FORMAT_IEC61937) to distinguish them. Actually, this whole thing is
just a mess.
Simplify this by handling all these formats the same way.
AF_FORMAT_IS_IEC61937() now returns 1 for all spdif formats (even MP3).
All AOs just accept all spdif formats now - whether that works or not is
not really clear (seems inconsistent due to earlier attempts to make
DTS-HD work). But on the other hand, enabling spdif requires manual user
interaction, so it doesn't matter much if initialization fails in
slightly less graceful ways if it can't work at all.
At a later point, we will support passthrough with ao_pulse. It seems
the PulseAudio API wants to know the codec type (or maybe not - feeding
it DTS while telling it it's AC3 works), add separate formats for each
codecs. While this reminds of the earlier chaos, it's stricter, and most
code just uses AF_FORMAT_IS_IEC61937().
Also, modify AF_FORMAT_TYPE_MASK (renamed from AF_FORMAT_POINT_MASK) to
include special formats, so that it always describes the fundamental
sample format type. This also ensures valid AF formats are never 0 (this
was probably broken in one of the earlier commits from today).
With e.g --start=-3 --audio-buffer=10 the decoder entered EOF state
before the initial sync was finished, entered STATUS_EOF, and just
started playing audio from a random position.
This doesn't handle seeking outside of the file, which is a different
case. E.g. --start=30:00 with audio and video enabled in a file shorter
than 30:00 will play a random last part of audio. This could perhaps be
fixed by using the hr-seek target for cutting audio, instead of the
video PTS, but that would be kind of intrusive, so don't do it for now.
The simpler solution, assuming audio EOF on video EOF, wouldn't work,
because we allow audio to start before video, or to last after video.
Somehow, there was a larger misunderstanding in the code: ao_buffer
does not need to be preserved over audio reinit for proper support of
gapless audio. The actual AO internal buffer takes care of this.
In fact, preserving ao_buffer just breaks audio resync. In the ordered
chapter case, end_pts is used, which means not all audio data in the
buffer is played, thus some data is left over when audio decoding
resumes on the next segment. This triggers some code that aborts resync
if there's "audio decoded" (ao_buffer contains something), but no PTS
is known (nothing was actually decoded yet).
Simplify, and always bind the output buffer to the decoder.
CC: @mpv-player/stable (maybe)
Probably no observable effect, but it's more correct. Setting audio to
EOF could have bad effects otherwise (anywhere the player logic for
example decides whether EOF was reached, and such).
Don't attempt to resync after speed changes. Note that most other cases
of audio reinit (like switching tracks etc.) still resync, but other
code paths take care of setting the audio_status accordingly.
This restores the old behavior of not trying to fix audio desync, which
was probably changed with commit 261506e3.
Note that the code as of now wasn't even entirely correct, since the A/V
sync values are slightly shifted. The dsync depends on the audio buffer
size, so a larger buffer size will show more extreme desync. Also see
mplayer2 commit 213a224e, which should fixed this - it was not merged
into mpv, because it disabled audio for too long, resulting in a worse
user experience. This is similar to the issue this commit attempts to
fix.
Fixes: #1042 (probably)
CC: @mpv-player-stable
This shouldn't change anything functionally.
Change the A/V desync message. --framedrop is enabled by default now, so
the text must be changed a little. I've never heard of audio outputs
messing up A/V sync recently, so remove that part.
Remove the unused ao_pts field.
Reorder 2 A/V sync related expressions so that they look the same.
In theory, timestamps can be negative, so we shouldn't just return -1
as special value.
Remove the separate code for clearing decode buffers; use the same code
that is used for normal seek reset.
Commit 5afc025c broke this. The reason is that mpctx->delay is updated
when a new video frame is added. This value is also needed to resync
audio, but it will be for the wrong PTS. They must be consistent with
each other, and if they aren't, initial sync will be off by N video
frames, which results at least in worse user experience.
This can be reproduced by for example heavily switching between normal
and 2x speed, or similar.
Fix by readding the video_next_pts field (keeping its use minimal,
instead of reverting the commit that removed it).
Apparently users prefer this behavior.
It was used for subtitles too, so move the code to calculate the video
offset into a separate function. Seeking also needs to be fixed.
Fixes#1018.
In encoding mode, the AO pretends to be infinitely fast (it will take
whatever we write, without ever rejecting input). Commit 261506e3 broke
this somehow. It turns out an old hack dealing with this was accidentally
dropped.
This is the hunk of code whose semantics were (partially) dropped:
if (mpctx->d_audio && (mpctx->restart_playback ? !video_left :
ao_untimed(mpctx->ao) && (mpctx->delay <= 0 ||
!video_left)))
{
int status = fill_audio_out_buffers(mpctx, endpts);
// Not at audio stream EOF yet
audio_left = status > -2;
}
This if condition is pretty wild, and it looked like it was pretty much
for audio-only mode, rather than subtle handling for encoding mode.
Basically move the code from playloop.c to video.c. The new function
write_video() now contains the code that was part of run_playloop().
There are no functional changes, except handling "new_frame_shown"
slightly differently. This is done so that we don't need new a new
MPContext field or a return value for write_video() to signal this
condition. Instead, it's handled indirectly.
This also reduces some code duplication with other parts of the code.
The changfe is mostly cosmetic, although there are also some subtle
changes in behavior. At least one change is that the big desync message
is now printed after every seek.
In situations when the demuxer reports EOF, but immediately "recovers"
after that and returns new data, it could happen that audio sync was
skipped. Deal with this by actually entering the EOF state, instead of
assuming this will happen later.
Some files have the first audio much later into the video (for whatever
reasons). Instead of appending large amounts of silence to the audio
buffer (and refusing to sync if the audio to append is "too large"),
just wait until enough video has played.
It probably happens relatively often that the first packet (or even the
first N packets) of a stream will fail to decode, but decoding will
eventually succeed at a later point. Before commit 261506e3, this was
handled by an explicit retry loop (although this was also for other
purposes), but with then was changed to abort on the first error. This
makes it impossible to decode some audio streams.
Change this so that errors are ignored for the first 50 packets, which
should make it equivalent to the old code.
If you for example use --audio-file, disable the external track, seek,
and enable the external track again, the playback position of the
external file was off, and you would get major A/V desync. This was
actually supposed to work, but broke at some time ago (probably commit
2b87415f). It didn't work, because it attempted to seek the stream if it
was already selected, which was always true due to
reselect_demux_streams() being called before that.
Fix by putting the initial selection and the seek together.
This commit makes audio decoding non-blocking. If e.g. the network is
too slow the playloop will just go to sleep, instead of blocking until
enough data is available.
For video, this was already done with commit 7083f88c. For audio, it's
unfortunately much more complicated, because the audio decoder was used
in a blocking manner. Large changes are required to get around this.
The whole playback restart mechanism must be turned into a statemachine,
especially since it has close interactions with video restart. Lots of
video code is thus also changed.
(For the record, I don't think switching this code to threads would
make this conceptually easier: the code would still have to deal with
external input while blocked, so these in-between states do get visible
[and thus need to be handled] anyway. On the other hand, it certainly
should be possible to modularize this code a bit better.)
This will probably cause a bunch of regressions.
Don't return an EOF code if there's still buffered data.
Also, don't call demux_stream_eof() in the playloop. There's probably
nothing wrong with it, but it's cleaner not to use it.
Also give AD_EOF its own value, so that a decoding error doesn't drain
audio by causing an EOF condition.
There was confusion about what should go into audio pts calculation and
what not (mainly due to the audio push thread). This has been fixed by
using the playing - not written - audio pts (which properly takes into
account the ao's buffer), and incrementing the samples count only by the
amount of samples actually taken from the buffer (unfortunately this
now forces us to keep the lock too long for my taste).
It's unlikely that files with multiple audio tracks and with replaygain
actually happen, but this change might help avoid minor corner cases
with later changes.
Basically, this allows gapless playback with similar files (including
the ordered chapter case), while still being robust in general.
The implementation is quite simplistic on purpose, in order to avoid
all the weird corner cases that can occur when creating the filter
chain. The consequence is that it might do not-gapless playback in
more cases when needed, but if that bothers you, you still can use
the normal gapless mode.
Just using "--gapless-audio" or "--gapless-audio=yes" selects the old
mode.
This code handles buggy AOs (even if all AOs are bug-free, it's good for
robustness). Move handling of it to the AO feed thread. Now this check
doesn't require magic numbers and does exactly what's it supposed to do.
This played the file at a wrong sample rate if the rate was out of
certain bounds.
A comment says this was for the sake of libaf/af_resample.c. This
resampler has been long removed. Our current resampler
(libav/swresample) checks supported sample rates on reconfiguration, and
will error out if a sample rate is not supported. And I think that is
the correct behavior.
This obviously doesn't work. It wasn't much of a problem in the past
because most passthrough formats use 2 channels, which is also the
default for downmix.
This is probably "safer". Without it, we will play 1 sample, because the
logic was written in a way to decode 1 sample if audio is paused. 1
sample usually will initialize the audio PTS, but not play any real
audio. Also see previous commit.
In ancient times, this actually used 1 byte (instead of 1 sample), so
clearly no sample was written, unless the audio was 8-bit mono.
Remove the ao_buffer_playable_samples field. This contained the number
of samples that fill_audio_out_buffers() wanted to write to the AO (i.e.
this data was supposed to be played at some point), but ao_play()
rejected it due to partial fill.
This could happen with many AOs, notably those which align all written
data to an internal period size (often called "outburst" in the AO
code), and the accepted number of samples is rounded down to period
boundaries. The left-over samples at the end were still kept in
mpctx->ao_buffer, and had to be played later.
The reason ao_buffer_playable_samples had to exist was to make sure that
at EOF, the correct number of left-over samples was played (and not
possibly other data in the buffer that had to be sliced off due to
endpts in fill_audio_out_buffers()). (You'd think you could just slice
the entire buffer, but I suspect this wasn't done because the end time
could actually change due to A/V sync changes. Maybe that was the reason
it's so complicated.)
Some commits ago, ao.c gained internal buffering, and ao_play() will
never return partial writes - as long as you don't try to write more
samples than ao_get_space() reports. This is always the case. The only
exception is filling the audio buffers while paused. In this case, we
decode and play only 1 sample in order to initialize decoding (e.g. on
seeking). Actually playing this 1 sample is in fact a bug, but even of
the AO doesn't have period size alignment, you won't notice it. In
summary, this means we can safely remove the code.
We want to move the AO to its own thread. There's no technical reason
for making the ao struct opaque to do this. But it helps us sleep at
night, because we can control access to shared state better.
This field will be moved out of the ao struct. The encoding code was
basically using an invalid way of accessing this field.
Since the AO will be moved into its own thread too and will do its own
buffering, the AO and the playback core might not even agree which
sample a PTS timestamp belongs to. Add some extrapolation code to handle
this case.