This has been a long standing annoyance - ffmpeg is removing
sizeof(AVPacket) from the API which means you cannot stack-allocate
AVPacket anymore. However, that is something we take advantage of
because we use short-lived AVPackets to bridge from native mpv packets
in our main decoding paths.
We don't think that switching these to `av_packet_alloc` is desirable,
given the cost of heap allocation, so this change takes a different
approach - allocating a single packet in the relevant context and
reusing it over and over.
That's fairly straight-forward, with the main caveat being that
re-initialising the packet is unintuitive. There is no function that
does exactly what we need (what `av_init_packet` did). The closest is
`av_packet_unref`, which additionally frees buffers and side-data.
However, we don't copy those things - we just assign them in from our
own packet, so we have to explicitly clear the pointers before calling
`av_packet_unref`. But at least we can make a wrapper function for
that.
The weirdest part of the change is the handling of the vtt subtitle
conversion. This requires two packets, so I had to pre-allocate two in
the context struct. That sounds excessive, but if allocating the
primary packet is too expensive, then allocating the secondary one for
vtt subtitles must also be too expensive.
This change is not conditional as heap allocated AVPackets were
available for years and years before the deprecation.
Change all OPT_* macros such that they don't define the entire m_option
initializer, and instead expand only to a part of it, which sets certain
fields. This requires changing almost every option declaration, because
they all use these macros. A declaration now always starts with
{"name", ...
followed by designated initializers only (possibly wrapped in macros).
The OPT_* macros now initialize the .offset and .type fields only,
sometimes also .priv and others.
I think this change makes the option macros less tricky. The old code
had to stuff everything into macro arguments (and attempted to allow
setting arbitrary fields by letting the user pass designated
initializers in the vararg parts). Some of this was made messy due to
C99 and C11 not allowing 0-sized varargs with ',' removal. It's also
possible that this change is pointless, other than cosmetic preferences.
Not too happy about some things. For example, the OPT_CHOICE()
indentation I applied looks a bit ugly.
Much of this change was done with regex search&replace, but some places
required manual editing. In particular, code in "obscure" areas (which I
didn't include in compilation) might be broken now.
In wayland_common.c the author of some option declarations confused the
flags parameter with the default value (though the default value was
also properly set below). I fixed this with this change.
Let's see how much everyone hates this. Leaving it enabled seems
problematic, because libavcodec returns an unspecific error if it
doesn't like it.
Fixes: #6300
Libav seems rather dead: no release for 2 years, no new git commits in
master for almost a year (with one exception ~6 months ago). From what I
can tell, some developers resigned themselves to the horrifying idea to
post patches to ffmpeg-devel instead, while the rest of the developers
went on to greener pastures.
Libav was a better project than FFmpeg. Unfortunately, FFmpeg won,
because it managed to keep the name and website. Libav was pushed more
and more into obscurity: while there was initially a big push for Libav,
FFmpeg just remained "in place" and visible for most people. FFmpeg was
slowly draining all manpower and energy from Libav. A big part of this
was that FFmpeg stole code from Libav (regular merges of the entire
Libav git tree), making it some sort of Frankenstein mirror of Libav,
think decaying zombie with additional legs ("features") nailed to it.
"Stealing" surely is the wrong word; I'm just aping the language that
some of the FFmpeg members used to use. All that is in the past now, I'm
probably the only person left who is annoyed by this, and with this
commit I'm putting this decade long problem finally to an end. I just
thought I'd express my annoyance about this fucking shitshow one last
time.
The most intrusive change in this commit is the resample filter, which
originally used libavresample. Since the FFmpeg developer refused to
enable libavresample by default for drama reasons, and the API was
slightly different, so the filter used some big preprocessor mess to
make it compatible to libswresample. All that falls away now. The
simplification to the build system is also significant.
Just an implementation detail that can be cleaned up now. Internally,
m_config maintains a tree of m_sub_options structs, except for the root
it was not defined explicitly. GLOBAL_CONFIG was a hack to get access to
it anyway. Define it explicitly instead.
ad_lavc and vd_lavc use the lavc_process() helper to translate the
FFmpeg push/pull API to the internal filter API (which completely
mismatch, even though I'm responsible for both, just fucking kill me).
This interface was "slightly" too tight. It returned only a bool
indicating "progress", which was not enough to handle some cases (see
following commit).
While we're at it, move all state into a struct. This is only a single
bool, but we get the chance to add more if needed.
This fixes mpv falling asleep if decoding returns an error during
draining. If decoding fails when we already sent EOF, the state machine
stopped making progress. This left mpv just sitting around and doing
nothing.
A test case can be created with: echo $RANDOM >> image.png
This makes libavformat read a proper packet plus a packet of garbage.
libavcodec will decode a frame, and then return an error code. The
lavc_process() wrapper could not deal with this, because there was no
way to differentiate between "retry" and "send new packet". Normally, it
would send a new packet, so decoding would make progress anyway. If
there was "progress", we couldn't just retry, because it'd retry
forever.
This is made worse by the fact that it tries to decode at least two
frames before starting display, meaning it will "sit around and do
nothing" before the picture is displayed.
Change it so that on error return, "receiving" a frame is retried. This
will make it return the EOF, so everything works properly.
This is a high-risk change, because all these funny bullshit exceptions
for hardware decoding are in the way, and I didn't retest them. For
example, if hardware decoding is enabled, it keeps a list of packets,
that are fed into the decoder again if hardware decoding fails, and a
software fallback is performed. Another case of horrifying accidental
complexity.
Fixes: #6618
This can be due to unsupported sample formats (see previous commits),
minor allocation failures, and similar things. For identifying the exact
cause it's buried too deep in abstractions. But most time it doesn't
happen anyway, since it's extremely rare that new audio formats are
added.
Fixes stupid messages with a opus/mkv test file that had an absurdly
huge codec delay.
This file fully skips several frames at the start. ad_lavc.c trimmed
these frames to 0 samples and returned them. The next layer
(f_decoder_wrapper.c) saw discontinuous PTS values, because the PTS
values increased by a frame, but amounted to 0 audio samples. This was
harmless, but logged PTS discontinuity errors.
This was always a legacy thing. Remove it by applying an orgy of
mp_get_config_group() calls, and sometimes m_config_cache_alloc() or
mp_read_option_raw().
win32 changes untested.
Use the decoder wrapper that was introduced for video. This removes all
code duplication the old audio decoder wrapper had with the video code.
(The audio wrapper was copy pasted from the video one over a decade ago,
and has been kept in sync ever since by the power of copy&paste. Since
the original copy&paste was possibly done by someone who did not answer
to the LGPL relicensing, this should also remove all doubts about
whether any of this code is left, since we now completely remove any
code that could possibly have been based on it.)
There is some complication with spdif handling, and a minor behavior
change (it will restrict the list of codecs to spdif if spdif is to be
used), but there should not be any difference in practice.
This is pretty pointless, but I believe it allows us to claim that the
new code is not affected by the copyright of the old code. This is
needed, because the original mp_audio struct was written by someone who
has disagreed with LGPL relicensing (it was called af_data at the time,
and was defined in af.h).
The "GPL'ed" struct contents that surive are pretty trivial: just the
data pointer, and some metadata like the format, samplerate, etc. - but
at least in this case, any new code would be extremely similar anyway,
and I'm not really sure whether it's OK to claim different copyright. So
what we do is we just use AVFrame (which of course is LGPL with 100%
certainty), and add some accessors around it to adapt it to mpv
conventions.
Also, this gets rid of some annoying conventions of mp_audio, like the
struct fields that require using an accessor to write to them anyway.
For the most part, this change is only dumb replacements of mp_audio
related functions and fields. One minor actual change is that you can't
allocate the new type on the stack anymore.
Some code still uses mp_audio. All audio filter code will be deleted, so
it makes no sense to convert this code. (Audio filters which are LGPL
and which we keep will have to be ported to a new filter infrastructure
anyway.) player/audio.c uses it because it interacts with the old filter
code. push.c has some complex use of mp_audio and mp_audio_buffer, but
this and pull.c will most likely be rewritten to do something else.
All relevant authors of the current code have agreed.
As always, there are the usual historical artifacts that could be
mentioned. For example, there used to be a large number of decoders
by various authors who were not asked, but whose code was all 100%
removed. (Mostly due to FFmpeg providing all codecs.)
One point of contention is that Nick Kurshev might have refactored the
old audio decoder code in 2001. Basically, there are hints that it might
have been done by him, such as Arpi's commit message stating that the
code was imported from MPlayerXP (Nick's fork), or all the files having
his name in the "maintainer" field. On the other hand, the murky history
of ad.h weakens this - it could be that Arpi started this work, and Nick
took it (and possibly finished it).
In any case, Nick could not be reached, so there is no agreement for
LGPL relicensing from him. We're changing the license anyway, and assume
that his change in itself is not copyrightable. He only moved code, and
in addition used the equivalent video decoder framework (done by Arpi,
who agreed) as template. For example, ad_functions_s was basically
vd_functions_s, which the signature of the decode callback changed to
the same as audio_decode(). ad_functions_s also had a comment that said
it interfaces with "video decoder drivers" (I'm fixing this comment in
this commit).
I verified that no additional code was added that is copyright-relevant,
still in today's code, and not copied from the existing code at the time
(either from the previous audio decoder code or the video framework
code). What apparently matters here is that none of the old code was not
written by Nick, and the authors of the old code have given his
agreement, and (probably) that Nick didn't add actual new code (none
that would have survived), that was not trivially based on the old one
(i.e. no new copyrightable "work").
A copyright expert told me that this kind of change can be considered
not relevant for copyright, so here we go.
Rewriting this would end with the same code anyway, and the naming
conventions can't be copyrighted.
This can be useful in other contexts.
Note that we end up setting AVCodecContext.width/height instead of
coded_width/coded_height now. AVCodecParameters can't set coded_width,
but this is probably more correct anyway.
The FFmpeg versions we support all have the APIs we were checking for.
Only Libav missed them. Simplify this by explicitly checking for FFmpeg
in the code, instead of trying to detect the presence of the API.
Since we set "skip_manual", we can actually get frames with this set.
Currently, only AV_PKT_FLAG_DISCARD will trigger this flag, and only
mov.c sets the latter flags, so this is related to FFmpeg's half-broken
mp4 edit list support.
Same deal as with video. Including the EOF handling.
(It would be nice if this code were not duplicated, but right now we're
not even close to unifying the audio and video code paths.)
Both AVFrame.pts and AVFrame.pkt_pts have existed for a long time. Until
now, decoders always returned the pts via the pkt_pts field, while the
pts field was used for encoding and libavfilter only. Recently, pkt_pts
was deprecated, and pts was switched to always carry the pts.
This means we have to be careful not to accidentally use the wrong
field, depending on the libavcodec version. We have to explicitly check
the version numbers. Of course the version numbers are completely
idiotic, because idiotically the pkg-config and library names are the
same for FFmpeg and Libav, so we have to deal with this explicitly as
well.
These are different AVCodecContext fields. pkt_timebase is the correct
one for identifying the unit of packet/frame timestamps when decoding,
while time_base is for encoding. Some decoders also overwrite the
time_base field with some unrelated codec metadata.
pkt_timebase does not exist in Libav, so an #if is required.
Instead of passing through double float timestamps opaquely, pass real
timestamps. Do so by always setting a valid timebase on the
AVCodecContext for audio and video decoding.
Specifically try not to round timestamps to a too coarse timebase, which
could round off small adjustments to timestamps (such as for start time
rebasing or demux_timeline). If the timebase is considered too coarse,
make it finer.
This gets rid of the need to do this specifically for some hardware
decoding wrapper. The old method of passing through double timestamps
was also a bit questionable. While libavcodec is not supposed to
interpret timestamps at all if no timebase is provided, it was
needlessly tricky. Also, it actually does compare them with
AV_NOPTS_VALUE. This change will probably also reduce confusion in the
future.
This commit adds an --audio-channel=auto-safe mode, and makes it the
default. This mode behaves like "auto" with most AOs, except with
ao_alsa. The intention is to allow multichannel output by default on
sane APIs. ALSA is not sane as in it's so low level that it will e.g.
configure any layout over HDMI, even if the connected A/V receiver does
not support it. The HDMI fuckup is of course not ALSA's fault, but other
audio APIs normally isolate applications from dealing with this and
require the user to globally configure the correct output layout.
This will help with other AOs too. ao_lavc (encoding) is changed to the
new semantics as well, because it used to force stereo (perhaps because
encoding mode is supposed to produce safe files for crap devices?).
Exclusive mode output on Windows might need to be adjusted accordingly,
as it grants the same kind of low level access as ALSA (requires more
research).
In addition to the things mentioned above, the --audio-channels option
is extended to accept a set of channel layouts. This is supposed to be
the correct way to configure mpv ALSA multichannel output. You need to
put a list of channel layouts that your A/V receiver supports.
The libavcodec wmapro decoder will skip some bytes at the start of the
first packet and return each time. It will not return any audio data in
this state.
Our own code as well as libavcodec's new API handling
(avcodec_send_packet() etc.) discard the PTS on the first return, which
means the PTS is never known for the first packet. This results in a
"Failed audio resync." message.
Fixy it by remember the PTS in next_pts. This field is used only if the
decoder outputs no PTS, and is updated after each frame - and thus
should be safe to set.
(Possibly this should be fixed in libavcodec new API handling by not
setting the PTS to NOPTS as long as no real data has been output. It
could even interpolate the PTS if the timebase is known.)
Fixes the failure message seen in #3297.
Workaround for an awful corner-case. The new decode API "locks" the
decoder into the EOF state once a drain packet has been sent. The
problem starts with a file containing a 0-sized packet, which is
interpreted as drain packet.
This should probably be changed in libavcodec (not treating 0-sized
packets as drain packets with the new API) or in libavformat (discard
0-sized packets as invalid), but efforts to do so have been fruitless.
Note that vd_lavc.c already does something similar, but originally for
other reasons.
Fixes#3106.
AVFormatContext.codec is deprecated now, and you're supposed to use
AVFormatContext.codecpar instead.
Handle this for all of the normal playback code.
Encoding mode isn't touched.
Fixes correctness_trimming_nobeeps.opus. One nasty thing is that this
mechanism interferes with the container-signalled mechanism with
AV_FRAME_DATA_SKIP_SAMPLES. So apply it only if that is apparently not
present. It's a mess, and it's still broken in FFmpeg CLI, so I'm sure
this will get fucked up later again.
I'm not quite sure what the FFmpeg AV_FRAME_DATA_SKIP_SAMPLES API
demands here. The code so far assumed that skipping can be more than a
frame, but not trimming. Extend it to trimming too.
This is actually already done by dec_audio.c. But if
AV_FRAME_DATA_SKIP_SAMPLES is applied, this happens too late here. The
problem is that this will slice off samples, and make it impossible for
later code to reconstruct the timestamp properly.
Missing timestamps can still happen with some demuxers, e.g. demux_mkv.c
with Opus tracks. (Although libavformat interpolates these itself.)
The code is shared with the --vd-lavc-threads option, so using 0 for
auto-detection just works.
But no, this is not useful. Just change it for orthogonality.
Similar to the video path. dec_audio.c now handles decoding only. It
also looks very similar to dec_video.c, and actually contains some of
the rewritten code from it. (A further goal might be unifying the
decoders, I guess.)
High potential for regressions.
This is mainly a refactor. I'm hoping it will make some things easier
in the future due to cleanly separating codec metadata and stream
metadata.
Also, declare that the "codec" field can not be NULL anymore. demux.c
will set it to "" if it's NULL when added. This gets rid of a corner
case everything had to handle, but which rarely happened.
Instead of requiring the decoder to set the PTS directly on the
dec_audio context (including handling absence of PTS etc.), transfer the
packet PTS to the decoded audio frame. Marginally simpler, and gives
more control to the generic code.
MPlayer traditionally had completely separate sh_ structs for
audio/video/subs, without a good way to share fields. This meant that
fields shared across all these headers had to be duplicated. This commit
deduplicates essentially the last remaining duplicated fields.
Remove the old implementation for these properties. It was never very
good, often returned very innaccurate values or just 0, and was static
even if the source was variable bitrate. Replace it with the
implementation of "packet-video-bitrate". Mark the "packet-..."
properties as deprecated. (The effective difference is different
formatting, and returning the raw value in bits instead of kilobits.)
Also extend the documentation a little.
It appears at least some decoders (sipr?) need the
AVCodecContext.bit_rate field set, so this one is still passed through.