libcve (libarchive is the official name I think?) has this annoying
behavior that if after a "fatal" error, you do any operation on the
archive context other than querying the error and closing the context,
you get a free CVE. So we close the archive context in these situations.
This can set p->mpa to NULL, so code accessing this field needs to be
careful.
This was not considered in a certain code path, and a simple truncated
.rar file made it crash. Part of the problem was that the file inside
the rar was a mkv file, which triggered seeking when the demux_mkv
resync code encountered bogus data.
This is probably a regression from a relatively recent change to this
code (in any case mpv 0.29.1 doesn't crash).
Fix this by adding the check.
There's also a mechanism to reopen an archive context used to emulate
seeking, since most libarchive format handlers don't support this
natively. Add a reopen call to the codepath, because obviously it should
always be possible to seek back into a "working" area of the file.
There is a second bug with this: if reopening fails, we don't adjust the
current position back to 0, which in some cases means we accidentally
return bogus data to the reader when we shouldn't. Fix this by always
resetting the position on reopening.
Seems to happen often with ytdl pseudo-DASH streams, so whatever. I
couldn't reproduce it and check what triggers it, I just remember seeing
the error message and found it annoying.
struct stream used to include the stream buffer, including peek buffer,
inline in the struct. It could not be resized, which means the maximum
peek size was set in stone. This meant demux_lavf.c could peek only so
much data.
Change it to use a dynamic buffer. Because it's possible, keep the
inline buffer for default buffer sizes (which are basically always used
outside of file opening). It's unknown whether it really helps with
anything. Probably not.
This is also the fallback plan in case we need something like the old
stream cache in order to deal with mp4 + unseekable http: the code can
now be easily changed to use any buffer size.
This happened with a .flac file inside an archive. It tried to seek
beyond the end of the archive entry in a format where seeking isn't
supported. stream_libarchive handles these situations by skipping data.
But when the end of the archive is reached, archive_read_data() returns
0. While libarchive didn't bother to fucking document this, they do say
it's supposed to work like read(), so I guess a return value of 0 really
means EOF. So change the "< 0" to "<= 0". Also add some error logging.
The same file actually worked without out of bounds reads when
extracted, so there still might be something very wrong.
Apparently this was so that when playing a video file from a .rar file,
it would load external subtitles with the same name (instead of looking
for mpv's rar:// mangled URL). This was requested on github almost 5
years ago. Seems like a shit feature, and why should I give a fuck? Drop
it, because it complicates some in progress change.
I don't ever use them, so kill them.
Linux TV is excessively complex, and whenever I attempted to use it, it
didn't work well or would have required some major work to update it.
(For example, when I tried to use a webcam-type device with tv://, it
worked badly; even the libavdevice garbage worked better.)
The "program" property was rather complex and rather obscure. I didn't
ever use it. Should there ever be a proper use for it (maybe HLS stream
selection?), it should be rewritten anyway.
The demuxer cache is the only cache now. Might need another change to
combat seeking failures in mp4 etc. The only bad thing is the loss of
cache-speed, which was sort of nice to have.
Alway give each demuxer its own mp_cancel instance. This makes
management of the mp_cancel things much easier. Also, instead of having
add/remove functions for mp_cancel slaves, replace them with a simpler
to use set_parent function. Remove cancel_and_free_demuxer(), which had
mpctx as parameter only to check an assumption. With this commit,
demuxers have their own mp_cancel, so add demux_cancel_and_free() which
makes use of it.
The properties/commands touched in this commit are all for obscure
special inputs (BD/DVD/DVB/TV), and they all block on the demuxer/stream
layer. For network streams, this blocking is very unwelcome. They will
affect playback and probably introduce pauses and frame drops. The
player can even freeze fully, and the logic that tries to make playback
abortable even if frozen complicates the player.
Since the mentioned accesses are not needed for network streams, but
they will block on network streams even though they're going to fail,
add a flag that coarsely enables/disables these accesses. Essentially it
establishes a whitelist of demuxers/streams which support them.
In theory you could to access BD/DVD images over network (or add such
support, I don't think it's a thing in mpv). In these cases these
controls still can block and could even "freeze" the player completely.
Writing to the "program" and "cache-size" properties still can block
even for network streams. Just don't use them if you don't want freezes.
The intention is to avoid that the parent mp_cancel retains the
internally allocated wakeup pipe. File FDs are a relatively scarce
resource, so try to avoid having too many. This might matter for
subtitle files, for which it is relatively likely that they are loaded
in large quantities.
demux_lavf.c will close the underlying stream for most subtitle files,
and now it will free the wakeup pipe too. Actually, there are currently
only 1 or 2 mp_cancel objects per mpv core, but this could change if
every external subtitle track gets its own mp_cancel in later commits.
It seems a bit inappropriate to have dumped this into stream.c, even if
it's roughly speaking its main user. At least it made its way somewhat
unfortunately to other components not related to the stream or demuxer
layer at all.
I'm too greedy to give this weird helper its own file, so dump it into
thread_tools.c.
Probably a somewhat pointless change.
There is some code that checks a FD for whether it is a regular file or
not. If it's not a regular file, it e.g. enables use of poll() to avoid
blocking forever.
But this was done only for FDs that were open()ed by us, not from stdin
special handling or fd://. Consequently, " | mpv -" could block the
player. Fix this by moving the code and running for it on all FDs.
Also, set p->regular_file even on mingw.
When this happens, network calls are forcibly aborted (more or less),
but demuxers might keep going, as most of them do not check for forced
exits properly. This can possibly lead to broken packets being added.
Also do not attempt to read more packets in this situation.
Also do not print a stream open failed message if opening was aborted
anyway.
Naturally, there's more than one fourcc that indicates an mjpeg
stream.
I have a particular ancient webcam here (Logitech QuickCam Messanger)
that only supports the single 'JPEG' format, but there are other
devices out there which support both 'JPEG' and 'MJPG' with no visible
differences, and others where the streams are slightly different.
Regardless of those details, it remains correct to treat 'JPEG'
the same as 'MJPG' from a stream consumption perspective.
Do this because retrying reading on higher levels (like the demuxer)
usually causes tons of problems. A hack like this is simpler and could
allow to remove some of the higher level retry behavior.
This works by trying to detect whether the file is appended. If we reach
EOF, check if the file size changed compared to the initial value. If it
did, it means the file was appended at least once, and we set the
p->appending flag. If that flag is set, we simply retry reading more
data every time we encounter EOF. The only way to do this is polling,
and we poll for at most 10 times, after waiting for 200ms every time.
The use of the FFmpeg hls protocol (as opposed to demuxer) is
"discouraged", and probably only causes additional potential security
problems at best, so drop it.
This commit eliminates the following clang warning:
warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
Going by the clang commit message, this seems to be explicitly specified
as UB by the standard, and they added this warning because MSVC
apparently results in different behavior. Whatever, we can just avoid
the warning with some small changes.
For quite some time, msg.c hasn't output partial log messages anymore,
and instead buffered them in memory. This means the MP_INFO() statement
here just kept appending the message to memory, instead of outputting
it.
Easy enough to fix by abusing the status line (which means the frontend
and this code will "fight" for the status line, but this code seems to
win usually, as the frontend doesn't update it so often).
Users should probably really switch to --cache-pause-initial.
Fixes#5360.
Remove our own hacky reconnection code, and use libavformat's feature for
that. It's disabled by default, and until recently it did not work too
well. This has been fixed in recent ffmpeg git master[1], so there's no reason
to keep our own code.
[1] FFmpeg/FFmpeg@8a108bdea0
We set "reconnect_delay_max" to 7, which limits the maximum time it
waits. Since libavformat doubles the wait time on each reconnect attempt
(starting with 1), and stops trying to reconnect once the wait time is
over the reconnect_delay_max value, this allows for 4 reconnection
attempts which should add to 11 seconds maximum wait time. The default
is 120, which seems too high for normal playback use.
(The user can still override these parameters with --stream-lavf-o.)
Don't drop the stream buffers, because the read call (that must have
been failing) might try to extend an existing read buffer in the first
place. Just move the messy seek logic to stream_lavf.c. (In theory,
stream_lavf should probably make libavformat connect at the correct
offset instead of using a seek to reconnect it again. This patch doesn't
fix it, but at least it's a good argument to have the messing with the
position not in the generic code.)
Also update the comment about avio not supporting reconnecting. It has
that feature now. Maybe we should use it, but only after it gets fixed.
In commit 1199c1e3, we added checks to every libarchive API call to make
sure the archive was closed on ARCHIVE_FATAL - otherwise, libarchive
could endow us with free CVEs (such as it apparently happens when you
continue reading a rar archive that uses features not yet supported by
libarchive).
This broke the fallback for seeking in unseekable archive formats. Of
course libarchive won't tell us directly whether a format implementation
has seek support or not - and OF COURSE it returns ARCHIVE_FATAL if it
has no seek support. (The error string, which you can retrieve via API,
is actually more detailed, and also claims it's an "internal error". I
don't think so, libarchive.) Returning ARCHIVE_FATAL means we have to
assume free CVEs are ahead, and we have to close the archive. Which
breaks the fallback in a dumb way (we have no way of telling which of
those cases happened anyway).
Fix this by assuming that all seek errors are potentially due to lack of
seek support. If the seek call fails, reopen the archive, and set a flag
so the seek API is never tried again. (This means we can still skip
ahead for forward seeks, which is more efficient than skipping from the
start of the archive entry.)
Also fix an old typo in an error message.
Reduce it from 75MB in both directions (forward/backwards) to 10MB each.
The stream cache is kind of becoming useless in favor of the demuxer
cache. Using both doesn't make much sense, because they will contain
duplicated data for no reason.
Still leave it at 10MB, which may help with mp4 a bit. libavformat's mp4
demuxer tends to seek too much, so we try to avoid triggering network
level seeks by having some caching in the stream layer.
I've decided that MP_TRACE means “noisy spam per frame”, whereas
MP_DBG just means “more verbose debugging messages than MSGL_V”.
Basically, MSGL_DBG shouldn't create spam per frame like it currently
does, and MSGL_V should make sense to the end-user and provide mostly
additional informational output.
MP_DBG is basically what I want to make the new default for --log-file,
so the cut-off point for MP_DBG is if we probably want to know if for
debugging purposes but the user most likely doesn't care about on the
terminal.
Also, the debug callbacks for libass and ffmpeg got bumped in their
verbosity levels slightly, because being external components they're a
bit less relevant to mpv debugging, and a bit too over-eager in what
they consider to be relevant information.
I exclusively used the "try it on my machine and remove messages from
MSGL_* until it does what I want it to" approach of refactoring, so
YMMV.
Fix that libarchive fails to return filenames for UTF-8/UTF-16 entries.
The reason is that it uses locales and all that garbage, and mpv does
not set a locale.
Both C locales and wchar_t are shitfucked retarded legacy braindeath. If
the C/POSIX standard committee had actually competent members, these
would have been deprecated or removed long ago. (I mean, they managed to
remove gets().) To justify this emotional outbreak potentially insulting
to unknown persons, I will write a lot of text. Those not comfortable
with toxic language should pretend this is a religious text.
C locales are supposed to be a way to support certain languages and
cultures easier. One example are character codepages. Back when UTF-8
was not invented yet, there were only 255 possible characters, which is
not enough for anything but English and some european languages. So they
decided to make the meaning of a character dependent on the current
codepage. The locale (LC_CTYPE specifically) determines what character
encoding is currently used.
Of course nowadays, this is legacy nonsense. Everything uses UTF-8 for
"char", and what doesn't is broken and terrible anyway. But the old ways
stayed with us, and the stupidity of it as well.
C locales were utterly moronic even when they were invented. The locale
(via setlocale()) is global state, and global state is not a reasonable
way to do anything. It will break libraries, or well modularized code.
(The latter would be forced to strictly guard all entrypoints set
set/restore locales, assuming a single threaded world.)
On top of that, setting a locale randomly changes the semantics of a
bunch of standard functions. If a function respects locale, you suddenly
can't rely on it to behave the same on all systems. Some behavior can
come as a surprise, and of course it will be dependent on the region of
the user (it doesn't help that most software is US-centric, and the US
locale is almost like the C locale, i.e. almost what you expect).
Idiotically, locales were not just used to define the current character
encoding, but the concept was used for a whole lot of things, like e. g.
whether numbers should use "," or "." as decimal separaror. The latter
issue is actually much worse, because it breaks basic string conversion
or parsing of numbers for the purpose of interacting with file formats
and such.
Much can be said about how retarded locales are, even beyond what I just
wrote, or will wrote below. They are so hilariously misdesigned and
insufficient, I can't even fathom how this shit was _standardized_. (In
any case, that meant everyone was forced to implement it.) Many C
functions can't even do it correctly. For example, the character set
encoding can be a multibyte encoding (not just UTF-8, but awful garbage
like Shift JIS (sometimes called SHIT JIZZ), yet functions like
toupper() can return only 1 byte. Or just take the fact that the locale
API tries to define standard paper sizes (LC_PAPER) or telephone number
formatting (LC_TELEPHONE). Who the fuck uses this, or would ever use
this?
But the badness doesn't stop here. At some point, they invented threads.
And they put absolutely no thought into how threads should interact with
locales. So they kept locales as global state. Because obviously, you
want to be able to change the semantics of basic string processing
functions _while_ they're running, right? (Any thread can call
setlocale() at any time, and it's supposed to change the locale of all
other threads.)
At this point, how the fuck are you supposed to do anything correctly?
You can't even temporarily switch the locale with setlocale(), because
it would asynchronously fuckup the other threads. All you can do is to
enforce a convention not to set anything but the C local (this is what
mpv does), or to duplicate standard functions using code that doesn't
query locale (this is what e.g. libass does, a close dependency of mpv).
Imagine they had done this for certain other things. Like errno, with
all the brokenness of the locale API. This simply wouldn't have worked,
shit would just have been too broken. So they didn't. But locales give a
delicious sweet spot of brokenness, where things are broken enough to
cause neverending pain, but not broken enough that enough effort would
have spent to fix it completely.
On that note, standard C11 actually can't stringify an error value. It
does define strerror(), but it's not thread safe, even though C11
supports threads. The idiots could just have defined it to be thread
safe. Even if your libc is horrible enough that it can't return string
literals, it could just just some thread local buffer. Because C11 does
define thread local variables. But hey, why care about details, if you
can just create a shitty standard?
(POSIX defines strerror_r(), which "solves" this problem, while still
not making strerror() thread safe.)
Anyway, back to threads. The interaction of locales and threads makes no
sense. Why would you make locales process global? Who even wanted it to
work this way? Who decided that it should keep working this way, despite
being so broken (and certainly causing implementation difficulties in
libc)? Was it just a fucked up psychopath?
Several decades later, the moronic standard committees noticed that this
was (still is) kind of a bad situation. Instead of fixing the situation,
they added more garbage on top of it. (Probably for the sake of
"compatibility"). Now there is a set of new functions, which allow you
to override the locale for the current thread. This means you can
temporarily override and restore the local on all entrypoints of your
code (like you could with setlocale(), before threads were invented).
And of course not all operating systems or libcs implement this. For
example, I'm pretty sure Microsoft doesn't. (Microsoft got to fuck it up
as usual, and only provides _configthreadlocale(). This is shitfucked on
its own, because it's GLOBAL STATE to configure that GLOBAL STATE should
not be GLOBAL STATE, i.e. completely broken garbage, because it requires
agreement over all modules/libraries what behavior should be used. I
mean, sure, makign setlocale() affect only the current thread would have
been the reasonable behavior. Making this behavior configurable isn't,
because you can't rely on what behavior is active.)
POSIX showed some minor decency by at least introducing some variations
of standard functions, which have a locale argument (e.g. toupper_l()).
You just pass the locale which you want to be used, and don't have to do
the set locale/call function/restore locale nonense. But OF COURSE they
fucked this up too. In no less than 2 ways:
- There is no statically available handle for the C locale, so you have
to initialize and store it somewhere, which makes it harder to make
utility functions safe, that call locale-affected standard functions
and expect C semantics. The easy solution, using pthread_once() and a
global variable with the created locale, will not be easily accepted
by pedantic assholes, because they'll worry about allocation failure,
or leaking the locale when using this in library code (and then
unloading the library). Or you could have complicated library
init/uninit functions, which bring a big load of their own mess.
Same for automagic DLL constructors/destructors.
- Not all functions have a variant that takes a locale argument, and
they missed even some important ones, like snprintf() or strtod() WHAT
THE FUCK WHAT THE FUCK WHAT THE FUCK WHAT THE FUCK WHAT THE FUCK WHAT
THE FUCK WHAT THE FUCK WHAT THE FUCK WHAT THE FUCK
I would like to know why it took so long to standardize a half-assed
solution, that, apart from being conceptually half-assed, is even
incomplete and insufficient. The obvious way to fix this would have
been:
- deprecate the entire locale API and their use, and make it a NOP
- make UTF-8 the standard character type
- make the C locale behavior the default
- add new APIs that explicitly take locale objects
- provide an emulation layer, that can be used to transparently build
legacy code without breaking them
But this wouldn't have been "compatible", and the apparently incompetent
standard committees would have never accepted this. As if anyone
actually used this legacy garbage, except other legacy garbage. Oh yeah,
and let's care a lot about legacy compatibility, and let's not care at
all about modern code that either has to suffer from this, or subtly
breaks when the wrong locales are active.
Last but not least, the UTF-8 locale name is apparently not even
standardized. At the moment I'm trying to use "C.UTF-8", which is
apparently glibc _and_ Debian specific. Got to use every opportunity to
make correct usage of UTF-8 harder. What luck that this commit is only
for some optional relatively obscure mpv feature.
Why is the C locale not UTF-8? Why did POSIX not standardize an UTF-8
locale? Well, according to something I heard a few years ago, they're
considering disallowing UTF-8 as locale, because UTF-8 would violate
certain ivnariants expected by C or POSIX. (But I'm not sure if I
remember this correctly - probably better not to rage about it.)
Now, on to libarchive.
libarchive intentionally uses the locale API and all the broken crap
around it to "convert" UTF-8 or UTF-16 (as contained in reasonably sane
archive formats) to "char*". This is a good start!
Since glibc does not think that the C locale uses UTF-8, this fails for
mpv. So trying to use archive_entry_pathname() to get the archive entry
name fails if the name contains non-ASCII characters.
Maybe use archive_entry_pathname_utf8()? Surely that should return
UTF-8, since its name seems to indicate that it returns UTF-8. But of
fucking course it doesn't! libarchive's horribly convoluted code (that
is full of locale API usage and other legacy shit, as well as ifdefs and
OS specific code, including Windows and fucking Cygwin) somehow fucks up
and fails if the locale is not set to UTF-8. I made a PR fixing this in
libarchive almost 2 years ago, but it was ignored.
So, would archive_entry_pathname_w() as fallback work? No, why would it?
Of course this _also_ involves shitfucked code that calls shitfucked
standard functions (or OS specific ifdeffed shitfuck). The truth is that
at least glibc changes the meaning of wchar_t depending on the locale.
Unlike most people think, wchar_t is not standardized to be an UTF
variant (or even unicode) - it's an encoding that uses basic units that
can be larger than 8 bit. It's an implementation defined thing. Windows
defines it to 2 bytes and UTF-16, and glibc defines it to 4 bytes and
UTF-32, but only if an UTF-8 locale is set (apparently).
Yes. Every libarchive function dealing with strings has 3 variants:
plain, _utf8, and _w. And none of these work if the locale is not set.
I cannot fathom why they even have a wchar_t variant, because it's
redundant and fucking useless for any modern code.
Writing a UTF-16 to UTF-8 conversion routine is maybe 3 pages of code,
or a few lines if you use iconv. But libarchive uses all this glorious
bullshit, and ends up with 3 not working API functions, and with over
4000 lines of its own string abstraction code with gratuitous amounts of
ifdefs and OS dependent code that breaks in a fairly common use case.
So what we do is:
- Use the idiotic POSIX 2008 API (uselocale() etc.) (Too bad for users
who try to build this on a system that doesn't have these - hopefully
none are left in 2017. But if there are, torturing them with obscure
build errors is probably justified. Might be bad for Windows though,
which is a very popular platform except on phones.)
- Use the "C.UTF-8" locale, which is probably not 100% standards
compliant, but works on my system, so it's fine.
- Guard every libarchive call with uselocale() + restoring the locale.
- Be lazy and skip some libarchive calls. Look forward to the unlikely
and astonishingly stupid bugs this could produce.
We could also just set a C UTF-8 local in main (since that would have no
known negative effects on the rest of the code), but this won't work for
libmpv.
We assume that uselocale() never fails. In an unexplainable stroke of
luck, POSIX made the semantics of uselocale() nice enough that user code
can fail failures without introducing crash or security bugs, even if
there should be an implementation fucked up enough where it's actually
possible that uselocale() fails even with valid input.
With all this shitty ugliness added, it finally works, without fucking
up other parts of the player. This is still less bad than that time when
libquivi fucked up OpenGL rendering, because calling a libquvi function
would load some proxy abstraction library, which in turn loaded a KDE
plugin (even if KDE was not used), which in turn called setlocale()
because Qt does this, and consequently made the mpv GLSL shader
generation code emit "," instead of "." for numbers, and of course only
for users who had that KDE plugin installed, and lived in a part of the
world where "." is not used as decimal separator.
All in all, I believe this proves that software developers as a whole
and as a culture produce worse results than drug addicted butt fucked
monkeys randomly hacking on typewriters while inhaling the fumes of a
radioactive dumpster fire fueled by chinese platsic toys for children
and Elton John/Justin Bieber crossover CDs for all eternity.
According to
https://github.com/libarchive/libarchive/pull/773#issuecomment-334892291
we're not allowed to "continue reading" (post above) or performing "more
operations" (comments in archive.h header), whatever that means. Assume
closing and freeing the archive is still ok.
Since the codec already includes logic for closing and reopening the
archive for seeking in unseekable archives, this probably isn't too bad.
Untested due to lack of crashing sample (I lost my original test case,
and as recently user-provided one didn't crash).
In the extreme case, reading 1 byte would wake up the cache to make the
cache thread read 1 byte. This would be extremely inefficient. This will
not normally happen in our cache implementation, but it's still present
to some lesser degree. Normally you'd set a predefined "cache too low"
boundary, after which you would restart reading. For some reason
something like this is already present using a hardcoded value
(FILL_LIMIT - I don't even know the deeper reason why this exists). So
use that to reduce wakeups.
This doesn't fix redundant wakeups on EOFs, which is especially visible
should something keep retrying reading on EOF (like in an endless loop).
This should actually cover all of them, if you take into account that
some unchanged GPL source files include header files with such checks.
Also this was done already for the libaf derived code.
This is only for "safety" and to avoid misunderstandings.
ATSC is a mix of terrestrial and cable,
and depending on modulation is actually using
DVBC_ANNEX_B. Thus, we need to override the delivery
system depending on the modulation, channel by channel.
Signed-off-by: Oliver Freyermuth <o.freyermuth@googlemail.com>
These values are kept with a different unit in VDR style config
for all delivery systems, not only for DVB-S / DVB-S2.
Signed-off-by: Oliver Freyermuth <o.freyermuth@googlemail.com>
Dump the complete raw tuning commands to allow for debugging
on low level.
Also, remove code duplication and some variable shadowing.
Signed-off-by: Oliver Freyermuth <o.freyermuth@googlemail.com>