1
0
mirror of https://github.com/mpv-player/mpv synced 2024-12-26 00:42:57 +00:00
Commit Graph

63 Commits

Author SHA1 Message Date
Akemi
8bbdecea83 osx: consistent normalisation when searching for external files
several unicode characters can be encoded in two different ways, either
in a precomposed (NFC) or decomposed (NFD) representation. everywhere
besides on macOS, specifically HFS+, precomposed strings are being used.
furthermore on macOS we can get either precomposed or decomposed
strings, for example when not HFS+ formatted volumes are used. that can
be the case for network mounted devices (SMB, NFS) or optical/removable
devices (UDF). this can lead to an inequality of actual equal strings,
which can happen when comparing strings from different sources, like the
command line or filesystem. this makes it mainly a problem on macOS
systems.

one case that can potential break is the sub-auto option. to prevent
that we convert the search string as well as the string we search in to
the same normalised representation, specifically we use the decomposed
form which is used anywhere else.

this could potentially be a problem on other platforms too, though the
potential of occurring is very minor. for those platforms we don't
convert anything and just fallback to the input.

Fixes #4016
2017-02-02 16:21:04 +01:00
wm4
06c55ac6c3 charset_conv: fallback to interpreting subs as latin1 if iconv fails
For display purposes, it's better to show scrambled text - at least
that's a more actionable failure mode than spamming the terminal with
FFmpeg nonsense error messages.

This avoids the obnoxious and pointless

  "Invalid UTF-8 in decoded subtitles text; maybe missing -sub_charenc option"

FFmpeg error, which will be spammed on every single subtitle event. We
don't even have a -sub-charenc option, fuck FFmpeg.

Did I mention fuck FFmpeg yet? Because fuck FFmpeg.
2017-01-22 13:10:16 +01:00
wm4
310671e91a charset_conv: support minimum compatibility to utf8:... syntax
Because it's the most commonly used one, and trivial to support.
2017-01-22 13:06:36 +01:00
wm4
4adfde5dd1 options: drop deprecated --sub-codepage syntax 2017-01-19 15:46:59 +01:00
wm4
35716b53db charset_conv: fix "auto" fallback with uchardet not compiled
Tried to open iconv with "auto" as source codepage, instead of using
the latin1 fallback. This also neutralizes the libavcodec dumbass
UTF-8 check, which discards subtitles not in UTF-8 and shows an
error message for ffmpeg CLI instead.

Fixes #3954.
2016-12-28 15:31:25 +01:00
wm4
c21c68edd4 bstr: change to LGPL
Was started by Uoti Urpala in commit 5f631d1c. Although it was made part
of demux_mkv.c, it's quite obvious that it's not based on any
pre-existing demux_mkv.c code (or ebml.c/.h for that matter). Anyone
else who has touched this code every since has already agreed to LGPL
relicensing.
2016-12-11 18:09:44 +01:00
wm4
c324bfab59 charset_conv: simplify and change --sub-codepage option
As documented in interface-changes.rst. This makes it much easier to
follow what the heck is going on.

Whether this is adequate for real-world use is unknown.
2016-12-09 19:51:29 +01:00
wm4
0eb87e1baf charset_conv: drop enca and libguess support
Enca is dead. libguess is relatively useless due to not having an
universal detection mode. On the other hand, libuchardet is actively
developed.

Manpages changes in the following commit.
2016-12-09 19:48:59 +01:00
wm4
4395a4f837 player: don't enter playloop for client API requests
This _actually_ does what commit 8716c2e8 promised, and gives a slight
performance improvement for client API users which make a lot of
requests (like reading properties).

The main issue was that mp_dispatch_lock() (which client.c uses to get
exclusive access to the core) still called the wakeup callback, which
made mp_dispatch_queue_process() exit. So the playloop got executed
again, and since it does a lot of stuff, performance could be reduced.
2016-09-16 20:24:52 +02:00
wm4
17e3e800e1 dispatch: fix a race condition triggering an assert()
If we were waiting, and then exiting due to timeout, we still have to
recheck the condition protected by the condition variable/mutex in order
to get back to a consistent state. In this case, the queue was locked
with mp_dispatch_lock(), and mp_dispatch_queue_process() got to return
without waiting for unlock.

Also caused commit 8716c2e8. Probably an argument for replacing the
dispatch queue by a simple mutex.
2016-09-16 16:11:33 +02:00
wm4
8716c2e88f player: use better way to wait for input and dispatching commands
Instead of using input_ctx for waiting, use the dispatch queue directly.
One big change is that the dispatch queue will just process commands
that come in (e.g. from client API) without returning. This should
reduce unnecessary playloop excutions (which is good since the playloop
got a bit fat from rechecking a lot of conditions every iteration).

Since this doesn't force a new playloop iteration on every access, this
has to be enforced manually in some cases.

Normal input (via terminal or VO window) still wakes up the playloop
every time, though that's not too important. It makes testing this
harder, though. If there are missing wakeup calls, it will be noticed
only when using the client API in some form.

At this point we could probably use a normal lock instead of the
dispatch queue stuff.
2016-09-16 14:49:23 +02:00
wm4
591e21a2eb osdep: rename atomics.h to atomic.h
The standard header is stdatomic.h, so the extra "s" freaks me out every
time I look at it.
2016-09-07 11:26:25 +02:00
wm4
6cd80e972e dispatch: improve recent locking changes slightly
Instead of adding a lock_frame to the list when mp_dispatch_lock() is
called, just set a simple flag. This uses the fact that the lock is not
recursive, and can happen once per mp_dispatch_queue_process(). It
avoids the dynamic allocation, and makes error checking slightly
stricter.

Again, this is actually redundant and exists only for error-checking.
It'd actually need only a counter, because the actual locking is done by
"parking" the target thread in mp_dispatch_queue_process() and then
setting queue->idling=false. Only when mp_dispatch_unlock() sets it to
true again other work can proceed again. Document this too.
2016-09-05 20:58:45 +02:00
wm4
3878a59e2c dispatch: redo locking, and allow reentrant processing
A deadlock bug was reported with the following test program:

	mpv_handle *mpv = mpv_create();
	mpv_set_option_string(mpv, "ytdl", "yes");
	mpv_initialize(mpv);
	mpv_terminate_destroy(mpv);

The cause of this is loading the ytdl.lua script, which triggers a
certain code path that calls mp_dispatch_queue_process() recursively. It
does so to wait until the script is loaded, and we want to keep that.

Reentrancy was not supported by mp_dispatch, which leads to the
deadlock. Rewrite the locking so that it does. We mainly get rid of the
"exclusive_lock" mutex. Instead we use the existing lock/condition
variable to wait until we can grab a logical lock.

Note that the lock_frame business can be replaced with a simple counter.
Instead of checking the lock_frame address, it'd simply increment and
store the counter when entering mp_dispatch_queue_process(), and then
compare the counter to decide whether or not to wait. But I think the
additional error checking done by the lock_frame list is valuable.

Fixes #3489.
2016-09-04 18:05:36 +02:00
wm4
2619d8eff4 client API: implement mpv_suspend/resume slightly differently
Why do these API calls even still exist? I don't know, and maybe they
don't make any sense anymore. But whether they should be removed or not
is not a decision I want to make now. I want to get rid of
mp_dispatch_suspend/resume(), though. So implement the client APIs
slightly differently.
2016-09-04 18:05:36 +02:00
Jeong Woon Choi
875aeb0f5c charset_conv: Use CP949 instead of EUC-KR
iconv distinguishes between euc-kr and cp949, while libguess
and libuchardet doesn't (only returns euc-kr). EILSEQ occurs
when the input encoding of iconv is set to euc-kr and if the subs
contain letters not included in euc-kr. Since cp949 is a extension
of euc-kr, choose cp949 instead.

Signed-off-by: wm4 <wm4@nowhere>
2016-09-02 14:46:11 +02:00
wm4
a9a55ea7f2 misc: add some annoying mpv_node helpers
Sigh.

Some parts of mpv essentially duplicate this code (with varrying levels
of triviality) - this can be fixed "later".
2016-08-28 19:39:05 +02:00
stepshal
c5094206ce Fix misspellings 2016-06-26 13:47:21 +02:00
Niklas Haas
034faaa9d8 vo_opengl: use RPN expressions for user hook sizes
This replaces the previous TRANSFORM by WIDTH, HEIGHT and OFFSET where
WIDTH and HEIGHT are RPN expressions. This allows for more fine-grained
control over the output size, and also makes sure that overwriting
existing textures works more cleanly.

(Also add some more useful bstr functions)
2016-05-15 20:42:02 +02:00
wm4
f9084c7bce bstr: avoid redundant vsnprintf calls
Until now, bstr_xappend_vasprintf() called vsnprintf() always twice:
once to determine how much output the call would produce, and a second
time to actually output the data to the (possibly resized) target
memory.

Change this so that it tries to output to the already allocated memory
first, and repeat the call only if allocation is required.

This is especially helpful, as bstr_xappend_vasprintf() is designed to
avoid reallocation when building strings. Usually, the second
vsnprintf() will happen only at the beginning, when the buffer hasn't
been extended to his largest needed size yet.

Not sure if there is a need to optimize this; but see the next commit.
2016-03-23 22:00:35 +01:00
wm4
a6a358ce61 dispatch: clarify lifetime issues 2016-02-26 23:28:02 +01:00
wm4
8a9b64329c Relicense some non-MPlayer source files to LGPL 2.1 or later
This covers source files which were added in mplayer2 and mpv times
only, and where all code is covered by LGPL relicensing agreements.

There are probably more files to which this applies, but I'm being
conservative here.

A file named ao_sdl.c exists in MPlayer too, but the mpv one is a
complete rewrite, and was added some time after the original ao_sdl.c
was removed. The same applies to vo_sdl.c, for which the SDL2 API is
radically different in addition (MPlayer supports SDL 1.2 only).

common.c contains only code written by me. But common.h is a strange
case: although it originally was named mp_common.h and exists in MPlayer
too, by now it contains only definitions written by uau and me. The
exceptions are the CONTROL_ defines - thus not changing the license of
common.h yet.

codec_tags.c contained once large tables generated from MPlayer's
codecs.conf, but all of these tables were removed.

From demux_playlist.c I'm removing a code fragment from someone who was
not asked; this probably could be done later (see commit 15dccc37).

misc.c is a bit complicated to reason about (it was split off mplayer.c
and thus contains random functions out of this file), but actually all
functions have been added post-MPlayer. Except get_relative_time(),
which was written by uau, but looks similar to 3 different versions of
something similar in each of the Unix/win32/OSX timer source files. I'm
not sure what that means in regards to copyright, so I've just moved it
into another still-GPL source file for now.

screenshot.c once had some minor parts of MPlayer's vf_screenshot.c, but
they're all gone.
2016-01-19 18:36:06 +01:00
Dmitrij D. Czarkoff
ea442fa047 mpv_talloc.h: rename from talloc.h
This change helps avoiding conflict with talloc.h from libtalloc.
2016-01-11 21:05:55 +01:00
wm4
fc7212b214 charset_conv: check for UTF-8 if uchardet returns unknown
When libuchardet returns an empty string, it can be either ASCII, UTF-8,
or an unknown encoding. Try to distinguish it from the unknown case by
checking for UTF-8. This avoids an annoying message, and avoids
unnecessary processing (we convert invalid UTF-8 sequences to latin1 to
workaround libavcodec's braindead UTF-8 check).
2015-12-20 20:55:24 +01:00
wm4
74c11f0c84 sub: detect charset in demuxer
Slightly simpler, and removes the need to pre-read all subtitle packets.

This still does the subtitle charset conversion on the packet level
(instead converting when parsing the file), so in theory this still
could provide a way to change the charset at runtime. But maybe even
this should be removed, as FFmpeg is somewhat likely to get its own
charset detection and conversion mechanism in the future. (Would have
to keep the subtitle file in memory to allow changing the charset on
the fly, I guess.)
2015-12-17 01:17:23 +01:00
wm4
384b13c5fd demux_libass: remove this demuxer
This loaded external .ass files via libass. libavformat's .ass reader is
now good enough, so use that instead.

Apparently libavformat still doesn't support fonts embedded into text
.ass files, but support for this has been accidentally broken in mpv for
a while anyway. (And only 1 person complained.)
2015-11-11 21:28:20 +01:00
wm4
d60270ed3d sub: fix --sub-codepage UTF-8 with fallback
Fixes e.g --sub-codepage=utf8:gb18030 if the subtitle us UTF-8.

This was broken in commit e5d31808.

Also log the detected charset in verbose mode.
2015-09-01 13:37:45 +02:00
wm4
e5d3180889 charset_conv: use our own UTF-8 check with ENCA only
Some charsets can look like valid UTF-8, but aren't UTF-8. One example
is ISO-2022-JP. While ENCA apparently likes to get misdetect real UTF-8,
this is not the case with uchardet. uchardet can detect ISO-2022-JP
correctly, but didn't even get to try, because our own UTF-8 check
succeeded. So run the UTF-8 check when using ENCA only.

Fixes #2195.
2015-08-04 18:58:58 +02:00
Jehan
e7897dfb9b charset_conv: "auto" encoding detection now uses uchardet.
If mpv is not built with uchardet, "enca" is still the fallback default
encoding detection.
2015-08-04 17:51:00 +02:00
wm4
bbbb2d9723 charset_conv: fix switched parameters
Fixes #2186.
2015-08-02 18:21:22 +02:00
wm4
a74914a057 charset_conv: add uchardet support
For now, it needs to be explicitly selected. ENCA is still the default.

This assumes uchardet returns iconv names. This doesn't seem to be
always the case, and the result are lots of iconv errors. So
explicitly check for this situation, and print a warning if it
occurs. It's entirely possible that uchardet support is actually
useless, because names are not necessarily iconv-compatible (but
uchardet doesn't seem to document whether it attempts to return
iconv-compatible names if possible).

Fixes #908.
2015-08-02 00:03:29 +02:00
wm4
11f2be2bcc charset_conv: make it possible to return an allocated string as guess
uchardet is written in C++, and thus doesn't appreciate the value of
using static strings, and internally stores the guessed charset as
allocated std::string. Add a minimal hack to deal with this. (I don't
appreciate that the code is potentially harder to understand by
returning either a static or allocated string, but I do appreciate for
not having to litter the existing code with strdups.)
2015-08-01 23:49:37 +02:00
wm4
92b9d75d72 threads: use utility+POSIX functions instead of weird wrappers
There is not much of a reason to have these wrappers around. Use POSIX
standard functions directly, and use a separate utility function to take
care of the timespec calculations. (Course POSIX for using this weird
format for time values.)
2015-05-11 23:44:36 +02:00
wm4
f77e3cbf0c json: fix UTF-8 handling
We escape only characters below 32, plus " and \. UTF-8 should be apssed
through verbatim. Since char can be signed (and usually is), the check
broke and happened to escape UTF-8 encoded bytes too. This broke UTF-8
completely.

Note that we don't check for broken or invalid UTF-8, such as described
both in the client API and IPC docs.

Fixes #1874.
2015-04-28 09:58:28 +02:00
Marcin Kurczewski
f43017bfe9 Update license headers
Signed-off-by: wm4 <wm4@nowhere>
2015-04-13 12:10:01 +02:00
wm4
4a7c6aaedf bstr: fix possible undefined behavior with length 0 strings
BSTR_P() passes the string length and start pointer to printf-like
functions. If the lenfth is 0, the pointer can be NULL, but we're
actually still not allowed to pass a NULL pointer in any case.

This is mostly a technically, because nobody in their right mind would
attempt to specifically break such cases. But it's still undefined
behavior, and some libcs might be strict about this.
2015-01-12 14:43:52 +01:00
wm4
86b521f7df Silence some Coverity warnings
None of this really matters.
2014-11-21 09:59:58 +01:00
wm4
9c456ab58f bstr: don't call memcpy(..., NULL, 0)
This is clearly not allowed, although it's not a problem on most libcs.

Found by Coverity.
2014-11-21 05:18:11 +01:00
wm4
38420eb49e json: handle >\\"< fragments correctly
It assumed that any >\"< sequence was an escape for >"<, but that is not
the case with JSON such as >{"ducks":"\\"}<. In this case, the second
>\< is obviously not starting an escape.
2014-10-21 02:59:30 +02:00
wm4
5548c75e55 lua: expose JSON parser
The JSON parser was introduced for the IPC protocol, but I guess it's
useful here too.

The motivation for this commit is the same as with 8e4fa5fc (again).
2014-10-19 05:51:37 +02:00
wm4
c01151e0bf misc: add JSON parser 2014-10-17 20:46:31 +02:00
James Ross-Gowan
476fc65b0f bstr: check strings before memcmp/strncasecmp
bstr.start can be NULL when bstr.len is 0, so don't call memcmp or
strncasecmp if that's the case. Passing NULL to string functions is
invalid C, even when the length is 0, and it causes Windows to raise an
invalid parameter error.

Should fix #1155
2014-10-07 08:45:27 +02:00
wm4
68ff8a0484 Move compat/ and bstr/ directory contents somewhere else
bstr.c doesn't really deserve its own directory, and compat had just
a few files, most of which may as well be in osdep. There isn't really
any justification for these extra directories, so get rid of them.

The compat/libav.h was empty - just delete it. We changed our approach
to API compatibility, and will likely not need it anymore.
2014-08-29 12:31:52 +02:00
wm4
559fe1daac Add Plan 9-style barriers
Plan 9 has a very interesting synchronization mechanism, the
rendezvous() call. A good property of this is that you don't need to
explicitly initialize and destroy a barrier object, unlike as with e.g.
POSIX barriers (which are mandatory to begin with). Upon "meeting", they
can exchange a value.

This mechanism will be nice to synchronize certain stages of
initialization between threads in the following commit.

Unlike Plan 9 rendezvous(), this is not implemented with a hashtable,
because that would require additional effort (especially if you want to
make it actually scele). Unlike the Plan 9 variant, we use intptr_t
instead of void* as type for the value, because I expect that we will be
mostly passing a status code as value and not a pointer. Converting an
integer to void* requires two cast (because the integer needs to be
intptr_t), the other way around it's only one cast.

We don't particularly care about performance in this case either. It's
simply not important for our use-case. So a simple linked list is used
for waiters, and on wakeup, all waiters are temporarily woken up.
2014-07-26 20:29:48 +02:00
wm4
aa1a383342 sub: add detection via BOM
Useful for Windows stuff. Actually, ENCA support should catch this, but,
well, whatever, everyone seems to hate ENCA.

Detection with BOM is trivial, although it needs some hackery to
integrate it with the existing autodetection support. For one, change
the default value of --sub-codepage to make this easier.

Probably fixes issue #937 (the second part).
2014-07-22 23:40:48 +02:00
wm4
f8c2dd1b78 build: include <strings.h> for strcasecmp()
It happens to work without strings.h on glibc or with _GNU_SOURCE, but
the POSIX standard requires including <strings.h>.

Hopefully fixes OSX build.
2014-07-10 08:29:32 +02:00
wm4
9a210ca2d5 Audit and replace all ctype.h uses
Something like "char *s = ...; isdigit(s[0]);" triggers undefined
behavior, because char can be signed, and thus s[0] can be a negative
value. The is*() functions require unsigned char _or_ EOF. EOF is a
special value outside of unsigned char range, thus the argument to the
is*() functions can't be a char.

This undefined behavior can actually trigger crashes if the
implementation of these functions e.g. uses lookup tables, which are
then indexed with out-of-range values.

Replace all <ctype.h> uses with our own custom mp_is*() functions added
with misc/ctype.h. As a bonus, these functions are locale-independent.
(Although currently, we _require_ C locale for other reasons.)
2014-07-01 23:11:08 +02:00
wm4
d87a84ca89 ring: use a different type for read/write pointers
uint_least32_t could be larger than uint32_t, so the return values of
mp_ring_get_wpos/rpos must be adjusted. Actually just use unsigned long
as type instead, because that is less awkward than uint_least32_t.
2014-05-30 02:15:25 +02:00
wm4
690ea07b7a ring: implement drain in terms of read
I think this makes it easier to reason about it and avoids duplicate
logic.
2014-05-29 02:24:05 +02:00
Paweł Forysiuk
f289060259 Fix gcc 4.7 warning about shadowing talloc_parent in mp_dispact_queue 2014-05-28 22:44:43 +02:00