mirror of https://github.com/mpv-player/mpv
DOCS/tech-overview.txt: add lots of irrelevant blabla
Thought it might be useful to document some of these things, instead of explaining them over and over again. But I can guarantee that nobody will ever read all this. (Independent of its quality and completeness.)
This commit is contained in:
parent
1caa653f2d
commit
2a7e17fe4e
|
@ -242,3 +242,376 @@ sub/:
|
|||
etc/:
|
||||
The file input.conf is actually integrated into the mpv binary by the
|
||||
build system. It contains the default keybindings.
|
||||
|
||||
Best practices and Concepts within mpv
|
||||
======================================
|
||||
|
||||
General contribution etc.
|
||||
-------------------------
|
||||
|
||||
See DOCS/contribute.md.
|
||||
|
||||
Error checking
|
||||
--------------
|
||||
|
||||
If an error is relevant, it should be handled. If it's interesting, log the
|
||||
error. However, mpv often keeps errors silent and reports failures somewhat
|
||||
coarsely by propagating them upwards the caller chain. This is OK, as long as
|
||||
the errors are not very interesting, or would require a developer to debug it
|
||||
anyway (in which case using a debugger would be more convenient, and the
|
||||
developer would need to add temporary debug printfs to get extremely detailed
|
||||
information which would not be appropriate during normal operation).
|
||||
|
||||
Basically, keep a balance on error reporting. But always check them, unless you
|
||||
have a good argument not to.
|
||||
|
||||
Memory allocation errors (OOM) are a special class of errors. Normally such
|
||||
allocation failures are not handled "properly". Instead, abort() is called.
|
||||
(New code should use MP_HANDLE_OOM() for this.) This is done out of laziness and
|
||||
for convenience, and due to the fact that MPlayer/mplayer2 never handled it
|
||||
correctly. (MPlayer varied between handling it correctly, trying to do so but
|
||||
failing, and just not caring, while mplayer2 started using abort() for it.)
|
||||
|
||||
This is justifiable in a number of ways. Error handling paths are notoriously
|
||||
untested and buggy, so merely having them won't make your program more reliable.
|
||||
Having these error handling paths also complicates non-error code, due to the
|
||||
need to roll back state at any point after a memory allocation.
|
||||
|
||||
Take any larger body of code, that is supposed to handle OOM, and test whether
|
||||
the error paths actually work, for example by overriding malloc with a version
|
||||
that randomly fails. You will find bugs quickly, and often they will be very
|
||||
annoying to fix (if you can even reproduce them).
|
||||
|
||||
In addition, a clear indication that something went wrong may be missing. On
|
||||
error your program may exhibit "degraded" behavior by design. Consider a video
|
||||
encoder dropping frames somewhere in the middle of a video due to temporary
|
||||
allocation failures, instead of just exiting with an errors. In other cases, it
|
||||
may open conceptual security holes. Failing fast may be better.
|
||||
|
||||
mpv uses GPU APIs, which may be break on allocation errors (because driver
|
||||
authors will have the same issues as described here), or don't even have a real
|
||||
concept for dealing with OOM (OpenGL).
|
||||
|
||||
libmpv is often used by GUIs, which I predict always break if OOM happens.
|
||||
|
||||
Last but not least, OSes like Linux use "overcommit", which basically means that
|
||||
your program may crash any time OOM happens, even if it doesn't use malloc() at
|
||||
all!
|
||||
|
||||
But still, don't just assume malloc() always succeeds. Use MP_HANDLE_OOM(). The
|
||||
ta* APIs do this for you. The reason for this is that dereferencing a NULL
|
||||
pointer can have security relevant consequences if large offsets are involved.
|
||||
Also, a clear error message is better than a random segfault.
|
||||
|
||||
Some big memory allocations are checked anyway. For example, all code must
|
||||
assume that allocating video frames or packets can fail. (The above example
|
||||
of dropping video frames during encoding is entirely possible in mpv.)
|
||||
|
||||
Undefined behavior
|
||||
------------------
|
||||
|
||||
Undefined behavior (UB) is a concept in the C language. C is famous for being a
|
||||
language that makes it almost impossible to write working code, because
|
||||
undefined behavior is so easily triggered, compilers will happily abuse it to
|
||||
generate "faster" code, debugging tools will shout at you, and sometimes it
|
||||
even means your code doesn't work.
|
||||
|
||||
There is a lot of literature on this topic. Read it.
|
||||
|
||||
(In C's defense, UB exists in other languages too, but since they're not used
|
||||
for low level infrastructure, and/or these languages are at times not rigorously
|
||||
defined, simply nobody cares. However, the C standard committee is still guilty
|
||||
for not addressing this. I'll admit that I can't even tell from the standard's
|
||||
gibberish whether some specific behavior is UB or not. It's written like tax
|
||||
law.)
|
||||
|
||||
In mpv, we generally try to avoid undefined behavior. For one, we want portable
|
||||
and reliable operation. But more importantly, we want clean output from
|
||||
debugging tools, in order to find real bugs more quickly and effectively.
|
||||
|
||||
Avoid the "works in practice" argument. Once debugging tools come into play, or
|
||||
simply when "in practice" stops being true, this will all get back to you in a
|
||||
bad way.
|
||||
|
||||
Global state, library safety
|
||||
----------------------------
|
||||
|
||||
Mutable global state is when code uses global variables that are not read-only.
|
||||
This must be avoided in mpv. Always use context structs that the caller of
|
||||
your code needs to allocate, and whose pointers are passed to your functions.
|
||||
|
||||
Library safety means that your code (or library) can be used by a library
|
||||
without causing conflicts with other library users in the same process. To any
|
||||
piece of code, a "safe" library's API can simply be used, without having to
|
||||
worry about other API users that may be around somewhere.
|
||||
|
||||
Libraries are often not library safe, because they they use global mutable state
|
||||
or other "global" resources. Typical examples include use of signals, simple
|
||||
global variables (like hsearch() in libc), or internal caches not protected by
|
||||
locks.
|
||||
|
||||
A surprisingly high number of libraries are not library safe because they need
|
||||
global initialization. Typically they provide an API function, which
|
||||
"initializes" the library, and which must be called before calling any other
|
||||
API functions. Often, you are to provide global configuration parameters, which
|
||||
can change the behavior of the library. If two libraries A and B use library C,
|
||||
but A and B initialize C with different parameters, something "bad" may happen.
|
||||
In addition, these global initialization functions are often not thread-safe. So
|
||||
if A and B try to initialize C at the same time (from different threads and
|
||||
without knowing about each other), it may cause undefined behavior. (libcurl is
|
||||
a good example of both of these issues. FFmpeg and some TLS libraries used to be
|
||||
affected, but improved.)
|
||||
|
||||
This is so bad because library A and B from the previous example most likely
|
||||
have no way to cooperate, because they're from different authors and have no
|
||||
business knowing each others. They'd need a library D, which wraps library C
|
||||
in a safe way. Unfortunately, typically something worse happens: libraries get
|
||||
"infected" by the unsafeness of its sub-libraries, and export a global init API
|
||||
just to initialize the sub-libraries. In the previous example, libraries A and B
|
||||
would export global init APIs just to init library C, even though the rest of
|
||||
A/B are clean and library safe. (Again, libcurl is an example of this, if you
|
||||
subtract other historic anti-features.)
|
||||
|
||||
The main problem with library safety is that its lack propagates to all
|
||||
libraries using the library.
|
||||
|
||||
We require libmpv to be library safe. This is not really possible, because some
|
||||
libraries are not library safe (FFmpeg, Xlib, partially ALSA). However, for
|
||||
ideological reasons, there is no global init API, and best effort is made to try
|
||||
to avoid problems.
|
||||
|
||||
libmpv has some features that are not library safe, but which are disabled by
|
||||
default (such as terminal usage aka stdout, or JSON IPC blocking SIGPIPE for
|
||||
internal convenience).
|
||||
|
||||
A notable, very disgustingly library unsafe behavior of libmpv is calling
|
||||
abort() on some memory allocation failure. See error checking section.
|
||||
|
||||
Logging
|
||||
-------
|
||||
|
||||
All logging and terminal output in mpv goes through the functions and macros
|
||||
provided in common/msg.h. This is in part for library safety, and in part to
|
||||
make sure users can silence all output, or to redirect the output elsewhere,
|
||||
like a log file or the internal console.lua script.
|
||||
|
||||
Locking
|
||||
-------
|
||||
|
||||
See generally available literature. In mpv, we use pthread for this.
|
||||
|
||||
Always keep locking clean. Don't skip locking just because it will work "in
|
||||
practice". (See undefined behavior section.) If your use case is simple, you may
|
||||
use C11 atomics( osdep/atomic.h for partial C99 support), but most likely you
|
||||
will only hurt yourself and others.
|
||||
|
||||
Always make clear which fields in a struct are protected by which lock. If a
|
||||
field is immutable, or simply not thread-safe (e.g. state for a single worker
|
||||
thread), document it as well.
|
||||
|
||||
Internal mpv APIs are assumed to be not thread-safe by default. If they have
|
||||
special guarantees (such as being usable by more than one thread at a time),
|
||||
these should be explicitly documented.
|
||||
|
||||
All internal mpv APIs must be free of global state. Even if a component is not
|
||||
thread-safe, multiple threads can use _different_ instances of it without any
|
||||
locking.
|
||||
|
||||
On a side note, recursive locks may seem convenient at first, but introduces
|
||||
additional problems with condition variables and locking hierarchies. They
|
||||
should be avoided.
|
||||
|
||||
Locking hierarchy
|
||||
-----------------
|
||||
|
||||
A simple way to avoid deadlocks with classic locking is to define a locking
|
||||
hierarchy or lock order. If all threads acquire locks in the same order, no
|
||||
deadlocks will happen.
|
||||
|
||||
For example, a "leaf" lock is a lock that is below all other locks in the
|
||||
hierarchy. You can acquire it any time, as long as you don't acquire other
|
||||
locks while holding it.
|
||||
|
||||
Unfortunately, C has no way to declare or check the lock order, so you should at
|
||||
least document it.
|
||||
|
||||
In addition, try to avoid exposing locks to the outside. Making the declaration
|
||||
of a lock private to a specific .c file (and _not_ exporting accessors or
|
||||
lock/unlock that manipulate the lock) is a good idea. Your component's API may
|
||||
acquire internal locks, but should release them when returning. Keeping the
|
||||
entire locking in a single file makes it easy to check it.
|
||||
|
||||
Avoiding callback hell
|
||||
----------------------
|
||||
|
||||
mpv code is separated in components, like the "frontend" (i.e. MPContext mpctx),
|
||||
VOs, AOs, demuxers, and more. The frontend usually calls "down" the usage
|
||||
hierarchy: mpctx almost on top, then things like vo/ao, and utility code on the
|
||||
very bottom.
|
||||
|
||||
"Callback hell" is when when components call both up and down the hierarchy,
|
||||
which for example leads to accidentally recursion, reentrancy problems, or
|
||||
locking nightmares. This is avoided by (mostly) calling only down the hierarchy.
|
||||
Basically the call graph forms a DAG. The other direction is handled by event
|
||||
queues, wakeup callbacks, and similar mechanisms.
|
||||
|
||||
Typically, a component provides an API, and does not know anything about its
|
||||
user. The API user (component higher in the hierarchy) polls the state of the
|
||||
lower component when needed.
|
||||
|
||||
This also enforces some level of modularization, and with some luck the locking
|
||||
hierarchy. (Basically, locks of lower components automatically become leaf
|
||||
locks.) Another positive effect is simpler memory management.
|
||||
|
||||
(Also see e.g.: http://250bpm.com/blog:24)
|
||||
|
||||
Wakeup callbacks
|
||||
----------------
|
||||
|
||||
This is a common concept in mpv. Even the public API uses it. It's used when an
|
||||
API has internal threads (or otherwise triggers asynchronous events), but the
|
||||
component call hierarchy needs to be kept. The wakeup callback is the only
|
||||
exception to the call hierarchy, and always calls up.
|
||||
|
||||
For example, vo spawns a thread that the API user. The mpv frontend is oblivious
|
||||
to this. vo simply provides a thread-safe API. vo needs to notify the API user
|
||||
of new events. But the vo event producer is on the vo thread - it can't simply
|
||||
invoke a callback back into the API user, because then the API user has to deal
|
||||
with locking, despite not using threads. In addition, this will probably cause
|
||||
problems like mentioned in the "callback hell" section, especially lock order
|
||||
issues.
|
||||
|
||||
The solution is the wakeup callback. It merely unblocks the API user from
|
||||
waiting, and the API user then uses the normal vo API to examine whether or
|
||||
which state changed. As a concept, it documents what a wakeup callback is
|
||||
allowed to do and what not, to avoid the aforementioned problems.
|
||||
|
||||
Generally, you are not allowed to call any API from the wakeup callback. You
|
||||
just do whatever is needed to unblock your thread. For example, if it's waiting
|
||||
on a mutex/condition variable, acquire the mutex, set a change flag, signal
|
||||
the condition variable, unlock, return. (This mutex must not be held when
|
||||
calling the API. It must be a leaf lock.)
|
||||
|
||||
Restricting the wakeup callback like this sidesteps any reentrancy issues and
|
||||
other complexities. The API implementation can simply hold internal (and
|
||||
non-recursive) locks while invoking the wakeup callback.
|
||||
|
||||
The API user still needs to deal with locking (probably), but there's only the
|
||||
need to implement a single "receiver", that can handle the entire API of the
|
||||
used component. (Or multiple APIs - MPContext for example has only 1 wakeup
|
||||
callback that handles all AOs, VOs, input, demuxers, and more. It simple re-runs
|
||||
the playloop.)
|
||||
|
||||
You could get something more advanced by turning this into a message queue. The
|
||||
API would append a message to the queue, and the API user can read it. But then
|
||||
you still need a way to "wakeup" the API user (unless you force the API user
|
||||
to block on your API, which will make things inconvenient for the API user). You
|
||||
also need to worry about what happens if the message queue overruns (you either
|
||||
lose messages or have unbounded memory usage). In the mpv public API, the
|
||||
distinction between message queue and wakeup callback is sort of blurry, because
|
||||
it does provide a message queue, but an additional wakeup callback, so API
|
||||
users are not required to call mpv_wait_event() with a high timeout.
|
||||
|
||||
mpv itself prefers using wakeup callbacks over a generic event queue, because
|
||||
most times an event queue is not needed (or complicates things), and it is
|
||||
better to do it manually.
|
||||
|
||||
(You could still abstract the API user side of wakeup callback handling, and
|
||||
avoid reimplementing it all the time. Although mp_dispatch_queue already
|
||||
provides mechanisms for this.)
|
||||
|
||||
Condition variables
|
||||
-------------------
|
||||
|
||||
They're used whenever a thread needs to wait for something, without nonsense
|
||||
like sleep calls or busy waiting. mpv uses the standard pthread API for this.
|
||||
There's a lot of literature on it. Read it.
|
||||
|
||||
For initial understanding, it may be helpful to know that condition variables
|
||||
are not variables that signal a condition. pthread_cond_t does not have any
|
||||
state per-se. Maybe pthread_cond_t would better be named pthread_interrupt_t,
|
||||
because its sole purpose is to interrupt a thread waiting via pthread_cond_wait()
|
||||
(or similar). The "something" in "waiting for something" can be called
|
||||
predicate (to avoid confusing it with "condition"). Consult literature for the
|
||||
proper terms.
|
||||
|
||||
The very short version is:
|
||||
|
||||
// --- Shared declarations
|
||||
|
||||
pthread_mutex_t lock;
|
||||
pthread_cond_t cond_var;
|
||||
struct something state_var; // protected by lock, changes signaled by cond_var
|
||||
|
||||
// --- Waiter thread
|
||||
|
||||
pthread_mutex_lock(&lock);
|
||||
|
||||
// Wait for a change in state_var. We want to wait until predicate_fulfilled()
|
||||
// returns true.
|
||||
// Must be a loop for 2 reasons:
|
||||
// 1. cond_var may be associated with other conditions too
|
||||
// 2. pthread_cond_wait() can have sporadic wakeups
|
||||
while (!predicate_fulfilled(&state_var)) {
|
||||
// This unlocks, waits for cond_var to be signaled, and then locks again.
|
||||
// The _whole_ point of cond_var is that unlocking and waiting for the
|
||||
// signal happens atomically.
|
||||
pthread_cond_wait(&cond_var, &lock);
|
||||
}
|
||||
|
||||
// Here you may react to the state change. The state cannot change
|
||||
// asynchronously as long as you still hold the lock (and didn't release
|
||||
// and reacquire it).
|
||||
// ...
|
||||
|
||||
pthread_mutex_unlock(&lock);
|
||||
|
||||
// --- Signaler thread
|
||||
|
||||
pthread_mutex_lock(&lock);
|
||||
|
||||
// Something changed. Update the shared variable with the new state.
|
||||
update_state(&state_var);
|
||||
|
||||
// Notify that something changed. This will wake up the waiter thread if
|
||||
// it's blocked in pthread_cond_wait(). If not, nothing happens.
|
||||
pthread_cond_broadcast(&cond_var);
|
||||
|
||||
// Fun fact: good implementations wake up the waiter only when the lock is
|
||||
// released, to reduce kernel scheduling overhead.
|
||||
pthread_mutex_unlock(&lock);
|
||||
|
||||
|
||||
Some basic rules:
|
||||
1. Always access your state under proper locking
|
||||
2. Always check your predicate before every call to pthread_cond_wait()
|
||||
(And don't call pthread_cond_wait() if the predicate is fulfilled.)
|
||||
3. Always call pthread_cond_wait() in a loop
|
||||
(And only if your predicate failed without releasing the lock..)
|
||||
4. Always call pthread_cond_broadcast()/_signal() inside of its associated
|
||||
lock
|
||||
|
||||
mpv sometimes violates rule 3, and leaves "retrying" (i.e. looping) to the
|
||||
caller.
|
||||
|
||||
Common pitfalls:
|
||||
- Thinking that pthread_cond_t is some kind of semaphore, or holds any
|
||||
application state or the user predicate (it _only_ wakes up threads
|
||||
that are at the same time blocking on pthread_cond_wait() and friends,
|
||||
nothing else)
|
||||
- Changing the predicate, but not updating all pthread_cond_broadcast()/
|
||||
_signal() calls correctly
|
||||
- Forgetting that pthread_cond_wait() unlocks the lock (other threads can
|
||||
and must acquire the lock)
|
||||
- Holding multiple nested locks while trying to wait (=> deadlock, violates
|
||||
the lock order anyway)
|
||||
- Waiting for a predicate correctly, but unlocking/relocking before acting
|
||||
on it (unlocking allows arbitrary state changes)
|
||||
- Confusing which lock/condition var. is used to manage a bit of state
|
||||
|
||||
Generally available literature probably has better examples and explanations.
|
||||
|
||||
Using condition variables the proper way is generally preferred over using more
|
||||
messy variants of them. (Just saying because on win32, "Event" exists, and it's
|
||||
inferior to condition variables. Try to avoid the win32 primitives, even if
|
||||
you're dealing with Windows-only code.)
|
||||
|
|
Loading…
Reference in New Issue