mirror of
https://github.com/mpv-player/mpv
synced 2025-01-22 15:43:13 +00:00
ae80a63c97
git-svn-id: svn://svn.mplayerhq.hu/mplayer/trunk@5587 b3059339-0415-0410-9bf9-f77b7e298cf2
315 lines
15 KiB
Plaintext
315 lines
15 KiB
Plaintext
So, I'll describe how this stuff works.
|
|
|
|
The main modules:
|
|
|
|
1. stream.c: this is the input layer, this reads the input media (file, stdin,
|
|
vcd, dvd, network etc). what it has to know: appropriate buffering by
|
|
sector, seek, skip functions, reading by bytes, or blocks with any size.
|
|
The stream_t (stream.h) structure describes the input stream, file/device.
|
|
|
|
There is a stream cache layer (cache2.c), it's a wrapper for the stream
|
|
API. It does fork(), then emulates stream driver in the parent process,
|
|
and stream user in the child process, while proxying between them using
|
|
preallocated big memory chunk for FIFO buffer.
|
|
|
|
2. demuxer.c: this does the demultiplexing (separating) of the input to
|
|
audio, video or dvdsub channels, and their reading by buffered packages.
|
|
The demuxer.c is basically a framework, which is the same for all the
|
|
input formats, and there are parsers for each of them (mpeg-es,
|
|
mpeg-ps, avi, avi-ni, asf), these are in the demux_*.c files.
|
|
The structure is the demuxer_t. There is only one demuxer.
|
|
|
|
2.a. demux_packet_t, that is DP.
|
|
Contains one chunk (avi) or packet (asf,mpg). They are stored in memory as
|
|
in linked list, cause of their different size.
|
|
|
|
2.b. demuxer stream, that is DS.
|
|
Struct: demux_stream_t
|
|
Every channel (a/v/s) has one. This contains the packets for the stream
|
|
(see 2.a). For now, there can be 3 for each demuxer :
|
|
- audio (d_audio)
|
|
- video (d_video)
|
|
- DVD subtitle (d_dvdsub)
|
|
|
|
2.c. stream header. There are 2 types (for now): sh_audio_t and sh_video_t
|
|
This contains every parameter essential for decoding, such as input/output
|
|
buffers, chosen codec, fps, etc. There are each for every stream in
|
|
the file. At least one for video, if sound is present then another,
|
|
but if there are more, then there'll be one structure for each.
|
|
These are filled according to the header (avi/asf), or demux_mpg.c
|
|
does it (mpg) if it founds a new stream. If a new stream is found,
|
|
the ====> Found audio/video stream: <id> messages is displayed.
|
|
|
|
The chosen stream header and its demuxer are connected together
|
|
(ds->sh and sh->ds) to simplify the usage. So it's enough to pass the
|
|
ds or the sh, depending on the function.
|
|
|
|
For example: we have an asf file, 6 streams inside it, 1 audio, 5
|
|
video. During the reading of the header, 6 sh structs are created, 1
|
|
audio and 5 video. When it starts reading the packet, it chooses the
|
|
stream for the first found audio & video packet, and sets the sh
|
|
pointers of d_audio and d_video according to them. So later it reads
|
|
only these streams. Of course the user can force choosing a specific
|
|
stream with
|
|
-vid and -aid switches.
|
|
A good example for this is the DVD, where the english stream is not
|
|
always the first, so every VOB has different language :)
|
|
That's when we have to use for example the -aid 128 switch.
|
|
|
|
Now, how this reading works?
|
|
- demuxer.c/demux_read_data() is called, it gets how many bytes,
|
|
and where (memory address), would we like to read, and from which
|
|
DS. The codecs call this.
|
|
- this checks if the given DS's buffer contains something, if so, it
|
|
reads from there as much as needed. If there isn't enough, it calls
|
|
ds_fill_buffer(), which:
|
|
- checks if the given DS has buffered packages (DP's), if so, it moves
|
|
the oldest to the buffer, and reads on. If the list is empty, it
|
|
calls demux_fill_buffer() :
|
|
- this calls the parser for the input format, which reads the file
|
|
onward, and moves the found packages to their buffers.
|
|
Well it we'd like an audio package, but only a bunch of video
|
|
packages are available, then sooner or later the:
|
|
DEMUXER: Too many (%d in %d bytes) audio packets in the buffer
|
|
error shows up.
|
|
|
|
2.d. video.c: this file/function handle the reading and assembling of the
|
|
video frames. each call to video_read_frame() should read and return a
|
|
single video frame, and it's duration in seconds (float).
|
|
The implementation is splitted to 2 big parts - reading from mpeg-like
|
|
streams and reading from one-frame-per-chunk files (avi, asf, mov).
|
|
Then it calculates duration, either from fixed FPS value, or from the
|
|
PTS difference between and after reading the frame.
|
|
|
|
2.e. other utility functions: there are some usefull code there, like
|
|
AVI muxer, or mp3 header parser, but leave them for now.
|
|
|
|
So everything is ok 'till now. It can be found in libmpdemux/ library.
|
|
It should compile outside of mplayer tree, you just have to implement few
|
|
simple functions, like mp_msg() to print messages, etc.
|
|
See libmpdemux/test.c for example.
|
|
|
|
See also formats.txt, for description of common media file formats and their
|
|
implementation details in libmpdemux.
|
|
|
|
Now, go on:
|
|
|
|
3. mplayer.c - ooh, he's the boss :)
|
|
Its main purpose is connecting the other modules, and maintaining A/V
|
|
sync.
|
|
|
|
The given stream's actual position is in the 'timer' field of the
|
|
corresponding stream header (sh_audio / sh_video).
|
|
|
|
The structure of the playing loop :
|
|
while(not EOF) {
|
|
fill audio buffer (read & decode audio) + increase a_frame
|
|
read & decode a single video frame + increase v_frame
|
|
sleep (wait until a_frame>=v_frame)
|
|
display the frame
|
|
apply A-V PTS correction to a_frame
|
|
handle events (keys,lirc etc) -> pause,seek,...
|
|
}
|
|
|
|
When playing (a/v), it increases the variables by the duration of the
|
|
played a/v.
|
|
- with audio this is played bytes / sh_audio->o_bps
|
|
Note: i_bps = number of compressed bytes for one second of audio
|
|
o_bps = number of uncompressed bytes for one second of audio
|
|
(this is = bps*samplerate*channels)
|
|
- with video this is usually == 1.0/fps, but I have to note that
|
|
fps doesn't really matters at video, for example asf doesn't have that,
|
|
instead there is "duration" and it can change per frame.
|
|
MPEG2 has "repeat_count" which delays the frame by 1-2.5 ...
|
|
Maybe only AVI and MPEG1 has fixed fps.
|
|
|
|
So everything works right until the audio and video are in perfect
|
|
synchronity, since the audio goes, it gives the timing, and if the
|
|
time of a frame passed, the next frame is displayed.
|
|
But what if these two aren't synchronized in the input file?
|
|
PTS correction kicks in. The input demuxers read the PTS (presentation
|
|
timestamp) of the packages, and with it we can see if the streams
|
|
are synchronized. Then MPlayer can correct the a_frame, within
|
|
a given maximal bounder (see -mc option). The summary of the
|
|
corrections can be found in c_total .
|
|
|
|
Of course this is not everything, several things suck.
|
|
For example the soundcards delay, which has to be corrected by
|
|
MPlayer! The audio delay is the sum of all these:
|
|
- bytes read since the last timestamp:
|
|
t1 = d_audio->pts_bytes/sh_audio->i_bps
|
|
- if Win32/ACM then the bytes stored in audio input buffer
|
|
t2 = a_in_buffer_len/sh_audio->i_bps
|
|
- uncompressed bytes in audio out buffer
|
|
t3 = a_buffer_len/sh_audio->o_bps
|
|
- not yet played bytes stored in the soundcard's (or DMA's) buffer
|
|
t4 = get_audio_delay()/sh_audio->o_bps
|
|
|
|
From this we can calculate what PTS we need for the just played
|
|
audio, then after we compare this with the video's PTS, we have
|
|
the difference!
|
|
|
|
Life didn't get simpler with AVI. There's the "official" timing
|
|
method, the BPS-based, so the header contains how many compressed
|
|
audio bytes or chunks belong to one second of frames.
|
|
In the AVI stream header there are 2 important fields, the
|
|
dwSampleSize, and dwRate/dwScale pairs:
|
|
- If the dwSampleSize is 0, then it's VBR stream, so its bitrate
|
|
isn't constant. It means that 1 chunk stores 1 sample, and
|
|
dwRate/dwScale gives the chunks/sec value.
|
|
- If the dwSampleSize is >0, then it's constant bitrate, and the
|
|
time can be measured this way: time = (bytepos/dwSampleSize) /
|
|
(dwRate/dwScale) (so the sample's number is divided with the
|
|
samplerate). Now the audio can be handled as a stream, which can
|
|
be cut to chunks, but can be one chunk also.
|
|
|
|
The other method can be used only for interleaved files: from
|
|
the order of the chunks, a timestamp (PTS) value can be calculated.
|
|
The PTS of the video chunks are simple: chunk number * fps
|
|
The audio is the same as the previous video chunk was.
|
|
We have to pay attention to the so called "audio preload", that is,
|
|
there is a delay between the audio and video streams. This is
|
|
usually 0.5-1.0 sec, but can be totally different.
|
|
The exact value was measured until now, but now the demux_avi.c
|
|
handles it: at the audio chunk after the first video, it calculates
|
|
the A/V difference, and take this as a measure for audio preload.
|
|
|
|
3.a. audio playback:
|
|
Some words on audio playback:
|
|
Not the playing is hard, but:
|
|
1. knowing when to write into the buffer, without blocking
|
|
2. knowing how much was played of what we wrote into
|
|
The first is needed for audio decoding, and to keep the buffer
|
|
full (so the audio will never skip). And the second is needed for
|
|
correct timing, because some soundcards delay even 3-7 seconds,
|
|
which can't be forgotten about.
|
|
To solve this, the OSS gives several possibilities:
|
|
- ioctl(SNDCTL_DSP_GETODELAY): tells how many unplayed bytes are in
|
|
the soundcard's buffer -> perfect for timing, but not all drivers
|
|
support it :(
|
|
- ioctl(SNDCTL_DSP_GETOSPACE): tells how much can we write into the
|
|
soundcard's buffer, without blocking. If the driver doesn't
|
|
support GETODELAY, we can use this to know how much the delay is.
|
|
- select(): should tell if we can write into the buffer without
|
|
blocking. Unfortunately it doesn't say how much we could :((
|
|
Also, doesn't/badly works with some drivers.
|
|
Only used if none of the above works.
|
|
|
|
4. Codecs. Consists of libmpcodecs/* and separate files or libs,
|
|
for example liba52, libmpeg2, xa/*, alaw.c, opendivx/*, loader, mp3lib.
|
|
|
|
mplayer.c doesn't call them directly, but through the dec_audio.c and
|
|
dec_video.c files, so the mplayer.c doesn't have to know anything about
|
|
the codecs.
|
|
|
|
libmpcodecs contains wrapper for every codecs, some of them include the
|
|
codec function implementation, some calls functions from other files
|
|
included with mplayer, some calls optional external libraries.
|
|
file naming convention in libmpcodecs:
|
|
ad_*.c - audio decoder (called through dec_audio.c)
|
|
vd_*.c - video decoder (called through dec_video.c)
|
|
ve_*.c - video encoder (used by mencoder)
|
|
vf_*.c - video filter (see option -vop)
|
|
|
|
5. libvo: this displays the frame.
|
|
|
|
for details on this, read libvo.txt
|
|
|
|
6. libao2: this control audio playing
|
|
|
|
As in libvo (see 5.) also here are some drivers, based on the same API:
|
|
|
|
static int control(int cmd, int arg);
|
|
This is for reading/setting driver-specific and other special parameters.
|
|
Not really used for now.
|
|
|
|
static int init(int rate,int channels,int format,int flags);
|
|
The init of driver, opens device, sets sample rate, channels, sample format
|
|
parameters.
|
|
Sample format: usually AFMT_S16_LE or AFMT_U8, for more definitions see
|
|
dec_audio.c and linux/soundcards.h files!
|
|
|
|
static void uninit();
|
|
Guess what.
|
|
Ok I help: closes the device, not (yet) called when exit.
|
|
|
|
static void reset();
|
|
Resets device. To be exact, it's for deleting buffers' contents,
|
|
so after reset() the previously received stuff won't be output.
|
|
(called if pause or seek)
|
|
|
|
static int get_space();
|
|
Returns how many bytes can be written into the audio buffer without
|
|
blocking (making caller process wait). If the buffer is (nearly) full,
|
|
has to return 0!
|
|
If it never gives 0, MPlayer won't work!
|
|
|
|
static int play(void* data,int len,int flags);
|
|
Plays a bit of audio, which is received throught the "data" memory area, with
|
|
a size of "len". The "flags" isn't used yet. It has to copy the data, because
|
|
they can be overwritten after the call is made. Doesn't really have to use
|
|
all the bytes, it has to give back how many have been used (copied to
|
|
buffer).
|
|
|
|
static float get_delay();
|
|
Returns how long time it will take to play the data currently in the
|
|
output buffer. Be exact, if possible, since the whole timing depends
|
|
on this! In the worst case, return the maximum delay.
|
|
|
|
!!! Because the video is synchronized to the audio (card), it's very important
|
|
!!! that the get_space and get_delay functions are correctly implemented!
|
|
|
|
6.a audio plugins
|
|
Audio plugins are used for processing the audio data before it
|
|
reaches the soundcard driver. A plugin can change the following
|
|
aspects of the audio data stream:
|
|
1. Sample format
|
|
2. Sample rate
|
|
3. Number of channels
|
|
4. The data itself (i.e. filtering and other sound effects)
|
|
5. The delay (almost all plugins does this)
|
|
The plugin interface is implemented as a pseudo device driver with
|
|
the catchy name "plugin". The plugins are executed sequentially
|
|
ordered by the "-aop list=plugin1,plugin2,..." command line switch.
|
|
To add plugins add an entry in audio_plugin.h the makefile and
|
|
create a source file named "pl_whatever.c". Input parameters are
|
|
added to audio_plugin.h and to cfg-mplayer.h. A good starting point
|
|
for writing plugins is pl_delay.c. Below is a description of what
|
|
the functions does:
|
|
|
|
static int control(int cmd, int arg);
|
|
This is for reading/setting plugin-specific and other special
|
|
parameters and can be used for keyboard input for example. All
|
|
plugins bust respond to cmd=AOCONTROL_PLUGIN_SET_LEN which is part
|
|
of the initialization of the plugin. When this command is received
|
|
the parameter pl_delay.len will contain the maximum size of data the
|
|
plugin can produce. This can be used for calculating and allocating
|
|
buffer space for the plugin. Before the function exits the parameter
|
|
pl_delay.len must be set to the maximum data size the plugin can
|
|
receive. Return CONTROL_OK for success and CONTROL_ERROR for fail,
|
|
other control codes are found in audio_out.h.
|
|
|
|
static int init();
|
|
This function is for initializing the plugin, it is called once
|
|
before the playing is started. In this function the plugin can read
|
|
AND write to the ao_plugin_data struct to determine and set input
|
|
and output parameters. It is important to write to the
|
|
ao_plugin_data.sz_mult and ao_plugin_data.delay_fix parameters if
|
|
the plugin changes the data size or adds delay. Return 0 for fail
|
|
and 1 for success.
|
|
|
|
static void uninit()
|
|
Called before mplayer exits. Used for deallocating dynamic buffers.
|
|
|
|
static void reset()
|
|
Called during reset can be used to empty buffers. Mplayer calls this
|
|
function when pause is pressed.
|
|
|
|
static int play()
|
|
Called for every block of audio data sent through the plugin. This
|
|
function should be optimized for speed. The incoming data is found
|
|
in ao_plugin_data.data having length ao_plugin_data.len. These two
|
|
parameters should be changed by the plugin. Return 1 for success and
|
|
0 for fail.
|
|
|