Commit 1a03af9a authored by Christophe Massiot's avatar Christophe Massiot

* Added IDEALX developer documentation into main CVS - PLEASE UPDATE

* Cleaned up doc/ directory.
parent a707befb
% common.tex: common definitions for LaTeX files.
% (c)1999 VideoLAN
% Included packages
% C-related commands
% C-related environments
- Pas de tabulations dans les sources
- Pas de include dans les headers sauf all.h
- Utiliser systématiquement NULL pour les pointeurs nuls, et non pas 0
- Eviter le if( (a=toto()) != b ), preferer l'ecriture sur 2 lignes, en
particulier pour les malloc(s):
a = toto();
if( a != b )
- A propos des mallocs, plus une remarque qu'une convention: il n'est
spécifié nul part que errno est mis à jour si malloc renvoie NULL.
Préférez donc strerror(ENOMEM) à strerror(errno) !
# Extract from the Debian SGML/XML HOWTO by Stphane Bortzmeyer
# For Debian :
# For RedHat :
all: manual
manual: manual.txt manual.html
%.tex: %.xml
$(JADE) -t tex -V %section-autolabel% -d $(PRINT_SS) $(XML_DECL) $<
perl -i.bak -pe 's/\000//g' $@ && rm $*.tex.bak
# No it's not a joke
%.html: %.xml
$(JADE) -t sgml -V %section-autolabel% -V nochunks \
-d $(HTML_SS) $(XML_DECL) $< > $@
%.dvi: %.tex
jadetex $<
jadetex $<
jadetex $< %.dvi
dvips -f $< > $@
%.txt: %.xml
$(JADE) -t sgml -V nochunks -d $(HTML_SS) $(XML_DECL) $< > dump.html
lynx -force_html -dump dump.html > $@
-rm -f dump.html
rm -f manual.txt
rm -f *.html *.aux *.log *.dvi *.ps *.tex
rm -f *.bck *~ .\#* \#*
<chapter> <title> The audio output layer </title>
<sect1> <title> Data exchanges between a decoder and the audio output
The audio output basically takes audio samples from one or several
FIFOs, mixes and resamples them, and plays them through the audio
chip. Data exchanges are simple and described in <filename>
src/audio_output/audio_output.c.</filename> A decoder needs to open
a channel FIFO with <function> aout_CreateFifo </function>, and
then write the data to the buffer. The buffer is in <parameter>
p_aout_fifo-&gt;buffer + p_aout_fifo-&gt;l_end_frame </parameter>
* <constant> ADEC_FRAME_SIZE</constant>.
<sect1> <title> How to write an audio output plugin </title>
[This API is subject to change in the very near future.] Have a look at
<filename> plugins/dsp/aout_dsp.c</filename>. You need to write six
functions :
<listitem> <para> <type> int </type> <function> aout_Probe </function>
<parameter> ( probedata_t *p_data ) </parameter> :
Returns a score between 0 and 999 to tell whether the plugin
can be used. <parameter> p_data </parameter> is currently
</para> </listitem>
<listitem> <para> <type> int </type> <function> aout_Open </function>
<parameter> ( aout_thread_t *p_aout ) </parameter> :
Opens the audio device.
</para> </listitem>
<listitem> <para> <type> int </type> <function> aout_SetFormat
</function> <parameter> ( aout_thread_t *p_aout ) </parameter> :
Sets the output format, the number of channels, and the output
</para> </listitem>
<listitem> <para> <type> long </type> <function> aout_GetBufInfo
</function> <parameter> ( aout_thread_t *p_aout,
long l_buffer_limit ) </parameter> :
Gets the status of the audio buffer.
</para> </listitem>
<listitem> <para> <function> aout_Play </function> <parameter>
( aout_thread_t *p_aout, byte_t *buffer, int i_size )
</parameter> :
Writes the audio output buffer to the audio device.
</para> </listitem>
<listitem> <para> <function> aout_Close </function> <parameter>
( aout_thread_t *p_aout ) </parameter> :
Closes the audio device.
</para> </listitem>
<appendix> <title> Advanced debugging </title>
We never debug our code, because we don't put bugs in. Okay, you want
some real stuff. Sam still uses <function> printf() </function> to
find out where it crashes. For real programmers, here is a summary
of what you can do if you have problems.
<sect1> <title> Where does it crash ? </title>
The best way to know that is to use gdb. You can start using it with
good chances by configuring with <parameter> --enable-debug </parameter>.
It will add <parameter> -g </parameter> to the compiler <parameter>
CFLAGS</parameter>, and activate some additional safety checks. Just
run <command> gdb vlc</command>, type <command> run myfile.vob</command>,
and wait until it crashes. You can view where it stopped with
<command>bt</command>, and print variables with <command>print
If you run into troubles, you may want to turn the optimizations off.
Optimizations (especially inline functions) may confuse the debugger.
Use <parameter> --disable-optimizations </parameter> in that case.
<sect1> <title> Other problems </title>
It may be more complicated than that, for instance unpredictable
behaviour, random bug or performance issue. You have several options
to deal with this. If you experience unpredictable behaviour, I hope
you don't have a heap or stack corruption (eg. writing in an unallocated
space), because they are hard to find. If you are really desperate, have
a look at something like ElectricFence or dmalloc. Under GNU/Linux, an
easy check is to type <command> export MALLOC_CHECK_=2 </command> before
launching vlc (see <command> malloc(3) </command> for more information).
VLC offers a "trace-mode". It can create a log file with very accurate dates
and messages of what it does, so it is useful to detect performance
issues or lock-ups. Compile with <parameter> --enable-trace </parameter>
and tune the <parameter> TRACE_* </parameter> flags in <filename>
include/config.h </filename> to enable certain types of messages (log
file writing can take up a lot of time, and will have side effects).
<chapter> <title> How to write a decoder </title>
<sect1> <title> What is precisely a decoder in the VLC scheme ? </title>
The decoder does the mathematical part of the process of playing a
stream. It is separated from the demultiplexers (in the input module),
which manage packets to rebuild a continuous elementary stream, and from
the output thread, which takes samples reconstituted by the decoder
and plays them. Basically, a decoder has no interaction with devices,
it is purely algorithmic.
In the next section we will describe how the decoder retrieves the
stream from the input. The output API (how to say "this sample is
decoded and can be played at xx") will be talked about in the next
<sect1> <title> Decoder configuration </title>
The input thread spawns the appropriate decoders [the relation is
currently hard-wired in <filename>src/input/input_programs.c</filename>].
It then launches <function> *_CreateThread()</function>, with either
an <type>adec_config_t</type> (audio) or an <type> vdec_config_t</type>
(video) structure, described in <filename> include/input_ext-dec.h</filename>.
It contains some parameters relative to the output thread, which will
be described in the following chapters, and a generic <type>
decoder_config_t</type>, which gives the decoder the ES ID and type, and
pointers to a <type> stream_control_t </type> structure (gives
information on the play status), a <type> decoder_fifo_t </type>
and <parameter> pf_init_bit_stream</parameter>, which will be
described in the next two sections.
<sect1> <title> Packet structures </title>
The input module provides an advanced API for delivering stream data
to the decoders. First let's have a look at the packet structures.
They are defined in <filename> include/input_ext-dec.h</filename>.
<type>data_packet_t</type> contains a pointer to the physical location
of data. Decoders should only start to read them at <parameter>
p_payload_start </parameter> until <parameter> p_payload_end</parameter>.
Thereafter, it will switch to the next packet, <parameter> p_next
</parameter> if it is not <constant>NULL</constant>. If the
<parameter> b_discard_payload
</parameter> flag is up, the content of the packet is messed up and it
should be discarded.
<type>data_packet_t</type> are contained into <type>pes_packet_t</type>.
<type>pes_packet_t</type> features a chained list
(<parameter>p_first</parameter>) of <type>data_packet_t
</type> representing (in the MPEG paradigm) a complete PES packet. For
PS streams, a <type> pes_packet_t </type> usually only contains one
<type>data_packet_t</type>. In TS streams though, one PES can be split
among dozens of TS packets. A PES packet has PTS dates (see your
MPEG specification for more information) and the current pace of reading
that should be applied for interpolating dates (<parameter>i_rate</parameter>).
<parameter> b_data_alignment </parameter> (if available in the system
layer) indicates if the packet is a random access point, and <parameter>
b_discontinuity </parameter> tells whether previous packets have been
<imagedata fileref="ps.eps" format="EPS" scalefit="1" scale="95" />
<imagedata fileref="ps.gif" format="GIF" />
<phrase> A PES packet in a Program Stream </phrase>
<para> In a Program Stream, a PES packet features only one
data packet, whose buffer contains the PS header, the PES
header, and the data payload.
<imagedata fileref="ts.eps" format="EPS" scalefit="1" scale="95" />
<imagedata fileref="ts.gif" format="GIF" />
<phrase> A PES packet in a Transport Stream </phrase>
<para> In a Transport Stream, a PES packet can feature an
unlimited number of data packets (three on the figure)
whose buffers contains the PS header, the PES
header, and the data payload.
The structure shared by both the input and the decoder is <type>
decoder_fifo_t</type>. It features a rotative FIFO of PES packets to
be decoded. The input provides macros to manipulate it : <function>
Please remember to take <parameter>p_decoder_fifo-&gt;data_lock
</parameter> before any operation on the FIFO.
The next packet to be decoded is DECODER_FIFO_START( *p_decoder_fifo ).
When it is finished, you need to call <function>
p_decoder_fifo-&gt;pf_delete_pes( p_decoder_fifo-&gt;p_packets_mgt,
DECODER_FIFO_START( *p_decoder_fifo ) ) </function> and then
<function> DECODER_FIFO_INCSTART( *p_decoder_fifo )</function> to
return the PES to the <link linkend="input_buff">buffer manager</link>.
If the FIFO is empty (<function>DECODER_FIFO_ISEMPTY</function>), you
can block until a new packet is received with a cond signal :
<function> vlc_cond_wait( &amp;p_fifo-&gt;data_wait,
&amp;p_fifo-&gt;data_lock )</function>. You have to hold the lock before
entering this function. If the file is over or the user quits,
<parameter>p_fifo-&gt;b_die</parameter> will be set to 1. It indicates
that you must free all your data structures and call <function>
vlc_thread_exit() </function> as soon as possible.
<sect1> <title> The bit stream (input module) </title>
This classical way of reading packets is not convenient, though, since
the elementary stream can be split up arbitrarily. The input module
provides primitives which make reading a bit stream much easier.
Whether you use it or not is at your option, though if you use it you
shouldn't access the packet buffer any longer.
The bit stream allows you to just call <function> GetBits()</function>,
and this functions will transparently read the packet buffers, change
data packets and pes packets when necessary, without any intervention
from you. So it is much more convenient for you to read a continuous
Elementary Stream, you don't have to deal with packet boundaries
and the FIFO, the bit stream will do it for you.
The central idea is to introduce a buffer of 32 bits [normally
<type> WORD_TYPE</type>, but 64-bit version doesn't work yet], <type>
bit_fifo_t</type>. It contains the word buffer and the number of
significant bits (higher part). The input module provides five
inline functions to manage it :
<listitem> <para> <type> u32 </type> <function> GetBits </function>
<parameter>( bit_stream_t * p_bit_stream, unsigned int i_bits )
</parameter> :
Returns the next <parameter> i_bits </parameter> bits from the
bit buffer. If there are not enough bits, it fetches the following
word from the <type>decoder_fifo_t</type>. This function is only
guaranteed to work with up to 24 bits. For the moment it works until
31 bits, but it is a side effect. We were obliged to write a different
function, <function>GetBits32</function>, for 32-bit reading,
because of the &lt;&lt; operator.
</para> </listitem>
<listitem> <para> <function> RemoveBits </function> <parameter>
( bit_stream_t * p_bit_stream, unsigned int i_bits ) </parameter> :
The same as <function> GetBits()</function>, except that the bits
aren't returned (we spare a few CPU cycles). It has the same
limitations, and we also wrote <function> RemoveBits32</function>.
</para> </listitem>
<listitem> <para> <type> u32 </type> <function> ShowBits </function>
<parameter>( bit_stream_t * p_bit_stream, unsigned int i_bits )
</parameter> :
The same as <function> GetBits()</function>, except that the bits
don't get flushed after reading, so that you need to call
<function> RemoveBits() </function> by hand afterwards. Beware,
this function won't work above 24 bits, except if you're aligned
on a byte boundary (see next function).
</para> </listitem>
<listitem> <para> <function> RealignBits </function> <parameter>
( bit_stream_t * p_bit_stream ) </parameter> :
Drops the n higher bits (n &lt; 8), so that the first bit of
the buffer be aligned an a byte boundary. It is useful when
looking for an aligned startcode (MPEG for instance).
</para> </listitem>
<listitem> <para> <function> GetChunk </function> <parameter>
( bit_stream_t * p_bit_stream, byte_t * p_buffer, size_t i_buf_len )
</parameter> :
It is an analog of <function> memcpy()</function>, but taking
a bit stream as first argument. <parameter> p_buffer </parameter>
must be allocated and at least <parameter> i_buf_len </parameter>
long. It is useful to copy data you want to keep track of.
</para> </listitem>
All these functions recreate a continuous elementary stream paradigm.
When the bit buffer is empty, they take the following word in the
current packet. When the packet is empty, it switches to the next
<type>data_packet_t</type>, or if unapplicable to the next <type>
pes_packet_t</type> (see <function>
p_bit_stream-&gt;pf_next_data_packet</function>). All this is
completely transparent.
<note> <title> Packet changes and alignment issues </title>
We have to study the conjunction of two problems. First, a
<type> data_packet_t </type> can have an even number of bytes,
for instance 177, so the last word will be truncated. Second,
many CPU (sparc, alpha...) can only read words aligned on a
word boundary (that is, 32 bits for a 32-bit word). So packet
changes are a lot more complicated than you can imagine, because
we have to read truncated words and get aligned.
For instance <function> GetBits() </function> will call
<function> UnalignedGetBits() </function> from <filename>
src/input/input_ext-dec.c</filename>. Basically it will
read byte after byte until the stream gets realigned. <function>
UnalignedShowBits() </function> is a bit more complicated
and may require a temporary packet
</para> </note>
To use the bit stream, you have to call <parameter>
p_decoder_config-&gt;pf_init_bit_stream( bit_stream_t * p_bit_stream,
decoder_fifo_t * p_fifo )</parameter> to set up all variables. You will
probably need to regularly fetch specific information from the packet,
for instance the PTS. If <parameter> p_bit_stream-&gt;pf_bit_stream_callback
</parameter> is not <constant> NULL</constant>, it will be called
on a packet change. See <filename> src/video_parser/video_parser.c
</filename> for an example. The second argument
indicates whether it is just a new <type>data_packet_t</type> or
also a new <type>pes_packet_t</type>. You can store your own structure in
<parameter> p_bit_stream-&gt;p_callback_arg</parameter>.
<warning> <para>
When you call <function>pf_init_bit_stream</function>, the
<function>pf_bitstream_callback</function> is not defined yet,
but it jumps to the first packet, though. You will probably
want to call your bitstream callback by hand just after
<function> pf_init_bit_stream</function>.
</para> </warning>
<sect1> <title> Built-in decoders </title>
VLC already features an MPEG layer 1 and 2 audio decoder, an MPEG MP@ML
video decoder, an AC3 decoder (borrowed from LiViD), a DVD SPU decoder,
and an LPCM decoder [not functional yet]. You can write your own
decoder, just mimic the video parser.
<note> <title> Limitations in the current design </title>
Currently, decoders are not "plug-ins", that is they are not dynamically
loadable. The way the input chooses a decoder is also not final - it
is hard-wired in <filename> src/input/input_programs.c</filename>.
</para> </note>
The MPEG audio decoder is native, but doesn't support layer 3 decoding
[too much trouble], the AC3 decoder is a port from Aaron
Holtzman's libac3 (the original libac3 isn't reentrant), and the
SPU decoder is native. You may want to have a look at <function>
BitstreamCallback </function> in the AC3 decoder. In that case we have
to jump the first 3 bytes of a PES packet, which are not part of the
elementary stream. The video decoder is a bit special and will
be described in the following section.
<sect1> <title> The MPEG video decoder </title>
VideoLAN Client provides an MPEG-1, and an MPEG-2 Main Profile @
Main Level decoder. It has been natively written for VLC, and is quite
mature. Its status is a bit special, since it is splitted between two
modules : video parser and video decoder [this is subject to change].
The initial goal is to separate bit stream parsing functions from
highly parallelizable mathematical algorithms. In theory, there can be
one video parser thread (and only one, otherwise we would have race
conditions reading the bit stream), along with several video decoder
threads, which do IDCT and motion compensation on several blocks
at once [practically,
multi-threaded mode hasn't been tested for a while, still needs some
work, and was actually slower than mono-threaded mode ; the
multi-threaded mode won't be documented for the moment].
It doesn't (and won't) support MPEG-4 or DivX decoding. It is not an
encoder. It should support the whole MPEG-2 MP@ML specification, though
some features are still left untested, like Differential Motion Vectors.
Please bear in mind before complaining that the input elementary stream
must be valid (for instance this is not the case when you directly read
a DVD multi-angle .vob file).
The most interesting file is <filename> vpar_synchro.c</filename>, it is
really worth the shot. It explains the whole frame dropping algorithm.
In a nutshell, if the machine is powerful enough, we decoder all IPBs,
otherwise we decode all IPs and Bs if we have enough time (this is
based on on-the-fly decoding time statistics). Another interesting file
is <filename>vpar_blocks.c</filename>, which describes all block
(including coefficients and motion vectors) parsing algorithms. Look
at the bottom of the file, we indeed generate one optimized function
for every common picture type, and one slow generic function. There
are also several levels of optimization (which makes compilation slower
but certain types of files faster decoded) called <constant>
VPAR_OPTIM_LEVEL</constant>, level 0 means no optimization, level 1
means optimizations for MPEG-1 and MPEG-2 frame pictures, level 2
means optimizations for MPEG-1 and MPEG-2 field and frame pictures.
<sect2> <title> Motion compensation plug-ins </title>
Motion compensation (i.e. copy of regions from a reference picture) is
very platform-dependant (for instance with MMX or AltiVec versions), so
we moved it to the <filename> plugins/motion </filename> directory. It
is more convenient for the video decoder, and resulting plug-ins may
be used by other video decoders (MPEG-4 ?). A motion plugin must
define 6 functions, coming straight from the specification :
<function> vdec_MotionFieldField420, vdec_MotionField16x8420,
vdec_MotionFieldDMV420, vdec_MotionFrameFrame420, vdec_MotionFrameField420,
vdec_MotionFrameDMV420</function>. The equivalent 4:2:2 and 4:4:4
functions are unused, since these formats are forbidden in MP@ML (it
would only take longer compilation time).
Look at the C version of the algorithms if you want more information.
Note also that the DMV algorithm is untested and is probably buggy.
<sect2> <title> IDCT plug-ins </title>
Just like motion compensation, IDCT is platform-specific. So we moved it
to <filename> plugins/idct</filename>. You need to define four methods :
<listitem> <para> <function> vdec_IDCT </function> <parameter>
( vdec_thread_t * p_vdec, dctelem_t * p_block, int ) </parameter> :
Does the complete 2-D IDCT. 64 coefficients are in <parameter>
</para> </listitem>
<listitem> <para> <function> vdec_SparseIDCT </function>
<parameter> ( vdec_thread_t * p_vdec, dctelem_t * p_block,
int i_sparse_pos ) </parameter> :
Does an IDCT on a block with only one non-NULL coefficient
(designated by <parameter> i_sparse_pos</parameter>). You can
use the function defined in <filename> plugins/idct/idct_common.c
</filename> which precalculates these 64 matrices at
initialization time.
</para> </listitem>
<listitem> <para> <function> vdec_InitIDCT </function>
<parameter> ( vdec_thread_t * p_vdec ) </parameter> :
Does the initialization stuff needed by <function>
</para> </listitem>
<listitem> <para> <function> vdec_NormScan </function>
<parameter> ( u8 ppi_scan[2][64] ) </parameter> :
Normally, this function does nothing. For minor optimizations,
some IDCT (MMX) need to invert certain coefficients in the
MPEG scan matrices (see ISO/IEC 13818-2).
</para> </listitem>
Currently we have implemented optimized versions for : MMX, MMXEXT, and
AltiVec [doesn't work]. We have two plain C versions, the normal
(supposedly optimized) Berkeley version (<filename>idct.c</filename>),
and the simple 1-D separation IDCT from the ISO reference decoder
[In the future, the IDCT plug-in will include <function> vdec_AddBlock
</function> and <function> vdec_CopyBlock </function>, which are
often architecture-specific.]