ceph/web/source.body


<div class="mainsegment">
	<h3>Getting Started</h3>
	<div>
		The Ceph source code is managed with Git.  For a Git crash course, there is a <a href="http://www.kernel.org/pub/software/scm/git/docs/tutorial.html">tutorial</a> and more from the <a href="http://git.or.cz/#documentation">official Git site</a>.  Here is a quick <a href="http://git.or.cz/course/svn.html">crash course for Subversion users</a>.

		<p>The Ceph project is always looking for more participants. If you are interested in using Ceph, or contributing to its development, please <a href="http://lists.sourceforge.net/mailman/listinfo/ceph-devel">join our mailing list</a> and <a href="mailto:ceph-devel@lists.sourceforge.net">drop us a line</a>.

		<h4>Checking out</h4>
		<div>
			You can check out a working copy (actually, clone the repository) with
<pre>
git clone git://ceph.newdream.net/ceph.git
</pre>
			To pull the latest,
<pre>
git pull
</pre>
		</div>

		<h4>Build Targets</h4>
		<div>
			There are a range of binary targets, mostly for ease of development and testing:
			<ul>
			<li><b>fakesyn</b> -- places all logical elements (MDS, client, etc.) in a single binary, with synchronous message delivery (for easy debugging!).  Includes synthetic workload generation.</li>
			<li><b>fakefuse</b> -- same as fakesyn, but mounts a single client via FUSE.</li>
			<li><b>newsyn</b> -- starts up all logical elements using MPI.  As with fakesyn, it includes synthetic workload generation.</li>
			<li><b>cosd</b> -- standalone OSD</li>
			<li><b>cmon</b> -- standalone monitor</li>
			<li><b>cmds</b> -- standalone MDS</li>
			<li><b>cfuse</b> -- standalone client, mountable via FUSE</li>
			</ul>

			For most development, fakesyn, fakefuse, and newsyn are sufficient.
		</div>

		<h4>Runtime Environment</h4>
		<div>
			Few quick steps to get things started.  Note that these instructions assume either that you are running on one node, or have a shared directory (e.g. over NFS) mounted on each node.

			<ol>
			<li>Checkout, change into the <tt>ceph/src</tt> directory, and build.  E.g.,
<pre>
git clone git://ceph.newdream.net/ceph.git
cd ceph/src
make mpi=no fuse=no
</pre>
(You can omit the mpi=no or fuse=no if you happen to have those installed.)


			<li>Create a <tt>log/</tt> dir for various runtime stats.
<pre>
mkdir log
</pre>
			<li>Identify the EBOFS block devices. This is accomplished with symlinks (or actual files) in the <tt>dev/</tt> directory.  Devices can be identified by symlinks named after the hostname (e.g. <tt>osd.googoo-1</tt>), logical OSD number (e.g. <tt>osd4</tt>), or simply <tt>osd.all</tt> (in that order of preference).  For example,
<pre>
mkdir dev
ln -s /dev/sda3 dev/osd.all   # all nodes use /dev/sda3
ln -s /dev/sda4 dev/osd0      # except osd0, which should use /dev/sd4
</pre>
				That is, when an osd starts up, it first looks for <tt>dev/osd$n</tt>, then <tt>dev/osd.all</tt>, in that order.

				These need not be "real" devices--they can be regular files too.  To get going with fakesyn, for example, or to test a whole "cluster" running on the same node,
<pre>
# create small "disks" for osd0-osd3
for f in 0 1 2 3; do                                 # default is 4 OSDs
dd if=/dev/zero of=dev/osd$f bs=1048576 count=1024   # 1 GB each
done
</pre>
				Note that if your home/working directory is mounted via NFS or similar, you'll want to symlink <tt>dev/</tt> to a directory on a local disk.
		</div>


		<h4>Running fakesyn -- everything one process</h4>
		<div>
			A quick example, assuming you've set up "fake" EBOFS devices as above:
<pre>
make fakesyn && ./fakesyn --mkfs --debug_ms 1 --debug_client 3 --syn rw 1 100000
# where those options mean:
#	--mkfs               # start with a fresh file system
#	--debug_ms 1         # show message delivery
#	--debug_client 3     # show limited client stuff
#	--syn rw 1 100000    # write 1MB to a file in 100,000 byte chunks, then read it back
</pre>
			One the synthetic workload finishes, the synthetic client unmounts, and the whole system shuts down.

			The full set of command line arguments can be found in <tt>config.cc</tt>.
		</div>

		<h4>Starting up a full "cluster" on a single host</h4>
		<div>
			You can start up a the full cluster of daemons on a single host.  Assuming you've created a set of individual files for each OSD's block device (the second option of #3 above), you can create a <tt>stop.sh</tt> script like
<pre>
#!/bin/sh
killall cosd cmds cmon
</pre>
and a <tt>start.sh</tt> script like
<pre>
#!/bin/sh
./stop.sh
./mkmonmap 1.2.3.4:12345  # your IP here; any unused port will do
./cmon --mkfs --mon 0 &
./cosd --mkfs --osd 0 &
./cosd --mkfs --osd 1 &
./cosd --mkfs --osd 2 &
./cosd --mkfs --osd 3 &
./cmds &
</pre>
			Note that the IP you specify is for the monitor.  This is the only fixed and static ip:port in the system.  The rest of the cluster daemons bind to a random port and register themselves with the monitor.
		</div>

		<h4>Mounting with FUSE</h4>
		<div>
			The easiest route is <tt>fakefuse</tt>:
<pre>
modprobe fuse  # make sure fuse module is loaded
mkdir mnt      # or whereever you want your mount point
make fakefuse && ./fakefuse --mkfs --debug_ms 1 mnt
</pre>
			You should be able to ls, copy files, or whatever else (in another terminal; fakefuse will stay in the foreground).  Control-C will kill fuse and cause an orderly shutdown.  Alternatively, <tt>fusermount -u mnt</tt> will unmount.  If fakefuse crashes or hangs, you may need to <tt>kill -9 fakefuse</tt> and/or <tt>fusermount -u mnt</tt> to clean up.  Overall, FUSE is pretty well-behaved.

If you have the cluster daemon's already running (as above), you can mount via the standalone fuse client:
<pre>
modprobe fuse
mkdir mnt
make cfuse && ./cfuse mnt
</pre>
		</div>

		<h4>Running the kernel client in a UML instance</h4>
		<div>
			Any recent mainline kernel will do here.
<pre>
$ cd linux
$ patch -p1 < ~/ceph/src/kernel/kconfig.patch
patching file fs/Kconfig
patching file fs/Makefile
$ cp ~/ceph/src/kernel/sample.uml.config .config
$ ln -s ~/ceph/src/kernel fs/ceph
$ ln -s ~/ceph/src/include/ceph_fs.h include/linux
$ make ARCH=um
</pre>
			I am using <a href="http://uml.nagafix.co.uk/Debian-3.1/Debian-3.1-AMD64-root_fs.bz2">this x86_64 Debian UML root fs image</a>, but any image will do (see <a href="http://user-mode-linux.sf.net">http://user-mode-linux.sf.net</a>) as long as the architecture (e.g. x86_64 vs i386) matches your host.  Start up the UML instance with something like
<pre>
./linux ubda=Debian-3.1-AMD64-root_fs mem=256M eth0=tuntap,,,1.2.3.4  # 1.2.3.4 is the _host_ ip
</pre>
			Note that if UML crashes/oopses/whatever, you can restart quick-and-dirty (up arrow + enter) with
<pre>
reset ; killall -9 linux ; ./linux ubda=Debian-3.1-AMD64-root_fs mem=256M eth0=tuntap,,,1.2.3.4
</pre>
			You'll need to configure the network in UML with an unused IP.  For my debian-based root fs image, this <tt>/etc/network/interfaces</tt> file does the trick:
<pre>
iface eth0 inet static
        address 1.2.3.5     # unused ip in your host's netowrk
        netmask 255.0.0.0
        gateway 1.2.3.4     # host ip
auto eth0
</pre>
			Note that you need install uml-utilities (<tt>apt-get install uml-utilities</tt> on debian distros) and add yourself to the <tt>uml-net</tt> group on the host (or run the UML instance as root) for the network to start up properly.
			<p>
			Inside UML, you'll want an <tt>/etc/fstab</tt> line like
<pre>
none            /host           hostfs  defaults        0       0
</pre>
			You can then load the kernel client module and mount from the UML instance with
<pre>
insmod /host/path/to/ceph/src/kernel/ceph.ko
mount -t ceph 1.2.3.4:/ -o monport=12345,ip=1.2.3.5 mnt # 1.2.3.4 is host, 1.2.3.5 is uml ip
</pre>

		</div>

		<h4>Running on multiple nodes</h4>
		<div>
			If you're ready to start things up on multiple nodes (or even just multiple processes on the same node), <tt>newsyn</tt> is the easiest way to get things launched.  It uses MPI to start up all the processes.  Assuming you have MPICH2 (or similar) installed,
<pre>
mpd &           # for a single host
mpiboot -n 10   # for multiple hosts (see MPICH docs)
make newsyn && mpiexec -l -n 10 ./newsyn --mkfs --nummds 1 --numosd 6 --numclient 20 --syn writefile 100 16384
</pre>
			You will probably want to make <tt>dev/osd.all</tt> a symlink to some block device that exists on every node you're starting an OSD on.  Otherwise, you'll need a symlink (for "block device" file) for each osd.

			If you want to mount a distributed FS (instead of generating a synthetic workload), try
<pre>
make newsyn && mpiexec -l -n 10 ./newsyn --mkfs --nummds 2 --numosd 6 --numclient 0     # 0 clients, just mds and osds
# in another terminal,
mkdir mnt
make cfuse && ./cfuse mnt
# and in yet another terminal,
ls mnt
touch mnt/asdf   # etc
</pre>
			Currently, when the last client (<tt>cfuse</tt> instance, in this case) shuts down, the whole thing will shut down.  Assuming things shut down cleanly, you should be able to start things up again without the <tt>--mkfs</tt> flag and recover the prior file system state.
		</div>

		<h4>Structure</h4>
		<div>
			Here's a crude table diagram that shows how the major (user space) pieces fit together.  Ingore the MDS bits; that's mostly wrong.

			FIXME: this links to the <b>old</b> Subversion repository.
<table border=0>
<tr> <td></td> <td>Application</td> </tr>
<tr> <td></td> <td class=kernel>kernel</td> </tr>
<tr> <td>Application</td> <td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/client/fuse.cc?view=markup">FUSE glue</a></td> </tr>
<tr> <td class=entity colspan=2><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/client/Client.h?view=markup">Client</a></td> <td class=net width=50></td><td class=entity colspan=2><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/mds/MDS.h?view=markup">MDS</a></td> </tr>
<tr>  <td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osdc/Filer.h?view=markup">Filer</a></td> <td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osdc/ObjectCacher.h?view=markup">ObjectCacher</a></td> <td></td><td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/mds/MDLog.h?view=markup">MDLog</td><td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/mds/MDStore.h?view=markup">MDStore</a></td> </tr>
<tr> <td></td> <td class=lib colspan=4><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osdc/Objecter.h?view=markup">Objecter</a></td> </tr>
<tr> <td colspan=2></td> <td class=net colspan=2>(message layer)</td> </tr>
<tr> <td colspan=2></td> <td class=entity colspan=2><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osd/OSD.h?view=markup">OSD</a></td> </tr>
<tr> <td colspan=2></td>  <td class=abstract colspan=2><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osd/ObjectStore.h?view=markup">ObjectStore</a></td> </tr>
<tr> <td colspan=2></td>  <td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/ebofs/Ebofs.h?view=markup">EBOFS</a></td> <td rowspan=2 class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osd/FakeStore.h?view=markup">FakeStore</a></td> </tr>
<tr> <td colspan=2></td>  <td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/ebofs/BlockDevice.h?view=markup">BlockDevice</a></td> </tr>
<tr> <td colspan=2></td>  <td class=kernel colspan=2>Kernel POSIX interface</td> </tr>
<tr> <td height=30></td> </tr>
<tr> <td>Key:</td> <td class=net>Network</td> <td class=entity>Entity</td> <td class=lib>Lib/module</td> <td class=abstract>Abstract interface</td> <td class=kernel>Kernel</td>  </tr>
</table>

		</div>
	</div>
</div>