mirror of
https://github.com/ceph/ceph
synced 2024-12-13 15:08:33 +00:00
222 lines
12 KiB
Plaintext
222 lines
12 KiB
Plaintext
|
|
<div class="mainsegment">
|
|
<h3>Getting Started</h3>
|
|
<div>
|
|
The Ceph source code is managed with Git. For a Git crash course, there is a <a href="http://www.kernel.org/pub/software/scm/git/docs/tutorial.html">tutorial</a> and more from the <a href="http://git.or.cz/#documentation">official Git site</a>. Here is a quick <a href="http://git.or.cz/course/svn.html">crash course for Subversion users</a>.
|
|
|
|
<p>The Ceph project is always looking for more participants. If you are interested in using Ceph, or contributing to its development, please <a href="http://lists.sourceforge.net/mailman/listinfo/ceph-devel">join our mailing list</a> and <a href="mailto:ceph-devel@lists.sourceforge.net">drop us a line</a>.
|
|
|
|
<h4>Checking out</h4>
|
|
<div>
|
|
You can check out a working copy (actually, clone the repository) with
|
|
<pre>
|
|
git clone git://ceph.newdream.net/ceph.git
|
|
</pre>
|
|
To pull the latest,
|
|
<pre>
|
|
git pull
|
|
</pre>
|
|
</div>
|
|
|
|
<h4>Build Targets</h4>
|
|
<div>
|
|
There are a range of binary targets, mostly for ease of development and testing:
|
|
<ul>
|
|
<li><b>fakesyn</b> -- places all logical elements (MDS, client, etc.) in a single binary, with synchronous message delivery (for easy debugging!). Includes synthetic workload generation.</li>
|
|
<li><b>fakefuse</b> -- same as fakesyn, but mounts a single client via FUSE.</li>
|
|
<li><b>newsyn</b> -- starts up all logical elements using MPI. As with fakesyn, it includes synthetic workload generation.</li>
|
|
<li><b>cosd</b> -- standalone OSD</li>
|
|
<li><b>cmon</b> -- standalone monitor</li>
|
|
<li><b>cmds</b> -- standalone MDS</li>
|
|
<li><b>cfuse</b> -- standalone client, mountable via FUSE</li>
|
|
</ul>
|
|
|
|
For most development, fakesyn, fakefuse, and newsyn are sufficient.
|
|
</div>
|
|
|
|
<h4>Runtime Environment</h4>
|
|
<div>
|
|
Few quick steps to get things started. Note that these instructions assume either that you are running on one node, or have a shared directory (e.g. over NFS) mounted on each node.
|
|
|
|
<ol>
|
|
<li>Checkout, change into the <tt>ceph/src</tt> directory, and build. E.g.,
|
|
<pre>
|
|
git clone git://ceph.newdream.net/ceph.git
|
|
cd ceph/src
|
|
make mpi=no fuse=no
|
|
</pre>
|
|
(You can omit the mpi=no or fuse=no if you happen to have those installed.)
|
|
|
|
|
|
<li>Create a <tt>log/</tt> dir for various runtime stats.
|
|
<pre>
|
|
mkdir log
|
|
</pre>
|
|
<li>Identify the EBOFS block devices. This is accomplished with symlinks (or actual files) in the <tt>dev/</tt> directory. Devices can be identified by symlinks named after the hostname (e.g. <tt>osd.googoo-1</tt>), logical OSD number (e.g. <tt>osd4</tt>), or simply <tt>osd.all</tt> (in that order of preference). For example,
|
|
<pre>
|
|
mkdir dev
|
|
ln -s /dev/sda3 dev/osd.all # all nodes use /dev/sda3
|
|
ln -s /dev/sda4 dev/osd0 # except osd0, which should use /dev/sd4
|
|
</pre>
|
|
That is, when an osd starts up, it first looks for <tt>dev/osd$n</tt>, then <tt>dev/osd.all</tt>, in that order.
|
|
|
|
These need not be "real" devices--they can be regular files too. To get going with fakesyn, for example, or to test a whole "cluster" running on the same node,
|
|
<pre>
|
|
# create small "disks" for osd0-osd3
|
|
for f in 0 1 2 3; do # default is 4 OSDs
|
|
dd if=/dev/zero of=dev/osd$f bs=1048576 count=1024 # 1 GB each
|
|
done
|
|
</pre>
|
|
Note that if your home/working directory is mounted via NFS or similar, you'll want to symlink <tt>dev/</tt> to a directory on a local disk.
|
|
</div>
|
|
|
|
|
|
<h4>Running fakesyn -- everything one process</h4>
|
|
<div>
|
|
A quick example, assuming you've set up "fake" EBOFS devices as above:
|
|
<pre>
|
|
make fakesyn && ./fakesyn --mkfs --debug_ms 1 --debug_client 3 --syn rw 1 100000
|
|
# where those options mean:
|
|
# --mkfs # start with a fresh file system
|
|
# --debug_ms 1 # show message delivery
|
|
# --debug_client 3 # show limited client stuff
|
|
# --syn rw 1 100000 # write 1MB to a file in 100,000 byte chunks, then read it back
|
|
</pre>
|
|
One the synthetic workload finishes, the synthetic client unmounts, and the whole system shuts down.
|
|
|
|
The full set of command line arguments can be found in <tt>config.cc</tt>.
|
|
</div>
|
|
|
|
<h4>Starting up a full "cluster" on a single host</h4>
|
|
<div>
|
|
You can start up a the full cluster of daemons on a single host. Assuming you've created a set of individual files for each OSD's block device (the second option of #3 above), you can create a <tt>stop.sh</tt> script like
|
|
<pre>
|
|
#!/bin/sh
|
|
killall cosd cmds cmon
|
|
</pre>
|
|
and a <tt>start.sh</tt> script like
|
|
<pre>
|
|
#!/bin/sh
|
|
./stop.sh
|
|
./mkmonmap 1.2.3.4:12345 # your IP here; any unused port will do
|
|
./cmon --mkfs --mon 0 &
|
|
./cosd --mkfs --osd 0 &
|
|
./cosd --mkfs --osd 1 &
|
|
./cosd --mkfs --osd 2 &
|
|
./cosd --mkfs --osd 3 &
|
|
./cmds &
|
|
</pre>
|
|
Note that the IP you specify is for the monitor. This is the only fixed and static ip:port in the system. The rest of the cluster daemons bind to a random port and register themselves with the monitor.
|
|
</div>
|
|
|
|
<h4>Mounting with FUSE</h4>
|
|
<div>
|
|
The easiest route is <tt>fakefuse</tt>:
|
|
<pre>
|
|
modprobe fuse # make sure fuse module is loaded
|
|
mkdir mnt # or whereever you want your mount point
|
|
make fakefuse && ./fakefuse --mkfs --debug_ms 1 mnt
|
|
</pre>
|
|
You should be able to ls, copy files, or whatever else (in another terminal; fakefuse will stay in the foreground). Control-C will kill fuse and cause an orderly shutdown. Alternatively, <tt>fusermount -u mnt</tt> will unmount. If fakefuse crashes or hangs, you may need to <tt>kill -9 fakefuse</tt> and/or <tt>fusermount -u mnt</tt> to clean up. Overall, FUSE is pretty well-behaved.
|
|
|
|
If you have the cluster daemon's already running (as above), you can mount via the standalone fuse client:
|
|
<pre>
|
|
modprobe fuse
|
|
mkdir mnt
|
|
make cfuse && ./cfuse mnt
|
|
</pre>
|
|
</div>
|
|
|
|
<h4>Running the kernel client in a UML instance</h4>
|
|
<div>
|
|
Any recent mainline kernel will do here.
|
|
<pre>
|
|
$ cd linux
|
|
$ patch -p1 < ~/ceph/src/kernel/kconfig.patch
|
|
patching file fs/Kconfig
|
|
patching file fs/Makefile
|
|
$ cp ~/ceph/src/kernel/sample.uml.config .config
|
|
$ ln -s ~/ceph/src/kernel fs/ceph
|
|
$ ln -s ~/ceph/src/include/ceph_fs.h include/linux
|
|
$ make ARCH=um
|
|
</pre>
|
|
I am using <a href="http://uml.nagafix.co.uk/Debian-3.1/Debian-3.1-AMD64-root_fs.bz2">this x86_64 Debian UML root fs image</a>, but any image will do (see <a href="http://user-mode-linux.sf.net">http://user-mode-linux.sf.net</a>) as long as the architecture (e.g. x86_64 vs i386) matches your host. Start up the UML instance with something like
|
|
<pre>
|
|
./linux ubda=Debian-3.1-AMD64-root_fs mem=256M eth0=tuntap,,,1.2.3.4 # 1.2.3.4 is the _host_ ip
|
|
</pre>
|
|
Note that if UML crashes/oopses/whatever, you can restart quick-and-dirty (up arrow + enter) with
|
|
<pre>
|
|
reset ; killall -9 linux ; ./linux ubda=Debian-3.1-AMD64-root_fs mem=256M eth0=tuntap,,,1.2.3.4
|
|
</pre>
|
|
You'll need to configure the network in UML with an unused IP. For my debian-based root fs image, this <tt>/etc/network/interfaces</tt> file does the trick:
|
|
<pre>
|
|
iface eth0 inet static
|
|
address 1.2.3.5 # unused ip in your host's netowrk
|
|
netmask 255.0.0.0
|
|
gateway 1.2.3.4 # host ip
|
|
auto eth0
|
|
</pre>
|
|
Note that you need install uml-utilities (<tt>apt-get install uml-utilities</tt> on debian distros) and add yourself to the <tt>uml-net</tt> group on the host (or run the UML instance as root) for the network to start up properly.
|
|
<p>
|
|
Inside UML, you'll want an <tt>/etc/fstab</tt> line like
|
|
<pre>
|
|
none /host hostfs defaults 0 0
|
|
</pre>
|
|
You can then load the kernel client module and mount from the UML instance with
|
|
<pre>
|
|
insmod /host/path/to/ceph/src/kernel/ceph.ko
|
|
mount -t ceph 1.2.3.4:/ -o monport=12345,ip=1.2.3.5 mnt # 1.2.3.4 is host, 1.2.3.5 is uml ip
|
|
</pre>
|
|
|
|
</div>
|
|
|
|
<h4>Running on multiple nodes</h4>
|
|
<div>
|
|
If you're ready to start things up on multiple nodes (or even just multiple processes on the same node), <tt>newsyn</tt> is the easiest way to get things launched. It uses MPI to start up all the processes. Assuming you have MPICH2 (or similar) installed,
|
|
<pre>
|
|
mpd & # for a single host
|
|
mpiboot -n 10 # for multiple hosts (see MPICH docs)
|
|
make newsyn && mpiexec -l -n 10 ./newsyn --mkfs --nummds 1 --numosd 6 --numclient 20 --syn writefile 100 16384
|
|
</pre>
|
|
You will probably want to make <tt>dev/osd.all</tt> a symlink to some block device that exists on every node you're starting an OSD on. Otherwise, you'll need a symlink (for "block device" file) for each osd.
|
|
|
|
If you want to mount a distributed FS (instead of generating a synthetic workload), try
|
|
<pre>
|
|
make newsyn && mpiexec -l -n 10 ./newsyn --mkfs --nummds 2 --numosd 6 --numclient 0 # 0 clients, just mds and osds
|
|
# in another terminal,
|
|
mkdir mnt
|
|
make cfuse && ./cfuse mnt
|
|
# and in yet another terminal,
|
|
ls mnt
|
|
touch mnt/asdf # etc
|
|
</pre>
|
|
Currently, when the last client (<tt>cfuse</tt> instance, in this case) shuts down, the whole thing will shut down. Assuming things shut down cleanly, you should be able to start things up again without the <tt>--mkfs</tt> flag and recover the prior file system state.
|
|
</div>
|
|
|
|
<h4>Structure</h4>
|
|
<div>
|
|
Here's a crude table diagram that shows how the major (user space) pieces fit together. Ingore the MDS bits; that's mostly wrong.
|
|
|
|
FIXME: this links to the <b>old</b> Subversion repository.
|
|
<table border=0>
|
|
<tr> <td></td> <td>Application</td> </tr>
|
|
<tr> <td></td> <td class=kernel>kernel</td> </tr>
|
|
<tr> <td>Application</td> <td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/client/fuse.cc?view=markup">FUSE glue</a></td> </tr>
|
|
<tr> <td class=entity colspan=2><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/client/Client.h?view=markup">Client</a></td> <td class=net width=50></td><td class=entity colspan=2><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/mds/MDS.h?view=markup">MDS</a></td> </tr>
|
|
<tr> <td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osdc/Filer.h?view=markup">Filer</a></td> <td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osdc/ObjectCacher.h?view=markup">ObjectCacher</a></td> <td></td><td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/mds/MDLog.h?view=markup">MDLog</td><td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/mds/MDStore.h?view=markup">MDStore</a></td> </tr>
|
|
<tr> <td></td> <td class=lib colspan=4><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osdc/Objecter.h?view=markup">Objecter</a></td> </tr>
|
|
<tr> <td colspan=2></td> <td class=net colspan=2>(message layer)</td> </tr>
|
|
<tr> <td colspan=2></td> <td class=entity colspan=2><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osd/OSD.h?view=markup">OSD</a></td> </tr>
|
|
<tr> <td colspan=2></td> <td class=abstract colspan=2><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osd/ObjectStore.h?view=markup">ObjectStore</a></td> </tr>
|
|
<tr> <td colspan=2></td> <td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/ebofs/Ebofs.h?view=markup">EBOFS</a></td> <td rowspan=2 class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/osd/FakeStore.h?view=markup">FakeStore</a></td> </tr>
|
|
<tr> <td colspan=2></td> <td class=lib><a href="http://svn.sourceforge.net/viewvc/ceph/trunk/ceph/ebofs/BlockDevice.h?view=markup">BlockDevice</a></td> </tr>
|
|
<tr> <td colspan=2></td> <td class=kernel colspan=2>Kernel POSIX interface</td> </tr>
|
|
<tr> <td height=30></td> </tr>
|
|
<tr> <td>Key:</td> <td class=net>Network</td> <td class=entity>Entity</td> <td class=lib>Lib/module</td> <td class=abstract>Abstract interface</td> <td class=kernel>Kernel</td> </tr>
|
|
</table>
|
|
|
|
</div>
|
|
</div>
|
|
</div>
|
|
|