ceph/web/source.body

169 lines
7.8 KiB
Plaintext

<div class="mainsegment">
<h3>Getting Started</h3>
<div>
The Ceph source code is managed with Git. For a Git crash course, there is a <a href="http://www.kernel.org/pub/software/scm/git/docs/tutorial.html">tutorial</a> and more from the <a href="http://git.or.cz/#documentation">official Git site</a>. Here is a quick <a href="http://git.or.cz/course/svn.html">crash course for Subversion users</a>.
<p>The Ceph project is always looking for more participants. If you are interested in using Ceph, or contributing to its development, please <a href="http://lists.sourceforge.net/mailman/listinfo/ceph-devel">join our mailing list</a> and <a href="mailto:ceph-devel@lists.sourceforge.net">drop us a line</a>.
<h4>Checking out the source</h4>
<div>
You can check out a working copy (actually, clone the repository) with
<pre>
git clone git://ceph.newdream.net/ceph.git
</pre>
or
<pre>
git clone http://ceph.newdream.net/git/ceph.git
</pre>
To pull the latest,
<pre>
git pull
</pre>
You can browse the git repo at <a href="http://ceph.newdream.net/git">http://ceph.newdream.net/git</a>.
</div>
<h4>Build Targets</h4>
<div>
There are a range of binary targets, mostly for ease of development and testing:
<ul>
<li><b>cmon</b> -- monitor</li>
<li><b>cosd</b> -- OSD storage daemon</li>
<li><b>cmds</b> -- MDS metadata server</li>
<li><b>cfuse</b> -- client, mountable via FUSE</li>
<li><b>csyn</b> -- client sythetic workload generator</li>
<li><b>cmonctl</b> -- control tool</li>
<p>
<li><b>fakesyn</b> -- places all logical elements (MDS, client, etc.) in a single binary, with synchronous message delivery (for easy debugging!). Includes synthetic workload generation.</li>
<li><b>fakefuse</b> -- same as fakesyn, but mounts a single client via FUSE.</li>
</ul>
</div>
<h4>Runtime Environment</h4>
<div>
Few quick steps to get things started. Note that these instructions assume either that you are running on one node, or have a shared directory (e.g. over NFS) mounted on each node.
<ol>
<li>Checkout, change into the <tt>ceph/src</tt> directory, and build. E.g.,
<pre>
git clone git://ceph.newdream.net/ceph.git
cd ceph
./autogen.sh
./configure # of CXXFLAGS="-g" ./configure to disable optimizations (for debugging)
cd src
make
</pre>
<li>Create a <tt>log/</tt> dir for various runtime stats.
<pre>
mkdir log
</pre>
<li>Identify the EBOFS block devices. This is accomplished with symlinks (or actual files) in the <tt>dev/</tt> directory. Devices can be identified by symlinks named after the hostname (e.g. <tt>osd.googoo-1</tt>), logical OSD number (e.g. <tt>osd4</tt>), or simply <tt>osd.all</tt> (in that order of preference). For example,
<pre>
mkdir dev
ln -s /dev/sda3 dev/osd.all # all nodes use /dev/sda3
ln -s /dev/sda4 dev/osd0 # except osd0, which should use /dev/sd4
</pre>
That is, when an osd starts up, it first looks for <tt>dev/osd$n</tt>, then <tt>dev/osd.all</tt>, in that order.
These need not be "real" devices--they can be regular files too. To get going with fakesyn, for example, or to test a whole "cluster" running on the same node,
<pre>
# create small "disks" for osd0-osd3
for f in 0 1 2 3; do # default is 4 OSDs
dd if=/dev/zero of=dev/osd$f bs=1048576 count=1024 # 1 GB each
done
</pre>
Note that if your home/working directory is mounted via NFS or similar, you'll want to symlink <tt>dev/</tt> to a directory on a local disk.
</div>
<h4>Starting up a full "cluster" on a single host</h4>
<div>
You can start up a the full cluster of daemons on a single host. Assuming you've created a set of individual files for each OSD's block device (the second option of #3 above), there is a <tt>start.sh</tt> and <tt>stop.sh</tt> script that will start up on port 12345.
<p>
One caveat here is that the ceph daemons need to know what IP they are reachable at; they determine that by doing a lookup on the machine's hostname. Since many/most systems map the hostname to 127.0.0.1 in <tt>/etc/hosts</tt>, you either need to change that (the easiest approach, usually) or add a <tt>--bind 1.2.3.4</tt> argument to cmon/cosd/cmds to help them out.
<p>
Note that the monitor has the only fixed and static ip:port in the system. The rest of the cluster daemons bind to a random port and register themselves with the monitor.
</div>
<h4>Mounting with FUSE</h4>
<div>
The easiest route is <tt>fakefuse</tt>:
<pre>
modprobe fuse # make sure fuse module is loaded
mkdir mnt # or whereever you want your mount point
make fakefuse && ./fakefuse --mkfs --debug_ms 1 mnt
</pre>
You should be able to ls, copy files, or whatever else (in another terminal; fakefuse will stay in the foreground). Control-C will kill fuse and cause an orderly shutdown. Alternatively, <tt>fusermount -u mnt</tt> will unmount. If fakefuse crashes or hangs, you may need to <tt>kill -9 fakefuse</tt> and/or <tt>fusermount -u mnt</tt> to clean up. Overall, FUSE is pretty well-behaved.
If you have the cluster daemon's already running (as above), you can mount via the standalone fuse client:
<pre>
modprobe fuse
mkdir mnt
make cfuse && ./cfuse mnt
</pre>
</div>
<h4>Running the kernel client in a UML instance</h4>
<div>
Any recent mainline kernel will do here.
<pre>
$ cd linux
$ patch -p1 < ~/ceph/src/kernel/kconfig.patch
patching file fs/Kconfig
patching file fs/Makefile
$ cp ~/ceph/src/kernel/sample.uml.config .config
$ ln -s ~/ceph/src/kernel fs/ceph
$ ln -s ~/ceph/src/include/ceph_fs.h include/linux
$ make ARCH=um
</pre>
I am using <a href="http://uml.nagafix.co.uk/Debian-3.1/Debian-3.1-AMD64-root_fs.bz2">this x86_64 Debian UML root fs image</a>, but any image will do (see <a href="http://user-mode-linux.sf.net">http://user-mode-linux.sf.net</a>) as long as the architecture (e.g. x86_64 vs i386) matches your host. Start up the UML guest instance with something like
<pre>
./linux ubda=Debian-3.1-AMD64-root_fs mem=256M eth0=tuntap,,,1.2.3.4 # 1.2.3.4 is the _host_ ip
</pre>
Note that if UML crashes/oopses/whatever, you can restart quick-and-dirty (up arrow + enter) with
<pre>
reset ; killall -9 linux ; ./linux ubda=Debian-3.1-AMD64-root_fs mem=256M eth0=tuntap,,,1.2.3.4
</pre>
You'll need to configure the network in UML with an unused IP. For my debian-based root fs image, this <tt>/etc/network/interfaces</tt> file does the trick:
<pre>
iface eth0 inet static
address 1.2.3.5 # unused ip in your host's netowrk for the uml guest
netmask 255.0.0.0
gateway 1.2.3.4 # host ip
auto eth0
</pre>
Note that you need install uml-utilities (<tt>apt-get install uml-utilities</tt> on debian distros) and add yourself to the <tt>uml-net</tt> group on the host (or run the UML instance as root) for the network to start up properly.
<p>
Inside UML, you'll want an <tt>/etc/fstab</tt> line like
<pre>
none /host hostfs defaults 0 0
</pre>
You can then load the kernel client module and mount from the UML instance with
<pre>
insmod /host/path/to/ceph/src/kernel/ceph.ko
mount -t ceph 1.2.3.4:/ mnt # 1.2.3.4 is host
</pre>
</div>
<h4>Running fakesyn -- everything one process</h4>
<div>
A quick example, assuming you've set up "fake" EBOFS devices as above:
<pre>
make fakesyn && ./fakesyn --mkfs --debug_ms 1 --debug_client 3 --syn rw 1 100000
# where those options mean:
# --mkfs # start with a fresh file system
# --debug_ms 1 # show message delivery
# --debug_client 3 # show limited client stuff
# --syn rw 1 100000 # write 1MB to a file in 100,000 byte chunks, then read it back
</pre>
One the synthetic workload finishes, the synthetic client unmounts, and the whole system shuts down.
The full set of command line arguments can be found in <tt>config.cc</tt>.
</div>
</div>
</div>