Building out information architecture. Modified getting involved, why use ceph, etc.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
This commit is contained in:
John Wilkins 2012-04-11 11:21:43 -07:00 committed by Tommi Virtanen
parent bc857d8696
commit 859da18e5e
17 changed files with 429 additions and 49 deletions

View File

@ -0,0 +1,3 @@
======================
Building ``ceph.conf``
======================

View File

@ -0,0 +1,3 @@
===================
Deploying with Chef
===================

View File

@ -0,0 +1,13 @@
==========================
Creating a Storage Cluster
==========================
.. toctree::
:hidden:
building_ceph_conf
running_mkcephfs
deploying_with_chef

View File

@ -0,0 +1,3 @@
====================
Running ``mkcephfs``
====================

View File

@ -22,8 +22,8 @@ using NFS or Samba re-exports.
start/index
install/index
create_cluster/index
configure/index
architecture
ops/index
rec/index
config
@ -31,5 +31,6 @@ using NFS or Samba re-exports.
api/index
Internals <dev/index>
man/index
architecture
papers
appendix/index

View File

@ -0,0 +1,105 @@
===================
Build Prerequisites
===================
Before you can build Ceph documentation or Ceph source code, you need to install several libraries and tools.
.. tip:: Check this section to see if there are specific prerequisites for your Linux/Unix distribution.
Prerequisites for Building Ceph Documentation
=============================================
Ceph utilizes Python's Sphinx documentation tool. For details on
the Sphinx documentation tool, refer to: `Sphinx <http://sphinx.pocoo.org>`_
Follow the directions at `Sphinx 1.1.3 <http://pypi.python.org/pypi/Sphinx>`_
to install Sphinx. To run Sphinx, with `admin/build-doc`, at least the following are required:
- ``python-dev``
- ``python-pip``
- ``python-virtualenv``
- ``libxml2-dev``
- ``libxslt-dev``
- ``doxygen``
- ``ditaa``
- ``graphviz``
Execute ``sudo apt-get install`` for each dependency that isn't installed on your host. ::
$ sudo apt-get install python-dev python-pip python-virtualenv libxml2-dev libxslt-dev doxygen ditaa graphviz
Prerequisites for Building Ceph Source Code
===========================================
Ceph provides ``autoconf`` and ``automake`` scripts to get you started quickly. Ceph build scripts
depend on the following:
- ``autotools-dev``
- ``autoconf``
- ``automake``
- ``cdbs``
- ``gcc``
- ``g++``
- ``git``
- ``libboost-dev``
- ``libedit-dev``
- ``libssl-dev``
- ``libtool``
- ``libfcgi``
- ``libfcgi-dev``
- ``libfuse-dev``
- ``linux-kernel-headers``
- ``libcrypto++-dev``
- ``libcrypto++``
- ``libexpat1-dev``
- ``libgtkmm-2.4-dev``
- ``pkg-config``
On Ubuntu, execute ``sudo apt-get install`` for each dependency that isn't installed on your host. ::
$ sudo apt-get install autotools-dev autoconf automake cdbs
gcc g++ git libboost-dev libedit-dev libssl-dev libtool
libfcgi libfcgi-dev libfuse-dev linux-kernel-headers
libcrypto++-dev libcrypto++ libexpat1-dev libgtkmm-2.4-dev
On Debian/Squeeze, execute ``aptitude install`` for each dependency that isn't installed on your host. ::
$ aptitude install autotools-dev autoconf automake cdbs
gcc g++ git libboost-dev libedit-dev libssl-dev libtool
libfcgi libfcgi-dev libfuse-dev linux-kernel-headers
libcrypto++-dev libcrypto++ libexpat1-dev libgtkmm-2.4-dev
Ubuntu Requirements
-------------------
- ``uuid-dev``
- ``libkeytutils-dev``
- ``libgoogle-perftools-dev``
- ``libatomic-ops-dev``
- ``libaio-dev``
- ``libgdata-common``
- ``libgdata13``
Execute ``sudo apt-get install`` for each dependency that isn't installed on your host. ::
$ sudo apt-get install uuid-dev libkeytutils-dev libgoogle-perftools-dev
libatomic-ops-dev libaio-dev libgdata-common libgdata13
Debian
------
Alternatively, you may also install::
$ aptitude install fakeroot dpkg-dev
$ aptitude install debhelper cdbs libexpat1-dev libatomic-ops-dev
openSUSE 11.2 (and later)
-------------------------
- ``boost-devel``
- ``gcc-c++``
- ``libedit-devel``
- ``libopenssl-devel``
- ``fuse-devel`` (optional)
Execute ``zypper install`` for each dependency that isn't installed on your host. ::
$zypper install boost-devel gcc-c++ libedit-devel libopenssl-devel fuse-devel

View File

@ -0,0 +1,31 @@
=============
Building Ceph
=============
Ceph provides build scripts for source code and for documentation.
Building Ceph
=============
Ceph provides ``automake`` and ``configure`` scripts to streamline the build process. To build Ceph, navigate to your cloned Ceph repository and execute the following::
$ cd ceph
$ ./autogen.sh
$ ./configure
$ make
You can use ``make -j`` to execute multiple jobs depending upon your system. For example::
$ make -j4
Building Ceph Documentation
===========================
Ceph utilizes Pythons Sphinx documentation tool. For details on the Sphinx documentation tool, refer to: `Sphinx <http://sphinx.pocoo.org>`_. To build the Ceph documentaiton, navigate to the Ceph repository and execute the build script::
$ cd ceph
$ admin/build-doc
Once you build the documentation set, you may navigate to the source directory to view it::
$ cd build-doc/output
There should be an ``/html`` directory and a ``/man`` directory containing documentation in HTML and manpage formats respectively.

View File

@ -0,0 +1,54 @@
=======================================
Cloning the Ceph Source Code Repository
=======================================
To check out the Ceph source code, you must have ``git`` installed
on your local host. To install ``git``, execute::
$ sudo apt-get install git
You must also have a ``github`` account. If you do not have a
``github`` account, go to `github.com <http://github.com>`_ and register.
Follow the directions for setting up git at `Set Up Git <http://help.github.com/linux-set-up-git/>`_.
Generate SSH Keys
-----------------
You must generate SSH keys for github to clone the Ceph
repository. If you do not have SSH keys for ``github``, execute::
$ ssh-keygen -d
Get the key to add to your ``github`` account::
$ cat .ssh/id_dsa.pub
Copy the public key.
Add the Key
-----------
Go to your your ``github`` account,
click on "Account Settings" (i.e., the 'tools' icon); then,
click "SSH Keys" on the left side navbar.
Click "Add SSH key" in the "SSH Keys" list, enter a name for
the key, paste the key you generated, and press the "Add key"
button.
Clone the Source
----------------
To clone the Ceph source code repository, execute::
$ git clone git@github.com:ceph/ceph.git
Once ``git clone`` executes, you should have a full copy of the Ceph repository.
Clone the Submodules
--------------------
Before you can build Ceph, you must get the ``init`` submodule and the ``update`` submodule::
$ git submodule init
$ git submodule update
.. tip:: Make sure you maintain the latest copies of these submodules. Running ``git status`` will tell you if the submodules are out of date::
$ git status

View File

@ -0,0 +1,6 @@
==========================
Downloading a Ceph Release
==========================
As Ceph development progresses, the Ceph team releases new versions. You may download Ceph releases here:
`Ceph Releases <http://ceph.newdream.net/download/>`_

View File

@ -0,0 +1,41 @@
======================================
Installing RADOS Processes and Daemons
======================================
When you start the Ceph service, the initialization process activates a series of daemons that run in the background.
The hosts in a typical RADOS cluster run at least one of three processes:
- RADOS (``ceph-osd``)
- Monitor (``ceph-mon``)
- Metadata Server (``ceph-mds``)
Each instance of a RADOS ``ceph-osd`` process performs a few essential tasks.
1. Each ``ceph-osd`` instance provides clients with an object interface to the OSD for read/write operations.
2. Each ``ceph-osd`` instance communicates and coordinates with other OSDs to store, replicate, redistribute and restore data.
3. Each ``ceph-osd`` instance communicates with monitors to retrieve and/or update the master copy of the cluster map.
Each instance of a monitor process performs a few essential tasks:
1. Each ``ceph-mon`` instance communicates with other ``ceph-mon`` instances using PAXOS to establish consensus for distributed decision making.
2. Each ``ceph-mon`` instance serves as the first point of contact for clients, and provides clients with the topology and status of the cluster.
3. Each ``ceph-mon`` instance provides RADOS instances with a master copy of the cluster map and receives updates for the master copy of the cluster map.
A metadata server (MDS) process performs a few essential tasks:
1. Each ``ceph-mds`` instance provides clients with metadata regarding the file system.
2. Each ``ceph-mds`` instance manage the file system namespace
3. Coordinate access to the shared OSD cluster.
Installing ``ceph-osd``
=======================
<placeholder>
Installing ``ceph-mon``
=======================
<placeholder>
Installing ``ceph-mds``
=======================
<placeholder>

View File

@ -0,0 +1,23 @@
=========================
Building Ceph from Source
=========================
<placeholder>
1. :doc:`Build Prerequisites <build_from_source/build_prerequisites>`
2. Get Source Code
a. :doc:`Downloading a Ceph Release <build_from_source/downloading_a_ceph_release>`
b. :doc:`Cloning the Ceph Source Code Repository <build_from_source/cloning_the_ceph_source_code_repository>`
3. :doc:`Building Ceph<build_from_source/building_ceph>`
4. :doc:`Installing RADOS Processes and Daemons <build_from_source/installing_rados_processes_and_daemons>`
.. toctree::
:hidden:
Prerequisites <build_from_source/build_prerequisites>
Get a Release <build_from_source/downloading_a_ceph_release>
Clone the Source <build_from_source/cloning_the_ceph_source_code_repository>
Build the Source <build_from_source/building_ceph>
Installation <build_from_source/installing_rados_processes_and_daemons>

View File

@ -0,0 +1,31 @@
===========================
File System Recommendations
===========================
Ceph OSDs depend on the Extended Attributes (XATTRS) of the underlying file system for:
- Internal object state
- Snapshot metadata
- RADOS Gateway Access Control Lists (ACLs).
Ceph OSDs rely heavily upon the stability and performance of the underlying file system. The
underlying file system must provide sufficient capacity for XATTRS. File system candidates for
Ceph include B tree and B+ tree file systems such as:
- ``btrfs``
- ``XFS``
.. warning:: XATTR limits.
The RADOS Gateway's ACL and Ceph snapshots easily surpass the 4-kilobyte limit for XATTRs in ``ext4``,
causing the ``ceph-osd`` process to crash. So ``ext4`` is a poor file system choice if
you intend to deploy the RADOS Gateway or use snapshots.
.. tip:: Use `btrfs`
The Ceph team believes that the best performance and stability will come from ``btrfs.``
The ``btrfs`` file system has internal transactions that keep the local data set in a consistent state.
This makes OSDs based on ``btrfs`` simple to deploy, while providing scalability not
currently available from block-based file systems. The 64-kb XATTR limit for ``xfs``
XATTRS is enough to accommodate RDB snapshot metadata and RADOS Gateway ACLs. So ``xfs`` is the second-choice
file system of the Ceph team. If you only plan to use RADOS and ``rbd`` without snapshots and without
``radosgw``, the ``ext4`` file system should work just fine.

View File

@ -0,0 +1,42 @@
========================
Hardware Recommendations
========================
Ceph runs on commodity hardware and a Linux operating system over a TCP/IP network. The hardware
recommendations for different processes/daemons differ considerably. Ceph OSDs run on commodity hardware
and a Linux operating system over a TCP/IP network. OSD hosts should have ample data storage in the form of
a hard drive or a RAID. Ceph OSDs run the RADOS service, calculate data placement with CRUSH, and maintain their
own copy of the cluster map. So OSDs should have a reasonable amount of processing power. Ceph monitors require
enough disk space for the cluster map, but usually do not encounter heavy loads. Monitors do not need to be very powerful.
Ceph metadata servers distribute their load. However, they must be capable of serving their data quickly.
Metadata servers should have strong processing capability and plenty of RAM.
.. note:: If you are not using the Ceph File System, you do not need a meta data server.
+--------------+----------------+------------------------------------+
| Process | Criteria | Minimum Recommended |
+==============+================+====================================+
| ``ceph-osd`` | Processor | 64-bit AMD-64/i386 dual-core |
| +----------------+------------------------------------+
| | RAM | 500 MB per daemon |
| +----------------+------------------------------------+
| | Volume Storage | 1-disk or RAID per daemon |
| +----------------+------------------------------------+
| | Network | 2-1GB Ethernet NICs |
+--------------+----------------+------------------------------------+
| ``ceph-mon`` | Processor | 64-bit AMD-64/i386 |
| +----------------+------------------------------------+
| | RAM | 1 GB per daemon |
| +----------------+------------------------------------+
| | Disk Space | 10 GB per daemon |
| +----------------+------------------------------------+
| | Network | 2-1GB Ethernet NICs |
+--------------+----------------+------------------------------------+
| ``ceph-mds`` | Processor | 64-bit AMD-64/i386 quad-core |
| +----------------+------------------------------------+
| | RAM | 1 GB minimum per daemon |
| +----------------+------------------------------------+
| | Disk Space | 1 MB per daemon |
| +----------------+------------------------------------+
| | Network | 2-1GB Ethernet NICs |
+--------------+----------------+------------------------------------+

View File

@ -1,27 +1,20 @@
=============================
Designing a Storage Cluster
=============================
==========================
Installing Ceph Components
==========================
Storage clusters are the foundation of the Ceph file system, and they can also provide
object storage to clients via ``librados``, ``rbd`` and ``radosgw``. The following sections
object storage to ``librados``, ``rbd`` and ``radosgw``. The following sections
provide guidance for configuring a storage cluster:
1. :doc:`Hardware Requirements <hardware_requirements>`
2. :doc:`File System Requirements <file_system_requirements>`
3. :doc:`Build Prerequisites <build_prerequisites>`
4. :doc:`Download Packages <download_packages>`
5. :doc:`Downloading a Ceph Release <downloading_a_ceph_release>`
6. :doc:`Cloning the Ceph Source Code Repository <cloning_the_ceph_source_code_repository>`
7. :doc:`Building Ceph<building_ceph>`
8. :doc:`Installing RADOS Processes and Daemons <installing_rados_processes_and_daemons>`
1. :doc:`Hardware Requirements <hardware_recommendations>`
2. :doc:`File System Requirements <file_system_recommendations>`
3. :doc:`Download Packages <download_packages>`
4. :doc:`Building Ceph from Source <building_ceph_from_source>`
.. toctree::
:hidden:
Hardware <hardware_requirements>
File System Reqs <file_system_requirements>
build_prerequisites
Hardware Recs <hardware_recommendations>
File System Recs <file_system_recommendations>
Download Packages <download_packages>
Download a Release <downloading_a_ceph_release>
Clone the Source Code <cloning_the_ceph_source_code_repository>
building_ceph
Installation <installing_rados_processes_and_daemons>
Build From Source <building_ceph_from_source>

View File

@ -1,21 +1,48 @@
===================================
Get Involved in the Ceph Community!
===================================
These are exciting times in the Ceph community!
Follow the `Ceph Blog <http://ceph.newdream.net/news/>`__ to keep track of Ceph progress.
These are exciting times in the Ceph community! Get involved!
As you delve into Ceph, you may have questions or feedback for the Ceph development team.
Ceph developers are often available on the ``#ceph`` IRC channel at ``irc.oftc.net``,
particularly during daytime hours in the US Pacific Standard Time zone.
Keep in touch with developer activity by subscribing_ to the email list at ceph-devel@vger.kernel.org.
You can opt out of the email list at any time by unsubscribing_. A simple email is
all it takes! If you would like to view the archives, go to Gmane_.
You can help prepare Ceph for production by filing
and tracking bugs, and providing feature requests using
the `bug/feature tracker <http://tracker.newdream.net/projects/ceph>`__.
+-----------------+-------------------------------------------------+-----------------------------------------------+
|Channel | Description | Contact Info |
+=================+=================================================+===============================================+
| **Blog** | Check the Ceph Blog_ periodically to keep track | http://ceph.newdream.net/news |
| | of Ceph progress and important announcements. | |
+-----------------+-------------------------------------------------+-----------------------------------------------+
| **IRC** | As you delve into Ceph, you may have questions | |
| | or feedback for the Ceph development team. Ceph | - **Domain:** ``irc.oftc.net`` |
| | developers are often available on the ``#ceph`` | - **Channel:** ``#ceph`` |
| | IRC channel particularly during daytime hours | |
| | in the US Pacific Standard Time zone. | |
+-----------------+-------------------------------------------------+-----------------------------------------------+
| **Email List** | Keep in touch with developer activity by | |
| | subscribing_ to the email list at | - Subscribe_ |
| | ceph-devel@vger.kernel.org. You can opt out of | - Unsubscribe_ |
| | the email list at any time by unsubscribing_. | - Gmane_ |
| | A simple email is all it takes! If you would | |
| | like to view the archives, go to Gmane_. | |
+-----------------+-------------------------------------------------+-----------------------------------------------+
| **Bug Tracker** | You can help keep Ceph production worthy by | http://tracker.newdream.net/projects/ceph |
| | filing and tracking bugs, and providing feature | |
| | requests using the Bug Tracker_. | |
+-----------------+-------------------------------------------------+-----------------------------------------------+
| **Source Code** | If you would like to participate in | |
| | development, bug fixing, or if you just want | - http://github.com:ceph/ceph.git |
| | the very latest code for Ceph, you can get it | - ``$git clone git@github.com:ceph/ceph.git`` |
| | at http://github.com. | |
+-----------------+-------------------------------------------------+-----------------------------------------------+
| **Support** | If you have a very specific problem, an | http://ceph.newdream.net/support |
| | immediate need, or if your deployment requires | |
| | significant help, consider commercial support_. | |
+-----------------+-------------------------------------------------+-----------------------------------------------+
.. _Subscribe: mailto:majordomo@vger.kernel.org?body=subscribe+ceph-devel
.. _Unsubscribe: mailto:majordomo@vger.kernel.org?body=unsubscribe+ceph-devel
.. _subscribing: mailto:majordomo@vger.kernel.org?body=subscribe+ceph-devel
.. _unsubscribing: mailto:majordomo@vger.kernel.org?body=unsubscribe+ceph-devel
.. _Gmane: http://news.gmane.org/gmane.comp.file-systems.ceph.devel
If you need hands-on help, `commercial support <http://ceph.newdream.net/support/>`__ is available too!
.. _Tracker: http://tracker.newdream.net/projects/ceph
.. _Blog: http://ceph.newdream.net/news
.. _support: http://ceph.newdream.net/support

View File

@ -7,16 +7,11 @@ to support storing many petabytes of data with the ability to store exabytes of
A number of factors make it challenging to build large storage systems. Three of them include:
- **Capital Expenditure**: Proprietary systems are expensive. So building scalable systems requires
using less expensive commodity hardware and a "scale out" approach to reduce build-out expenses.
- **Capital Expenditure**: Proprietary systems are expensive. So building scalable systems requires using less expensive commodity hardware and a "scale out" approach to reduce build-out expenses.
- **Ongoing Operating Expenses**: Supporting thousands of storage hosts can impose significant personnel
expenses, particularly as hardware and networking infrastructure must be installed, maintained and replaced
ongoingly.
- **Ongoing Operating Expenses**: Supporting thousands of storage hosts can impose significant personnel expenses, particularly as hardware and networking infrastructure must be installed, maintained and replaced ongoingly.
- **Loss of Data or Access to Data**: Mission-critical enterprise applications cannot suffer significant
amounts of downtime, including loss of data *or access to data*. Yet, in systems with thousands of storage hosts,
hardware failure is an expectation, not an exception.
- **Loss of Data or Access to Data**: Mission-critical enterprise applications cannot suffer significant amounts of downtime, including loss of data *or access to data*. Yet, in systems with thousands of storage hosts, hardware failure is an expectation, not an exception.
Because of the foregoing factors and other factors, building massive storage systems requires new thinking.
@ -48,6 +43,4 @@ Ceph Metadata Servers (MDSs) are only required for Ceph FS. You can use RADOS bl
RADOS Gateway without MDSs. The MDSs dynamically adapt their behavior to the current workload.
As the size and popularity of parts of the file system hierarchy change over time, the MDSs
dynamically redistribute the file system hierarchy among the available
MDSs to balance the load to use server resources effectively.
<image>
MDSs to balance the load to use server resources effectively.

View File

@ -1,13 +1,24 @@
=============
Why use Ceph?
=============
Ceph provides an economic and technical foundation for massive scalability.
Financial constraints limit scalability. Ceph is free and open source, which means it does not require expensive
license fees or expensive updates. Ceph can run on economical commodity hardware, which reduces one economic barrier to scalability. Ceph is easy to install and administer, so it reduces expenses related to administration. Ceph supports popular and widely accepted interfaces (e.g., POSIX-compliance, Swift, Amazon S3, FUSE, etc.). So Ceph provides a compelling solution for building petabyte-to-exabyte scale storage systems.
Ceph provides an economic and technical foundation for massive scalability. Ceph is free and open source,
which means it does not require expensive license fees or expensive updates. Ceph can run on economical
commodity hardware, which reduces another economic barrier to scalability. Ceph is easy to install and administer,
so it reduces expenses related to administration. Ceph supports popular and widely accepted interfaces in a
unified storage system (e.g., Amazon S3, Swift, FUSE, block devices, POSIX-compliant shells, etc.), so you don't
need to build out a different storage system for each storage interface you support.
Technical and personnel constraints also limit scalability. The performance profile of highly scaled systems
can very substantially. With intelligent load balancing and adaptive metadata servers that re-balance the file system dynamically, Ceph alleviates the administrative burden of optimizing performance. Additionally, because Ceph provides for data replication, Ceph is fault tolerant. Ceph administrators can simply replace a failed host by subtituting new hardware without having to rely on complex fail-over scenarios. With POSIX semantics for Unix/Linux-based operating systems, popular interfaces like Swift or Amazon S3, and advanced features like directory-level snapshots, system administrators can deploy enterprise applications on Ceph, and provide those applications with a long-term economical solution for scalable persistence.
can very substantially. Ceph relieves system administrators of the complex burden of manual performance optimization
by utilizing the storage system's computing resources to balance loads intelligently and rebalance the file system dynamically.
Ceph replicates data automatically so that hardware failures do not result in data loss or cascading load spikes.
Ceph fault tolerant, so complex fail-over scenarios are unnecessary. Ceph administrators can simply replace a failed host
with new hardware.
With POSIX semantics for Unix/Linux-based operating systems, popular interfaces like Amazon S3 or Swift, block devices
and advanced features like directory-level snapshots, you can deploy enterprise applications on Ceph while
providing them with a long-term economical solution for scalable storage. While Ceph is open source, commercial
support is available too! So Ceph provides a compelling solution for building petabyte-to-exabyte scale storage systems.
Reasons to use Ceph include: