diff --git a/trunk/web/index.body b/trunk/web/index.body index d2ec4e5a72e..2027ff5e655 100644 --- a/trunk/web/index.body +++ b/trunk/web/index.body @@ -12,8 +12,8 @@ We are actively seeking experienced C/C++ and Linux kernel developers who are in Ceph is a distributed network file system designed to provide excellent performance, reliability, and scalability. Ceph fills two significant gaps in the array of currently available file systems:
+ A thorough overview of the system architecture can be found in this paper that appeared at OSDI '06. +
A Ceph installation consists of three main elements: clients, metadata servers (MDSs), and object storage devices (OSDs). Ceph clients can either be individual processes linking directly to a user-space client library, or a host mounting the Ceph file system natively (ala NFS). OSDs are servers with attached disks and are responsible for storing data.
The Ceph architecture is based on three key design principles that set it apart from traditional file systems. @@ -29,7 +31,7 @@ Ceph fills this gap by providing a scalable, reliable file system that can seaml
- Both file data and file system metadata are striped over multiple objects, each of which is replicated on multiple OSDs for reliability. A special-purpose mapping function called CRUSH is used to determine which OSDs store which objects. CRUSH resembles a hash function in that this mapping is pseudo-random (it appears random, but is actually deterministic). This provides load balancing across all devices that is relatively invulnerable to "hot spots," while Ceph's policy of redistributing data ensures that workload remains balanced and all devices are equally utilized even when the storage cluster is expanded or OSDs are removed. + Both file data and file system metadata are striped over multiple objects, each of which is replicated on multiple OSDs for reliability. A special-purpose mapping function called CRUSH is used to determine which OSDs store which objects. CRUSH resembles a hash function in that this mapping is pseudo-random (it appears random, but is actually deterministic). This provides load balancing across all devices that is relatively invulnerable to "hot spots," while Ceph's policy of redistributing data ensures that workload remains balanced and all devices are equally utilized even when the storage cluster is expanded or OSDs are removed.