============================ Frequently Asked Questions ============================ These questions have been frequently asked on the ceph-users and ceph-devel mailing lists, the IRC channel, and on the `Ceph.com`_ blog. .. _Ceph.com: http://ceph.com Is Ceph Production-Quality? =========================== Ceph's object store (RADOS) is production ready. Large-scale storage systems (i.e., petabytes of data) use Ceph's RESTful Object Gateway (RGW), which provides APIs compatible with Amazon's S3 and OpenStack's Swift. Many deployments also use the Ceph Block Device (RBD), including deployments of OpenStack and CloudStack. `Inktank`_ provides commercial support for the Ceph object store, Object Gateway, block devices and CephFS with running a single metadata server. The CephFS POSIX-compliant filesystem is functionally complete and has been evaluated by a large community of users. There are production systems using CephFS with a single metadata server. The Ceph community is actively testing clusters with multiple metadata servers for quality assurance. Once CephFS passes QA muster when running with multiple metadata servers, `Inktank`_ will provide commercial support for CephFS with multiple metadata servers, too. .. _Inktank: http://inktank.com What Kind of Hardware Does Ceph Require? ======================================== Ceph runs on commodity hardware. A typical configuration involves a rack mountable server with a baseboard management controller, multiple processors, multiple drives, and multiple NICs. There are no requirements for proprietary hardware. For details, see `Ceph Hardware Recommendations`_. What Kind of OS Does Ceph Require? ================================== Ceph runs on Linux for both the client and server side. Ceph runs on Debian/Ubuntu distributions, which you can install from `APT packages`_. Ceph also runs on Fedora and Enterprise Linux derivates (RHEL, CentOS) using `RPM packages`_ . You can also download Ceph source `tarballs`_ and build Ceph for your distribution. See `Installation`_ for details. .. _try-ceph: How Can I Give Ceph a Try? ========================== Follow our `Quick Start`_ guides. They will get you up an running quickly without requiring deeper knowledge of Ceph. Our `Quick Start`_ guides will also help you avoid a few issues related to limited deployments. If you choose to stray from the Quick Starts, there are a few things you need to know. We recommend using at least two hosts, and a recent Linux kernel. In older kernels, Ceph can deadlock if you try to mount CephFS or RBD client services on the same host that runs your test Ceph cluster. This is not a Ceph-related issue. It's related to memory pressure and needing to relieve free memory. Recent kernels with up-to-date ``glibc`` and ``syncfs(2)`` reduce this issue considerably. However, a memory pool large enough to handle incoming requests is the only thing that guarantees against the deadlock occuring. When you run Ceph clients on a Ceph cluster machine, loopback NFS can experience a similar problem related to buffer cache management in the kernel. You can avoid these scenarios entirely by using a separate client host, which is more realistic for deployment scenarios anyway. We recommend using at least two OSDs with at least two replicas of the data. OSDs report other OSDs to the monitor, and also interact with other OSDs when replicating data. If you have only one OSD, a second OSD cannot check its heartbeat. Also, if an OSD expects another OSD to tell it which placement groups it should have, the lack of another OSD prevents this from occurring. So a placement group can remain stuck "stale" forever. These are not likely production issues. Finally, `Quick Start`_ guides are a way to get you up and running quickly. To build performant systems, you'll need a drive for each OSD, and you will likely benefit by writing the OSD journal to a separate drive from the OSD data. How Many OSDs Can I Run per Host? ================================= Theoretically, a host can run as many OSDs as the hardware can support. Many vendors market storage hosts that have large numbers of drives (e.g., 36 drives) capable of supporting many OSDs. We don't recommend a huge number of OSDs per host though. Ceph was designed to distribute the load across what we call "failure domains." See `CRUSH Maps`_ for details. At the petabyte scale, hardware failure is an expectation, not a freak occurrence. Failure domains include datacenters, rooms, rows, racks, and network switches. In a single host, power supplies, motherboards, NICs, and drives are all potential points of failure. If you place a large percentage of your OSDs on a single host and that host fails, a large percentage of your OSDs will fail too. Having too large a percentage of a cluster's OSDs on a single host can cause disruptive data migration and long recovery times during host failures. We encourage diversifying the risk across failure domains, and that includes making reasonable tradeoffs regarding the number of OSDs per host. Can I Use the Same Drive for Multiple OSDs? =========================================== Yes. **Please don't do this!** Except for initial evaluations of Ceph, we do not recommend running multiple OSDs on the same drive. In fact, we recommend **exactly** the opposite. Only run one OSD per drive. For better performance, run journals on a separate drive from the OSD drive, and consider using SSDs for journals. Run operating systems on a separate drive from any drive storing data for Ceph. Storage drives are a performance bottleneck. Total throughput is an important consideration. Sequential reads and writes are important considerations too. When you run multiple OSDs per drive, you split up the total throughput between competing OSDs, which can slow performance considerably. Why Do You Recommend One Drive Per OSD? ======================================= Ceph OSD performance is one of the most common requests for assistance, and running an OS, a journal and an OSD on the same disk is a frequently the impediment to high performance. Total throughput and simultaneous reads and writes are a major bottleneck. If you journal data, run an OS, or run multiple OSDs on the same drive, you will very likely see performance degrade significantly--especially under high loads. Running multiple OSDs on a single drive is fine for evaluation purposes. We even encourage that in our `5-minute quick start`_. However, just because it works does NOT mean that it will provide acceptable performance in an operational cluster. What Underlying Filesystem Do You Recommend? ============================================ Currently, we recommend using XFS as the underlying filesystem for OSD drives. We think ``btrfs`` will become the optimal filesystem. However, we still encounter enough issues that we do not recommend it for production systems yet. See `Filesystem Recommendations`_ for details. How Does Ceph Ensure Data Integrity Across Replicas? ==================================================== Ceph periodically scrubs placement groups to ensure that they contain the same information. Low-level or deep scrubbing reads the object data in each replica of the placement group to ensure that the data is identical across replicas. How Many NICs Per Host? ======================= You can use one :abbr:`NIC (Network Interface Card)` per machine. We recommend a minimum of two NICs: one for a public (front-side) network and one for a cluster (back-side) network. When you write an object from the client to the primary OSD, that single write only accounts for the bandwidth consumed during one leg of the transaction. If you store multiple copies (usually 2-3 copies in a typical cluster), the primary OSD makes a write request to your secondary and tertiary OSDs. So your back-end network traffic can dwarf your front-end network traffic on writes very easily. What Kind of Network Throughput Do I Need? ========================================== Network throughput requirements depend on your load. We recommend starting with a minimum of 1GB Ethernet. 10GB Ethernet is more expensive, but often comes with some additional advantages, including virtual LANs (VLANs). VLANs can dramatically reduce the cabling requirements when you run front-side, back-side and other special purpose networks. The number of object copies (replicas) you create is an important factor, because replication becomes a larger network load than the initial write itself when making multiple copies (e.g., triplicate). Network traffic between Ceph and a cloud-based system such as OpenStack or CloudStack may also become a factor. Some deployments even run a separate NIC for management APIs. Finally load spikes are a factor too. Certain times of the day, week or month you may see load spikes. You must plan your network capacity to meet those load spikes in order for Ceph to perform well. This means that excess capacity may remain idle or unused during low load times. Can Ceph Support Multiple Data Centers? ======================================= Yes, but with safeguards to ensure data safety. When a client writes data to Ceph the primary OSD will not acknowledge the write to the client until the secondary OSDs have written the replicas synchronously. See `How Ceph Scales`_ for details. The Ceph community is working to ensure that OSD/monitor heartbeats and peering processes operate effectively with the additional latency that may occur when deploying hardware in different geographic locations. See `Monitor/OSD Interaction`_ for details. If your data centers have dedicated bandwidth and low latency, you can distribute your cluster across data centers easily. If you use a WAN over the Internet, you may need to configure Ceph to ensure effective peering, heartbeat acknowledgement and writes to ensure the cluster performs well with additional WAN latency. The Ceph community is working on an asynchronous write capability via the Ceph Object Gateway (RGW) which will provide an eventually-consistent copy of data for disaster recovery purposes. This will work with data read and written via the Object Gateway only. Work is also starting on a similar capability for Ceph Block devices which are managed via the various cloudstacks. How Does Ceph Authenticate Users? ================================= Ceph provides an authentication framework called ``cephx`` that operates in a manner similar to Kerberos. The principal difference is that Ceph's authentication system is distributed too, so that it doesn't constitute a single point of failure. For details, see `Ceph Authentication & Authorization`_. Does Ceph Authentication Provide Multi-tenancy? =============================================== Ceph provides authentication at the `pool`_ level, which may be sufficient for multi-tenancy in limited cases. Ceph plans on developing authentication namespaces within pools in future releases, so that Ceph is well-suited for multi-tenancy within pools. Can Ceph use other Multi-tenancy Modules? ========================================= The Bobtail release of Ceph integrates the Object Gateway with OpenStack's Keystone. See `Keystone Integration`_ for details. .. _Keystone Integration: ../radosgw/config#integrating-with-openstack-keystone Does Ceph Enforce Quotas? ========================= Currently, Ceph doesn't provide enforced storage quotas. The Ceph community has discussed enforcing user quotas within CephFS. Does Ceph Track Per User Usage? =============================== The CephFS filesystem provides user-based usage tracking on a subtree basis. RADOS Gateway also provides detailed per-user usage tracking. RBD and the underlying object store do not track per user statistics. The underlying object store provides storage capacity utilization statistics. Does Ceph Provide Billing? ========================== Usage information is available via a RESTful API for the Ceph Object Gateway which can be integrated into billing systems. Usage data at the RADOS pool level is not currently possible but is on the roadmap. Can Ceph Export a Filesystem via NFS or Samba/CIFS? =================================================== Ceph doesn't export CephFS via NFS or Samba. However, you can use a gateway to serve a CephFS filesystem to NFS or Samba clients. Can I Access Ceph via a Hypervisor? =================================== Currently, the `QEMU`_ hypervisor can interact with the Ceph `block device`_. The :abbr:`KVM (Kernel Virtual Machine)` `module`_ and the `librbd` library allow you to use QEMU with Ceph. Most Ceph deployments use the `librbd` library. Cloud solutions like `OpenStack`_ and `CloudStack`_ interact `libvirt`_ and QEMU to as a means of integrating with Ceph. Ceph integrates cloud solutions via ``libvirt`` and QEMU. The Ceph community is also looking to support the Xen hypervisor in a future release. There is interest in support for VMWare, but there is no deep-level integration between VMWare and Ceph as yet. Can Block, CephFS, and Gateway Clients Share Data? ================================================== For the most part, no. You cannot write data to Ceph using RBD and access the same data via CephFS, for example. You cannot write data with RADOS gateway and read it with RBD. However, you can write data with the RADOS Gateway S3-compatible API and read the same data using the RADOS Gateway Swift-comptatible API. RBD, CephFS and the RADOS Gateway each have their own namespace. The way they store data differs significantly enough that it isn't possible to use the clients interchangeably. However, you can use all three types of clients, and clients you develop yourself via ``librados`` simultaneously on the same cluster. Which Ceph Clients Support Striping? ==================================== Ceph clients--RBD, CephFS and RADOS Gateway--providing striping capability. For details on striping, see `Striping`_. What Programming Languages can Interact with the Object Store? ============================================================== Ceph's ``librados`` is written in the C programming language. There are interfaces for other languages, including: - C++ - Java - PHP - Python - Ruby Can I Develop a Client With Another Language? ============================================= Ceph does not have many native bindings for ``librados`` at this time. If you'd like to fork Ceph and build a wrapper to the C or C++ versions of ``librados``, please check out the `Ceph repository`_. You can also use other languages that can use the ``librados`` native bindings (e.g., you can access the C/C++ bindings from within Perl). Do Ceph Clients Run on Windows? =============================== No. There are no immediate plans to support Windows clients at this time. However, you may be able to emulate a Linux environment on a Windows host. For example, Cygwin may make it feasible to use ``librados`` in an emulated environment. How can I add a question to this list? ====================================== If you'd like to add a question to this list (hopefully with an accompanying answer!), you can find it in the doc/ directory of our main git repository: `https://github.com/ceph/ceph/blob/master/doc/faq.rst`_ We use Sphinx to manage our documentation, and this page is generated from reStructuredText source. See the section on Building Ceph Documentation for the build procedure. .. _Ceph Hardware Recommendations: ../install/hardware-recommendations .. _APT packages: ../install/debian .. _RPM packages: ../install/rpm .. _tarballs: ../install/get-tarballs .. _Installation: ../install .. _CRUSH Maps: ../rados/operations/crush-map .. _5-minute quick start: ../start/quick-start .. _How Ceph Scales: ../architecture#how-ceph-scales .. _Monitor/OSD Interaction: ../rados/configuration/mon-osd-interaction .. _Ceph Authentication & Authorization: ../rados/operations/auth-intro .. _Ceph repository: https://github.com/ceph/ceph .. _QEMU: ../rbd/qemu-rbd .. _block device: ../rbd .. _module: ../rbd/rbd-ko .. _libvirt: ../rbd/libvirt .. _OpenStack: ../rbd/rbd-openstack .. _CloudStack: ../rbd/rbd-cloudstack .. _pool: ../rados/operations/pools .. _Striping: ../architecture##how-ceph-clients-stripe-data .. _https://github.com/ceph/ceph/blob/master/doc/faq.rst: https://github.com/ceph/ceph/blob/master/doc/faq.rst .. _Filesystem Recommendations: ../rados/configuration/filesystem-recommendations .. _Quick Start: ../start