================== Configuring Ceph ================== When you start the Ceph service, the initialization process activates a series of daemons that run in the background. The hosts in a typical Ceph cluster run at least one of four daemons: - Object Storage Device (``ceph-osd``) - Monitor (``ceph-mon``) - Metadata Server (``ceph-mds``) - Ceph Gateway (``radosgw``) For your convenience, each daemon has a series of default values (*i.e.*, many are set by ``ceph/src/common/config_opts.h``). You may override these settings with a Ceph configuration file. .. _ceph-conf-file: The Configuration File ====================== When you start a Ceph cluster, each daemon looks for a Ceph configuration file (i.e., ``ceph.conf`` by default) that provides the cluster's configuration settings. For manual deployments, you need to create a Ceph configuration file. For third party tools that create configuration files for you (*e.g.*, Chef), you may use the information contained herein as a reference. The Ceph Configuration file defines: - Authentication settings - Cluster membership - Host names - Host addresses - Paths to keyrings - Paths to journals - Paths to data - Other runtime options The default ``ceph.conf`` locations in sequential order include: #. ``$CEPH_CONF`` (*i.e.,* the path following the ``$CEPH_CONF`` environment variable) #. ``-c path/path`` (*i.e.,* the ``-c`` command line argument) #. ``/etc/ceph/ceph.conf`` #. ``~/.ceph/config`` #. ``./ceph.conf`` (*i.e.,* in the current working directory) The Ceph configuration file uses an *ini* style syntax. You can add comments by preceding comments with a semi-colon (;) or a pound sign (#). For example: .. code-block:: ini # <--A number (#) sign precedes a comment. ; A comment may be anything. # Comments always follow a semi-colon (;) or a pound (#) on each line. # The end of the line terminates a comment. # We recommend that you provide comments in your configuration file(s). .. _ceph-conf-settings: Config Sections =============== The configuration file can configure all daemons in a cluster, or all daemons of a particular type. To configure a series of daemons, the settings must be included under the processes that will receive the configuration as follows: ``[global]`` :Description: Settings under ``[global]`` affect all daemons in a Ceph cluster. :Example: ``auth supported = cephx`` ``[osd]`` :Description: Settings under ``[osd]`` affect all ``ceph-osd`` daemons in the cluster. :Example: ``osd journal size = 1000`` ``[mon]`` :Description: Settings under ``[mon]`` affect all ``ceph-mon`` daemons in the cluster. :Example: ``mon addr = 10.0.0.101:6789`` ``[mds]`` :Description: Settings under ``[mds]`` affect all ``ceph-mds`` daemons in the cluster. :Example: ``host = myserver01`` ``[client]`` :Description: Settings under ``[client]`` affect all clients (e.g., mounted CephFS filesystems, mounted block devices, etc.) :Example: ``log file = /var/log/ceph/radosgw.log`` Global settings affect all instances of all daemon in the cluster. Use the ``[global]`` setting for values that are common for all daemons in the cluster. You can override each ``[global]`` setting by: #. Changing the setting in a particular process type (*e.g.,* ``[osd]``, ``[mon]``, ``[mds]`` ). #. Changing the setting in a particular process (*e.g.,* ``[osd.1]`` ) Overriding a global setting affects all child processes, except those that you specifically override. A typical global setting involves activating authentication. For example: .. code-block:: ini [global] #Enable authentication between hosts within the cluster. #v 0.54 and earlier auth supported = cephx #v 0.55 and after auth cluster required = cephx auth service required = cephx auth client required = cephx You can specify settings that apply to a particular type of daemon. When you specify settings under ``[osd]``, ``[mon]`` or ``[mds]`` without specifying a particular instance, the setting will apply to all OSDs, monitors or metadata daemons respectively. You may specify settings for particular instances of a daemon. You may specify an instance by entering its type, delimited by a period (.) and by the instance ID. The instance ID for an OSD is always numeric, but it may be alphanumeric for monitors and metadata servers. .. code-block:: ini [osd.1] # settings affect osd.1 only. [mon.a] # settings affect mon.a only. [mds.b] # settings affect mds.b only. .. _ceph-metavariables: Metavariables ============= Metavariables simplify cluster configuration dramatically. When a metavariable is set in a configuration value, Ceph expands the metavariable into a concrete value. Metavariables are very powerful when used within the ``[global]``, ``[osd]``, ``[mon]`` or ``[mds]`` sections of your configuration file. Ceph metavariables are similar to Bash shell expansion. Ceph supports the following metavariables: ``$cluster`` :Description: Expands to the cluster name. Useful when running multiple clusters on the same hardware. :Example: ``/etc/ceph/$cluster.keyring`` :Default: ``ceph`` ``$type`` :Description: Expands to one of ``mds``, ``osd``, or ``mon``, depending on the type of the current daemon. :Example: ``/var/lib/ceph/$type`` ``$id`` :Description: Expands to the daemon identifier. For ``osd.0``, this would be ``0``; for ``mds.a``, it would be ``a``. :Example: ``/var/lib/ceph/$type/$cluster-$id`` ``$host`` :Description: Expands to the host name of the current daemon. ``$name`` :Description: Expands to ``$type.$id``. :Example: ``/var/run/ceph/$cluster-$name.asok`` .. _ceph-conf-common-settings: Common Settings =============== The `Hardware Recommendations`_ section provides some hardware guidelines for configuring the cluster. It is possible for a single host to run multiple daemons. For example, a single host with multiple disks or RAIDs may run one ``ceph-osd`` for each disk or RAID. Additionally, a host may run both a ``ceph-mon`` and an ``ceph-osd`` daemon on the same host. Ideally, you will have a host for a particular type of process. For example, one host may run ``ceph-osd`` daemons, another host may run a ``ceph-mds`` daemon, and other hosts may run ``ceph-mon`` daemons. Each host has a name identified by the ``host`` setting. Monitors also specify a network address and port (i.e., domain name or IP address) identified by the ``addr`` setting. A basic configuration file will typically specify only minimal settings for each instance of a daemon. For example: .. code-block:: ini [mon.a] host = hostName mon addr = 150.140.130.120:6789 [osd.0] host = hostName .. important:: The ``host`` setting is the short name of the host (i.e., not an fqdn). It is **NOT** an IP address either. Enter ``hostname -s`` on the command line to retrieve the name of the host. Also, this setting is **ONLY** for ``mkcephfs`` and manual deployment. It **MUST NOT** be used with ``chef`` or ``ceph-deploy``. .. _Hardware Recommendations: ../../../install/hardware-recommendations .. _ceph-network-config: Networks ======== Monitors listen on port 6789 by default, while metadata servers and OSDs listen on the first available port beginning at 6800. Ensure that you open port 6789 on hosts that run a monitor daemon, and open one port beginning at port 6800 for each OSD or metadata server that runs on the host. Ports are host-specific, so you don't need to open any more ports than the number of daemons running on that host, other than potentially a few spares. You may consider opening a few additional ports in case a daemon fails and restarts without letting go of the port such that the restarted daemon binds to a new port. If you set up separate public and cluster networks, you may need to make entries for each network. For example:: iptables -A INPUT -m multiport -p tcp -s {ip-address}/{netmask} --dports 6789,6800:6810 -j ACCEPT In our `hardware recommendations`_ section, we recommend having at least two NIC cards, because Ceph can support two networks: a public (front-side) network, and a cluster (back-side) network. Ceph functions just fine with a public network only. You only need to specify the public and cluster network settings if you use both public and cluster networks. There are several reasons to consider operating two separate networks. First, OSDs handle data replication for the clients. When OSDs replicate data more than once, the network load between OSDs easily dwarfs the network load between clients and the Ceph cluster. This can introduce latency and create a performance problem. Second, while most people are generally civil, a very tiny segment of the population likes to engage in what's known as a Denial of Service (DoS) attack. When traffic between OSDs gets disrupted, placement groups may no longer reflect an ``active + clean`` state, which may prevent users from reading and writing data. A great way to defeat this type of attack is to maintain a completely separate cluster network that doesn't connect directly to the internet. To configure the networks, add the following options to the ``[global]`` section of your Ceph configuration file. .. code-block:: ini [global] public network = {public-network-ip-address/netmask} cluster network = {enter cluster-network-ip-address/netmask} To configure Ceph hosts to use the networks, you should set the following options in the daemon instance sections of your ``ceph.conf`` file. .. code-block:: ini [osd.0] public addr = {host-public-ip-address} cluster addr = {host-cluster-ip-address} .. _hardware recommendations: ../../../install/hardware-recommendations Authentication ============== .. versionadded:: 0.55 For Bobtail (v 0.56) and beyond, you should expressly enable or disable authentication in the ``[global]`` section of your Ceph configuration file. :: auth cluster required = cephx auth service required = cephx auth client required = cephx See `Cephx Authentication`_ for additional details. .. important:: When upgrading, we recommend expressly disabling authentication first, then perform the upgrade. Once the upgrade is complete, re-enable authentication. .. _Cephx Authentication: ../../operations/authentication .. _ceph-monitor-config: Monitors ======== Ceph production clusters typically deploy with a minimum 3 monitors to ensure high availability should a monitor instance crash. An odd number of monitors (3) ensures that the Paxos algorithm can determine which version of the cluster map is the most recent from a quorum of monitors. .. note:: You may deploy Ceph with a single monitor, but if the instance fails, the lack of a monitor may interrupt data service availability. Ceph monitors typically listen on port ``6789``. For example: .. code-block:: ini [mon.a] host = hostName mon addr = 150.140.130.120:6789 By default, Ceph expects that you will store a monitor's data under the following path:: /var/lib/ceph/mon/$cluster-$id You must create the corresponding directory yourself. With metavariables fully expressed and a cluster named "ceph", the foregoing directory would evaluate to:: /var/lib/ceph/mon/ceph-a You may override this path using the ``mon data`` setting. We don't recommend changing the default location. Create the default directory on your new monitor host. :: ssh {new-mon-host} sudo mkdir /var/lib/ceph/mon/ceph-{mon-letter} .. _ceph-osd-config: OSDs ==== Ceph production clusters typically deploy OSDs where one host has one OSD daemon running a filestore on one data disk. A typical deployment specifies a journal size and whether the file store's extended attributes (XATTRs) use an object map (i.e., when running on the ``ext4`` filesystem). For example: .. code-block:: ini [osd] osd journal size = 10000 filestore xattr use omap = true #enables the object map. Only if running ext4. [osd.0] host = {hostname} By default, Ceph expects that you will store an OSD's data with the following path:: /var/lib/ceph/osd/$cluster-$id You must create the corresponding directory yourself. With metavariables fully expressed and a cluster named "ceph", the foregoing directory would evaluate to:: /var/lib/ceph/osd/ceph-0 You may override this path using the ``osd data`` setting. We don't recommend changing the default location. Create the default directory on your new OSD host. :: ssh {new-osd-host} sudo mkdir /var/lib/ceph/osd/ceph-{osd-number} The ``osd data`` path ideally leads to a mount point with a hard disk that is separate from the hard disk storing and running the operating system and daemons. If the OSD is for a disk other than the OS disk, prepare it for use with Ceph, and mount it to the directory you just created:: ssh {new-osd-host} sudo mkfs -t {fstype} /dev/{disk} sudo mount -o user_xattr /dev/{hdd} /var/lib/ceph/osd/ceph-{osd-number} We recommend using the ``xfs`` file system or the ``btrfs`` file system when running :command:mkfs. By default, Ceph expects that you will store an OSDs journal with the following path:: /var/lib/ceph/osd/$cluster-$id/journal Without performance optimization, Ceph stores the journal on the same disk as the OSDs data. An OSD optimized for performance may use a separate disk to store journal data (e.g., a solid state drive delivers high performance journaling). Ceph's default ``osd journal size`` is 0, so you will need to set this in your ``ceph.conf`` file. A journal size should find the product of the ``filestore max sync interval`` and the expected throughput, and multiply the product by two (2):: osd journal size = {2 * (expected throughput * filestore max sync interval)} The expected throughput number should include the expected disk throughput (i.e., sustained data transfer rate), and network throughput. For example, a 7200 RPM disk will likely have approximately 100 MB/s. Taking the ``min()`` of the disk and network throughput should provide a reasonable expected throughput. Some users just start off with a 10GB journal size. For example:: osd journal size = 10000 .. _ceph-logging-and-debugging: Logs / Debugging ================ Ceph is still on the leading edge, so you may encounter situations that require modifying logging output and using Ceph's debugging. To activate Ceph's debugging output (*i.e.*, ``dout()``), you may add ``debug`` settings to your configuration. Ceph's logging levels operate on a scale of 1 to 20, where 1 is terse and 20 is verbose. .. note:: See `Debugging and Logging`_ for details on log rotation. .. _Debugging and Logging: ../../operations/debug Subsystems common to each daemon may be set under ``[global]`` in your configuration file. Subsystems for particular daemons are set under the daemon section in your configuration file (*e.g.*, ``[mon]``, ``[osd]``, ``[mds]``). For example:: [global] debug ms = 1 [mon] debug mon = 20 debug paxos = 20 debug auth = 20 [osd] debug osd = 20 debug filestore = 20 debug journal = 20 debug monc = 20 [mds] debug mds = 20 debug mds balancer = 20 debug mds log = 20 debug mds migrator = 20 When your system is running well, choose appropriate logging levels and remove unnecessary debugging settings to ensure your cluster runs optimally. Logging debug output messages is relatively slow, and a waste of resources when operating your cluster. .. tip: When debug output slows down your system, the latency can hide race conditions. Each subsystem has a logging level for its output logs, and for its logs in-memory. You may set different values for each of these subsystems by setting a log file level and a memory level for debug logging. For example:: debug {subsystem} {log-level}/{memory-level} #for example debug mds log 1/20 +--------------------+-----------+--------------+ | Subsystem | Log Level | Memory Level | +====================+===========+==============+ | ``default`` | 0 | 5 | +--------------------+-----------+--------------+ | ``lockdep`` | 0 | 5 | +--------------------+-----------+--------------+ | ``context`` | 0 | 5 | +--------------------+-----------+--------------+ | ``crush`` | 1 | 5 | +--------------------+-----------+--------------+ | ``mds`` | 1 | 5 | +--------------------+-----------+--------------+ | ``mds balancer`` | 1 | 5 | +--------------------+-----------+--------------+ | ``mds locker`` | 1 | 5 | +--------------------+-----------+--------------+ | ``mds log`` | 1 | 5 | +--------------------+-----------+--------------+ | ``mds log expire`` | 1 | 5 | +--------------------+-----------+--------------+ | ``mds migrator`` | 1 | 5 | +--------------------+-----------+--------------+ | ``buffer`` | 0 | 0 | +--------------------+-----------+--------------+ | ``timer`` | 0 | 5 | +--------------------+-----------+--------------+ | ``filer`` | 0 | 5 | +--------------------+-----------+--------------+ | ``objecter`` | 0 | 0 | +--------------------+-----------+--------------+ | ``rados`` | 0 | 5 | +--------------------+-----------+--------------+ | ``rbd`` | 0 | 5 | +--------------------+-----------+--------------+ | ``journaler`` | 0 | 5 | +--------------------+-----------+--------------+ | ``objectcacher`` | 0 | 5 | +--------------------+-----------+--------------+ | ``client`` | 0 | 5 | +--------------------+-----------+--------------+ | ``osd`` | 0 | 5 | +--------------------+-----------+--------------+ | ``optracker`` | 0 | 5 | +--------------------+-----------+--------------+ | ``objclass`` | 0 | 5 | +--------------------+-----------+--------------+ | ``filestore`` | 1 | 5 | +--------------------+-----------+--------------+ | ``journal`` | 1 | 5 | +--------------------+-----------+--------------+ | ``ms`` | 0 | 5 | +--------------------+-----------+--------------+ | ``mon`` | 1 | 5 | +--------------------+-----------+--------------+ | ``monc`` | 0 | 5 | +--------------------+-----------+--------------+ | ``paxos`` | 0 | 5 | +--------------------+-----------+--------------+ | ``tp`` | 0 | 5 | +--------------------+-----------+--------------+ | ``auth`` | 1 | 5 | +--------------------+-----------+--------------+ | ``finisher`` | 1 | 5 | +--------------------+-----------+--------------+ | ``heartbeatmap`` | 1 | 5 | +--------------------+-----------+--------------+ | ``perfcounter`` | 1 | 5 | +--------------------+-----------+--------------+ | ``rgw`` | 1 | 5 | +--------------------+-----------+--------------+ | ``hadoop`` | 1 | 5 | +--------------------+-----------+--------------+ | ``asok`` | 1 | 5 | +--------------------+-----------+--------------+ | ``throttle`` | 1 | 5 | +--------------------+-----------+--------------+ Example ceph.conf ================= .. literalinclude:: demo-ceph.conf :language: ini .. _ceph-runtime-config: Runtime Changes =============== Ceph allows you to make changes to the configuration of an ``ceph-osd``, ``ceph-mon``, or ``ceph-mds`` daemon at runtime. This capability is quite useful for increasing/decreasing logging output, enabling/disabling debug settings, and even for runtime optimization. The following reflects runtime configuration usage:: ceph {daemon-type} tell {id or *} injectargs '--{name} {value} [--{name} {value}]' Replace ``{daemon-type}`` with one of ``osd``, ``mon`` or ``mds``. You may apply the runtime setting to all daemons of a particular type with ``*``, or specify a specific daemon's ID (i.e., its number or letter). For example, to increase debug logging for a ``ceph-osd`` daemon named ``osd.0``, execute the following:: ceph osd tell 0 injectargs '--debug-osd 20 --debug-ms 1' In your ``ceph.conf`` file, you may use spaces when specifying a setting name. When specifying a setting name on the command line, ensure that you use an underscore or hyphen (``_`` or ``-``) between terms (e.g., ``debug osd`` becomes ``debug-osd``). Viewing a Configuration at Runtime ================================== If your Ceph cluster is running, and you would like to see the configuration settings from a running daemon, execute the following:: ceph --admin-daemon {/path/to/admin/socket} config show | less The default path for the admin socket for each daemon is:: /var/run/ceph/$cluster-$name.asok At real time, the metavariables will evaluate to the actual cluster name and daemon name. For example, if the cluster name is ``ceph`` (it is by default) and you want to retrieve the configuration for ``osd.0``, use the following:: ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | less Running Multiple Clusters ========================= With Ceph, you can run multiple clusters on the same hardware. Running multiple clusters provides a higher level of isolation compared to using different pools on the same cluster with different CRUSH rulesets. A separate cluster will have separate monitor, OSD and metadata server processes. When running Ceph with default settings, the default cluster name is ``ceph``, which means you would save your Ceph configuration file with the file name ``ceph.conf`` in the ``/etc/ceph`` default directory. When you run multiple clusters, you must name your cluster and save the Ceph configuration file with the name of the cluster. For example, a cluster named ``openstack`` will have a Ceph configuration file with the file name ``openstack.conf`` in the ``/etc/ceph`` default directory. .. important:: Cluster names must consist of letters a-z and digits 0-9 only. Separate clusters imply separate data disks and journals, which are not shared between clusters. Referring to `Metavariables`_, the ``$cluster`` metavariable evaluates to the cluster name (i.e., ``openstack`` in the foregoing example). Various settings use the ``$cluster`` metavariable, including: - ``keyring`` - ``admin socket`` - ``log file`` - ``pid file`` - ``mon data`` - ``mon cluster log file`` - ``osd data`` - ``osd journal`` - ``mds data`` - ``rgw data`` See `General Settings`_, `OSD Settings`_, `Monitor Settings`_, `MDS Settings`_, `RGW Settings`_ and `Log Settings`_ for relevant path defaults that use the ``$cluster`` metavariable. .. _General Settings: ../general-config-ref .. _OSD Settings: ../osd-config-ref .. _Monitor Settings: ../mon-config-ref .. _MDS Settings: ../../../cephfs/mds-config-ref .. _RGW Settings: ../../../radosgw/config-ref/ .. _Log Settings: ../log-and-debug-ref When deploying the Ceph configuration file, ensure that you use the cluster name in your command line syntax. For example:: ssh myserver01 sudo tee /etc/ceph/openstack.conf < /etc/ceph/openstack.conf When creating default directories or files, you should also use the cluster name at the appropriate places in the path. For example:: sudo mkdir /var/lib/ceph/osd/openstack-0 sudo mkdir /var/lib/ceph/mon/openstack-a .. important:: When running monitors on the same host, you should use different ports. By default, monitors use port 6789. If you already have monitors using port 6789, use a different port for your other cluster(s). To invoke a cluster other than the default ``ceph`` cluster, use the ``--cluster=clustername`` option with the ``ceph`` command. For example:: ceph --cluster=openstack health