diff --git a/doc/dev/cephadm/cephadm-exporter.rst b/doc/dev/cephadm/cephadm-exporter.rst new file mode 100644 index 00000000000..bc41fcaeb10 --- /dev/null +++ b/doc/dev/cephadm/cephadm-exporter.rst @@ -0,0 +1,306 @@ +================ +cephadm Exporter +================ + +There are a number of long running tasks that the cephadm 'binary' runs which can take several seconds +to run. This latency represents a scalability challenge to the Ceph orchestrator management plane. + +To address this, cephadm needs to be able to run some of these longer running tasks asynchronously - this +frees up processing on the mgr by offloading tasks to each host, reduces latency and improves scalability. + +This document describes the implementation requirements and design for an 'exporter' feature + + +Requirements +============ +The exporter should address these functional and non-functional requirements; + +* run as a normal systemd unit +* utilise the same filesystem schema as other services deployed with cephadm +* require only python3 standard library modules (no external dependencies) +* use encryption to protect the data flowing from a host to Ceph mgr +* execute data gathering tasks as background threads +* be easily extended to include more data gathering tasks +* monitor itself for the health of the data gathering threads +* cache metadata to respond to queries quickly +* respond to a metadata query in <30ms to support large Ceph clusters (000's nodes) +* provide CLI interaction to enable the exporter to be deployed either at bootstrap time, or once the + cluster has been deployed. +* be deployed as a normal orchestrator service (similar to the node-exporter) + +High Level Design +================= + +This section will focus on the exporter logic **only**. + +.. code:: + + Establish a metadata cache object (tasks will be represented by separate attributes) + Create a thread for each data gathering task; host, ceph-volume and list_daemons + each thread updates it's own attribute within the cache object + Start a server instance passing requests to a specific request handler + the request handler only interacts with the cache object + the request handler passes metadata back to the caller + Main Loop + Leave the loop if a 'stop' request is received + check thread health + if a thread that was active, is now inactive + update the cache marking the task as inactive + update the cache with an error message for that task + wait for n secs + + +In the initial exporter implementation, the exporter has been implemented as a RESTful API. + + +Security +======== + +The cephadm 'binary' only supports standard python3 features, which has meant the RESTful API has been +developed using the http module, which itself is not intended for production use. However, the implementation +is not complex (based only on HTTPServer and BaseHHTPRequestHandler) and only supports the GET method - so the +security risk is perceived as low. + +Current mgr to host interactions occurs within an ssh connection, so the goal of the exporter is to adopt a similar +security model. + +The initial REST API is implemented with the following features; + +* generic self-signed, or user provided SSL crt/key to encrypt traffic between the mgr and the host +* 'token' based authentication of the request + +All exporter instances will use the **same** crt/key to secure the link from the mgr to the host(s), in the same way +that the ssh access uses the same public key and port for each host connection. + +.. note:: Since the same SSL configuration is used on every exporter, when you supply your own settings you must + ensure that the CN or SAN components of the distinguished name are either **not** used or created using wildcard naming. + +The crt, key and token files are all defined with restrictive permissions (600), to help mitigate against the risk of exposure +to any other user on the Ceph cluster node(s). + +Administrator Interaction +========================= +Several new commands are required to configure the exporter, and additional parameters should be added to the bootstrap +process to allow the exporter to be deployed automatically for new clusters. + + +Enhancements to the 'bootstrap' process +--------------------------------------- +bootstrap should support additional parameters to automatically configure exporter daemons across hosts + +``--with-exporter`` + +By using this flag, you're telling the bootstrap process to include the cephadm-exporter service within the +cluster. If you do not provide a specific configuration (SSL, token, port) to use, defaults would be applied. + +``--exporter-config`` + +With the --exporter-config option, you may pass your own SSL, token and port information. The file must be in +JSON format and contain the following fields; crt, key, token and port. The JSON content should be validated, and any +errors detected passed back to the user during the argument parsing phase (before any changes are done). + + +Additional ceph commands +------------------------ +:: + +# ceph cephadm generate-exporter-config + +This command will create generate a default configuration consisting of; a self signed certificate, a randomly generated +32 character token and the default port of 9443 for the REST API. +:: + +# ceph cephadm set-exporter-config -i + +Use a JSON file to define the crt, key, token and port for the REST API. The crt, key and token are validated by +the mgr/cephadm module prior storing the values in the KV store. Invalid or missing entries should be reported to the +user. +:: + +# ceph cephadm clear-exporter-config + +Clear the current configuration (removes the associated keys from the KV store) +:: + +# ceph cephadm get-exporter-config + +Show the current exporter configuration, in JSON format + + +.. note:: If the service is already deployed any attempt to change or clear the configuration will + be denied. In order to change settings you must remove the service, apply the required configuration + and re-apply (``ceph orch apply cephadm-exporter``) + + + +New Ceph Configuration Keys +=========================== +The exporter configuration is persisted to the monitor's KV store, with the following keys: + +| mgr/cephadm/exporter_config +| mgr/cephadm/exporter_enabled + + + +RESTful API +=========== +The primary goal of the exporter is the provision of metadata from the host to the mgr. This interaction takes +place over a simple GET interface. Although only the GET method is supported, the API provides multiple URLs to +provide different views on the metadata that has been gathered. + +.. csv-table:: Supported URL endpoints + :header: "URL", "Purpose" + + "/v1/metadata", "show all metadata including health of all threads" + "/v1/metadata/health", "only report on the health of the data gathering threads" + "/v1/metadata/disks", "show the disk output (ceph-volume inventory data)" + "/v1/metadata/host", "show host related metadata from the gather-facts command" + "/v1/metatdata/daemons", "show the status of all ceph cluster related daemons on the host" + +Return Codes +------------ +The following HTTP return codes are generated by the API + +.. csv-table:: Supported HTTP Responses + :header: "Status Code", "Meaning" + + "200", "OK" + "204", "the thread associated with this request is no longer active, no data is returned" + "206", "some threads have stopped, so some content is missing" + "401", "request is not authorised - check your token is correct" + "404", "URL is malformed, not found" + "500", "all threads have stopped - unable to provide any metadata for the host" + + +Deployment +========== +During the initial phases of the exporter implementation, deployment is regarded as optional but is available +to new clusters and existing clusters that have the feature (Pacific and above). + +* new clusters : use the ``--with-exporter`` option +* existing clusters : you'll need to set the configuration and deploy the service manually + +.. code:: + + # ceph cephadm generate-exporter-config + # ceph orch apply cephadm-exporter + +If you choose to remove the cephadm-exporter service, you may simply + +.. code:: + + # ceph orch rm cephadm-exporter + +This will remove the daemons, and the exporter releated settings stored in the KV store. + + +Management +========== +Once the exporter is deployed, you can use the following snippet to extract the host's metadata. + +.. code-block:: python + + import ssl + import json + import sys + import tempfile + import time + from urllib.request import Request, urlopen + + # CHANGE THIS V + hostname = "rh8-1.storage.lab" + + print("Reading config.json") + try: + with open('./config.json', 'r') as f: + raw=f.read() + except FileNotFoundError as e: + print("You must first create a config.json file using the cephadm get-exporter-config command") + sys.exit(1) + + cfg = json.loads(raw) + with tempfile.NamedTemporaryFile(buffering=0) as t: + print("creating a temporary local crt file from the json") + t.write(cfg['crt'].encode('utf-8')) + + ctx = ssl.create_default_context() + ctx.check_hostname = False + ctx.load_verify_locations(t.name) + hdrs={"Authorization":f"Bearer {cfg['token']}"} + print("Issuing call to gather metadata") + req=Request(f"https://{hostname}:9443/v1/metadata",headers=hdrs) + s_time = time.time() + r = urlopen(req,context=ctx) + print(r.status) + print("call complete") + # assert r.status == 200 + if r.status in [200, 206]: + + raw=r.read() # bytes string + js=json.loads(raw.decode()) + print(json.dumps(js, indent=2)) + elapsed = time.time() - s_time + print(f"Elapsed secs : {elapsed}") + + +.. note:: the above example uses python3, and assumes that you've extracted the config using the ``get-exporter-config`` command. + + +Implementation Specific Details +=============================== + +In the same way as a typical container based deployment, the exporter is deployed to a directory under ``/var/lib/ceph/``. The +cephadm binary is stored in this cluster folder, and the daemon's configuration and systemd settings are stored +under ``/var/lib/ceph//cephadm-exporter./``. + +.. code:: + + [root@rh8-1 cephadm-exporter.rh8-1]# pwd + /var/lib/ceph/cb576f70-2f72-11eb-b141-525400da3eb7/cephadm-exporter.rh8-1 + [root@rh8-1 cephadm-exporter.rh8-1]# ls -al + total 24 + drwx------. 2 root root 100 Nov 25 18:10 . + drwx------. 8 root root 160 Nov 25 23:19 .. + -rw-------. 1 root root 1046 Nov 25 18:10 crt + -rw-------. 1 root root 1704 Nov 25 18:10 key + -rw-------. 1 root root 64 Nov 25 18:10 token + -rw-------. 1 root root 38 Nov 25 18:10 unit.configured + -rw-------. 1 root root 48 Nov 25 18:10 unit.created + -rw-r--r--. 1 root root 157 Nov 25 18:10 unit.run + + +In order to respond to requests quickly, the CephadmDaemon uses a cache object (CephadmCache) to hold the results +of the cephadm commands. + +The exporter doesn't introduce any new data gathering capability - instead it merely calls the existing cephadm commands. + +The CephadmDaemon class creates a local HTTP server(uses ThreadingMixIn), secured with TLS and uses the CephadmDaemonHandler +to handle the requests. The request handler inspects the request header and looks for a valid Bearer token - if this is invalid +or missing the caller receives a 401 Unauthorized error. + +The 'run' method of the CephadmDaemon class, places the scrape_* methods into different threads with each thread supporting +a different refresh interval. Each thread then periodically issues it's cephadm command, and places the output +in the cache object. + +In addition to the command output, each thread also maintains it's own timestamp record in the cache so the caller can +very easily determine the age of the data it's received. + +If the underlying cephadm command execution hits an exception, the thread passes control to a _handle_thread_exception method. +Here the exception is logged to the daemon's log file and the exception details are added to the cache, providing visibility +of the problem to the caller. + +Although each thread is effectively given it's own URL endpoint (host, disks, daemons), the recommended way to gather data from +the host is to simply use the ``/v1/metadata`` endpoint. This will provide all of the data, and indicate whether any of the +threads have failed. + +The run method uses "signal" to establish a reload hook, but in the initial implementation this doesn't take any action and simply +logs that a reload was received. + + +Future Work +=========== + +#. Consider the potential of adding a restart policy for threads +#. Once the exporter is fully integrated into mgr/cephadm, the goal would be to make the exporter the + default means of data gathering. However, until then the exporter will remain as an opt-in 'feature + preview'. diff --git a/doc/dev/cephadm/index.rst b/doc/dev/cephadm/index.rst index 70517d5ca3b..c1aff75ee31 100644 --- a/doc/dev/cephadm/index.rst +++ b/doc/dev/cephadm/index.rst @@ -10,3 +10,4 @@ CEPHADM Developer Documentation host-maintenance compliance-check + cephadm-exporter