Merge pull request #42970 from sebastian-philipp/monitoring-custom-webhook

doc/cephadm: monitoring: Add "Adding Alertmanager webhooks" 

Reviewed-by: Patrick Seidensal <pseidensal@suse.com>
This commit is contained in:
Sebastian Wagner 2021-08-30 16:34:45 +02:00 committed by GitHub
commit a2827edf48
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 123 additions and 78 deletions

11
.github/labeler.yml vendored
View File

@ -255,3 +255,14 @@ nfs:
- doc/cephadm/nfs.rst
- doc/radosgw/nfs.rst
- doc/dev/vstart-ganesha.rst
monitoring:
- doc/cephadm/monitoring.rst
- src/pybind/mgr/cephadm/services/monitoring.py
- src/pybind/mgr/cephadm/templates/services/alertmanager/**
- src/pybind/mgr/cephadm/templates/services/grafana/**
- src/pybind/mgr/cephadm/templates/services/prometheus/**
- src/pybind/mgr/dashboard/ci/check_grafana_dashboards.py
- src/pybind/mgr/prometheus/**
- monitoring/**

View File

@ -56,87 +56,32 @@ steps below:
.. prompt:: bash #
ceph orch apply node-exporter '*'
ceph orch apply node-exporter
#. Deploy alertmanager:
.. prompt:: bash #
ceph orch apply alertmanager 1
ceph orch apply alertmanager
#. Deploy Prometheus. A single Prometheus instance is sufficient, but
for high availablility (HA) you might want to deploy two:
.. prompt:: bash #
ceph orch apply prometheus 1
ceph orch apply prometheus
or
.. prompt:: bash #
ceph orch apply prometheus 2
ceph orch apply prometheus --placement 'count:2'
#. Deploy grafana:
.. prompt:: bash #
ceph orch apply grafana 1
Manually setting the Grafana URL
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cephadm automatically configures Prometheus, Grafana, and Alertmanager in
all cases except one.
In a some setups, the Dashboard user's browser might not be able to access the
Grafana URL that is configured in Ceph Dashboard. This can happen when the
cluster and the accessing user are in different DNS zones.
If this is the case, you can use a configuration option for Ceph Dashboard
to set the URL that the user's browser will use to access Grafana. This
value will never be altered by cephadm. To set this configuration option,
issue the following command:
.. prompt:: bash $
ceph dashboard set-grafana-frontend-api-url <grafana-server-api>
It might take a minute or two for services to be deployed. After the
services have been deployed, you should see something like this when you issue the command ``ceph orch ls``:
.. code-block:: console
$ ceph orch ls
NAME RUNNING REFRESHED IMAGE NAME IMAGE ID SPEC
alertmanager 1/1 6s ago docker.io/prom/alertmanager:latest 0881eb8f169f present
crash 2/2 6s ago docker.io/ceph/daemon-base:latest-master-devel mix present
grafana 1/1 0s ago docker.io/pcuzner/ceph-grafana-el8:latest f77afcf0bcf6 absent
node-exporter 2/2 6s ago docker.io/prom/node-exporter:latest e5a616e4b9cf present
prometheus 1/1 6s ago docker.io/prom/prometheus:latest e935122ab143 present
Configuring SSL/TLS for Grafana
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``cephadm`` deploys Grafana using the certificate defined in the ceph
key/value store. If no certificate is specified, ``cephadm`` generates a
self-signed certificate during the deployment of the Grafana service.
A custom certificate can be configured using the following commands:
.. prompt:: bash #
ceph config-key set mgr/cephadm/grafana_key -i $PWD/key.pem
ceph config-key set mgr/cephadm/grafana_crt -i $PWD/certificate.pem
If you have already deployed Grafana, run ``reconfig`` on the service to
update its configuration:
.. prompt:: bash #
ceph orch reconfig grafana
The ``reconfig`` command also sets the proper URL for Ceph Dashboard.
ceph orch apply grafana
Networks and Ports
~~~~~~~~~~~~~~~~~~
@ -291,22 +236,8 @@ Example
# reconfig the prometheus service
ceph orch reconfig prometheus
Disabling monitoring
--------------------
To disable monitoring and remove the software that supports it, run the following commands:
.. code-block:: console
$ ceph orch rm grafana
$ ceph orch rm prometheus --force # this will delete metrics data collected so far
$ ceph orch rm node-exporter
$ ceph orch rm alertmanager
$ ceph mgr module disable prometheus
Deploying monitoring manually
-----------------------------
Deploying monitoring without cephadm
------------------------------------
If you have an existing prometheus monitoring infrastructure, or would like
to manage it yourself, you need to configure it to integrate with your Ceph
@ -325,9 +256,112 @@ cluster.
* To enable dashboard integration with Grafana, see :ref:`dashboard-grafana`.
Enabling RBD-Image monitoring
---------------------------------
Disabling monitoring
--------------------
To disable monitoring and remove the software that supports it, run the following commands:
.. code-block:: console
$ ceph orch rm grafana
$ ceph orch rm prometheus --force # this will delete metrics data collected so far
$ ceph orch rm node-exporter
$ ceph orch rm alertmanager
$ ceph mgr module disable prometheus
Setting up RBD-Image monitoring
-------------------------------
Due to performance reasons, monitoring of RBD images is disabled by default. For more information please see
:ref:`prometheus-rbd-io-statistics`. If disabled, the overview and details dashboards will stay empty in Grafana
and the metrics will not be visible in Prometheus.
Setting up Grafana
------------------
Manually setting the Grafana URL
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cephadm automatically configures Prometheus, Grafana, and Alertmanager in
all cases except one.
In a some setups, the Dashboard user's browser might not be able to access the
Grafana URL that is configured in Ceph Dashboard. This can happen when the
cluster and the accessing user are in different DNS zones.
If this is the case, you can use a configuration option for Ceph Dashboard
to set the URL that the user's browser will use to access Grafana. This
value will never be altered by cephadm. To set this configuration option,
issue the following command:
.. prompt:: bash $
ceph dashboard set-grafana-frontend-api-url <grafana-server-api>
It might take a minute or two for services to be deployed. After the
services have been deployed, you should see something like this when you issue the command ``ceph orch ls``:
.. code-block:: console
$ ceph orch ls
NAME RUNNING REFRESHED IMAGE NAME IMAGE ID SPEC
alertmanager 1/1 6s ago docker.io/prom/alertmanager:latest 0881eb8f169f present
crash 2/2 6s ago docker.io/ceph/daemon-base:latest-master-devel mix present
grafana 1/1 0s ago docker.io/pcuzner/ceph-grafana-el8:latest f77afcf0bcf6 absent
node-exporter 2/2 6s ago docker.io/prom/node-exporter:latest e5a616e4b9cf present
prometheus 1/1 6s ago docker.io/prom/prometheus:latest e935122ab143 present
Configuring SSL/TLS for Grafana
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``cephadm`` deploys Grafana using the certificate defined in the ceph
key/value store. If no certificate is specified, ``cephadm`` generates a
self-signed certificate during the deployment of the Grafana service.
A custom certificate can be configured using the following commands:
.. prompt:: bash #
ceph config-key set mgr/cephadm/grafana_key -i $PWD/key.pem
ceph config-key set mgr/cephadm/grafana_crt -i $PWD/certificate.pem
If you have already deployed Grafana, run ``reconfig`` on the service to
update its configuration:
.. prompt:: bash #
ceph orch reconfig grafana
The ``reconfig`` command also sets the proper URL for Ceph Dashboard.
Setting up Alertmanager
-----------------------
Adding Alertmanager webhooks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To add new webhooks to the Alertmanager configuration, add additional
webhook urls like so:
.. code-block:: yaml
service_type: alertmanager
spec:
user_data:
default_webhook_urls:
- "https://foo"
- "https://bar"
Where ``default_webhook_urls`` is a list of additional URLs that are
added to the default receivers' ``<webhook_configs>`` configuration.
Run ``reconfig`` on the service to update its configuration:
.. prompt:: bash #
ceph orch reconfig alertmanager
Further Reading
---------------
* :ref:`mgr-prometheus`