Commit Graph

592 Commits

Author SHA1 Message Date
Adam King
812be8465f
Merge pull request #47763 from phlogistonjohn/jjm-object-format-fixes
pybind/mgr: object_format.py decorator updates & docs

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
2022-09-01 13:54:13 -04:00
Yuri Weinstein
4b0182efda
Merge pull request #47184 from ljflores/wip-telemetry-memory-stats
mgr/telemetry: add `perf_memory_metrics` collection to telemetry

Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>
Reviewed-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
2022-09-01 08:29:25 -07:00
Zac Dover
fc70ccde75 doc/mgr: update prompts in dboard.rst includes
This PR adds unselectable prompts to three files that are
transcluded in the doc/mgr/dashboard.rst file. These three
files are:

 1. debug.inc.rst
 2. feature_toggles.inc.rst
 3. motd.inc.rst

The addition of unselectable prompts to these three files
completes the work begun in PR#47810 (d8064b4), which sought
to bring dashboard.rst into line with the unselectable prompt
standard introduced by Kefu Chai in 2020.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2022-08-29 10:39:51 +10:00
Zac Dover
d8064b4681 doc/mgr: add prompt directives to dashboard.rst
This commit adds prompt directives (.. prompt:: bash $) to
the commands in dashboard.rst.

There are several ".. include::" directives in the dashboard.rst
file, which means that part of this page is sourced from elsewhere
than the dashboard.rst file. Because I have not yet added prompt
directives to those files, there is an inconsistency in the rendering
of this file. Most of the commands on this page have unselectable
prompts (unselectable prompts are the prompts that don't get added to
the buffer when you copy them to one of the clipboards). But the
commands on this page that come from those ".. include::" directives
do not yet have unselectable prompts.

This file is over 1600 lines long. It was perhaps not optimally wise
of me to have edited all of it in one fell swoop. It took many hours,
and carefully checking it will probably take at least one hour. I
suggest that whoever reviews this should not spend much time on it,
but should instead make a quick pass over the page and make sure that
it looks passable.

The English syntax on this page (and throughout the Dashboard doc-
umentation) will be tightened to remove ambiguity and to improve
readability in the near future, so hold all English-language-related
comments for a future pull request.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2022-08-26 01:56:41 +10:00
Laura Flores
138eb5db67 doc/mgr: add perf_memory_metrics to the telemetry documentation
Signed-off-by: Laura Flores <lflores@redhat.com>
2022-08-24 22:07:14 +00:00
Nizamudeen A
79bbaa5553 docs: fix doc link pointing to master in dashboard.rst
Signed-off-by: Nizamudeen A <nia@redhat.com>
2022-08-24 16:11:00 +05:30
Zac Dover
2172b7ec98 doc/mgr: edit orchestrator.rst
This PR improves the English language in the "Orchestrator CLI"
section of the MGR documentation. It adds a couple of section
headers in order to signpost the information in the document
a bit more than had already been done, but it makes no major
structural changes to the presentation of the information here.

This PR was motivated by feedback from the 2022 Ceph User Survey
in which one of the respondents wrote "better ceph orch documen-
tation".

The final section on this page, "Current Implementation Status",
must be verified by someone who is familiar with the current state
of "ceph orch" and a date stamp should be applied to the top of
the section so that the word "current" has a meaningful referent.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2022-08-24 13:21:25 +10:00
John Mulligan
2a2d044247 doc/mgr: add a tutorial-esque section on object_format python module
It doesn't cover everything but should get most use cases started.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2022-08-23 13:01:45 -04:00
John Mulligan
30d3e5bab5 doc/mgr: fix quoting error in python example
Found by vim syntax highlighting. Thanks vim!

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2022-08-23 13:01:45 -04:00
John Mulligan
481776becf doc/mgr: use subsections for two approaches to exposing commands
This makes the content for each approach clearer and prepares
for a future sub-section.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2022-08-23 13:01:45 -04:00
Anthony D'Atri
f1235a8ee0 doc/mgr: Fix capitalization in orchestrator.rst
Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
2022-07-05 19:23:36 -07:00
Anthony D'Atri
cf1415a2b2
Merge pull request #46919 from jsoref/spelling-docs
doc: Fix many spelling errors
2022-07-03 19:54:55 -07:00
Anthony D'Atri
fe2200527e
Merge pull request #46087 from rhcs-dashboard/update-centralized-logging-docs
doc: update docs for centralized logging
2022-07-03 16:32:59 -07:00
Josh Soref
8abce157f1 doc: Fix many spelling errors
* administrators
* allocated
* allowed
* approximate
* authenticate
* availability
* average
* behavior
* binaries
* bootstrap
* bootstrapping
* capacity
* cephadm
* clients
* combining
* command
* committed
* comparison
* compiled
* consequences
* continues
* convenience
* cookie
* crypto
* dashboard
* deduplication
* defaults
* delivered
* deployment
* describe
* directory
* documentation
* dynamic
* elimination
* entries
* expectancy
* explicit
* explicitly
* exporter
* github
* hard
* healthcheck
* heartbeat
* heavily
* http
* indices
* infrastructure
* inherit
* layout
* lexically
* likelihood
* logarithmic
* manually
* metadata
* minimization
* minimize
* object
* of
* operation
* opportunities
* overwrite
* prioritized
* recipe
* records
* requirements
* restructured
* running
* scalability
* second
* select
* significant
* specify
* subscription
* supported
* synonym
* throttle
* unpinning
* upgraded
* value
* version
* which
* with

Plus some line wrapping and additional edits...

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2022-07-02 23:38:18 -04:00
Aashish Sharma
4ac2a3e5f7 doc: update docs for centralized logging
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
2022-06-28 16:22:24 +05:30
Redouane Kachach
df1aaacb7d
doc/cephadm: enhancing daemon operations documentation
Fixes: https://tracker.ceph.com/issues/54399

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
2022-06-20 11:44:49 +02:00
Konstantin Shalygin
4512270736 doc/mgr: Document wildcard to expose Prometheus metrics for all RBD pools and namespaces
Fixes: https://tracker.ceph.com/issues/47537

Signed-off-by: Konstantin Shalygin <k0ste@k0ste.ru>
2022-06-02 14:23:38 +07:00
Ramana Raja
3adb70a24d doc/mgr/nfs: Add commands to check the statuses
.. of NFS and ingress services after creating/deleting a NFS cluster.
The `nfs cluster info` command is not sufficient to show that the
NFS cluster is created/deleted as expected.

Signed-off-by: Ramana Raja <rraja@redhat.com>
2022-04-27 12:12:36 -04:00
Neha Ojha
ab9546fd17
Merge pull request #44666 from s0nea/correct_metric_name
doc/mgr/prometheus: correct metric name

Reviewed-by: Patrick Seidensal <pseidensal@suse.com>
Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
2022-04-07 13:57:52 -07:00
wangxinyu
1c326651d0 doc/mgr/prometheus.rst: fix spelling error
fix spelling error

Signed-off-by: wangxinyu <wangxinyu@inspur.com>
2022-03-22 18:51:57 +08:00
John Mulligan
b5b3e0bcb5 doc/mgr/nfs: document that nfs exports related mgr call requirements
A recent change in the mgr/nfs module should enable the functioning
of export management commands/API calls as long as the rados namespaces
and objects have been already established. Document this fact, noting
that now only the `ceph nfs cluster ...` calls *require* an
orchestration module.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2022-02-23 16:33:48 -05:00
Yuri Weinstein
ddeec8d88a
Merge pull request #44781 from ljflores/wip-basic-channel-additions
mgr/telemetry: add `basic_pool_usage` and `basic_usage_by_class` collections to the telemetry module

Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>
2022-02-14 09:06:00 -08:00
Yuri Weinstein
2624f51a72
Merge pull request #44588 from kamoltat/wip-ksirivad-disable-progress-by-default
pybind/mgr/progress: disable pg recovery event by default

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2022-02-11 14:49:17 -08:00
Laura Flores
f69cec5b70 mgr/telemetry: separate device class usage statistics into their own collection
The new collection is called `basic_usage_by_class`. This info should be separate
from `basic_pool_usage` since it doesn't involve pool statistics.

Signed-off-by: Laura Flores <lflores@redhat.com>
2022-02-08 00:45:02 +00:00
Laura Flores
c71a54ec1a mgr/telemetry: update basic_pool_usage collection desc
- Added the word "default" since we are only collecting
default pool applications

- Removed the word "data" since we are actually collecting
usage *statistics*

Signed-off-by: Laura Flores <lflores@redhat.com>
2022-02-08 00:42:37 +00:00
Nizamudeen A
27592b7561 cephadm: change shared_folder directory for prometheus and grafana
After https://github.com/ceph/ceph/pull/44059 the monitoring/prometheus
and monitoring/grafana/dashboards directories are changed to
monitoring/ceph-mixins. That broke the shared_folders in the cephadm
bootstrap script.

Changed all the instances of monitoring/prometheus and
monitoring/grafana/dashboards to monitoring/ceph-mixins

Also, renaming all the instances of prometheus_alerts.yaml to
prometheus_alerts.yml.

Fixes: https://tracker.ceph.com/issues/54176
Signed-off-by: Nizamudeen A <nia@redhat.com>
2022-02-07 16:34:37 +05:30
Kamoltat
f06da20dff pybind/mgr/progress: disable pg recovery event by default
The progress module disabled the pg recovery event by default
since the event is expensive and has interrupted other serviceis
when there is OSDs being marked in/out from the the cluster.

To turn the event on manually:

ceph config set mgr mgr/progress/allow_pg_recovery_event true

Updated qa/tasks/mgr/test_progress.py to enable
the pg recovery event when testing the progress module.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2022-02-03 17:51:42 +00:00
Laura Flores
4a2b54c1f2 doc/mgr: update telemetry doc to reflect basic_pool_usage collection
Signed-off-by: Laura Flores <lflores@redhat.com>
2022-02-02 23:08:53 +00:00
Tatjana Dehler
eefcb0aeed
doc/mgr/prometheus: correct metric name
Replace the metric name `node_disk_bytes_written` by
`node_disk_written_bytes_total` to reflect changes made in node exporter
version 0.16.0
https://github.com/prometheus/node_exporter/releases/tag/v0.16.0 /
https://github.com/prometheus/node_exporter/blob/v0.16.0/docs/example-16-compatibility-rules.yml .

Fixes: https://tracker.ceph.com/issues/53932
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
2022-01-19 15:20:41 +01:00
Yuri Weinstein
f5b4f3f4d9
Merge pull request #44251 from yaarith/telemetry-opt-in
mgr/telemetry: introduce new design for varying report data

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2022-01-14 09:06:11 -08:00
Yaarit Hatuka
2d1550cf05 mgr/telemetry: add enable / disable channel all
Enable or disable all telemetry channels at once with:
    ceph telemetry enable channel all
    ceph telemetry disable channel all

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
2022-01-13 21:54:07 +00:00
Yaarit Hatuka
77a032526d mgr/telemetry: improve output of ceph telemetry collection ls
STATUS column now indicates whether a collection is being reported, and
the reasons why it's not (either the user is not opted-in to this
collection, or its channel is off).

Also, removed the ENROLLED and DEFAULT columns due to potential
confusion they may cause.

In case a user is not opted-in to certain collections, a message will
appear above the table with the missing collections:

    New collections are available:
    ['basic_base', 'basic_mds_metadata', 'crash_base', 'device_base',
    'ident_base', 'perf_perf']
    Run `ceph telemetry on` to opt-in to these collections.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
2022-01-13 21:54:07 +00:00
Yaarit Hatuka
4c110ed2a5 doc/mgr/telemetry: document new commands
New commands:

  ceph telemetry enable channel <channel_name>
  ceph telemetry disable channel <channel_name>
  ceph telemetry channel ls
  ceph telemetry collection ls
  ceph telemetry collection diff
  ceph telemetry preview
  ceph telemetry preview-device
  ceph telemetry preview-all

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
2022-01-13 21:53:47 +00:00
Patrick Seidensal
18d3a71618 mgr/prometheus: Fix regression with OSD/host details/overview dashboards
Fix issues with PromQL expressions and vector matching with the
`ceph_disk_occupation` metric.

As it turns out, `ceph_disk_occupation` cannot simply be used as
expected, as there seem to be some edge cases for users that have
several OSDs on a single disk.  This leads to issues which cannot be
approached by PromQL alone (many-to-many PromQL erros).  The data we
have expected is simply different in some rare cases.

I have not found a sole PromQL solution to this issue. What we basically
need is the following.

1. Match on labels `host` and `instance` to get one or more OSD names
   from a metadata metric (`ceph_disk_occupation`) to let a user know
   about which OSDs belong to which disk.

2. Match on labels `ceph_daemon` of the `ceph_disk_occupation` metric,
   in which case the value of `ceph_daemon` must not refer to more than
   a single OSD. The exact opposite to requirement 1.

As both operations are currently performed on a single metric, and there
is no way to satisfy both requirements on a single metric, the intention
of this commit is to extend the metric by providing a similar metric
that satisfies one of the requirements. This enables the queries to
differentiate between a vector matching operation to show a string to
the user (where `ceph_daemon` could possibly be `osd.1` or
`osd.1+osd.2`) and to match a vector by having a single `ceph_daemon` in
the condition for the matching.

Although the `ceph_daemon` label is used on a variety of daemons, only
OSDs seem to be affected by this issue (only if more than one OSD is run
on a single disk).  This means that only the `ceph_disk_occupation`
metadata metric seems to need to be extended and provided as two
metrics.

`ceph_disk_occupation` is supposed to be used for matching the
`ceph_daemon` label value.

    foo * on(ceph_daemon) group_left ceph_disk_occupation

`ceph_disk_occupation_human` is supposed to be used for anything where
the resulting data is displayed to be consumed by humans (graphs, alert
messages, etc).

    foo * on(device,instance)
    group_left(ceph_daemon) ceph_disk_occupation_human

Fixes: https://tracker.ceph.com/issues/52974

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
2022-01-13 13:27:55 +01:00
Waad AlKhoury
8f99e18380 doc/mgr: Add cli api documentation
Signed-off-by: Waad AlKhoury <walkhour@redhat.com>
2022-01-05 10:11:58 +01:00
Pere Diaz Bou
69aa388d90 doc/mgr: Add cache documentation
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-05 10:11:58 +01:00
wangyunqing
f838e2e207 doc/mgr/zabbix.rst: fix typos
Signed-off-by: wangyunqing <wangyunqing@inspur.com>
2021-12-24 17:10:11 +08:00
Sebastian Wagner
2459728b0c
Merge pull request #43901 from pcuzner/snmp-notifier
mgr/cephadm: Add snmp-gateway service support

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-12-16 16:53:55 +01:00
Dimitri Papadopoulos
7677651618
doc,man: typos found by codespell
Signed-off-by: Dimitri Papadopoulos <3234522+DimitriPapadopoulos@users.noreply.github.com>
2021-12-15 12:04:36 +01:00
Paul Cuzner
91f35e1f53 mgr/cephadm: Updated docs for snmp-gateway support
Updated docs to show snmp-gateway usage. docs provide
guidance on SNMP versions supported and show CLI and
yaml deployment examples.

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2021-12-15 14:25:34 +13:00
Yuri Weinstein
b2e20eb068
Merge pull request #44025 from ljflores/wip-remove-aggregated-perf-data
mgr/telemetry: remove aggregated perf metrics from the perf channel

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
2021-12-10 15:35:09 -08:00
Laura Flores
b57d61c1cb mgr/telemetry: remove aggregated perf metrics from the perf channel
Up until this point, we included aggregated and separated data for
testing purposes. Now that we've done our testing, the aggregated
metrics are no longer relevant.

Aggregated metrics can still be achieved on the server side by
summing separated metrics.

Signed-off-by: Laura Flores <lflores@redhat.com>
2021-12-02 18:43:33 +00:00
Sebastian Wagner
aecd0fb9b9
Merge pull request #44143 from devlikai/master
doc/mgr/diskprediction: fix a typo.

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-12-01 10:24:00 +01:00
Yehuda Sadeh
193895ffba
Merge pull request #42710 from yehudasa/wip-rgw-mgr-module
mgr/rgw: new rgw manager module

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-30 14:32:39 -08:00
Kyle
dfba515c86
doc/mgr/diskprediction: fix a typo.
doc: remove extra comma.

This commit remove extra comma of "To disable prediction,:".

Fixes: https://tracker.ceph.com/issues/53433

Signed-off-by: devlikai <likai_lc@inspur.com>
2021-11-30 15:27:26 +08:00
Yehuda Sadeh
af402c41e3 docs: document mgr/rgw module
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00
Sebastian Wagner
6e528ed7c5
Merge pull request #43095 from sebastian-philipp/_check_for_moved_osds
mgr/cephadm: Add _check_for_moved_osds

Reviewed-by: Adam King <adking@redhat.com>
2021-11-17 15:09:06 +01:00
Ernesto Puerta
45eb9dd328
Merge pull request #43464 from rsommer/wip-prometheus-standby-behaviour
mgr/prometheus: Make prometheus standby behaviour configurable

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2021-11-11 17:36:30 +01:00
Roland Sommer
c1570f870e mgr/prometheus: Make standby discoverable
Enable config settings to modify standby's behaviour on the index page
This makes the standby discoverable by reverse proxy or loadbalancer
setups. Testing for the empty response of the '/metrics' endpoint would
trigger metric collection on the active manager instance.

The newly added configuration options settings standby_behaviour and
standby_error_status_code are documented and flagged as runtime, as
modifying both settings has an immediate effect (no restart required).

Co-authored-by: Ernesto Puerta <37327689+epuertat@users.noreply.github.com>
Signed-off-by: Roland Sommer <rol@ndsommer.de>
Fixes: https://tracker.ceph.com/issues/53229
2021-11-11 08:28:40 +01:00
Laura Flores
92fcfbb464
Merge pull request #43411 from ljflores/wip-mgr-command-cleanup
mon: simplify 'mgr module ls' output
2021-11-10 14:09:51 -06:00