Commit Graph

563 Commits

Author SHA1 Message Date
Yuri Weinstein
f5b4f3f4d9
Merge pull request #44251 from yaarith/telemetry-opt-in
mgr/telemetry: introduce new design for varying report data

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2022-01-14 09:06:11 -08:00
Yaarit Hatuka
2d1550cf05 mgr/telemetry: add enable / disable channel all
Enable or disable all telemetry channels at once with:
    ceph telemetry enable channel all
    ceph telemetry disable channel all

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
2022-01-13 21:54:07 +00:00
Yaarit Hatuka
77a032526d mgr/telemetry: improve output of ceph telemetry collection ls
STATUS column now indicates whether a collection is being reported, and
the reasons why it's not (either the user is not opted-in to this
collection, or its channel is off).

Also, removed the ENROLLED and DEFAULT columns due to potential
confusion they may cause.

In case a user is not opted-in to certain collections, a message will
appear above the table with the missing collections:

    New collections are available:
    ['basic_base', 'basic_mds_metadata', 'crash_base', 'device_base',
    'ident_base', 'perf_perf']
    Run `ceph telemetry on` to opt-in to these collections.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
2022-01-13 21:54:07 +00:00
Yaarit Hatuka
4c110ed2a5 doc/mgr/telemetry: document new commands
New commands:

  ceph telemetry enable channel <channel_name>
  ceph telemetry disable channel <channel_name>
  ceph telemetry channel ls
  ceph telemetry collection ls
  ceph telemetry collection diff
  ceph telemetry preview
  ceph telemetry preview-device
  ceph telemetry preview-all

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
2022-01-13 21:53:47 +00:00
Patrick Seidensal
18d3a71618 mgr/prometheus: Fix regression with OSD/host details/overview dashboards
Fix issues with PromQL expressions and vector matching with the
`ceph_disk_occupation` metric.

As it turns out, `ceph_disk_occupation` cannot simply be used as
expected, as there seem to be some edge cases for users that have
several OSDs on a single disk.  This leads to issues which cannot be
approached by PromQL alone (many-to-many PromQL erros).  The data we
have expected is simply different in some rare cases.

I have not found a sole PromQL solution to this issue. What we basically
need is the following.

1. Match on labels `host` and `instance` to get one or more OSD names
   from a metadata metric (`ceph_disk_occupation`) to let a user know
   about which OSDs belong to which disk.

2. Match on labels `ceph_daemon` of the `ceph_disk_occupation` metric,
   in which case the value of `ceph_daemon` must not refer to more than
   a single OSD. The exact opposite to requirement 1.

As both operations are currently performed on a single metric, and there
is no way to satisfy both requirements on a single metric, the intention
of this commit is to extend the metric by providing a similar metric
that satisfies one of the requirements. This enables the queries to
differentiate between a vector matching operation to show a string to
the user (where `ceph_daemon` could possibly be `osd.1` or
`osd.1+osd.2`) and to match a vector by having a single `ceph_daemon` in
the condition for the matching.

Although the `ceph_daemon` label is used on a variety of daemons, only
OSDs seem to be affected by this issue (only if more than one OSD is run
on a single disk).  This means that only the `ceph_disk_occupation`
metadata metric seems to need to be extended and provided as two
metrics.

`ceph_disk_occupation` is supposed to be used for matching the
`ceph_daemon` label value.

    foo * on(ceph_daemon) group_left ceph_disk_occupation

`ceph_disk_occupation_human` is supposed to be used for anything where
the resulting data is displayed to be consumed by humans (graphs, alert
messages, etc).

    foo * on(device,instance)
    group_left(ceph_daemon) ceph_disk_occupation_human

Fixes: https://tracker.ceph.com/issues/52974

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
2022-01-13 13:27:55 +01:00
Waad AlKhoury
8f99e18380 doc/mgr: Add cli api documentation
Signed-off-by: Waad AlKhoury <walkhour@redhat.com>
2022-01-05 10:11:58 +01:00
Pere Diaz Bou
69aa388d90 doc/mgr: Add cache documentation
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2022-01-05 10:11:58 +01:00
wangyunqing
f838e2e207 doc/mgr/zabbix.rst: fix typos
Signed-off-by: wangyunqing <wangyunqing@inspur.com>
2021-12-24 17:10:11 +08:00
Sebastian Wagner
2459728b0c
Merge pull request #43901 from pcuzner/snmp-notifier
mgr/cephadm: Add snmp-gateway service support

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-12-16 16:53:55 +01:00
Dimitri Papadopoulos
7677651618
doc,man: typos found by codespell
Signed-off-by: Dimitri Papadopoulos <3234522+DimitriPapadopoulos@users.noreply.github.com>
2021-12-15 12:04:36 +01:00
Paul Cuzner
91f35e1f53 mgr/cephadm: Updated docs for snmp-gateway support
Updated docs to show snmp-gateway usage. docs provide
guidance on SNMP versions supported and show CLI and
yaml deployment examples.

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2021-12-15 14:25:34 +13:00
Yuri Weinstein
b2e20eb068
Merge pull request #44025 from ljflores/wip-remove-aggregated-perf-data
mgr/telemetry: remove aggregated perf metrics from the perf channel

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
2021-12-10 15:35:09 -08:00
Laura Flores
b57d61c1cb mgr/telemetry: remove aggregated perf metrics from the perf channel
Up until this point, we included aggregated and separated data for
testing purposes. Now that we've done our testing, the aggregated
metrics are no longer relevant.

Aggregated metrics can still be achieved on the server side by
summing separated metrics.

Signed-off-by: Laura Flores <lflores@redhat.com>
2021-12-02 18:43:33 +00:00
Sebastian Wagner
aecd0fb9b9
Merge pull request #44143 from devlikai/master
doc/mgr/diskprediction: fix a typo.

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-12-01 10:24:00 +01:00
Yehuda Sadeh
193895ffba
Merge pull request #42710 from yehudasa/wip-rgw-mgr-module
mgr/rgw: new rgw manager module

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-30 14:32:39 -08:00
Kyle
dfba515c86
doc/mgr/diskprediction: fix a typo.
doc: remove extra comma.

This commit remove extra comma of "To disable prediction,:".

Fixes: https://tracker.ceph.com/issues/53433

Signed-off-by: devlikai <likai_lc@inspur.com>
2021-11-30 15:27:26 +08:00
Yehuda Sadeh
af402c41e3 docs: document mgr/rgw module
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00
Sebastian Wagner
6e528ed7c5
Merge pull request #43095 from sebastian-philipp/_check_for_moved_osds
mgr/cephadm: Add _check_for_moved_osds

Reviewed-by: Adam King <adking@redhat.com>
2021-11-17 15:09:06 +01:00
Ernesto Puerta
45eb9dd328
Merge pull request #43464 from rsommer/wip-prometheus-standby-behaviour
mgr/prometheus: Make prometheus standby behaviour configurable

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2021-11-11 17:36:30 +01:00
Roland Sommer
c1570f870e mgr/prometheus: Make standby discoverable
Enable config settings to modify standby's behaviour on the index page
This makes the standby discoverable by reverse proxy or loadbalancer
setups. Testing for the empty response of the '/metrics' endpoint would
trigger metric collection on the active manager instance.

The newly added configuration options settings standby_behaviour and
standby_error_status_code are documented and flagged as runtime, as
modifying both settings has an immediate effect (no restart required).

Co-authored-by: Ernesto Puerta <37327689+epuertat@users.noreply.github.com>
Signed-off-by: Roland Sommer <rol@ndsommer.de>
Fixes: https://tracker.ceph.com/issues/53229
2021-11-11 08:28:40 +01:00
Laura Flores
92fcfbb464
Merge pull request #43411 from ljflores/wip-mgr-command-cleanup
mon: simplify 'mgr module ls' output
2021-11-10 14:09:51 -06:00
Sebastian Wagner
501ecf035e
mgr/orch: Add DaemonDescriptionStatus starting and unknown
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-10 13:49:24 +01:00
Sage Weil
b6d85e3975 doc/mgr/nfs: document rgw user and bucket exports
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-04 10:42:50 -04:00
Sage Weil
aef952bc46 mgr/nfs: use keyword args for 'nfs export create rgw'
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-02 17:06:58 -04:00
Sage Weil
9467f1e89e mgr/nfs: document and use keyword args for 'nfs export create cephfs'
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-02 17:06:58 -04:00
Sebastian Wagner
aae2ea3897
Merge pull request #43293 from pcuzner/granular-alerts
mgr/prometheus: expose ceph healthchecks as metrics

Reviewed-by: Boris Ranto <branto@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-10-29 00:23:24 +02:00
Paul Cuzner
e0dfc02063 mgr/prometheus: track individual healthchecks as metrics
This patch creates a health history object maintained in
the modules kvstore.  The history and current health
checks are used to create a metric per healthcheck whilst
also providing a history feature. Two new commands are added:
ceph healthcheck history ls
ceph healthcheck history clear

In addition to the new commands, the additional metrics
have been used to update the prometheus alerts

Fixes: https://tracker.ceph.com/issues/52638

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2021-10-22 13:32:39 +13:00
Ernesto Puerta
f5fddd6121
Merge pull request #42526 from liewegas/dashboard-nfs
mgr/dashboard: consume mgr/nfs

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Varsha Rao <rvarsha016@gmail.com>
2021-10-19 11:17:17 +02:00
Alfonso Martínez
58a6ab2147 mgr/dashboard: NFS exports: API + UI: integration with mgr/nfs; cleanups
mgr/dashboard: move NFS_GANESHA_SUPPORTED_FSALS to mgr_module.py

Importing from nfs module throws AttributeError because as a side effect the dashboard module is impersonating the nfs module.
https://gist.github.com/varshar16/61ac26426bbe5f5f562ebb14bcd0f548

mgr/dashboard: 'Create NFS export' form: list clusters from nfs module

mgr/dashboard: frontend+backend cleanups for NFS export

Removed all code and references related to daemons. UI cleanup and adopted unit-testing for
nfs-epxort create form for CEPHFS backend. Cleanup for export list/get/create/set/delete endpoints.

mgr/dashboard: rm set-ganesha ref + update docs

Remove existing set-ganesha-clusters-rados-pool-namespace references as
they are no longer required. Moreover, nfs doc in dashboard doc is
updated accordingly to the current nfs status.

mgr/dashboard: add nfs-export e2e test coverage

mgr/dashboard: 'Create NFS export' form: remove RGW user id field.

- Improve bucket typeahead behavior.
- Increase version for bucket list endpoint.
- Some refactoring.

mgr/dashboard: 'Create NFS export' form: allow RGW backend only when default realm is selected.

When RGW multisite is configured, the NFS module can only handle buckets in the default realm.

mgr/dashboard: 'Create service' form: fix NFS service creation.

After https://github.com/ceph/ceph/pull/42073, NFS pool and namespace are not customizable.

mgr/dashboard: 'Create NFS export' form: add bucket validation.

- Allow only existing buckets.
- Refactoring:
  - Moved bucket validator from bucket form to cd-validators.ts
  - Split bucket validator into 2: bucket name validator and bucket existence (that checks either existence or non-existence).

mgr/dashboard: 'Create NFS export' form: path validation refactor: allow only existing paths.

Fixes: https://tracker.ceph.com/issues/46493
Fixes: https://tracker.ceph.com/issues/51479
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
2021-10-18 12:58:54 +02:00
Laura Flores
46236139f9 mon: remove detail option for mgr module ls command
Signed-off-by: Laura Flores <lflores@redhat.com>
2021-10-15 16:20:16 +00:00
Sage Weil
522c184f42 mgr/nfs: add 'nfs cluster config get'
Fixes: https://tracker.ceph.com/issues/52942
Signed-off-by: Sage Weil <sage@newdream.net>
2021-10-14 20:57:38 -04:00
Laura Flores
2a10be5347 mon: simplify 'mgr module ls' output
Fixes: https://tracker.ceph.com/issues/45322
Signed-off-by: Laura Flores <lflores@redhat.com>
2021-10-04 23:36:57 +00:00
Sebastian Wagner
8ef77a0bbc
doc/cephadm: use sphinx autoclass to document RGWSpec
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-09-30 11:29:07 +02:00
Patrick Seidensal
df7d30ca5b
mgr/prometheus: offer ability to disable cache
Fixes: https://tracker.ceph.com/issues/52414

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
2021-09-06 16:23:09 +02:00
Zac Dover
beeb140178 doc/mgr: add progress module documentation
This PR ingests the work added to the documentation
in PR#29335.

The technical information in this PR concerns the
installation and use of the "Progress Module".

Signed-off-by: Zac Dover <zac.dover@gmail.com>
Signed-off-by: kamoltat <ksirivad@redhat.com>
2021-08-28 13:22:00 +10:00
cypherean
e90ebc195e mgr/dashboard: report ceph tracker bug/feature through CLI/API
Fixes: https://tracker.ceph.com/issues/44851

Signed-off-by: Shreya Sharma <shreyasharma.ss305@gmail.com>
2021-08-24 22:37:52 +05:30
Sage Weil
034de1f5a0 Merge PR #42759 into master
* refs/pull/42759/head:
	doc/mgr/nfs: add section on updating an nfs cluster

Reviewed-by: Varsha Rao <varao@redhat.com>
2021-08-11 15:28:40 -04:00
Sage Weil
7cc4c91dce doc/mgr/nfs: add section on updating an nfs cluster
Signed-off-by: Sage Weil <sage@newdream.net>
2021-08-11 11:32:35 -05:00
Sage Weil
6f8bdfbb90 Merge PR #42252 into master
* refs/pull/42252/head:
	mgr/dashboard: set rgw credentials: fix api tests
	mgr/dashboard: run-frontend-e2e-tests.sh: remove unneeded rgw setting
	mgr/dashboard: rgw service creation form: add realm and zone to service spec.
	mgr/dashboard: connect-rgw: rename to set-rgw-credentials; refactoring
	mgr/dashboard: connect-rgw: adaptation and test coverage
	mgr/cephadm: re-check dashboard <-> rgw creds when rgw daemons created/destroyed
	mgr/dashboard: add 'dashboard connect-rgw' command
	doc/mgr/dashboard: simplify dashboard+rgw config docs

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
2021-08-11 11:28:28 -04:00
Sage Weil
3331a0a7ea Merge PR #42691 into master
* refs/pull/42691/head:
	mgr/nfs: add --port to 'nfs cluster create' and port to 'nfs cluster info'
	qa/suites/orch/cephadm/smoke-roleless: test taking ganeshas offline
	qa/tasks/vip: exec with bash -ex
	qa/suites/orch/cephadm: separate test_nfs from test_orch_cli

Reviewed-by: Varsha Rao <varao@redhat.com>
2021-08-10 16:37:38 -04:00
Alfonso Martínez
6e20ef1dd3 mgr/dashboard: connect-rgw: rename to set-rgw-credentials; refactoring
- Rename the dashboard command to better reflect its behavior.
- Rename '_radosgw_admin' method to 'send_rgwadmin_command' for consistency with
  'send_mon_command' and move it to the mgr_module.py .
- Cleanup: remove unneeded rgw settings.
- Better error handling and test coverage.

Fixes: https://tracker.ceph.com/issues/44605
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
2021-08-10 14:06:03 +02:00
Sage Weil
599116a068 doc/mgr/dashboard: simplify dashboard+rgw config docs
Signed-off-by: Sage Weil <sage@newdream.net>
2021-08-10 14:06:03 +02:00
Sage Weil
8ebe341198 mgr/nfs: add --port to 'nfs cluster create' and port to 'nfs cluster info'
Fixes: https://tracker.ceph.com/issues/51787
Signed-off-by: Sage Weil <sage@newdream.net>
2021-08-09 11:41:08 -04:00
Ernesto Puerta
3582bbc034
doc,mgr/dashboard: clarify SSO documentation
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
2021-08-06 19:31:42 +02:00
Laura Flores
1734647008
doc/mgr/telemetry: fix formatting problem
There was strange bolding and bullet point placement due to a missing new line in the perf description.

Signed-off-by: Laura Flores <lflores@redhat.com>
2021-07-28 10:11:17 -05:00
Josh Durgin
624068d244
Merge pull request #42322 from ljflores/wip-lflores-telemetry-docs
doc/mgr/telemetry: update Telemetry Module docs to include perf channel

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Zac Dover <zac.dover@gmail.com>
2021-07-27 17:31:33 -07:00
Sage Weil
9aeefbc666 doc/mgr/rook: update title
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-27 10:47:55 -04:00
Sage Weil
7d3443412c doc/mgr/nfs: reference customizing ingress
Link to the cephadm docs on modifying the service directly.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-27 10:47:55 -04:00
Sage Weil
b5a32d632d doc/mgr/nfs: add section for manual ganesha config; reframe
This documentation is incomplete because this mode of operation is not
tested/validated.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-27 10:47:55 -04:00
Sage Weil
8d9db910f7 doc/mgr/nfs: document ingress in more detail
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-26 16:23:17 -04:00