Enable or disable all telemetry channels at once with:
ceph telemetry enable channel all
ceph telemetry disable channel all
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
STATUS column now indicates whether a collection is being reported, and
the reasons why it's not (either the user is not opted-in to this
collection, or its channel is off).
Also, removed the ENROLLED and DEFAULT columns due to potential
confusion they may cause.
In case a user is not opted-in to certain collections, a message will
appear above the table with the missing collections:
New collections are available:
['basic_base', 'basic_mds_metadata', 'crash_base', 'device_base',
'ident_base', 'perf_perf']
Run `ceph telemetry on` to opt-in to these collections.
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
Fix issues with PromQL expressions and vector matching with the
`ceph_disk_occupation` metric.
As it turns out, `ceph_disk_occupation` cannot simply be used as
expected, as there seem to be some edge cases for users that have
several OSDs on a single disk. This leads to issues which cannot be
approached by PromQL alone (many-to-many PromQL erros). The data we
have expected is simply different in some rare cases.
I have not found a sole PromQL solution to this issue. What we basically
need is the following.
1. Match on labels `host` and `instance` to get one or more OSD names
from a metadata metric (`ceph_disk_occupation`) to let a user know
about which OSDs belong to which disk.
2. Match on labels `ceph_daemon` of the `ceph_disk_occupation` metric,
in which case the value of `ceph_daemon` must not refer to more than
a single OSD. The exact opposite to requirement 1.
As both operations are currently performed on a single metric, and there
is no way to satisfy both requirements on a single metric, the intention
of this commit is to extend the metric by providing a similar metric
that satisfies one of the requirements. This enables the queries to
differentiate between a vector matching operation to show a string to
the user (where `ceph_daemon` could possibly be `osd.1` or
`osd.1+osd.2`) and to match a vector by having a single `ceph_daemon` in
the condition for the matching.
Although the `ceph_daemon` label is used on a variety of daemons, only
OSDs seem to be affected by this issue (only if more than one OSD is run
on a single disk). This means that only the `ceph_disk_occupation`
metadata metric seems to need to be extended and provided as two
metrics.
`ceph_disk_occupation` is supposed to be used for matching the
`ceph_daemon` label value.
foo * on(ceph_daemon) group_left ceph_disk_occupation
`ceph_disk_occupation_human` is supposed to be used for anything where
the resulting data is displayed to be consumed by humans (graphs, alert
messages, etc).
foo * on(device,instance)
group_left(ceph_daemon) ceph_disk_occupation_human
Fixes: https://tracker.ceph.com/issues/52974
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
Updated docs to show snmp-gateway usage. docs provide
guidance on SNMP versions supported and show CLI and
yaml deployment examples.
Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
Up until this point, we included aggregated and separated data for
testing purposes. Now that we've done our testing, the aggregated
metrics are no longer relevant.
Aggregated metrics can still be achieved on the server side by
summing separated metrics.
Signed-off-by: Laura Flores <lflores@redhat.com>
doc: remove extra comma.
This commit remove extra comma of "To disable prediction,:".
Fixes: https://tracker.ceph.com/issues/53433
Signed-off-by: devlikai <likai_lc@inspur.com>
Enable config settings to modify standby's behaviour on the index page
This makes the standby discoverable by reverse proxy or loadbalancer
setups. Testing for the empty response of the '/metrics' endpoint would
trigger metric collection on the active manager instance.
The newly added configuration options settings standby_behaviour and
standby_error_status_code are documented and flagged as runtime, as
modifying both settings has an immediate effect (no restart required).
Co-authored-by: Ernesto Puerta <37327689+epuertat@users.noreply.github.com>
Signed-off-by: Roland Sommer <rol@ndsommer.de>
Fixes: https://tracker.ceph.com/issues/53229
This patch creates a health history object maintained in
the modules kvstore. The history and current health
checks are used to create a metric per healthcheck whilst
also providing a history feature. Two new commands are added:
ceph healthcheck history ls
ceph healthcheck history clear
In addition to the new commands, the additional metrics
have been used to update the prometheus alerts
Fixes: https://tracker.ceph.com/issues/52638
Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
mgr/dashboard: move NFS_GANESHA_SUPPORTED_FSALS to mgr_module.py
Importing from nfs module throws AttributeError because as a side effect the dashboard module is impersonating the nfs module.
https://gist.github.com/varshar16/61ac26426bbe5f5f562ebb14bcd0f548
mgr/dashboard: 'Create NFS export' form: list clusters from nfs module
mgr/dashboard: frontend+backend cleanups for NFS export
Removed all code and references related to daemons. UI cleanup and adopted unit-testing for
nfs-epxort create form for CEPHFS backend. Cleanup for export list/get/create/set/delete endpoints.
mgr/dashboard: rm set-ganesha ref + update docs
Remove existing set-ganesha-clusters-rados-pool-namespace references as
they are no longer required. Moreover, nfs doc in dashboard doc is
updated accordingly to the current nfs status.
mgr/dashboard: add nfs-export e2e test coverage
mgr/dashboard: 'Create NFS export' form: remove RGW user id field.
- Improve bucket typeahead behavior.
- Increase version for bucket list endpoint.
- Some refactoring.
mgr/dashboard: 'Create NFS export' form: allow RGW backend only when default realm is selected.
When RGW multisite is configured, the NFS module can only handle buckets in the default realm.
mgr/dashboard: 'Create service' form: fix NFS service creation.
After https://github.com/ceph/ceph/pull/42073, NFS pool and namespace are not customizable.
mgr/dashboard: 'Create NFS export' form: add bucket validation.
- Allow only existing buckets.
- Refactoring:
- Moved bucket validator from bucket form to cd-validators.ts
- Split bucket validator into 2: bucket name validator and bucket existence (that checks either existence or non-existence).
mgr/dashboard: 'Create NFS export' form: path validation refactor: allow only existing paths.
Fixes: https://tracker.ceph.com/issues/46493
Fixes: https://tracker.ceph.com/issues/51479
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
This PR ingests the work added to the documentation
in PR#29335.
The technical information in this PR concerns the
installation and use of the "Progress Module".
Signed-off-by: Zac Dover <zac.dover@gmail.com>
Signed-off-by: kamoltat <ksirivad@redhat.com>
* refs/pull/42691/head:
mgr/nfs: add --port to 'nfs cluster create' and port to 'nfs cluster info'
qa/suites/orch/cephadm/smoke-roleless: test taking ganeshas offline
qa/tasks/vip: exec with bash -ex
qa/suites/orch/cephadm: separate test_nfs from test_orch_cli
Reviewed-by: Varsha Rao <varao@redhat.com>
- Rename the dashboard command to better reflect its behavior.
- Rename '_radosgw_admin' method to 'send_rgwadmin_command' for consistency with
'send_mon_command' and move it to the mgr_module.py .
- Cleanup: remove unneeded rgw settings.
- Better error handling and test coverage.
Fixes: https://tracker.ceph.com/issues/44605
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
There was strange bolding and bullet point placement due to a missing new line in the perf description.
Signed-off-by: Laura Flores <lflores@redhat.com>