mgr/dashboard: introduce memory and cpu usage for daemons
Reviewed-by: Sarthak0702 <NOT@FOUND>
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: ceph-jenkins <NOT@FOUND>
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: sunilangadi2 <NOT@FOUND>
ec and replicated backends are derived from PGBackend,
shard_services should be a member of the base class.
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
In the dashboard, we've been showing smart data for hdd devices with ata
protocol only. Otherwise we show a No Smart Data found error which is
clearly misleading since Smart Data is returned even in the api call.
So this PR is trying to show the smart data for hdd devices
that uses scsi protocol too.
Fixes: https://tracker.ceph.com/issues/55574
Signed-off-by: Nizamudeen A <nia@redhat.com>
Result of os.path.join() before "./bin/ceph-mds" and after
"./bin/./ceph-mds".
Before -
2022-05-05 19:36:11,100.100 DEBUG:__main__:> ./bin/./ceph-mds -i a
After -
2022-05-05 19:38:48,179.179 DEBUG:__main__:> ./bin/ceph-mds -i a
Signed-off-by: Rishabh Dave <ridave@redhat.com>
The message regarding deletion of helper tools is printed for every
command. This message should be printed only when applicable.
Besides -
* Move XXX comments to _do_run() since it increases visibility of
these messages.
* Move omission of arguments stuff to new method clear up the clutter.
* And remove shell as a parameter from _perform_checks_and_adjustments
since it's redundant.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
PglogBasedRecovery and BackfillRecovery reuse the same Operation
until their respective operations are complete. Each recovery
operation adds an entry to AggregateBlockingEvent::events. This
way, we only retain entries that are currently blocking.
Signed-off-by: Samuel Just <sjust@redhat.com>
crimson/os: Don't limit the amount of returned keys per omap get call
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Rationale: Whenever a python exception occurred in cephfs-shell,
it would often only be the exception message but doesn't
say anything about the type of exception. For example if
`ZeroDivisionError: division by zero` occurred, the onecmd()
would print `division by zero` but will omit the type of
exception. In this case it's easy to understand but let's
say an `KeyError` exception occurred for a key `9999` which
is not existent in the dictionary, onecmd() would print
just `9999` in this scenario and it would be very difficult
to interpret what type of error it is.
Fixes: https://tracker.ceph.com/issues/55536
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
NOTE: Although most of the issues are fixed but a few function
and variable names are unchanged in order to prevent
ambiguity and preserve their meaning.
They are:
- functions: setUp(), test_ls_H_prints_human_readable_file_size(),
- variables: ls_H_output, ls_H_file_size
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
the error in the log was this
```
"/usr/share/ceph/mgr/dashboard/services/ceph_service.py", line 253, in _get_smart_data_by_device
May 06 07:38:39 occldlr750-1.occl208.lab conmon[2142938]: svc_type, svc_id = daemon.split('.')
May 06 07:38:39 occldlr750-1.occl208.lab conmon[2142938]: ValueError: too many values to unpack (expected 2)
```
on the cluster, the output of `ceph device ls-by-host` looks like this
```
ceph: root@occldlr750-1 /]# ceph device ls-by-host occldlr750-1.occl208.lab
DEVICE DEV DAEMONS EXPECTED FAILURE
DELLBOSS_VD_cbd004c975390010 sda mon.occldlr750-1.occl208.lab
WDC_WUH721818AL5204_3FGZR3JT sdda osd.20
WDC_WUH721818AL5204_3FH4315T sdbf osd.94
WDC_WUH721818AL5204_3FHP58TT sdec osd.30
WDC_WUH721818AL5204_3FHSK8HT sdu osd.78
WDC_WUH721818AL5204_3FHVTS9T sdfi osd.47
WDC_WUH721818AL5204_3FHWJE8T sdv osd.23
WDC_WUH721818AL5204_3FHXHETT sdcl osd.11
WDC_WUH721818AL5204_3FHXKP1T sdcj osd.10
```
the first device is mon and its name is mon.occldlr750-1.occl208.lab.
In our dashboard code, when fetching the smart data we have a line like
this
`svc_type, svc_id = daemon.split('.')`
so for the mon the output of `daemon.split('.') will be ['mon', 'occldlr750-1', 'occl208', 'lab']. The svc_id gets split into three because of the split. I am changing that and giving the criteria as splitting only on the first occurence of the dot and the considering everything that comes after the dot as the svc_id of the device.
Fixes: https://tracker.ceph.com/issues/55571
Signed-off-by: Nizamudeen A <nia@redhat.com>
In the Physical Disks page, the uids for multiple devices are coming in
as same and that causes the selection to go berserk and select multiple
rows with same UID. The uid is generated in the frontend service call
itself. I just added some more parameters to it inorder to make it more
unique.
The second issue is the number of selected number getting multiplied
exponentially. Its because each time the table is updated or refreshed,
we push the row with the number of selected items we had before and that
causes the number of selection to multiply.
Fixes: https://tracker.ceph.com/issues/55523
Signed-off-by: Nizamudeen A <nia@redhat.com>
PGRecovery::on_global_recover destroys the map entry without waiting for
the future returned from
seastar::future<> wait_for_recovered(BlockingEvent::TriggerI& trigger) {
This commit changes WaitForObjectRecovery to be refcounted and retains a
reference until the future resolves.
Fixes: https://tracker.ceph.com/issues/55565
Signed-off-by: Samuel Just <sjust@redhat.com>
The rgw_admin_curl.sh script will allow end-user/developers to
access RGW admin APIs through curl command.
Signed-off-by: Prashant D <pdhange@redhat.com>