The `instance` label is only useful if
- the exporter returns only data about its node or instance
- the exporter provides an instance label and then may return data about
other nodes
In this case, it's about the Prometheus mgr module, which is a single
exporter providing data about a whole cluster, so not only data related
to the node (or instance) the mgr module is running on. It is
completely irrelevant on which node the exporter runs on, the data
provided doesn't change. The exporter also doesn't provide `instance`
labels (which Prometheus wouldn't change due to our configuration, see
"honor_labels" setting).
(Actually there's one exception where `instance` labels are provided by
the Ceph mgr module, but that doesn't affect the Ceph Cluster
dashboard.)
Note that keeping that instance label on this particular dashboard would
enable the user to switch between a previously failed mgr instance and
the data collected from there and the currently running mgr instance (on
which the Prometheus mgr module runs on). That'd split the data, which
I don't think is a useful feature, but rather looks broken.
Fixes: https://tracker.ceph.com/issues/51212
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
The health status widget doesn't show any status because it requires its
query to return a single result. But in case a mgr instance had failed,
it would return more, provided the incident has happened in the
requested time frame.
This is simply an issue of the `instant` switch being disabled for that
widget. As only one mgr instance can ever be providing data at a time,
enabling `instant` completely solves that issue.
Fixes: https://tracker.ceph.com/issues/51212
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
Remove hard-coded timezone off Grafana dashboards to enable the Grafana
administrator to decide which timezone should be used for dashboards.
If we hard-coded those values, changing the global settings in Grafana
wouldn't have an effect. And the administrators can't change the
automatically imported Grafana dashboards provided by us.
Fixes: https://tracker.ceph.com/issues/51212
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
Convert newline character from CRLF in `rbd-details.json` to LF, so that
it will be consistent with all the other dashboard JSON files.
Fixes: https://tracker.ceph.com/issues/51212
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
rgw: add the description of blocking io during index resharding
Reviewed-by: Matt Benjamin mbenjamin@redhat.com
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
* refs/pull/41483/head:
cephadm: stop passing --no-hosts to podman
mgr/nfs: use host.addr for backend IP where possible
mgr/cephadm: convert host addr if non-IP to IP
mgr/dashboard,prometheus: new method of getting mgr IP
doc/cephadm: remove any reference to the use of DNS or /etc/hosts
mgr/cephadm: use known host addr
mgr/cephadm: resolve IP at 'orch host add' time
Reviewed-by: Sebastian Wagner <swagner@suse.com>
This reverts cfc1f914ce, which is no longer
neceesary because (1) we don't use socket.getfqdn(), and (2) we generally
do not rely on DNS or /etc/hosts at all anymore (with the exception of
the upgrade transition).
Signed-off-by: Sage Weil <sage@newdream.net>
Previously we allowed the host.addr to be a DNS name (short or fqdn).
This is problematic because of the inconsistent way that docker and podman
handle /etc/hosts, and undesirable because relying on external DNS is
an external source of failure for the cluster without any benefit in
return (simply updating DNS is not sufficient to make ceph behave).
So: update any non-IP to an IP as soon as we start up (presumably on
upgrade). If we get a loopback address (127.0.0.1 or 127.0.1.1), then
wait and hope that the next instance of the manager has better luck.
Signed-off-by: Sage Weil <sage@newdream.net>
- Use a centralized method get_mgr_ip()
- Look up the hostname via DNS. This is a bit more reliable than
getfqdn() since it will work even when podman adds the container
name to /etc/hosts.
Signed-off-by: Sage Weil <sage@newdream.net>
If the host IP/addr is known, use that. The addr might even be a FQDN
instead of an IP address, in which case we want to look that up instead
of the bare hostname.
Signed-off-by: Sage Weil <sage@newdream.net>
just for the sake of correctness, as they don't need a full-blown
std::string, what they need is but a string like object. and they always
create a std::string instance as a member variable if they want to have
a copy of it.
Signed-off-by: Kefu Chai <kchai@redhat.com>
before this change, cot never destructs the created ObjectStore
instances.
after this change, they are destructed upon returning from main().
Signed-off-by: Kefu Chai <kchai@redhat.com>
RAII can simplify the clean up logic in OSD::mkfs().
and since `ch` is a smart pointer, so it is able to take care of itself,
as long as we ensure that it is destructed before objectstore.
Signed-off-by: Kefu Chai <kchai@redhat.com>
instead of returning a raw pointer of ObjectStore, let
`ObjectStore::create()` return a `std::unique_ptr<ObjectStore>`.
less error prune this way.
Signed-off-by: Kefu Chai <kchai@redhat.com>
I found that the difference between "rbd cp" and "rbd deep cp",
i.e. what "deep" means in this context, is documented only in
the mailing list archive and in the Mimic reelase notes.
Let's make the difference explicit in the manpage and in rbd --help.
Signed-off-by: Jan "Yenya" Kasprzak <kas@fi.muni.cz>
mon/OSDMonitor: drop stale failure_info even if can_mark_down()
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>