in 7736bddc53, we assumed that the object
to be recovered did not exist in `recovering` before
`recover_object(oid)` was called. but this turns out not true. so, in
this change, `add_object(oid)` is called before `recover_object(oid)`
gets called.
Fixes: https://tracker.ceph.com/issues/47593
Signed-off-by: Kefu Chai <kchai@redhat.com>
this helps to avoid the confusion of "where the recovery is added" and
"are we adding a new instance of recovery here".
should call add_recovery() explicitly when we need to add a new recovery
instance.
Signed-off-by: Kefu Chai <kchai@redhat.com>
before this change, get_recovery() can also be used for adding a
recovery instance to `recovering`. this behavior is error-prone and
confusing.
after this change, add_recovery() is used in the place where we
want to add a new instance of recovery instance.
Signed-off-by: Kefu Chai <kchai@redhat.com>
mgr/dashboard: log in non-admin users successfully if the telemetry notification is shown
Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
While creating erasure-coded profile make sure
that user is specifying valid crush-failure-domain.
Fixes: https://tracker.ceph.com/issues/47452
Signed-off-by: Prashant Dhange <pdhange@redhat.com>
crimson::osd::PG::send_cluster_message() accepts a `Message*`
pointer, and then hand it over to `shard_services.send_to_osd()`,
which expects a `Ref<Message>`. so the raw pointer is used to
construct an `intrusive_ptr<Message>`, which increment the
refcount of that Message instance by one. but that Message
was owned by nobody before that, so we end up with an
`intrusive_ptr<Message>` of 2 refcount, and only a single
owner. hence the memory leak.
in this change, instructs the constructor to not add the refcount.
Signed-off-by: Kefu Chai <kchai@redhat.com>
once the continuation consuming the stored value of the associated
future, we cannot set_value() again. otherwise, ASan complains that we
are accessing the memory on heap after it is freed.
in this change, std::optional<> is used for holding
promise<auth_result_t>, once the promise is fulfilled, `auth_done` is
reset to prevent another call of `set_value()` or `set_exception()`.
Signed-off-by: Kefu Chai <kchai@redhat.com>
This updates a date from 2016 to 2020,
so that readers can be confident that the
procedure that they're reading has been recently
tested.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
The new created MonConnection are refered through
MonClient::pending_cons instead of through the return
value.
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
Wrap all devices' health information within a tabset
instead of displaying them from top to bottom.
Add more guard in the HTML template to prevent referencing undefined
variables.
Fixes: https://tracker.ceph.com/issues/47494
Fixes: https://tracker.ceph.com/issues/43177
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
This gets past things like tzconfig stopping for user input.
Remove redundant install of python-virtualenv.
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
* refs/pull/36776/head:
systemd: Support Graceful Reboot for AIO Node
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
crimson/seastore: fix potential non-repeatable-read from RootBlock
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
retry forever if cct->_conf->bdev_flock_retry is 0.
systemd-udevd is most likely the reason why ceph-osd fails to
acquire the flock when "mkfs", because systemd-udevd probes
all block devices when the device changes in the system using
libblkid, and when systemd-udevd starts looking at the device
it takes a `LOCK_SH|LOCK_NB` lock. and it releases the lock
right after done with it. so normally, it only takes a jiffy,
see
ee0b9e721a/src/shared/lockfile-util.c (L18)
so, we just need to retry couple times before acquiring the
lock.
Fixes: https://tracker.ceph.com/issues/46124
Signed-off-by: Kefu Chai <kchai@redhat.com>
* use OFD lock if available. OFD is Linux specific, and only available
on 3.15 kernels. OFD is able to synchronize both threads and
processes. and has simpler semantics. this is just a cleanup.
as we don't create threads for acquiring the flock.
* use BSD flock(2) as a fallback
* return the errno right away, without printing logging messages.
for two reasons:
- writing logging messages would reset the errno.
- the caller of _lock() also prints the logging messages along
with strerror(errno)
Fixes: https://tracker.ceph.com/issues/46124
Signed-off-by: Kefu Chai <kchai@redhat.com>
also drop bdev_flock_retry and bdev_flock_retry_interval from
legacy_config_opts.h, as `KernelDevice::_lock()` is not in the critical
path, there is no need to access these settings via member variables --
get_val<> would just suffice.
Signed-off-by: Kefu Chai <kchai@redhat.com>