otherwise we might have:
```
ceph/src/pybind/mgr/k8sevents/__init__.py", line 1, in <module>
from .module import Module
File "/home/kchai/ceph/src/pybind/mgr/k8sevents/module.py", line 28, in <module>
import yaml
ImportError: No module named yaml
```
Signed-off-by: Kefu Chai <kchai@redhat.com>
* timeout() is never passed any parameter when being called, so let's
remove the parameters list of "seconds" and "error_message"
* use `getattr()` instead of `hasattr()` for retrieving the
member variable of `self`
* pass `self` to wrapper function explicitly.
* return `func()` right away.
* hardwire the error message of `TimeoutError` to "Timer expired",
because
- as neither errno.ETIME nor errno.ETIMEOUT is portable
- the only caller of `TimeoutError` is `timeout()`, so there is
no need to have the flexibility to pass a different error message
* use `wraps()` as a decorator, simpler this way.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* use initializer_list<> for {si,iec}_options, no need to uset set<>
* remove the comments, the variable names are self-documented.
Signed-off-by: Kefu Chai <kchai@redhat.com>
qa: enable dashboard tests to be run with "--suite rados/dashboard"
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
stop-all.sh will work if the right deps are there (currently we lack 'nc')
also killall -9 java to be sure.
Fixes: https://tracker.ceph.com/issues/42496
Signed-off-by: Sage Weil <sage@redhat.com>
rgw: Select the std::bitset to resolv ambiguity
Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
systemd 219 doesn't have the issue that is worked around in the
previous commit, but has a different one: udev_enumerate_scan_devices()
always succeeds, but sometimes returns an empty list when the device is
actually there. This happens rarely and at random so I haven't been
able to get to the bottom of it yet, but it looks like another similar
race condition in libudev.
Since an empty list is expected if the device isn't there, retry just
twice with a small sleep in-between. This appears to be enough: I got
7 occurrences per 600000 "rbd unmap" invocations, all of which needed
a single retry:
rbd: udev enumerate missed a device, tries = 1
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
We only fix the per-pool-omap issues if we do a non-shallow fsck.
Fixes: https://tracker.ceph.com/issues/42490
Signed-off-by: Sage Weil <sage@redhat.com>
The node dict that is passed to the _gather_leaf_ids function from the
_gather_osds function does not have 'items' in it. We also can't use
buckets at this point since those only exist for leaf nodes, not all
nodes.
We need to query the nodes_by_id dict to get 'items' for a node inside
the _gather_leaf_ids function instead.
Signed-off-by: Boris Ranto <branto@redhat.com>
udev_enumerate_scan_devices() doesn't handle disappearing devices well.
If called while some devices are being removed, it sometimes propagates
ENOENT and ENODEV errors encountered operating on directory entries in
/sys that no longer exist. Some of these errors are suppressed, but
this isn't reliable and varies across versions. In particular, systemd
239 suppresses ENODEV from sd_device_new_from_syspath() but doesn't
suppress ENODEV from sd_device_get_devnum(). In systemd 243 the call
to sd_device_get_devnum() has been moved, but it still leaks ENOENT
from sd_device_get_is_initialized() (referring to the body of
FOREACH_DIRENT_ALL loop in enumerator_scan_dir_and_add_devices()).
Assume that all ENOENT and ENODEV errors are transient and retry the
call to udev_enumerate_scan_devices(). Don't limit the number, but log
each retry.
Fixes: https://tracker.ceph.com/issues/41036
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
to silence GCC warnings like:
rc/auth/cephx/CephxProtocol.h:309:5: warning: 'type' may be used uninitialized in this function [-Wmaybe-uninitialized]
if (i != tickets_map.end())
^~
Signed-off-by: Kefu Chai <kchai@redhat.com>
RHEL/CentOS 8 now provide librabbitmq-devel so we can enable it as a
build requirement.
Fixes: https://tracker.ceph.com/issues/38466
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
The relative path works on one of my machines but not my laptop; the
full CWD works everywhere. This is presumably better anyway!
Signed-off-by: Sage Weil <sage@redhat.com>
This is better than having a single /var/run/ceph on the host with a
weird naming scheme. Among other things, it means that we can access
the asok for any daemon for a given fsid from any container on the same
host with the same fsid (notably, a shell).
Signed-off-by: Sage Weil <sage@redhat.com>
We can't decide whether to use the new tell command style based on the
monmap.min_mon_release alone because some mons may be octopus even though
that hasn't updated yet. The same goes for if we look at the combined
features for the cluster--the underlying problem is the monmap doesn't
tell us which mons are octopus and which ones aren't, so we don't know
how to behave.
Instead, allow octopus+ mons to advertise the converted tell commands
going forward, for compatibility with pre-octopus clients (who do the old
style of tell) and for octopus+ clients talking to a min_mon_release <
octopus cluster.
Signed-off-by: Sage Weil <sage@redhat.com>
Sometimes we run containers on a host that doesn't have a crash dir set
up (becuase no daemon has been deployed). Examples include shell and
ceph-volume.
Signed-off-by: Sage Weil <sage@redhat.com>