Each method that uses a formatter is doing the same thing.
Simplify by constructing and handling errors only once.
Also use a scoped_ptr for easy clean up.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Several >80 characters have crept in recently.
The older ones generally don't have very useful history,
so I'm not worried about obscuring the history any more.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
This patch renames the --format option to --image-format, for
specifying the RBD image format, and uses --format to specify the
output formatting (to be consistent with the other ceph tools). To
avoid breaking backwards compatibility with existing scripts, rbd will
still accept --format [1|2] for the image format, but will print a
warning message, noting its use is deprecated.
The rbd subcommands that support the new --format option are : ls, info, snap
list, children, showmapped, lock list.
Signed-off-by: Stratos Psomadakis <psomas@grnet.gr>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
We were rescrubbing if INCONSISTENT is set, but that is now persistent.
Add a new scrub_after_recovery flag that is reset on each peering interval
and set that when repair encounters errors.
Signed-off-by: Sage Weil <sage@inktank.com>
The AS_IF used to cover java related checks via --enable-cephfs-java
didn't work correctly. Use a plain 'if/fi' instead to make sure this
section is only executed if --enable-cephfs-java is used.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
This makes the state persistent across PG peering and OSD restarts.
This has the side-effect that, on recovery, we rescrub any PGs marked
inconsistent. This is new behavior!
Signed-off-by: Sage Weil <sage@inktank.com>
The initial value for pair<utime_t,pg_t> can match pg 0.0, preventing it
from being manually scrubbed. Fix!
Signed-off-by: Sage Weil <sage@inktank.com>
The previous 'osd scrub min interval' was mostly meaningless and useless.
Meanwhile, the 'osd scrub max interval' would only trigger a scrub if the
load was sufficiently low; if it was high, the PG might *never* scrub.
Instead, make the 'min' what the max used to be. If it has been more than
this many seconds, and the load is low, scrub. And add an additional
condition that if it has been more than the max threshold, scrub the PG
no matter what--regardless of the load.
Note that this does not change the default scrub interval for less-loaded
clusters, but it *does* change the meaning of existing config options.
Fixes: #3786
Signed-off-by: Sage Weil <sage@inktank.com>
This was already a no-op: we don't call PG::scrub_sched() unless it has
been osd_scrub_max_interval seconds since we last scrubbed. Unless we
explicitly requested in, in which case we don't want this check anyway.
Signed-off-by: Sage Weil <sage@inktank.com>
When a scrub is requested, flag it and move it to the front of the
scrub schedule instead of immediately queuing it. This avoids
bypassing the scrub reservation framework, which can lead to a heavier
impact on performance.
Signed-off-by: Sage Weil <sage@inktank.com>
Adds r/w lock to protect against some races.
1. Mutual exclusion for mount/unmount prevents races between the two in
libcephfs, which isn't safe (access to ceph_mount_info state).
2. An extremely narrow race between unmount and ceph_* calls in
libcephfs. ThreadA calls ceph_xxx, is_mounted test passes, then ThreadB
calls unmount and destroys the client. ThreadA resumes with a bad client
pointer.
3. Race between unmount and ceph_* calls in JNI. In JNI we hold the
CephContext reference across ceph_* calls. If the ceph mount were to be
released while a thread was returning from a ceph_* call then an attempt
to write to the log (e.g. the return value) would reference bad context.
Since ceph_release is only called by finalize() then no thread can be in
JNI. So this is actually safe.
Using r/w here provides trade-off between allowing concurrency into
libcephfs, and not having to constantly update the Java bindings. The
only assumption is that unmount/mount race with the rest of the
interface.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
The constructor calls create, and finalize() calls release. Since each
of these can only happen once (enforced by Java), there is no fear of a
race condition.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
This is more often the case than not, and we don't have a good way to
magically know what size of cluster the user will be creating. Better to
err on the side of doing the right thing for more people.
Fixes: #3785
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Try to share the map with a randomly picked OSD; if the picked monitor is
not 'up', then try to find the nearest 'up' OSD in the map by doing a
backward and a forward linear search on the map -- this would be O(n) in
the worst case scenario, as we only do a single iteration starting on the
picked position, incrementing and decrementing two different iterators
until we find an appropriate OSD or we exhaust the map.
Fixes: #3629
Backport: bobtail
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Check for existence of /sys/bus/rbd first to avoid unnecessary calls
Fixes: #3784
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
When we map/unmap devices, udev gets called to manage device nodes;
this will allow the command to wait for those manipulations to complete,
particularly for test runs, so that the device tree is stable by the
time the command exits.
--no-settle is also provided to avoid this behavior if desired (say,
for a series of 'map' commands, perhaps the user wants to wait for
settling only on the last of the series).
Fixes: #3635
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
map_cache.cached_lb() provides us with a lower bound across
all pgs for in-use osdmaps. We cannot trim past this since
those maps are still in use.
backport: bobtail
Fixes: #3770
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>