Previously, we just picked the first one to have the object in
question. Now, we will attempt to choose one that has as
much of the following as possible:
1) has the object (there must be one)
2) has an object_info attr
3) has a valid object_info attr
4) has an object_info whose size matches the scrubbed size
Signed-off-by: Samuel Just <sam.just@inktank.com>
The common case already has a snapshot context, so avoid duplicating
it (copying a potentially large vector) in IoCtxImpl::aio_operate().
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
When the object name is short, check that the corresponding file is
::unlink()ed. When the object name is long, there may be multiple files
with the same name, modulo the anti-collision number showing just before
the FILENAME_COOKIE. The following scenarii are tested:
* there only is one file
* there are multiple files and the last one is removed
* there are multiple files and the last one is moved in place of the
file that is to be removed
lfn_unlink and remove_object are tested together because
lfn_unlink is a private function and remove_object is a protected function
that does very little beside calling lfn_unlink
http://tracker.ceph.com/issues/4560 refs #4560
Signed-off-by: Loic Dachary <loic@dachary.org>
This is based on Sandon's initial patch, but much-modified.
Mounts ceph data volumes temporarily to see what is inside. Attempts to
associated journals with osds.
Resolves: #3120
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Building up on the Single-Paxos and our existing k/v store that backs
the monitor, we now introduce a simple service so that the monitors
act as a generic k/v store available to the cluster, in which a user
can stash (and later obtain) configuration keys at his own discretion.
Users can put, get, delete, list and check for values using the
following commands:
- ceph config-key put <key> [<value>]
or
- ceph config-key put <key> [-i <in-file>]
with 'value' and 'in-file' being optional; if these are not specified,
'put' will act as 'touch' if 'key' does not exist, or will overwrite
the value of 'key' with a zero byte value (i.e., truncates the
contents of the value to zero)
- ceph config-key get <key>
or
- ceph config-key get <key> -o <out-file>
- ceph config-key delete <key>
- ceph config-key list [-o <out-file]
- ceph config-key exists <key>
Fixes: #4313
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
The ceph-create-keys, ceph-disk, ceph-disk-activate, and
ceph-disk-prepare scripts are built in sbin, but debian installs
them into usr/bin, and several utilities look for them there.
This commit changes the RPM to install them in /usr/bin. (Bug #3921)
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
We were returning '1' regardless of what do_command() returned in case
of error. This would make building tools relying on command error codes
short of useless, and forced them to rely instead on error messages.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit e91405d540)
We were returning '1' regardless of what do_command() returned in case
of error. This would make building tools relying on command error codes
short of useless, and forced them to rely instead on error messages.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
At this point it's a simple wrapper around the ObjectCacher or
librados.
This is needed for QEMU so that its main thread can continue while a
flush is occurring. Since this will be backported, don't update the
librbd version yet, just add a #define that QEMU and others can use to
detect the presence of aio_flush().
Refs: #3737
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Before we were duplicating the IoCtx for each new request since they
could have a different snapshot context or read from a different
snapshot id. Since librados now supports setting these explicitly
for a given request, do that instead.
Since librados tracks outstanding requests on a per-IoCtx basis, this
also fixes a bug that causes flush() without caching to ignore
all the outstanding requests, since they were to separate,
duplicate IoCtxs.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Mainly this is useful for testing, like flushing and checking that
all pending writes are complete after the flush finishes.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Usually the snapid to read from or the snapcontext to send with a write
are determined implicitly by the IoCtx the operations are done on.
This makes it difficult to have multiple ops in flight to the same
IoCtx using different snapcontexts or reading from different snapshots,
particularly when more than one operation may be needed past the initial
scheduling.
Add versions of aio_read, aio_sparse_read, and aio_operate
that don't depend on the snap id or snapcontext stored in the IoCtx,
but get them from the caller. Specifying this information for each
operation can be a more useful interface in general, but for now just
add it for the methods used by librbd.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Sometimes you don't want flush to block, and can't modify
already scheduled aio_writes. This will be useful for a
librbd async flush interface.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
The gather will only have subs if there is something to flush. Remove
the safe variable, which indicates the same thing, and convert the
conditionals that used it to an else branch. Movinig gather.activate()
inside the has_subs() check has no effect since activate() does
nothing when there are no subs.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
This removes the last remnants of
b5e9995f59. If there's nothing to flush,
immediately call the callback instead of deleting it. Callers were
assuming they were responsible for completing the callback whenever
flush_set() returned true, and always called complete(0) in this
case. Simplify the interface and just do this in flush_set(), so that
it always calls the callback.
Since C_GatherBuilder deletes its finisher if there are no subs,
only set its finisher when subs are present. This way we can still
call ->complete() for the callback.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
When the ObjectCacher's writex blocks, it affects the thread requesting
the aio, which can cause starvation for other I/O when used by QEMU.
Preserve the old behavior via a config option in case this has any
bad side-effects, like too much memory usage under heavy write loads.
Fixes: #4091
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Add a callback argument to writex, and a finisher to run the
callbacks. Move the check for dirty+tx > max_dirty into a helper that
can be called from a wrapper around the callbacks from writex, or from
the current place in _wait_for_write().
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>