The one exception to the "immediately readable" it collection_list, which
is not readable until the kv transaction is applied. Our choices are
1. Wait until kv to apply to trigger onreadable (for any create/remove
ops). This wipes away much of the benefit of fully sync onreadable.
2. Add tracking for created/removed objects in BlueStore so that we can
incorporate those into collection_list. This is complex.
3. flush() from collection_list. Unfortunately we don't have osr linked
to Collection, so this doesn't quite work with the current ObjectStore
interface.
4. Require the caller flush() before list and put a big * next to the
"immediately onreadable" claim. It turns out that because of FileStore,
the OSD already does flush() before collection_list anyway, so this does
not require any actual change... except to store_test tests. (This didn't
affect filestore because store_test is using apply_transaction, which
waits for readable, and on filestore that also implies visible by
collection_list.)
Signed-off-by: Sage Weil <sage@redhat.com>
This doesn't work as implemented. We are doing _txc_finalize_kv() from
queue_transactions, which calls into the freelist and does this verify
code. However, we have no assurance that a previous txc in the sequencer
has applied its changes to the kv store, which means that a simple sequence
like
- write object
- delete object
can trigger if the write is waiting for aio. This currently happens
with ObjectStore/StoreTest.SimpleRemount/2.
Comment out the verify, but leave _verify_range() helper in place in case
we can use it in the future in some other context.
Signed-off-by: Sage Weil <sage@redhat.com>
The parent may go away, so we need to keep our own copy of shard_hint in
OpSequencer to avoid a user-after-free (e.g., when the user drops their
osr and calls OpSequencer::discard()).
Signed-off-by: Sage Weil <sage@redhat.com>
If bluestore chooses to it may try to call sync_finish() from the queueing
call chain (instead of finish() from a Finisher). Allow it for several
Contexts in PG and PrimaryLogPG, including those used for the main IO
path.
We assume here that all Contexts that we bless can complete synchronously
by calling their normal finish() method.
Signed-off-by: Sage Weil <sage@redhat.com>
Bluestore updates are immediately present in the cache and readable. The
only exception are omap updates, but the read methods there block until
they commit, so we can still tell the OSD that they are readable.
Signed-off-by: Sage Weil <sage@redhat.com>
We would get this implicitly with FileStore if we waited for the onreadable
callbacks, but in some cases the OSD has already done that. With BlueStore,
we need to explicitly flush().
Signed-off-by: Sage Weil <sage@redhat.com>
Sometimes we are able to complete a context synchronously, within the same
callchain of the caller who queued it. In this case the locking rules are
usually a bit different for the caller. Add a generic Context hook that
allows this.
If a sync-capable Context implements sync_complete, it can do whatever
it needs to do for this particular event. If it is not implemented, the
generic implementation will return false, and the caller can use the
normal complete() as it normally would have (presumably by calling it
asynchronously, e.g., in a Finisher).
If sync_complete() is implemented, it *must* return true.
Signed-off-by: Sage Weil <sage@redhat.com>
The second _rval list was a dumb idea. A vector of pairs is simpler
and more efficient.
Also, extend support to any container type.
Signed-off-by: Sage Weil <sage@redhat.com>
* should install software-properties-common beforehand, otherwise
the `add-apt-repository` command will not be available.
* the update-alternative commandline were copied from ceph-build,
should remove the escape characters.
Signed-off-by: Kefu Chai <kchai@redhat.com>
please note, run-make-check.sh sources install-deps.sh here to import
the $PATH and other environmental variables, which could be changed by
the the DTS "enable" script.
Signed-off-by: Kefu Chai <kchai@redhat.com>
this is a follow-up of #19328. we need to get this change into 12.2.3.
so better off do the switch somewhere after 12.2.2 which has been
tagged, and before 12.2.3, which is not tagged yet.
please note, this is not targetting master, because i want to make
sure the change number (the <num> in << 12.2.2-<num>) is correct. it
does not hurt if it's not, as long as it is ">> 12.2.2", so the replace
machinery in 12.2.3 works, and it covers the releases where the
ceph-{osdomap,kvstore,monstore}-tool are not move yet. but why don't
make it more right?
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit cdf49ba664)
The backport didn't make 12.2.2, but it will be in 12.2.3.
Fixes: http://tracker.ceph.com/issues/22319
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit e0c814266f)
Though "ls" command is explained and it's usage shown in the man page,
it is not mentioned in the subcommands list of "ceph osd" in the
beginning.
Signed-off-by: Rishabh Dave <ridave@redhat.com>