Ensure that a crush file always compiled deterministically, even though
the default values for *new* maps has changed.
Signed-off-by: Sage Weil <sage@inktank.com>
These ops have already taken their budget in the original op_submit().
It will be returned via put_op_budget() when they complete.
If there were many localized reads of missing objects from replicas,
or cache pool redirects, this would cause the objecter to use up all
of its op throttle budget and hang.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Since detach_bucket is a private helper solely used by move_bucket which
contains another ( correct ) safeguard, the code cannot be reached and
the problem can never happen. If another function uses detach_bucket,
it may happen.
Signed-off-by: Loic Dachary <loic@dachary.org>
The following was introduced in 2012 by a2d0cff1b0
// un-set the device name so we can use add_item later
build_rmap(name_map, name_rmap);
name_map.erase(id);
name_rmap.erase(id_name);
when insert_item refused to move a bucket for which a name already
exists. It was changed in 2013 by
4e2557a038 and now supports it. The
TestCrushWrapper unittest for move_bucket pass.
Signed-off-by: Loic Dachary <loic@dachary.org>
We don't want to seal HitSets just because we're writing a
snapshot to disk; it potentially shrinks the in-memory one
we want to keep adding stuff to!
Signed-off-by: Greg Farnum <greg@inktank.com>
Any time we persist a hit_set object, take the opportunity to remove any
old ones that we don't want any more.
Note that this means if the admin decreases the number of objects to track,
we won't remove them until the next time we persist something. We also
don't clean up if the HitSet tracking is disabled entirely.
Signed-off-by: Sage Weil <sage@inktank.com>
This lets us put PGLS in a compound operation. Nothing does that yet, but
this would allow it.
Despite appearances, this is not a protocol change and does not require
a feature bit for clients: using the osd_ops vector mechanisms store all
the data in the same places as before, it just fills in some of the
already-decoded-but-empty data structures in the MOSDOpReply header.
<Greg note:> We may need a feature bit to let clients know they can send
compound PG ops to OSDs, though? Or maybe we can let it be covered
by supporting hitset ops.
Signed-off-by: Sage Weil <sage@inktank.com>
Add pool properties to control what type of HitSet we want to use, along with
some (mostly generic) parameters.
Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
Track metadata about the currently accumulating HitSet as well as
previously archived ones in the pg_info_t. This will not scale well for
extremely long histories, but does let us avoid explicitly sharing this
metadata during recovery or other normal update activity.
Signed-off-by: Sage Weil <sage@inktank.com>
Track a set of hash values, either explicitly or using a bloom_filter. Hide
the implementation and allow us to transparently encode and decode.
Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
This makes it easier to create repops correctly, and should help
prevent bugs like the one we remove here in process_copy_op (we were
serializing on the wrong object!)
Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Farnum <greg@inktank.com>