When we use an unordered_map the encoding order is non-deterministic,
which is problematic for OSDMap. Construct an ordered map<> on encode
and use that. This lets us keep the hash table for lookups in the general
case.
Fixes: #9211
Backport: firefly
Signed-off-by: Sage Weil <sage@redhat.com>
I had changed the implementation in Objecter
to avoid a spurious get/put cycle in "osdc/Objecter: fix resource
management", but this guy was still going a get() before
calling handle_osd_map.
Signed-off-by: John Spray <john.spray@redhat.com>
If a notify operation times out (all watchers to not ACK in time), include
an ETIMEDOUT in the final error message back to the client, so that they
know about it.
Signed-off-by: Sage Weil <sage@redhat.com>
Instead of taking a pointer to an existing OSDMap in our constructor,
allocate our own, so that we completely own it.
Signed-off-by: Sage Weil <sage@redhat.com>
Separate objecter initialization to non cluster related work (e.g.,
internal data structures, other registrations), and to operations that
can initiate cluster interaction. This is so that we don't hit a rare
race where we can get called indirectly from one of the dispatcher callbacks
e.g., into handle_osd_map() when not yet being initialized.
This requires that objecter->init() should be called before
messenger->add_dispatcher_head(), and objecter->start() after it.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Caller don't need to hold lock before calling _dump_op_descriptor(),so,
to reflect this it is renamed to _dump_op_descriptor_unlocked().
Signed-off-by: Pavan Rallabhandi <pavan.rallabhandi@sandisk.com>
Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
In practice, the map will remain pinned for a while, but this
will make coverity happy.
*** CID 1231685: Use after free (USE_AFTER_FREE)
/osd/OSD.cc: 6223 in OSD::handle_osd_map(MOSDMap *)()
6217
6218 if (o->test_flag(CEPH_OSDMAP_FULL))
6219 last_marked_full = e;
6220 pinned_maps.push_back(add_map(o));
6221
6222 bufferlist fbl;
>>> CID 1231685: Use after free (USE_AFTER_FREE)
>>> Calling "encode" dereferences freed pointer "o".
6223 o->encode(fbl);
6224
6225 hobject_t fulloid = get_osdmap_pobject_name(e);
6226 t.write(coll_t::META_COLL, fulloid, 0, fbl.length(), fbl);
6227 pin_map_bl(e, fbl);
6228 continue;
Signed-off-by: Sage Weil <sage@redhat.com>
DEGRADED means there are objects without complete reduncancy; also check
for needs_recovery().
UNDERSIZED means acting set is too small.
Signed-off-by: Sage Weil <sage@redhat.com>
A degraded object does not have enough replicas or shards, while a
misplaced object is not stored in the correct place. Account for them
separately.
Signed-off-by: Sage Weil <sage@redhat.com>
Otherwise, an objecter callback might still be hanging
onto this reference until after the flush.
Fixes: #8894
Introduced: 589b639af7
Signed-off-by: Samuel Just <sam.just@inktank.com>
Often there will be a CRUSH rule present for erasure coding that uses the
new CRUSH steps or indep mode. If these rules are not referenced by any
pool, we do not need clients to support the mapping behavior. This is true
because the encoding has not changed; only the expected CRUSH output.
Fixes: #8963
Backport: firefly
Signed-off-by: Sage Weil <sage@redhat.com>
If the remap vector is not empty, use it to figure out the sequence of
data chunks.
http://tracker.ceph.com/issues/9025Fixes: #9025
Signed-off-by: Loic Dachary <loic@dachary.org>
The LRU now handles you attempting to insert multiple values for the
same key, by telling you that you've done so and returning the
existing value before it manages to muck up existing data.
The param 'existed' is not mandatory, default value is NULL.
Signed-off-by: Greg Farnum <greg@inktank.com>
Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>