Generate it on cluster creations with the initial monmap. Include it in
the report. Provide no way for this uuid to be fed in to the cluster
(intentionally or not) so that it can be assumed to be a truly unique
identifier for the cluster.
Signed-off-by: Sage Weil <sage@redhat.com>
The CollectionIndex constructor is changed to accept the coll_t
so that the collection name can be used to form access_lock(RWLock)
name.This is needed otherwise lockdep will report a recursive lock error
and assert. lockdep needs unique lock names for each Index object.
Fixes: #9145
Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
Load the jerasure plugin when ceph-osd starts to avoid the following
scenario:
* ceph-osd-v1 is running but did not load jerasure
* ceph-osd-v2 is installed being installed but takes time : the files
are installed before ceph-osd is restarted
* ceph-osd-v1 is required to handle an erasure coded placement group and
loads jerasure (the v2 version which is not API compatible)
* ceph-osd-v1 calls the v2 jerasure plugin and does not reference the
expected part of the code and crashes
Although this problem shows in the context of teuthology, it is unlikely
to happen on a real cluster because it involves upgrading immediately
after installing and running an OSD. Once it is backported to firefly,
it will not even happen in teuthology tests because the upgrade from
firefly to master will use the firefly version including this fix.
While it would be possible to walk the plugin directory and preload
whatever it contains, that would not work for plugins such as jerasure
that load other plugins depending on the CPU features, or even plugins
such as isa which only work on specific CPU.
http://tracker.ceph.com/issues/9153Fixes: #9153
Backport: firefly
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
Do not assume that all OSDs are weighted equally for reweight-by-pg.
Note that reweight-by-utilization already reweights based on the size of
the OSD volume; we presume that this is already reflected by the CRUSH
weights.
Signed-off-by: Sage Weil <sage@redhat.com>
Allow the reweight-by-pg to look at a specific set of pools. If the list
is ommitted, use PGs from all pools. This allows you to focus on a
specific pool (the one that will dominate data usage). Otherwise things
may not be quite right because other pools may have PGs that contain
much less data.
Signed-off-by: Sage Weil <sage@redhat.com>
Note when OSDs are underloaded, as well. If that is the case, adjust the
OSD reweight value if, if possible. (It won't always be possible since
weights are capped at 1.)
Note that we set the underload threshold to the average, as we want to
aggressively adjust weights up (back to 1.0) whenever possible. This gets
us a more efficient mapping calculation and reduces the amount of "noise"
in the weights.
Signed-off-by: Sage Weil <sage@redhat.com>
This is just like reweight-by-utilization, but looks purely at the PG to
OSD mapping, not at the number of bytes used on the target disks. This
allows the reweighting to be done before any data is written into the
cluster, when no data will need to migrate as a result of the reweight.
Signed-off-by: Sage Weil <sage@redhat.com>
In practice, the map will remain pinned for a while, but this
will make coverity happy.
*** CID 1231685: Use after free (USE_AFTER_FREE)
/osd/OSD.cc: 6223 in OSD::handle_osd_map(MOSDMap *)()
6217
6218 if (o->test_flag(CEPH_OSDMAP_FULL))
6219 last_marked_full = e;
6220 pinned_maps.push_back(add_map(o));
6221
6222 bufferlist fbl;
>>> CID 1231685: Use after free (USE_AFTER_FREE)
>>> Calling "encode" dereferences freed pointer "o".
6223 o->encode(fbl);
6224
6225 hobject_t fulloid = get_osdmap_pobject_name(e);
6226 t.write(coll_t::META_COLL, fulloid, 0, fbl.length(), fbl);
6227 pin_map_bl(e, fbl);
6228 continue;
Signed-off-by: Sage Weil <sage@redhat.com>
DEGRADED means there are objects without complete reduncancy; also check
for needs_recovery().
UNDERSIZED means acting set is too small.
Signed-off-by: Sage Weil <sage@redhat.com>
A degraded object does not have enough replicas or shards, while a
misplaced object is not stored in the correct place. Account for them
separately.
Signed-off-by: Sage Weil <sage@redhat.com>