Do not assume that all OSDs are weighted equally for reweight-by-pg.
Note that reweight-by-utilization already reweights based on the size of
the OSD volume; we presume that this is already reflected by the CRUSH
weights.
Signed-off-by: Sage Weil <sage@redhat.com>
Allow the reweight-by-pg to look at a specific set of pools. If the list
is ommitted, use PGs from all pools. This allows you to focus on a
specific pool (the one that will dominate data usage). Otherwise things
may not be quite right because other pools may have PGs that contain
much less data.
Signed-off-by: Sage Weil <sage@redhat.com>
Note when OSDs are underloaded, as well. If that is the case, adjust the
OSD reweight value if, if possible. (It won't always be possible since
weights are capped at 1.)
Note that we set the underload threshold to the average, as we want to
aggressively adjust weights up (back to 1.0) whenever possible. This gets
us a more efficient mapping calculation and reduces the amount of "noise"
in the weights.
Signed-off-by: Sage Weil <sage@redhat.com>
This is just like reweight-by-utilization, but looks purely at the PG to
OSD mapping, not at the number of bytes used on the target disks. This
allows the reweighting to be done before any data is written into the
cluster, when no data will need to migrate as a result of the reweight.
Signed-off-by: Sage Weil <sage@redhat.com>
In practice, the map will remain pinned for a while, but this
will make coverity happy.
*** CID 1231685: Use after free (USE_AFTER_FREE)
/osd/OSD.cc: 6223 in OSD::handle_osd_map(MOSDMap *)()
6217
6218 if (o->test_flag(CEPH_OSDMAP_FULL))
6219 last_marked_full = e;
6220 pinned_maps.push_back(add_map(o));
6221
6222 bufferlist fbl;
>>> CID 1231685: Use after free (USE_AFTER_FREE)
>>> Calling "encode" dereferences freed pointer "o".
6223 o->encode(fbl);
6224
6225 hobject_t fulloid = get_osdmap_pobject_name(e);
6226 t.write(coll_t::META_COLL, fulloid, 0, fbl.length(), fbl);
6227 pin_map_bl(e, fbl);
6228 continue;
Signed-off-by: Sage Weil <sage@redhat.com>
DEGRADED means there are objects without complete reduncancy; also check
for needs_recovery().
UNDERSIZED means acting set is too small.
Signed-off-by: Sage Weil <sage@redhat.com>
A degraded object does not have enough replicas or shards, while a
misplaced object is not stored in the correct place. Account for them
separately.
Signed-off-by: Sage Weil <sage@redhat.com>
This causes build failure in latest fedora builds, ceph_test_librbd_fsx adds -Wno-format cflag but the default AM_CFLAGS already contain -Werror=format-security, in previous releases, this was tolerated but in the latest fedora rawhide it no longer is, ceph_test_librbd_fsx builds fine without -Wno-format on x86_64 so there is likely no need for the flag anymore
Signed-off-by: Boris Ranto <branto@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>