address the regression introduced by e62cfceb
in e62cfceb, we wanted to test the newly introduced TOO_FEW_OSDS
warning, so we increased the number of OSD to the size of pool, so if
the number of OSD is less than pool size, monitor will send a warning
message.
but we need to bring all OSDs back if we are expecting a healthy
cluster. in this change, all OSDs are resurrect before
`wait_for_health_ok`.
Signed-off-by: Kefu Chai <kchai@redhat.com>
The changes to the way EC/ReplicatedBackend communicate read
t showerrors had a side effect of making first eio on the object in
TEST_rados_get_subread_eio_shard_[01] repair itself depending
on the timing of the killed osd recovering. The test should
be improved to actually test that behavior at some point.
Signed-off-by: Samuel Just <sjust@redhat.com>
Use OSD_POOL_PRIORITY_MAX and OSD_POOL_PRIORITY_MIN constants
Scale legacy priorities if exceeds maximum
Signed-off-by: David Zafman <dzafman@redhat.com>
Case 1: A more recent update exists
Case 2: The first entry in the divergent sequence is a create
Case 3 NOT TESTED - Ohject currently missing
Case 4: We can rollback all of the entries
Case 5: We cannot rollback at least 1 of the entries
Support starting OSDs even when "noup" is set (don't wait for up).
Move create_ec_pool() to ceph-helpers.sh
Fixes: https://tracker.ceph.com/issues/39162
Signed-off-by: David Zafman <dzafman@redhat.com>
stop command can be used to force stopping a specified osd daemon, e.g.,
you don't have to pre-figure out where it located.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
If user specifies dump-import it will still work, but isn't
in the usage that way.
Fixes: http://tracker.ceph.com/issues/39284
Signed-off-by: David Zafman <dzafman@redhat.com>
The ceph cli tool checks for the presence of the variable, not its value.
Fixes: http://tracker.ceph.com/issues/38359
Signed-off-by: Sage Weil <sage@redhat.com>
Change run_osd() to default objectstore bluestore
Use run_osd_filestore() to use the non-default objectstore
Fix inject_eio to handle any objectstore if config prefixed with type
Remaining tests using filestore:
osd-pool-create.sh TEST_pool_create_rep_expected_num_objects
Test filestore directory creation
qa/standalone/osd/osd-dup.sh TEST_filestore_to_bluestore
Obvious
qa/standalone/osd/osd-rep-recov-eio.sh TEST_rep_read_unfound
Requires data digest in object info
qa/standalone/scrub/osd-scrub-repair.sh multiple tests
Erasure code pools append mode for filestore is tested
qa/standalone/special/ceph_objectstore_tool.py
Test code verifies COT by directly examining filestore contents
Fixes: https://tracker.ceph.com/issues/39162
Signed-off-by: David Zafman <dzafman@redhat.com>
Helpers to decide when it is safe to stop a mon, add a mon that is
not started, or remove a mon. (Adding and start a mon would always
be safe, but it takes time to sync, so it's not really possible to do
quickly.)
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/27169/head:
common/config: parse --default-$option as a default value
Reviewed-by: Sébastien Han <seb@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Sometimes it is useful to specify an alternative default value for an
option via the command line such that it has a lower priority than the
mon config database, config file, the rest of the command line, or the
environment.
Signed-off-by: Sage Weil <sage@redhat.com>
Leave repair pg state on until recovery finishes or a new scrub starts
Fixes: http://tracker.ceph.com/issues/38616
Signed-off-by: David Zafman <dzafman@redhat.com>
Allow auto_repair for replicated bluestore pools
Regular scrub within auto repair parameters will trigger deep scrub
New state failed_repair if PG repair attempt could not fix everything
Set failed_repair if not possible to repair anything
Fixes: http://tracker.ceph.com/issues/38616
Signed-off-by: David Zafman <dzafman@redhat.com>
Fix for argument handling of create_ec_pool()
Always pass a value for allow_overwrites for consistency
Caused by: 3ca750d41d
Signed-off-by: David Zafman <dzafman@redhat.com>
Verify we have the expected behavior for creates and moves that
maintain bucket summation, both with and without the
osd_crush_update_weight_set option enabled.
Signed-off-by: Sage Weil <sage@redhat.com>
- Make the initial weight-set actually consistent (summing)
- Fix the intermediate state so that it reflects a correctly
maintained summation.
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/26898/head:
osd/PG: invalidate PG if merging with unexpected version
osd,mon: include more pg merge metadata in pg_pool_t
qa/standalone/osd/pg-split-merge.sh: reproduce pg merge problem with empty pgs
osd: add osd_debug_no_{acting_change,purge_strays}
Reviewed-by: Neha Ojha <nojha@redhat.com>
* refs/pull/26894/head:
qa/standalone/erasure-code/test-erasure-code: adjust test to avoid m=0
erasure-code: ensure m >= 1
mon/OSDMonitor: set ec min_size to k + min(1, m - 1)
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
_DD is k=2 m=0, which we don't allow. Switch it to cDD.
I confess I don't fully understand why this was _DD to begin with, but
I'm pretty sure mapping is there to control the order of results so that
it can be mapped to the CRUSH rule output sanely, and the coding portion
is not relevant to the test.
Signed-off-by: Sage Weil <sage@redhat.com>
If the source or target PG version is 0'0, we may silently take the max
of the source and target and still leave the PG complete. This
specifically can happen with an empty PG, as seen with bug 38655. In
theory we could encounter one of the PGs with some other last_update
that doesn't match what we expect. If that ever happens, make sure the
result is incomplete so that backfill can clean up.
Additionally check that the pool metadata for the last merge matches the
PGs at all. This could mismatch if we have an osdmap gap and are forced
to do some merge without merge info at all... in which case we should
definitely invalidate: there should be newer copies of the PG(s), and we
have no idea whether the PGs we are merging are what we want. If this is
some disaster recovery situation, an operator is always free to use
ceph-objectstore-tool to re-mark a PG complete (at their own peril!).
Fixes: http://tracker.ceph.com/issues/38655
Signed-off-by: Sage Weil <sage@redhat.com>
Bluestore caused grep crash with "grep: memory exhausted" due to
size of "block" storage.
Fixes: http://tracker.ceph.com/issues/38678
Signed-off-by: David Zafman <dzafman@redhat.com>