Commit Graph

5 Commits

Author SHA1 Message Date
Sage Weil
fb915c4805 osd/PG: invalidate PG if merging with unexpected version
If the source or target PG version is 0'0, we may silently take the max
of the source and target and still leave the PG complete.  This
specifically can happen with an empty PG, as seen with bug 38655.  In
theory we could encounter one of the PGs with some other last_update
that doesn't match what we expect.  If that ever happens, make sure the
result is incomplete so that backfill can clean up.

Additionally check that the pool metadata for the last merge matches the
PGs at all.  This could mismatch if we have an osdmap gap and are forced
to do some merge without merge info at all... in which case we should
definitely invalidate: there should be newer copies of the PG(s), and we
have no idea whether the PGs we are merging are what we want.  If this is
some disaster recovery situation, an operator is always free to use
ceph-objectstore-tool to re-mark a PG complete (at their own peril!).

Fixes: http://tracker.ceph.com/issues/38655
Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-12 10:08:46 -05:00
Sage Weil
f978b27d2b qa/standalone/osd/pg-split-merge.sh: reproduce pg merge problem with empty pgs
This reproduces http://tracker.ceph.com/issues/38655

Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-11 17:10:28 -05:00
Sage Weil
01316aa7bd qa/standalone/osd/pg-split-merge: fix import_after_merge_and_gap
This test introduces a map gap.  What *should* happen is that when there is
such a gap, we cannot import.  Previously, the test didn't reliably produce
a map gap at all, and didn't check that import failed--it verified that it
passed.

Fix the test so that it reliably produces a gap *and* reports
min_last_epoch_clean to the mon so we can trim.  Then verify we fail to
import, but can with --force.  But remove the pg again, because if we
force an import with a map gap the osd will refuse to start.

Fixes: http://tracker.ceph.com/issues/38525
Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-03 10:23:27 -06:00
David Zafman
da3c556aa2 test: Make sure kill_daemons failure will be easy to find
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-10-17 16:54:45 -07:00
Sage Weil
26cb966cab ceph-objectstore-tool: import pg at original epoch
- In the jewel era, we fast-forwarded the PG to the OSD's latest epoch
and cleared past_intervals.

- In mimic, as of 2347ecb961, we brought the
PG up to date while updating past_intervals.  (At the same time we removed
the OSD's parallel past_intervals regeneration.)

The problem is that the tool then has to reimplement the past_intervals
update logic, and *also* has to cope with splits and merges.  Splits are
somewhat easier (until now we enable partial import of a PG into a split
child), but merges are not so easy.

This patch changes it so we import the PG and leave the pg_epoch matching
the import file.  The OSD is then responsible for bringing it up to date
with the latest map, and dealing with any intervening splits or merges.

We also adjust the safety check to ensure that we don't collide with
any existing PG, either a child we eventually split into, or a parent
we eventually merge into.

Fixes: http://tracker.ceph.com/issues/35955
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 12:58:00 -05:00