If we have to fabricate a merge target, we need to prime any future splits
it might have. Otherwise a sequence like
- e100 1.f merge to 1.7
- e110 1.7 split to 1.f, 1.17, 1.1f
where we process all of the above in one go at, say, e120, will lead to
a crash in register_and_wake_split_child because 1.17 and/or 1.1f aren't
primed.
Fix by making identify_splits_and_merges do a recursive scan on any
merge/split participants detected too.
Fixes: http://tracker.ceph.com/issues/38483
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
otherwise we will fail to install the build dependencies of
`lazy-object-proxy` from the wheelhouse. as `lazy-object-proxy` does not
add `setuptools_scm` in its `setup.py`, instead it lists
`setuptools_scm` in `setup.cfg` and `pyproject.toml` as a `build-system`
requires. but unfortunately, `pip download` only downloads the
install/run-time dependencies at this moment. and `lazy-object-proxy`
does not offer binary package for at least python2.7.
ideally, `pip download` should collects its dependencies like
Collecting setuptools_scm>=3.3.1 (from lazy-object-proxy->astroid<3,>=2.2.0->pylint->-r requirements-lint.txt (line 1))
so we need to use `pip wheel` do download build-time dependencies
see also https://github.com/pypa/pip/issues/6222
Signed-off-by: Kefu Chai <kchai@redhat.com>
common/condition_variable_debug: do not assert() if sloppy
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
mgr/BaseMgrModule: use PyInt_Check() to compatible with py2
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>
For unfound objects, we might append LOST_REVERT log entries,
which shall allow these objects to be reverted to the newest
available version later.
However, we are currently lack of support to rewind the clean_regions
portion too when marking unfound objects as lost with inc-recovery mode
enabled. Hence we must mark these unfound-revert objects as fully dirty
to make sure they can be correctly recovered.
E.g.,:
- primary is pulling object A from replica 1
- object A is corrupted on replica 1
- object A is now unfound
- mark object A as lost, replica 1 will persist a wrong
missing item for object A..
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
There is no consumer.
Actually, I think this field is only meaningful to be used
to indicate whether we should initiate an inc-recovery or not.
If not, then we shall fall back to triggering a full-recovery
instead.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
In general we shall build missing set (and hence clean_regions)
based on pg log. However, currently there are still 5 cases we might call
missing.add to add a new pg_missing_item into the missing set
explicitly (or replace an existing pg_missing_item entirely):
1. we explicitly build missing set on startup, in which case
we know we are trying to be compatiable with pre-kraken versions,
so it should be ok for us to disable inc-recovery.
2. we are currently processing authoritative log, and there are
some divergent objects detected. For simplicity (and correctness),
we should disable inc-recovery entirly for these objects.
3. we are re-building missing set, e.g., due to the global
CEPH_OSDMAP_RECOVERY_DELETES policy changing.
In this case we know we are at the end of upgrading from a
pervious version that is lack of CEPH_OSDMAP_RECOVERY_DELETES support.
Hence it should be the recommended option to disable inc-recovery
simultaneously since these objects should be lack of inc-recovery support
too.
4. we are adding or re-adding missing object into primary's missing_loc.
It doesn't matter whether we have a correct clean_regions there
since we never actually refer to that field from missing_loc
when we actually start to perform object recovery later.
5. we are auto-repairing a corrupted object and hence the need of
adding it to the corresponding missing set first, e.g, by leveraging
the existing recovery procedure. In this case, we always disable
inc-recovery to make sure this object can be fully (and correctly)
recovered later.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
* refs/pull/29760/head:
mgr/volumes: cleanup FS subvolume or subvolume group path
mgr/volumes: give useful error message
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Do this outside the standard tick interval as it needs to be driven more
frequently to keep up with client workloads that generate a lot of
capabilities.
Fixes: https://tracker.ceph.com/issues/41141
Fixes: https://tracker.ceph.com/issues/41140
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>