For PG with a huge amount of objects, it wouldn't be an ideal
way to list all of them at a time. Split them into small batches
which we can handle individually efficiently should instead be
the preferred option.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
allow remove all unprotected snapshots when exiting
protected snapshots in the same image.
Fixes: http://tracker.ceph.com/issues/23126
Signed-off-by: songweibin <song.weibin@zte.com.cn>
The mgr balancer module are basically doing optimizations based on
the snapshots of OSDMap at certain moments, which turns out to be
the culprit of data loss since it can produce bad PG mapping results
sometimes while in upmap mode.
I.e.:
1) original cluster topology:
-5 2.00000 host host-a
0 ssd 1.00000 osd.0 up 1.00000 1.00000
1 ssd 1.00000 osd.1 up 1.00000 1.00000
-7 2.00000 host host-b
2 ssd 1.00000 osd.2 up 1.00000 1.00000
3 ssd 1.00000 osd.3 up 1.00000 1.00000
-9 2.00000 host host-c
4 ssd 1.00000 osd.4 up 1.00000 1.00000
5 ssd 1.00000 osd.5 up 1.00000 1.00000
2) mgr balancer applies optimization for PG 3.f:
pg-upmap-items[3.f : 1->4]
3.f [1 3] + -------------------------> [4 3]
3) osd.3 is out/reweighted etc., original crush mapping of 3.f changed
(while pg-upmap-items did not):
pg-upmap-items[3.f : 1->4]
3.f [1 5] + -------------------------> [4 5]
4) we are now mapping PG 3.f to two OSDs(osd.4 & osd.5) on the same host
(host-c).
Fix the above problem by putting a guard procedure before we can
finally encode these *unsafe* upmap remappings into OSDMap.
If any of them turns out to be inappropriate, we can simply cancel it
since balancer can still re-calculate and re-generate later if necessary.
Fixes: http://tracker.ceph.com/issues/23118
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
The specific specialisations in both .h and .cc does not really fly
with Clang
/home/jenkins/workspace/ceph-master/src/mon/MDSMonitor.cc:79:22: error: template parameter redefines default argument
template <int dblV = 7>
^
/home/jenkins/workspace/ceph-master/src/mon/MDSMonitor.h:76:23: note: previous default template argument defined here
template<int dblV = 7>
^
1 error generated.
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Remove mon_quorum_count and replace it with per-MON quorum status
(mon_quorum_status). Also add mon_metadata metrics.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
The central change of this commit is that per-daemon metrics are now
managed by first appending the metric (using Metrics.append) to a
staging area. Then the metrics for specific paths (metric names) are
overwritten by the staged metrics (by calling Metrics.reset). This gets
rid of metrics from daemon that are no longer in the cluster. I.e. when
ceph no longer reports metrics for one OSD daemon (because it was
removed from the cluster) the prometheus module will no longer export
metrics for that daemon.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
We might do 'ceph osd out <osd.x>' or 'ceph osd crush reweight <osd.x> 0'
for various reasons, and hence can produce 0-weighted OSDs.
Skip those OSDs when trying to calculdate PG upmaps so we won't be able to
hit the *assert* below:
/build/ceph-13.0.1-2232-g64665c7/src/osd/OSDMap.cc: 4179: FAILED assert(target > 0)
See also:
http://pulpito.ceph.com/xxg-2018-02-28_09:02:53-rados-wip-fix-upmap-distro-basic-smithi/2235497/
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
msg/async:fix the incoming parameter type of EventCenter::process_events()
Reviewed-by: Haomai Wang <haomai@xsky.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
os/bluestore: trim cache every 50ms (instead of 200ms)
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>