mirror of
https://github.com/ceph/ceph
synced 2025-02-23 19:17:37 +00:00
mon/OSDMonitor: share new maps with even non-active osds
OSDs may not be aware of their deadness and trapped at an obsolete map in which they were still marked as up: ``` host osd down_at stuck_at ceph-03 9 e712 e711 ceph-03 13 e700 e699 ceph-03 28 e697 e696 ceph-03 48 e697 e696 ceph-03 52 e707 e704 ceph-03 61 e710 e708 ceph-03 73 e712 e710 ceph-03 77 e708 e707 ceph-05 12 e711 e710 ceph-05 21 e703 e702 ceph-05 24 e700 e699 ceph-05 29 e703 e699 ceph-05 41 e711 e710 ceph-05 53 e711 e710 ceph-05 72 e712 e711 ``` In https://github.com/ceph/ceph/pull/23958 an OSD will ping monitor periodically now if it is stuck at __wait_for_healthy__. But in the above case OSDs are still considering themselves as __active__ and hence should miss that fixer. Since these OSDs might be still able to contact with monitors ( otherwise there is no way for them to be marked up again) and send beacons contiguously, we can simply get them out of the trap by sharing some new maps with them. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn> Signed-off-by: runsisi <runsisi@zte.com.cn>
This commit is contained in:
parent
5a3344f0e5
commit
79f480442f
@ -3572,6 +3572,11 @@ bool OSDMonitor::prepare_beacon(MonOpRequestRef op)
|
||||
if (!src.is_osd() ||
|
||||
!osdmap.is_up(from) ||
|
||||
beacon->get_orig_source_addrs() != osdmap.get_addrs(from)) {
|
||||
if (src.is_osd() && !osdmap.is_up(from)) {
|
||||
// share some new maps with this guy in case it may not be
|
||||
// aware of its own deadness...
|
||||
send_latest(op, beacon->version+1);
|
||||
}
|
||||
dout(1) << " ignoring beacon from non-active osd." << from << dendl;
|
||||
return false;
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user