rgw: temporarily disable calls to defer_gc() in RGWGetObj

cls_rgw_gc_queue_update_entry() is known to cause data loss when called
on objects that have not actually been scheduled for garbage collection

RGWGetObj is the only caller, and uses defer_gc() when reads are taking
a long time compared to rgw_gc_obj_min_wait. if an object has since been
deleted and submitted for garbage collection, this allows RGWGetObj to
defer that gc until the entire read completes

by disabling these calls to defer_gc(), very long reads (longer than 1hr,
with default configuration) may fail if the object gets deleted, and a
retry will result in a 404 Not Found error as expected

Fixes: https://tracker.ceph.com/issues/47866

Signed-off-by: Casey Bodley <cbodley@redhat.com>
This commit is contained in:
Casey Bodley 2020-11-23 18:06:26 -05:00
parent c569a302be
commit 94df9cd37a

View File

@ -2017,16 +2017,8 @@ int RGWGetObj::handle_slo_manifest(bufferlist& bl, optional_yield y)
int RGWGetObj::get_data_cb(bufferlist& bl, off_t bl_ofs, off_t bl_len)
{
/* garbage collection related handling */
utime_t start_time = ceph_clock_now();
if (start_time > gc_invalidate_time) {
int r = store->defer_gc(s->obj_ctx, s->bucket.get(), s->object.get(), s->yield);
if (r < 0) {
ldpp_dout(this, 0) << "WARNING: could not defer gc entry for obj" << dendl;
}
gc_invalidate_time = start_time;
gc_invalidate_time += (s->cct->_conf->rgw_gc_obj_min_wait / 2);
}
/* garbage collection related handling:
* defer_gc disabled for https://tracker.ceph.com/issues/47866 */
return send_response_data(bl, bl_ofs, bl_len);
}