If the leader has and older lc than we do, and we are sharing states to
bring them up to date, we still want to also share our uncommitted value.
This particular case was broken by b26b7f6e, which was only contemplating
the case where the leader was ahead of us or at the same point as us, but
not the case where the leader was behind. Note that the call to
share_state() a few lines up will bring them fully up to date, so
after they receive and store_state() for this message they will be at the
same lc as we are.
Fixes: #5750
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
To support this, we add an optional out argument to
RGWMetadatManager::put() and fill in the read_version. When the
function returns, that contains whatever the current on-disk version
of the object is (either what already existed or what we just wrote).
Signed-off-by: Greg Farnum <greg@inktank.com>
Add new STATUS_APPLIED, then specify the RGWX_UPDATE_STATUS header
based on that return code when doing metadata puts.
Add a send_response() function to RGWOp_Metadata_Put in order to
support sending back our new headers. Move the translation from
STATUS_NO_APPLY from set_req_state_err() to this function, so we
can turn different sync results into failures if necessary elsewhere.
Signed-off-by: Greg Farnum <greg@inktank.com>
Specify the param "sync-type" as one of "always", "update-by-version",
"update-by-timestamp". It defaults to always.
Signed-off-by: Greg Farnum <greg@inktank.com>
We want to be able to conditionally apply new updates:
1) if we already have a newer version than the sync is applying for some
reason (replay of logs?), we don't want to go back in time.
2) If both zones were active at the same time, then we'd like to be
able to do a merge based on timestamps.
In order to support this, we add a sync_type flag to the implementations of
RGWMetadataHandler::put, and then check the version or the mtime of the
incoming put to what we have on disk, and refuse the update if needed.
We return the 204 NoContent success code when refusing sync; for the
moment the conversion is automatic but we're going to pull it out in
the next couple commits.
This commit does not complete the feature: we don't provide an interface
for specifying a different sync protocol.
Signed-off-by: Greg Farnum <greg@inktank.com>
Also increase fd limit defaults to accomodate the larger number
of fds.
Fixes: #5692
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Mark Nelson <mark.nelson@inktank.com>
If the replay is being replayed, we might have already
performed the rename, skip it. Also, we must set the
collection replay guard only after we have done the
rename.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
A replay of an in progress merge or split might make
our counts unreliable.
Fixes: #5723
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
For performance reasons: init 'prefix' with META_LOG_OBJ_PREFIX
in the initialization list of RGWMetadataLog instead of assigning
the value in the constructor body.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
We also need to poll the control fd/pipe so that we restart the poll loop
when new signal handlers are added. This was broken by commit 8e4a78f1.
Fixes: #5742
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
In 97462a3213 we tried to search for a
recent full osdmap but were looking at the wrong key. If full_0 was
present we could record that the latest full map was last_committed even
though it wasn't present. This is fixed in 76cd7ac1c, but we need to
compensate for when get_version_latest_full() gives us a back version
number by repeating the search.
Fixes: #5737
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
CID 1054868 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member "cur_shard" is not
initialized in this constructor nor in any functions that it calls.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
CID 1054827 (#1 of 1): Dereference after null check (FORWARD_NULL)
var_deref_model: Passing null pointer "objv_tracker->read_version"
to function "obj_version::operator =(obj_version const &)", which
dereferences it.
Moved affected 2 cases into the block checking for objv_tracker
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
CID 1054829 (#1 of 1): Missing break in switch (MISSING_BREAK)
unterminated_case: This case (value 37) is not terminated by a
'break' statement.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
CID 1019623 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member "scrub_version" is not
initialized in this constructor nor in any functions that it calls.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Remove the rc suffix since RPM complains about. For rc release
builds the "rc" in the git describe string is suffcient for
everyhting but RPM. For rc release builds (i.e. not gitbuilder)
add a flag to the spec file.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
Starting with commit 61a298c39c we delay the
signal handler setup until after lots of other initialization has happened,
which can result in us having very large (>1024) open fds, which will
break the FD_SET macros for select(2). Use poll(2) instead.
Fixes: #5722
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
If we try to open an mds session and the MDS responds with close (aka,
"no"), we call _closed_mds_session() which signals the Cond*'s but then
deallocates the list. wait_on_list() then does a use-after-free trying
to remove itself.
Instead, use Context*'s, so that the waiter does not reference the list.
Fixes: #5689
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
These are better when the list may need to be deallocated. Context's are
single-shot and the list is not referenced by the caller.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>