In 7fb3804fb8 we moved the full version
stashing logic to the encode_trim_extra() function. However, we forgot
to update the osdmap's 'latest_full' key that should always point to
the latest osdmap full version. This eventually degenerated in a missing
full version after a trim. This patch works around this bug by looking
for the latest available full osdmap version in the store and updating
'latest_full' to its proper value.
Related-to: #5704
Backport: cuttlefish
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
We have delegated this to encode_trim_extra() since
7fb3804fb8 -- no need to keep this code
around.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
We used to do this on encode_full(), but since [1] we no longer rely on
PaxosService to manage the full maps for us. And we forgot to write down
the latest_full version to the store, leaving it in a truly outdated state.
[1] - 7fb3804fb8Fixes: #5704
Backport: cuttlefish
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
From the man page for posix_fallocate:
posix_fallocate() returns zero on success, or an error
number on failure. Note that errno is not set.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Sage Weil <sage@inktank.com>
The osd lock is not held at this point, we must use
the createmap passed in.
Fixes: #5656
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Fixes: #5691
We need to also read the attributes, as bucket might be a legacy
bucket and might have all bucket instance info in that object.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Tested-by: Faidon Liambotis <faidon@wikimedia.org>
Some callbacks take the osd lock, so we need to avoid blocking an
osd lock holding thread while waiting on a filestore callback.
Instead, just queue the transaction, and allow _try_resurrect_pg
to cancel us while we are waiting for the transaction to go through
(CLEARING_WAITING).
Fixes: #5672
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Pass const std::list<> parameter by refrence to
cls_replica_log_progress_marker().
From cppcheck:
[src/cls/replica_log/cls_replica_log_types.h:64]: (performance)
Function parameter 'b' should be passed by reference.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Object 'librados::ObjectWriteOperation *op' is freed twice in the TEST
test_version_inc_read. Free instead 'librados::ObjectReadOperation *rop'
Related cppcheck warning:
[src/test/cls_version/test_cls_version.cc:79]: (error) Memory
pointed to by 'op' is freed twice.
This should also fix:
CID 1049247 (#1 of 1): Use after free (USE_AFTER_FREE)
deref_arg: Calling "librados::ObjectWriteOperation::~ObjectWriteOperation()"
dereferences freed pointer "op". (The dereference happens because this is
a virtual function call.)
CID 1049218 (#4 of 4): Resource leak (RESOURCE_LEAK)
leaked_storage: Variable "rop" going out of scope leaks the storage it
points to.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Return 'const string' instead of 'const char *' from RGWOp::name() to
avoid the usage of std::string:c_str() to return 'const char *' in
some cases in rgw_rest_replica_log.h.
Returning result of c_str() from a function is dangerous since the
result gets (may) invalid after the related string object gets
destroyed or out of scope (which is the case with return). So you
may end up with garbage in this case.
Related warning from cppcheck:
[src/rgw/rgw_rest_replica_log.h:39]: (error) Dangerous usage of
c_str(). The value returned by c_str() is invalid after this call.
[src/rgw/rgw_rest_replica_log.h:59]: (error) Dangerous usage of
c_str(). The value returned by c_str() is invalid after this call.
[src/rgw/rgw_rest_replica_log.h:79]: (error) Dangerous usage of
c_str(). The value returned by c_str() is invalid after this call
This should also fix:
CID 1049250 (#1 of 1): Wrapper object use after free (WRAPPER_ESCAPE)
escape: The internal representation of "s" escapes, but is destroyed
when it exits scope.
CID 1049251 (#1 of 1): Wrapper object use after free (WRAPPER_ESCAPE)
escape: The internal representation of "s" escapes, but is destroyed
when it exits scope.
CID 1049252 (#1 of 1): Wrapper object use after free (WRAPPER_ESCAPE)
escape: The internal representation of "s" escapes, but is destroyed
when it exits scope.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
We must require something or else the caps check is going to pass in
a degenerate sense. Use X for commands.
Signed-off-by: Sage Weil <sage@inktank.com>
set env var TEST_EXIT_ON_ERROR=0 to obtain all errors instead of exiting
with return 1 on first error found.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
'thrash_map' is only set if we are the leader, so we would thrash and
propose the pending value if we are the leader. However, we should keep
the 'is_leader()' check not only for clarity's sake (an unfamiliar reader
may cry OMGBUG, prompting to a patch much like this), but also because
we may lose a subsequent election and become a peon instead, while still
holding a 'thrash_map' value > 0 -- and we really don't want to propose
while being a peon.
[This is a rebased version of 5eac38797d,
complete with the typo fix in d656aed599ee754646e16386ce5a4ab0117f2d6e.]
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
send_latest() checks for readable and, if untrue, will wait before sending
out the latest OSDMap. This is completely unnecessary; I think it is a
hold-over from when we have independent paxos states. An audit of all
callers confirms that everyone would be happy with whatever is committed,
even if we are in the process of committing an even newer version.
Effectively, everyone waits *above* this layer in the usual PaxosService
traps for whether we are readable or not. This means that waiting_for_map
and send_to_waiting() go away entirely, which is nice.
This addresses, among other things: send_to_waiting() is called from
update_from_paxos(), which can be called when we are not readable due to
the paxos commit/finish timing changes in f1ce8d7c95 and
c711203c0d. If no subsequent update happens, those waiters never get
their maps.
Instead, we send them immediately--we know they are committed and old
history is as good as future history.
Fixes: #5643
Signed-off-by: Sage Weil <sage@inktank.com>
This reverts commit f06a124a7f.
On peons, on_active() is only called when we *first* become active after an
election. Only on the leader is it called after each commit/update. This
makes this change cause other problems (broken subscriptions on peons, in
particular). We possibly should fix that, but there is also a simpler fix
for the original problem we were trying to solve.
Signed-off-by: Sage Weil <sage@inktank.com>
The logic was a bit broken. Basically, we want to make sure
that region names are the same. However, if region name is not
set then we need to check whether it's the master region. This
can happen in upgrade cases where originally we didn't have
a region name set.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Multiple fixes:
- sync master, secondary entry point ver on creation
- use correct entry point version when removing entry point
- check correct version on bucket removal
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
was never initialized correctly anyway. It was only supposed to
be used for buckets, but it was never initialized in that case.
Using s->bucket_info.objv_tracker instead.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
We can only forward the bucket removal to the master if it was
successfully removed locally.
The master region has no knowledge about whether the
bucket can be removed or not, e.g., there are still objects in the
bucket. If we send it to the master first, then it'll happily remove it
even though it might fail in the end.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
We had a problem with bucket recreation, where we identified
that bucket has already existed, but missed the fact that it's
the same bucket, so removal of the bucket index was wrong.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>