The previous debug message outputted the function's name, as often our
functions do. This was however a source of bewilderment, as users would
see those in logs and think their stores would need conversion. Changing
this message is trivial enough and it will make ceph users happier log
readers.
Backport: cuttlefish
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
We are supposed to have umount'ed the store and set the pointer to NULL.
We should not tolerate any other case on init().
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
We already open the store on ceph_mon.cc, before we start the conversion.
Given we are unable to reproduce this every time a conversion is triggered,
we are led to believe that this causes a race in leveldb that will lead
to 'store.db/LOCK' being locked upon the open this patch removes.
Regardless, reopening the db here is pointless as we already did it when
we reach Monitor::StoreConverter::convert().
Fixes: #5640
Backport: cuttlefish
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Return the result of rgw_(un)link_bucket()/ from
RGWBucketMetadataHandler::put() to signal errors correctly to
the function caller.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Remove twice listed checks, get checks in alphabetical order,
added comment above the checks to point out to keep it in
alphabetical order to avoid double checks.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
When we set bucket.instance meta, we need to set
the correct bucket placement to the bucket (according to
the specific placement rule). However, it might be that
bucket placement was never configured and we just go by
the defaults, using the old legacy pools selection.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Large peering_wq batch sizes may excessively delay
peering messages resulting in unreasonably long
peering. This may speed up peering.
Backport: cuttlefish
Related: #5084
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Put the crc in the status string and drop the header and footer. If users
want to capture it,
ceph report 2>&1 > foo.txt
Signed-off-by: Sage Weil <sage@inktank.com>
We can live with the incompatibility here; the hack is currently
not working anyway (see #5623).
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
The MClientReconnect puts everything in the data payload portion of
the message and nothing in the front portion. That means that if the
message is resent (socket failure or something), the messenger thinks it
hasn't been encoded yet (front empty) and reencodes, which means
everything gets added (again) to the data portion.
Decoding keep decoding until it runs out of data, so the second copy
means we decode garbage snap realms, leading to the crash in bug
Clearing data each time around resolves the problem, although it does
mean we do the encoding work multiple times. We could alternatively
(or also) stick some data in the front portion of the payload
(ignored), but that changes the wire protocol and I would rather not
do that.
Fixes: #4565
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Set it to the stamp of the MForward that carried us. One could argue
we really want the original receive stamp on the origin, but that is
not available to us, and this is better than nothing.
In particular, this gives 'ceph log ...' commands a timestamp when they
are forwarded via a peon. The stamp is still between when the request
is sent and when it is committed/acked, so all is well from the
client's perspective.
Signed-off-by: Sage Weil <sage@inktank.com>
This is only there for the benefit of win_standalone_election(), but it
doesn't need it, it clutters the code, and weakens our assertions.
Now the only win_election() callers are win_standalone_election() (which
is a single path that just did _reset()) and from the elector.
Signed-off-by: Sage Weil <sage@inktank.com>
Previously we would call mon->reset() and set various flags (like
exited_quorum timestamp), but the state would remain PEON. Make an
explicit join_election() callback and set the state there, and add
asserts in reset() (renamed to be private) so that we ensure all
callers are well-behaved.
Signed-off-by: Sage Weil <sage@inktank.com>
It is possible for a sequence like:
- probe
- first probe reply has paxos trim that indicates a full sync is
needed
- start sync
- clear store
- something happens that makes us abort and bootstrap (e.g., the
provider mon restarts
- probe
- first probe reply has older paxos trim bound and we call an election
- on election completion, we crash because we have no data.
Non-determinism of the probe decision aside, we need to ensure that
the info we share during probe (fc, lc) is accurate, and that once we
clear the store we know we *must* do a full sync.
Fixes: #5621
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
We had duplicated code in election_finished() and restart(), and it was
incomplete. Put it all in restart() only (the mon should have called
restart() long before the election finishes). Note that we cannot
assert as much in election_finished() because another service may have
just cross-proposed.
Signed-off-by: Sage Weil <sage@inktank.com>
Each commit should match with exactly one proposal; finish it when we
actually commit it and make sensible asserts.
The old finish_proposal() turns into finish_round(), and performs
generic checks and cleanup associated with the transition from
updating -> active.
Signed-off-by: Sage Weil <sage@inktank.com>
Consider:
- paxos starts a commit N+1
- a majority of the peers ack it
- paxos::commit() writes N+1 it to disk
- tells peers to commit
- peers commit N+1, *and* refresh_from_paxos(), and generate N+1 full map
- leader does _scrub on N+1, without latest full osdmap
- peers do _scrub on N+1, with latest full osdmap
- leader finishes paxos gather, does refresh_from_paxos()
-> scrub fails.
Fix this by doing the refresh_from_paxos() at commit time and not when
the paxos round finishes. We move the refresh out of finish_proposal
and into its own helper, and update all callers accordingly. This
keeps on-disk state more tightly in sync with in-memory state and
avoids the need for a e.g., kludgey workaround in the scrub code.
We also simplify the bootstrap checks a bit by doing so immediately
and relying on the normal bootstrap paxos reset paths to clean up
any waiters.
Signed-off-by: Sage Weil <sage@inktank.com>
It is possible for a sequence like:
- probe
- first probe reply has paxos trim that indicates a full sync is
needed
- start sync
- clear store
- something happens that makes us abort and bootstrap (e.g., the
provider mon restarts
- probe
- first probe reply has older paxos trim bound and we call an election
- on election completion, we crash because we have no data.
Non-determinism of the probe decision aside, we need to ensure that
the info we share during probe (fc, lc) is accurate, and that once we
clear the store we know we *must* do a full sync.
Fixes: #5621
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
test/test_rgw_admin_opstate.cc: In member function 'int admin_log::test_helper::extract_input(int, char**)':
warning: test/test_rgw_admin_opstate.cc:129:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_opstate.cc:131:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_opstate.cc:133:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_opstate.cc:135:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
test/test_rgw_admin_log.cc: In member function 'int admin_log::test_helper::extract_input(int, char**)':
warning: test/test_rgw_admin_log.cc:132:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_log.cc:134:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_log.cc:136:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_log.cc:138:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
Signed-off-by: Sage Weil <sage@inktank.com>
test/test_rgw_admin_meta.cc: In member function 'int admin_meta::test_helper::extract_input(int, char**)':
warning: test/test_rgw_admin_meta.cc:126:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_meta.cc:128:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_meta.cc:130:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_meta.cc:132:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
on arm7l
Signed-off-by: Sage Weil <sage@inktank.com>
cls/rgw/cls_rgw.cc: In function 'int get_obj_vals(cls_method_context_t, const string&, const string&, int, std::map, ceph::buffer::list>*)':
warning: cls/rgw/cls_rgw.cc:175:28: narrowing conversion of '129' from 'int' to 'char' inside { } is ill-formed in C++11 [-Wnarrowing]
on arm7l
Signed-off-by: Sage Weil <sage@inktank.com>