Consider:
- paxos starts a commit N+1
- a majority of the peers ack it
- paxos::commit() writes N+1 it to disk
- tells peers to commit
- peers commit N+1, *and* refresh_from_paxos(), and generate N+1 full map
- leader does _scrub on N+1, without latest full osdmap
- peers do _scrub on N+1, with latest full osdmap
- leader finishes paxos gather, does refresh_from_paxos()
-> scrub fails.
Fix this by doing the refresh_from_paxos() at commit time and not when
the paxos round finishes. We move the refresh out of finish_proposal
and into its own helper, and update all callers accordingly. This
keeps on-disk state more tightly in sync with in-memory state and
avoids the need for a e.g., kludgey workaround in the scrub code.
We also simplify the bootstrap checks a bit by doing so immediately
and relying on the normal bootstrap paxos reset paths to clean up
any waiters.
Signed-off-by: Sage Weil <sage@inktank.com>
It is possible for a sequence like:
- probe
- first probe reply has paxos trim that indicates a full sync is
needed
- start sync
- clear store
- something happens that makes us abort and bootstrap (e.g., the
provider mon restarts
- probe
- first probe reply has older paxos trim bound and we call an election
- on election completion, we crash because we have no data.
Non-determinism of the probe decision aside, we need to ensure that
the info we share during probe (fc, lc) is accurate, and that once we
clear the store we know we *must* do a full sync.
Fixes: #5621
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
test/test_rgw_admin_opstate.cc: In member function 'int admin_log::test_helper::extract_input(int, char**)':
warning: test/test_rgw_admin_opstate.cc:129:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_opstate.cc:131:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_opstate.cc:133:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_opstate.cc:135:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
test/test_rgw_admin_log.cc: In member function 'int admin_log::test_helper::extract_input(int, char**)':
warning: test/test_rgw_admin_log.cc:132:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_log.cc:134:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_log.cc:136:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_log.cc:138:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
Signed-off-by: Sage Weil <sage@inktank.com>
test/test_rgw_admin_meta.cc: In member function 'int admin_meta::test_helper::extract_input(int, char**)':
warning: test/test_rgw_admin_meta.cc:126:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_meta.cc:128:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_meta.cc:130:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
warning: test/test_rgw_admin_meta.cc:132:24: comparison between signed and unsigned integer expressions [-Wsign-compare]
on arm7l
Signed-off-by: Sage Weil <sage@inktank.com>
cls/rgw/cls_rgw.cc: In function 'int get_obj_vals(cls_method_context_t, const string&, const string&, int, std::map, ceph::buffer::list>*)':
warning: cls/rgw/cls_rgw.cc:175:28: narrowing conversion of '129' from 'int' to 'char' inside { } is ill-formed in C++11 [-Wnarrowing]
on arm7l
Signed-off-by: Sage Weil <sage@inktank.com>
If we get a monmap update, the leader bootstraps. Peons should do the
same.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
The _active callback can get called while are already proposing. If
that happens, we should not prepare a fresh new pending but should
wait for the previous proposal to finish.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Do not call C_Committed after trim or else we will prematurely clear
the bool proposing, propose something again using the same version, and
crash. We do not in fact need anything to happen here aside from the
refresh_from_paxos() that happens on its own.
Broken by 39b71c5826.
Signed-off-by: Sage Weil <sage@inktank.com>
Change badchars to goodchars (no one was using badchars); allow
goodchars to be a RE character class of valid characters for the
param. First use: crush item names.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
I don't know what I was thinking; this was always the right validation
algorithm, and I broke it trying to simplify.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
* wip-wsgi:
ceph-rest-api: separate into module and front-end for WSGI deploy
ceph-rest-api: make main program be "shell" around WSGI guts
Reviewed-by: Sage Weil <sage@inktank.com>
To deploy ceph-rest-api within a WSGI server (apache/mod_wsgi,
nginx/uwsgi, etc.), there needs to be an importable (.py) module
that performs all init/config when imported. ceph-rest-api was
close, but it needs to be named properly, and there's no argument
passing, so it needs to get args from a fixed file or the env.
Separate most of ceph-rest-api into pybind/ceph_rest_api.py, and make
its arguments come from the environment, and init errors be
ImportError exceptions. Recase ceph-rest-api as a thin layer that
does the usual setup and arg parsing, and then sets args into the
environment and imports ceph_rest_api.py, catching exceptions and
reporting errors. This allows standalone execution as usual.
ceph-rest-api grabs a few module globals (addr/port and the flask.app)
to use after it imports.
Accept cluster name, and do the ceph.conf search using cluster name
in the appropriate places in the searched-for files.
Also ceph_rest_api.py gets a little cleanup (fewer global variables,
cleaner conf file search algorithm, better error reporting on conf
load)
Also: doc updates, packaging updates to include ceph_rest_api.py
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Calling handle_ack() here has no effect because we have already
spliced sent messages back into our out queue. Instead, pull them out
of there and discard. Add a few assertions along the way.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Hangs result if 'ceph auth import' is attempted without -i.
Check for this case and return error status. Also,
update auth import help to more-clearly indicate that "input"
means "-i <file>".
Fixes: #4599
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Remove obsolete use of collection_move()
Allow operations to be skipped if random selections don't make sense
Track total number of possible objects in m_num_objects
BUG: do_remove() was calling _do_touch() instead of _do_remove()
For ops that require an object, select from among existing objects in collection
Initialize m_num_objects unique objects across collections
touch: don't create an object that already exists in another collection
remove: Use remove_obj() to clear object from m_objects to have accurate tracking
clone/clone_range(): Select 2 existing objects in the collection
add: Skip operation if selected target object name exists in target collection
move: Removed this buggy operation that is only present for upgrades
Fixes: #5371Fixes: #5240
Signed-off-by: David Zafman <david.zafman@inktank.com>