Commit Graph

21465 Commits

Author SHA1 Message Date
John Wilkins
280aeaf035 doc: Reverted so that we don't force yes or non-interactive.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-09-19 16:24:12 -07:00
John Wilkins
9c8061fe5d doc: Removed legacy usage.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-09-19 16:23:26 -07:00
John Wilkins
b3651dac76 doc: Cleanup, spell check, grammar check mostly.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-09-19 16:22:38 -07:00
John Wilkins
b311a408a3 doc: Updating the index to remove legacy and uneeded entries.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-09-19 16:21:08 -07:00
Sam Lang
5e9158ebdb Fix description for --nodaemon
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-09-19 15:34:24 -07:00
Sage Weil
0a83cb9956 mon: clean up recovered_leader() checks
Assert they are called only once per machine per election epoch.  Fix the
recovered_peon() caller to do that.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-19 14:34:56 -07:00
Sam Lang
50e7251dd0 Abort on failure
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-09-19 14:34:04 -07:00
Sam Lang
31f430a9b2 Fixup usage to reflect options available
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-09-19 14:33:16 -07:00
Sam Lang
faddb80c42 Swap current dir (.) with CEPH_BIN for OOT builds
With out-of-tree builds, vstart.sh needs CEPH_BIN to be set, and
needs to look for init-ceph in CEPH_BIN rather than just ./init-ceph.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-09-19 13:57:53 -07:00
Sam Lang
0f7c516f3e Move keyring option to global section
Using vstart.sh -n uses ceph-authtool to generate the keyring file
in ./keyring.  The vstart.sh script then writes out the ceph.conf
with a keyring option in the [client] section, so when the monitors
start, they can't find a keyring file.  This commit puts the keyring in
the [global] section.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-09-19 13:57:37 -07:00
Sage Weil
259fffbed4 mon: require MON_GV protocol feature
Require the MON_GV feature when

 - we see the ondisk feature is set on bootup
 - we enable the ondisk feature

This means that once we form a quorum with the feature and enable it on
disk, there is no going back; we won't be able to talk to old monitors
without the feature, and a downgrade won't be possible.

Hopefully, in practice, any monitors with old code will be up at the time
we are upgrading, such that the quorum will not include the feature and we
won't make the transition.  Otherwise, if they are down, and the remaining
nodes have the feature and enable it, and the old code starts up, it won't
be able ot join until it is upgraded to the new code as well.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-19 11:53:45 -07:00
Sage Weil
879ce01a11 mon: move setting of ondisk GV feature into helper
Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-19 11:53:44 -07:00
Sage Weil
414cd1b500 mon: do not issue global versions if quorum does not support the feature
Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-19 11:53:44 -07:00
Sage Weil
3baae15d5f mon: set new incompat GV feature when paxos stabilizes for the first time
This is a marker that future versions will use to know whether they can
safely convert the monitor data to the new format.  If the GV feature is
not present, they will refuse to convert.

Also set the ondisk GV feature at the same time.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-19 11:53:22 -07:00
Sage Weil
db04ce46f5 mon: make MRoute encoding backwards-compatible
If the target as the NULLROUTE feature, use a new encoding that explicitly
indicates whether a message follows.  If the feature is absent, use the
old encoding.  The mon is responsible for not trying to send a null reply
if the target does not have the feature.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-19 11:49:34 -07:00
Alex Elder
e068bd7c80 rbd/copy.sh: fix typo
Or maybe it was a spello, or a thinko, or something.  In any case
I'm pretty sure Josh intended to call the function he added in
commit 78d6a60ca, and not the non-existent "test_import_args".

Signed-off-by: Alex Elder <elder@inktank.com>
(cherry picked from commit ed43d4de12)
2012-09-18 23:36:57 -07:00
Alex Elder
ed43d4de12 rbd/copy.sh: fix typo
Or maybe it was a spello, or a thinko, or something.  In any case
I'm pretty sure Josh intended to call the function he added in
commit 78d6a60ca, and not the non-existent "test_import_args".

Signed-off-by: Alex Elder <elder@inktank.com>
2012-09-18 22:51:10 -05:00
Sage Weil
471acda426 Merge remote-tracking branch 'gh/next' 2012-09-18 16:49:58 -07:00
Josh Durgin
f530659786 Merge remote branch 'origin/wip-librbd-locking'
Conflicts:
	qa/workunits/rbd/copy.sh

Reviewed-by: Sage Weil <sage.weil@inktank.com>
2012-09-18 16:06:25 -07:00
Josh Durgin
7a3d1e66ef librbd: bump version
This marks the availability of the cloning and locking functions.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:51:38 -07:00
Josh Durgin
855dff62ae cls_rbd: remove locking methods
These are unnecessary now that librbd is using the generic cls_lock.

Fixes: #2951
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:45:54 -07:00
Josh Durgin
8f2a0d91ab rbd: add locking commands
The locker (entity_name_t) will be different each time the rbd
command line tool is run, so 'lock remove' is always breaking a lock.

Fixes: #2556
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:45:43 -07:00
Josh Durgin
b66ef430a0 qa: update rbd tests and runner
* no longer need to wait for watch timeout since #2948 was fixed
* use --format 2 instead of --new-format
* add test_cls_rbd to run-rbd-tests script

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:43:45 -07:00
Josh Durgin
3a9e6650af librbd: use generic cls_lock instead of cls_rbd's locking
Update the librbd locking api to make more sense:
 * Add an optional tag to shared locking
 * only make shared vs exclusive different functions in the user-visible api
 * return a list of structs instead of a set of pairs
 * fix incorrect range checking in the C api
 * rename locks to lockers to be consistent with the generic locking class
 * rename other_locker parameter to client, to match the list_lockers usage

Fixes: #2952
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:43:13 -07:00
Josh Durgin
69ee9afa27 cls_lock_client: add ObjectOperation-based get_lock_info
This will be used by librbd to grab lock info along with
the rest of its header information in a single request.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:42:37 -07:00
Josh Durgin
6dcbbbb6fc cls_lock_types: add missing include
msg_types defines entity-related types used here.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:40:19 -07:00
Josh Durgin
67bbcf2c27 cls_lock_client: return error when decoding fails
Library code shouldn't be using cerr either.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:40:05 -07:00
Josh Durgin
d1252ea21e cls_lock_client: fix indentation
Add indentation settings to header, and reindent.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:39:56 -07:00
Josh Durgin
bf2e489248 cls_lock_client: change modified reference parameters to pointers
This makes it clear which parameters are modified,
as our style guide states.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:39:43 -07:00
Josh Durgin
2dca3a8616 cls_lock_client: clean up reference parameters
These should all be const. The remaining reference parameters
will be converted to pointers in another commit.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:36:11 -07:00
Josh Durgin
e71fdc75be cls_lock: fix some spacing
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:36:08 -07:00
Yehuda Sadeh
b69a9599d8 cls_lock: specify librados namespace explicitly
librados namespace was not specified, hence required including
source files to add using namespace. This fixes it.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-09-18 15:35:30 -07:00
Sage Weil
55673babb9 radosgw-admin: fix cli test
Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-18 15:28:32 -07:00
Josh Durgin
3372f1471c rbd: only open the destination pool for import
Otherwise importing into another pool when the default pool, rbd,
doesn't exist results in an error trying to open the rbd pool.

Reported-by: Sébastien Han <han.sebastien@gmail.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:21:49 -07:00
Josh Durgin
ad2ba8e606 qa: test args for rbd import
Make sure that --pool/--dest-pool and --image/--dest all work
interchangeably.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:21:43 -07:00
Josh Durgin
d14a31d387 rbd: make --pool/--image args easier to understand for import
There's no need to set the default pool in set_pool_image_name - this
is done later, in a way that doesn't ignore --pool if --dest-pool
is not specified.

This means --pool and --image can be used with import, just like
the rest of the commands. Without this change, --dest and --dest-pool
had to be used, and --pool would be silently ignored for rbd import.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:21:39 -07:00
Josh Durgin
a583a605aa librbd, cls_rbd: close snapshot creation race with old format
If two clients created a snapshot at the same time, the one with the
higher snapshot id might be created first, so the lower snapshot id
would be added to the snapshot context and the snaphot seq would be
set to the lower one.

Instead of allowing this to happen, return -ESTALE if the snapshot id
is lower than the currently stored snapshot sequence number. On the
client side, get a new id and retry if this error is encountered.

Backport: argonaut
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-09-18 15:21:21 -07:00
Sage Weil
a4833bb293 librbd: fix delete[]
CID 716902: Non-array delete for scalars (DELETE_ARRAY)
At (15): Deleting array variable "buf" with non-array delete in "delete buf".

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-18 15:19:47 -07:00
Josh Durgin
3401f004cd doc: clarify rbd man page (esp. layering)
* a clone's size can't be overridden
* note which commands require format 2
* clarify details of copy
* add examples for cloning
* add pool to map example for consistency
* fix a couple warnings and re-sync man page with rst

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:19:17 -07:00
Josh Durgin
582001eb49 rbd: add --format option
This chooses whether to use the original (supported by krbd)
or the new (supports layering) format.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:19:07 -07:00
Josh Durgin
a1124193c2 librbd: prevent racing clone and snap unprotect
If the following sequence of events occured,
a clone could be created of an unprotected snapshot:

1. A: begin clone - check that snap foo is protected
2. B: rbd unprotect snap foo
3. B: check that all pools have no clones of foo
4. B: unprotect snap foo
5. A: finish creating clone of foo, add it as a child

To stop this from happening, check at the beginning and end of
cloning that the parent snapshot is protected. If it is not,
or checking protection status fails (possibly because the parent
snapshot was removed), remove the clone and return an error.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:18:59 -07:00
Dan Mick
e85a238303 rbd: add "children" command, update cli test files
Fixes: #2720
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-09-18 15:18:50 -07:00
Dan Mick
bd9405844b librbd: add {rbd_}list_children() methods
These iterate over all pools and check for children of a
particular snapshot.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2012-09-18 15:18:27 -07:00
Sage Weil
f6b2f79c39 mon: make heartbeat grace and down out interval scaling optional
Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-18 14:39:01 -07:00
Sage Weil
be5039155e mon: add tunable to control laggy probability weighting. simplify decoding.
Default to .3. Setting to 0 effectively turns this off.

Also make OSDMap::osd_xinfo_t decode into a float to simplify the
arithmetic conversions.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-18 14:39:01 -07:00
Sage Weil
5499778f8d mon: apply grace period scaling to mon_osd_down_out_interval
Scale the down/out interval the same way we do the heartbeat grace, so that
we give laggy osds a bit longer to recovery.

See #3047.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-18 14:39:01 -07:00
Sage Weil
2ad62d5256 mon: decay laggy calculations over time
Add a configurable halflife for the laggy probability and duration and
apply it at the time those values are used to adjust the heartbeat grace
period.  Both are multiplied together, so it doesn't matter which you
think is being decayed (the probability or the interval).

Default to an hour.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-18 14:39:01 -07:00
Sage Weil
abd2ae7423 mon: factor reporter lagginess into grace adjustment
Use reporters as a proxy for laggy subclusters within the overall cluster.
See #3046.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-18 14:39:00 -07:00
Sage Weil
adf0fe6a10 mon: scale heartbeat grace based on laggy probability, interval
If, based on historical behavior, an observed osd failure is likely to be
due to unresponsiveness and not the daemon stopping, scale the heartbeat
grace period accordingly:

 grace' = grace + laggy_probabiliy * laggy_interval

This will avoid fruitlessly marking OSDs down and generating additional
map update overhead when the cluster is overloaded and potentially
struggling to keep up with map updates.   See #3045.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-18 14:39:00 -07:00
Sage Weil
3f51d31639 mon: check failures in tick
Currently we only trigger a failure on receipt of a failure report.  Move
the checks into a helper and check during tick() too, so that we will
trigger failures even when the thresholds are not met at failure report
time.  This is rarely true now, but will be true once we locally scale the
grace period.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-09-18 14:39:00 -07:00