Commit Graph

32750 Commits

Author SHA1 Message Date
Sage Weil
008ce85d19 Merge pull request #1614 from ceph/wip-7964
Wip 7964

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-07 14:01:58 -07:00
Sage Weil
71fc7ae306 Merge pull request #1616 from ceph/wip-7916
ReplicatedPG: improve get_object_context debugging

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-07 13:59:22 -07:00
Sage Weil
1a9952c605 Merge pull request #1613 from ceph/wip-7994
OSD: _share_map_outgoing whenever sending a message to a peer

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-07 10:57:33 -07:00
Sage Weil
51da3bb07a mds: fix uninit val in MMDSSlaveRequest
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-07 08:22:32 -07:00
Sage Weil
db281bf51e Merge pull request #1607 from ceph/wip-7997
mon: wait for quorum for MMonGetVersion

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2014-04-07 08:11:00 -07:00
Sage Weil
68e27116d0 Merge pull request #1609 from ceph/wip-7739
mds: fix some uninitialized message fields

Reviewed-by: Zheng Yan <zheng.z.yan@intel.com>
2014-04-06 17:56:05 -07:00
Sage Weil
76cbd5dd82 mds: fix uninit MMDSSlaveRequest lock_type
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-06 17:36:52 -07:00
Samuel Just
c0fd3df41e Merge pull request #1608 from ceph/wip-8002
osd: fix osd map subscribe on YOU_DIED osd_ping

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-04-06 16:32:38 -07:00
Sage Weil
4ea9e4818f osd: fix map subscription in YOU_DIED osd_ping handler
If we have epoch X and find out we died as of epoch Y, we still want to
request X+1.  Among other things, this fixes a 'stall' if Y happens to be
the most recent map published and no new maps are generated because we will
never get anything back from our subscription.

This makes this osdmap_subscribe() caller match every other caller by
passing in current epoch + 1.

Fixes: #8002
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-06 16:03:50 -07:00
Sage Weil
2f7522c83a msgr: add ms_dump_on_send option
This is useful only for debugging.  The encoded contents of a message are
dumped to the log on message send.  This is useful when valgrind is
triggering warnings about uninitialized memory in messages because the
call chain will indicate which message type is to blame, whereas the
usual writer thread context does not tell us any useful information.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-06 13:19:11 -07:00
Sage Weil
87e6a62e4f mds: fix uninitialized fields in MDiscover
Fixes: #7739
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-06 13:18:40 -07:00
Sage Weil
67fd4218d3 mon: wait for quorum for MMonGetVersion
We should not respond to checks for map versions when we are in the
probing or electing states or else clients will get incorrect results when
they ask what the latest map version is.

Fixes: #7997
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-05 16:58:55 -07:00
Sage Weil
6a4c50d7f2 Merge pull request #1605 from ceph/wip-7993
ceph-post-file: use getopt for multiple options, add longopts to help

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-04 18:07:52 -07:00
Greg Farnum
232ac1a52a OSD: _share_map_outgoing whenever sending a message to a peer
This ensures that they get new maps before an op which requires them (that
they would then request from the monitor).

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-04-04 16:06:05 -07:00
Dan Mick
6f40b64463 ceph-post-file: use getopt for multiple options, add longopts to help
Fixes: #7993
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2014-04-04 15:26:42 -07:00
Samuel Just
ebb865b12c Merge pull request #1603 from ceph/wip-7983
osd/ReplicatedPG: do not hit_set_persist while potentially backfilling hit_set_*

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-04-04 15:17:00 -07:00
Dan Mick
f2edd959fc Merge pull request #1604 from ceph/wip-7992
ceph-post-file: fix installation of ssh key files
2014-04-04 14:41:02 -07:00
Sage Weil
2f6a62b457 ceph-post-file: fix installation of ssh key files
Fixes: #7992
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-04 14:39:56 -07:00
Sage Weil
e02b7f93ab osd/ReplicatedPG: do not hit_set_persist while potentially backfilling hit_set_*
The hit_set transactions may include both a modify of the new hit_set and
deletion of an old one, spanning the backfill boundary, and we may end up
sending a backfill target a blank transaction that does not correctly
remove the old object.  Later it will notice the stray object and
throw an assertion.

Fix this by skipping hit_set_persist() if any of the backfill targets are
still working on the very first hash value in the PG (which is where all
of the hit_set objects live).  This is coarse but simple.

Another solution would be to send separate ops for the trim/deletion and
new hit_set update, but that is a bit more complex and a bit more
runtime overhead (twice the messages).

Fixes: #7983
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-04 13:56:33 -07:00
Sage Weil
4aef403dbc doc/release-notes: note about emperor backport of mon auth fix
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-04 12:59:41 -07:00
Joao Eduardo Luis
db266a3fb2 mon: MonCommands.h: have 'auth' read-only operations require 'x' cap
This reintroduces the same semantics that were in place in dumpling prior
to the refactoring of the cap/command matching code.

We haven't added this requirement to auth read-write operations as that
would have the potential to break a lot of well-configured keyrings once
the users upgraded, without any significant gain -- we assume that if
they have set 'rw' caps on a given entity, they are indeed expecting said
entity to be sort-of-privileged entities with regard to monitor access.

Fixes: #7919

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-04 12:51:27 -07:00
Samuel Just
82d2551c8c Merge pull request #1602 from ceph/wip-cache-create-fix
ReplicatedPG: fix CEPH_OSD_OP_CREATE on cache pools

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-04-04 10:34:40 -07:00
Ilya Dryomov
b219c8f917 ReplicatedPG: fix CEPH_OSD_OP_CREATE on cache pools
The following

./ceph osd pool create data-cache 8 8
./ceph osd tier add data data-cache
./ceph osd tier cache-mode data-cache writeback
./ceph osd tier set-overlay data data-cache

./rados -p data create foo
./rados -p data stat foo

results in

  error stat-ing data/foo: No such file or directory

even though foo exists in the data-cache pool, as it should.  STAT
checks for (exists && !is_whiteout()), but the whiteout flag isn't
cleared on CREATE as it is on WRITE and WRITEFULL.  The problem is
that, for newly created 0-sized cache pool objects, CREATE handler in
do_osd_ops() doesn't get a chance to queue OP_TOUCH, and so the logic
in prepare_transaction() considers CREATE to be a read and therefore
doesn't clear whiteout.  Fix it by allowing CREATE handler to queue
OP_TOUCH at all times, mimicking WRITE and WRITEFULL behaviour.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-04 20:23:14 +04:00
Sage Weil
2bd548e915 Merge pull request #1600 from ceph/wip-7922
Wip 7922

Passes my manual testing and the new teuthology test case.

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-04 09:22:42 -07:00
David Zafman
be8b228140 osd: Send REJECT to all previously acquired reservations
When getting a REJECT from a backfill target, tell already GRANTed targets to
go back to RepNotRecovering state by sending a REJECT to them.

Fixes: #7922

Signed-off-by: David Zafman <david.zafman@inktank.com>
2014-04-03 22:13:17 -07:00
Sage Weil
18201efd65 doc/release-notes: v0.79 release notes
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-03 18:28:15 -07:00
Dan Mick
4dc62669ec Fix byte-order dependency in calculation of initial challenge
Fixes: #7977
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 18:28:15 -07:00
Samuel Just
6cb50d74a3 ReplicatedPG::_delete_oid: adjust num_object_clones
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 17:53:42 -07:00
Samuel Just
0f2ab4dd76 ReplicatedPG::agent_choose_mode: improve debugging
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 17:53:40 -07:00
Sage Weil
80a1ed8a74 Merge pull request #1599 from ceph/wip-7978
rgw: only look at next placement rule if we're not at the last rule

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 17:44:13 -07:00
Yehuda Sadeh
0552ecbabb rgw: only look at next placement rule if we're not at the last rule
Fixes: #7978
We tried to move to the next placement rule, but we were already at the
last one, so we ended up looping forever.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2014-04-03 15:15:41 -07:00
Samuel Just
eb23ac46e9 ReplicatedPG::agent_choose_mode: use num_user_objects for target_max_bytes calc
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 13:04:41 -07:00
Samuel Just
cc9ca67af3 ReplicatedPG::agent_choose_mode: exclude omap objects for ec base pool
Fixes: #7831
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 13:04:03 -07:00
Samuel Just
a130a4452e osd/: track num_objects_omap in pg stats
Fixes: #7831
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 13:04:02 -07:00
Samuel Just
9894a55d3b ReplicatedPG: handle FLAG_OMAP on promote and copyfrom
Fixes: #7967
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 13:03:56 -07:00
Sage Weil
31df91e091 osd: add 'osd debug reject backfill probability' option
This will make the OSD randomly reject backfill reservation requests.  This
exercises the failure code paths but does not break overall behavior
because the primary will back off and retry later.

This should help us reproduce #7922.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-03 12:06:08 -07:00
Sage Weil
90c4540b5b Merge pull request #1598 from ceph/wip-test-alloc-hint-ec-fix
qa: test_alloc_hint: set ec ruleset-failure-domain to osd

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 11:45:21 -07:00
Sage Weil
9f41975c40 Merge pull request #1581 from ceph/wip-init
a few deb changes
2014-04-03 11:44:29 -07:00
Ilya Dryomov
d323634024 qa: test_alloc_hint: set ec ruleset-failure-domain to osd
Create a custom profile with ruleset-failure-domain=osd.  (The default
ruleset-failure-domain=host won't do because this script assumes and
works only if all osds are on the same host.)  While at it, set k and m
explicitly to avoid troubles in the future.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-03 21:16:14 +04:00
Sage Weil
60d1975682 Merge pull request #1593 from dachary/wip-vstart-erasure-code-default
vstart: set a sensible default for ruleset-failure-domain

Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-03 09:57:49 -07:00
Sage Weil
cdcd8368a7 Merge pull request #1596 from ceph/wip-vstop-unmap
Unmap rbd images when stopping the whole cluster

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 09:57:04 -07:00
Ilya Dryomov
8e46fe00fa stop.sh: unmap rbd images when stopping the whole cluster
Unmap rbd images when stopping the whole cluster.  Not doing so results
in images that cannot be unmapped until the same cluster is brought
back up.  Issue a warning if we failed to unmap all images.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-03 18:14:57 +04:00
Ilya Dryomov
afc5dc530c stop.sh: do not trace commands
Command tracing here doesn't bring any value and simply pollutes the
terminal, as the script always runs to completion.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-03 18:14:57 +04:00
Ilya Dryomov
0110a19b50 stop.sh: indent 4 spaces universally
Currently there is a mix between tabs and 4 spaces indent.  Switch to
4 spaces indent.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-03 18:03:23 +04:00
Loic Dachary
e4a8535ad1 vstart: set a sensible default for ruleset-failure-domain
Set ruleset-failure-domain=osd so that

  ./ceph osd pool create ecpool 12 12 erasure
  ./rados --pool ecpool put SOMETHING /etc/group

works by default. When using a vstart cluster the default failure
domain (host) won't work because all OSDs are in "localhost".

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-04-03 14:07:19 +02:00
Josh Durgin
89f38c09f8 Merge pull request #1592 from ceph/wip-7965
lockdep: fix when instantiated multiple times (bug 7965)

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-04-02 17:03:09 -07:00
Sage Weil
c43822cdaf lockdep: reset state on shutdown
If we shut down, clear out all of the lockdep state.  This ensures that if
we start up again on another cct, we will not be confused by old type ids
and dependency state.

Possibly contributed to #7965.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-02 16:46:33 -07:00
Sage Weil
7a49f3da55 lockdep: do not initialize if already started
If we have already registered a cct for lockdep, do not accept another one.
We already check that the cct matches when we shut down.  This we will run
for the life span of a single cct and no longer.

Fixes: #7965
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-02 16:46:30 -07:00
Samuel Just
eae5a37779 Merge pull request #1591 from ceph/wip-7915
mon: bump snap_epoch when adding a tier (fixes 7915)

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-04-02 16:13:59 -07:00
Sage Weil
6bf46e23e0 OSDMap: bump snap_epoch when adding a tier
When we make an existing pool a tier, we start copying the snap metadata
from the base tier.  That includes removed_snaps.  In order for the OSD
to recognize that this value is changing for the first time, we need to
set snap_epoch, or else the OSD doesn't update it's in-memory PGPool
with removed snaps and we eventually hit an assertion failure because
PGPool::cached_remove_snaps is incorrect (e.g., empty).

Fix this by bumping snap_epoch when we add the new tier.

Fixes: #7915
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-02 16:03:37 -07:00