Commit Graph

32840 Commits

Author SHA1 Message Date
Yan, Zheng
2d5bd84b93 client: assign implemented caps to caps field of MClientCaps
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-05 00:47:51 +08:00
Yan, Zheng
1538a98a4f client: hold Fcr caps during readahead
Fcr caps prevent the file from being truncated.

Fixes: #7958
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-05 00:47:51 +08:00
Yan, Zheng
701c22a81b client: implement RDCACHE reference tracking
make the code be able to track Fc caps used by aysnc buffer reads

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-05 00:47:45 +08:00
Ilya Dryomov
b219c8f917 ReplicatedPG: fix CEPH_OSD_OP_CREATE on cache pools
The following

./ceph osd pool create data-cache 8 8
./ceph osd tier add data data-cache
./ceph osd tier cache-mode data-cache writeback
./ceph osd tier set-overlay data data-cache

./rados -p data create foo
./rados -p data stat foo

results in

  error stat-ing data/foo: No such file or directory

even though foo exists in the data-cache pool, as it should.  STAT
checks for (exists && !is_whiteout()), but the whiteout flag isn't
cleared on CREATE as it is on WRITE and WRITEFULL.  The problem is
that, for newly created 0-sized cache pool objects, CREATE handler in
do_osd_ops() doesn't get a chance to queue OP_TOUCH, and so the logic
in prepare_transaction() considers CREATE to be a read and therefore
doesn't clear whiteout.  Fix it by allowing CREATE handler to queue
OP_TOUCH at all times, mimicking WRITE and WRITEFULL behaviour.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-04 20:23:14 +04:00
Sage Weil
2bd548e915 Merge pull request #1600 from ceph/wip-7922
Wip 7922

Passes my manual testing and the new teuthology test case.

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-04 09:22:42 -07:00
David Zafman
be8b228140 osd: Send REJECT to all previously acquired reservations
When getting a REJECT from a backfill target, tell already GRANTed targets to
go back to RepNotRecovering state by sending a REJECT to them.

Fixes: #7922

Signed-off-by: David Zafman <david.zafman@inktank.com>
2014-04-03 22:13:17 -07:00
Sage Weil
18201efd65 doc/release-notes: v0.79 release notes
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-03 18:28:15 -07:00
Dan Mick
4dc62669ec Fix byte-order dependency in calculation of initial challenge
Fixes: #7977
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 18:28:15 -07:00
Samuel Just
6cb50d74a3 ReplicatedPG::_delete_oid: adjust num_object_clones
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 17:53:42 -07:00
Samuel Just
0f2ab4dd76 ReplicatedPG::agent_choose_mode: improve debugging
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 17:53:40 -07:00
Sage Weil
80a1ed8a74 Merge pull request #1599 from ceph/wip-7978
rgw: only look at next placement rule if we're not at the last rule

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 17:44:13 -07:00
Yehuda Sadeh
0552ecbabb rgw: only look at next placement rule if we're not at the last rule
Fixes: #7978
We tried to move to the next placement rule, but we were already at the
last one, so we ended up looping forever.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2014-04-03 15:15:41 -07:00
Samuel Just
eb23ac46e9 ReplicatedPG::agent_choose_mode: use num_user_objects for target_max_bytes calc
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 13:04:41 -07:00
Samuel Just
cc9ca67af3 ReplicatedPG::agent_choose_mode: exclude omap objects for ec base pool
Fixes: #7831
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 13:04:03 -07:00
Samuel Just
a130a4452e osd/: track num_objects_omap in pg stats
Fixes: #7831
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 13:04:02 -07:00
Samuel Just
9894a55d3b ReplicatedPG: handle FLAG_OMAP on promote and copyfrom
Fixes: #7967
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 13:03:56 -07:00
Samuel Just
a11b3e8d63 ReplicatedPG::do_op: use get_object_context for list-snaps
find_object_context provides some niceties which we don't need since we know
the oid of the clones.  Problematically, it also return ENOENT if the snap
requested happens to have been removed.  Even in such a case, the clone may
well still exist for other snaps.  Rather than modify find_object_context to
avoid this situation for this caller, we'll simply do it inline in do_op.

Fixes: #7858
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 12:53:51 -07:00
Samuel Just
78e9813c41 ReplicatedPG: do not create snapdir on head eviction
Head eviction implies that no clones are present.  Also, add
an exists flag to SnapSetContext in order prevent an ssc from
a recent eviction from preventing a snap read from activating
the promotion machinery.

Fixes: #7858
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-03 12:53:51 -07:00
Sage Weil
31df91e091 osd: add 'osd debug reject backfill probability' option
This will make the OSD randomly reject backfill reservation requests.  This
exercises the failure code paths but does not break overall behavior
because the primary will back off and retry later.

This should help us reproduce #7922.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-03 12:06:08 -07:00
Sage Weil
90c4540b5b Merge pull request #1598 from ceph/wip-test-alloc-hint-ec-fix
qa: test_alloc_hint: set ec ruleset-failure-domain to osd

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 11:45:21 -07:00
Sage Weil
9f41975c40 Merge pull request #1581 from ceph/wip-init
a few deb changes
2014-04-03 11:44:29 -07:00
Ilya Dryomov
d323634024 qa: test_alloc_hint: set ec ruleset-failure-domain to osd
Create a custom profile with ruleset-failure-domain=osd.  (The default
ruleset-failure-domain=host won't do because this script assumes and
works only if all osds are on the same host.)  While at it, set k and m
explicitly to avoid troubles in the future.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-03 21:16:14 +04:00
Sage Weil
60d1975682 Merge pull request #1593 from dachary/wip-vstart-erasure-code-default
vstart: set a sensible default for ruleset-failure-domain

Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-03 09:57:49 -07:00
Sage Weil
cdcd8368a7 Merge pull request #1596 from ceph/wip-vstop-unmap
Unmap rbd images when stopping the whole cluster

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 09:57:04 -07:00
Ilya Dryomov
8e46fe00fa stop.sh: unmap rbd images when stopping the whole cluster
Unmap rbd images when stopping the whole cluster.  Not doing so results
in images that cannot be unmapped until the same cluster is brought
back up.  Issue a warning if we failed to unmap all images.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-03 18:14:57 +04:00
Ilya Dryomov
afc5dc530c stop.sh: do not trace commands
Command tracing here doesn't bring any value and simply pollutes the
terminal, as the script always runs to completion.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-03 18:14:57 +04:00
Ilya Dryomov
0110a19b50 stop.sh: indent 4 spaces universally
Currently there is a mix between tabs and 4 spaces indent.  Switch to
4 spaces indent.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-03 18:03:23 +04:00
Loic Dachary
e4a8535ad1 vstart: set a sensible default for ruleset-failure-domain
Set ruleset-failure-domain=osd so that

  ./ceph osd pool create ecpool 12 12 erasure
  ./rados --pool ecpool put SOMETHING /etc/group

works by default. When using a vstart cluster the default failure
domain (host) won't work because all OSDs are in "localhost".

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-04-03 14:07:19 +02:00
Josh Durgin
89f38c09f8 Merge pull request #1592 from ceph/wip-7965
lockdep: fix when instantiated multiple times (bug 7965)

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-04-02 17:03:09 -07:00
Sage Weil
c43822cdaf lockdep: reset state on shutdown
If we shut down, clear out all of the lockdep state.  This ensures that if
we start up again on another cct, we will not be confused by old type ids
and dependency state.

Possibly contributed to #7965.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-02 16:46:33 -07:00
Sage Weil
7a49f3da55 lockdep: do not initialize if already started
If we have already registered a cct for lockdep, do not accept another one.
We already check that the cct matches when we shut down.  This we will run
for the life span of a single cct and no longer.

Fixes: #7965
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-02 16:46:30 -07:00
Samuel Just
eae5a37779 Merge pull request #1591 from ceph/wip-7915
mon: bump snap_epoch when adding a tier (fixes 7915)

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-04-02 16:13:59 -07:00
Sage Weil
6bf46e23e0 OSDMap: bump snap_epoch when adding a tier
When we make an existing pool a tier, we start copying the snap metadata
from the base tier.  That includes removed_snaps.  In order for the OSD
to recognize that this value is changing for the first time, we need to
set snap_epoch, or else the OSD doesn't update it's in-memory PGPool
with removed snaps and we eventually hit an assertion failure because
PGPool::cached_remove_snaps is incorrect (e.g., empty).

Fix this by bumping snap_epoch when we add the new tier.

Fixes: #7915
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-02 16:03:37 -07:00
Samuel Just
27e353ccc1 Merge pull request #1580 from ceph/wip-7937
osd: fix scrub logic for snapdir object

Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-02 15:15:56 -07:00
Samuel Just
01445d5c62 ReplicatedPG::_scrub: don't bail early for snapdir
Fixes: #7937
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-02 15:12:41 -07:00
Samuel Just
5f680f9011 ReplicatedPG::_verify_no_head_clones: missing implies that the clone exists
Fixes: #7659
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-04-02 14:22:17 -07:00
Mohammad Salehe
7909262f21 debian: fix control to allow upgrades
Signed-off-by: Mohammad Salehe <salehe+dev@gmail.com>
2014-04-02 11:29:38 -07:00
Sage Weil
250a10296b Merge pull request #1590 from ceph/wip-7939
PG: set role for replicated even if role != shard

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-02 10:52:11 -07:00
Samuel Just
d6258b63e5 Merge pull request #1579 from ceph/wip-7907
osd/ReplicatedPG: mark_unrollbackable when _rollback_to head

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-04-02 10:35:38 -07:00
Sage Weil
17732dc0c8 debian: move rbdmap config and sysvinit/upstart scripts into ceph-common
Fixes: #7171
Signed-off-by: Sage Weil <sage@inktank.com>
2014-04-02 10:29:08 -07:00
Sage Weil
86a032f2c2 Merge pull request #1586 from ceph/wip-dirfrag
mds: fix check for merging/spliting dirfrag

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-02 08:44:33 -07:00
Sage Weil
84e62e9f0e Merge pull request #1587 from onlyjob/debian
init.d: correcting rbdmap LSB header / init order:

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-02 08:43:02 -07:00
Dmitry Smirnov
1d42de5446 init.d: correcting rbdmap init order:
* Require "$remote_fs" since it guarantees /usr availability
   (rbd executable is in /usr/bin/rbd)
 * Speed-up init.d rbd mapping on machines acting as MON/OSD
   by starting rbdmap after /init.d/ceph (when possible) and
   shutting down rbd before ceph.
 * Map rbd devices before starting X (helpful when /home is mounted from rbd).
2014-04-03 01:25:28 +11:00
Yan, Zheng
771e88a401 mds: fix check for merging/spliting dirfrag
check actual number of items instead of number of cached items

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-02 15:32:33 +08:00
Sage Weil
edb8a5965e Merge pull request #1583 from ceph/wip-largedir
Wip largedir

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-01 21:47:51 -07:00
Yan, Zheng
43bc39beab mds: ignore CDir::check_rstats() when debug_scatterstat is off
It uses lots of CPU when dirfrag is large

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-02 12:19:34 +08:00
Yan, Zheng
5a9b99aa91 mds: initialize bloom filter according to dirfrag size
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-02 12:19:34 +08:00
Yan, Zheng
16af25fba3 mds: add dentries in dirfrag to LRU in reverse order
Files in a dirfrag are usually processed in the order of readdir
results. Files at the beginning of are more likely to be used in
the future than files at the last.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-02 12:19:26 +08:00
Sage Weil
d351e5fb12 Merge pull request #1584 from ceph/wip-multimds
Wip multimds

Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-01 21:01:07 -07:00
Yan, Zheng
06ecb2c74c mds: handle freeze authpin race
For across authority rename, the MDS first freezes the source inode's
authpin. It happens while the source dentry isn't locked. So when the
inode's authpin become frozen, the source dentry may have changed and
be linked to a different inode.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-02 11:03:11 +08:00