Commit Graph

36849 Commits

Author SHA1 Message Date
Sage Weil
17d517991c Merge pull request #2689 from zhurongze/fix-crush
crush: fix incorrect use of adjust_item_weight method

Reviewed-by: Sage Weil <sage@redhat.com>
2014-11-10 09:02:28 -08:00
Sage Weil
43f004de30 Merge pull request #2881 from ceph/wip-10030
librbd: don't close an already closed parent image upon failure

Reviewed-by: Sage Weil <sage@redhat.com>
2014-11-10 08:41:58 -08:00
Rongze Zhu
9850227d2f crush: fix incorrect use of adjust_item_weight method
adjust_item_weight method will adjust all buckets which the item
inside. If the osd.0 in host=fake01 and host=fake02, we execute
"ceph osd crush osd.0 10 host=fake01", it not only will adjust fake01's
weight, but also will adjust fake02's weight.

the patch add adjust_item_weightf_in_loc method and fix remove_item,
_remove_item_under, update_item, insert_item, detach_bucket methods.

Signed-off-by: Rongze Zhu <zrzhit@gmail.com>
2014-11-10 23:37:03 +08:00
Loic Dachary
4b07381daf Merge pull request #2888 from dachary/wip-9970-erasure-code-documentation
erasure-code: document pool operations

Reviewed-by: Laurent Guerby <laurent@guerby.net>
2014-11-10 14:51:15 +01:00
Loic Dachary
c44bdb1dc9 erasure-code: document pool operations
A short introduction to the first time user of an erasure coded pool.
It includes a reminder of how it relates to cache tiering and links to
define new profiles with an example.

There was examples in the developer documentation but the operator
expects to find such a guide in the rados operations chapter.

http://tracker.ceph.com/issues/9970 Fixes: #9970

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2014-11-10 14:47:56 +01:00
Loic Dachary
ec92128993 Merge pull request #2750 from dachary/wip-9815-make-check-parallel
parallelize make check

Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
2014-11-10 11:58:27 +01:00
Loic Dachary
a0c1f220c7 tests: use kill -0 to check process existence
When killing a daemon, instead of using kill -9 to check the process was
terminated, use kill -0. Should the pid of the process be reused
immediately after, it would be wrong to kill the new process. Worst case
scenario the kill_daemon function returns before the process is
confirmed to be killed but this is not treated as an error and is
unlikely to cause any problem.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-11-09 11:59:51 +01:00
Loic Dachary
17f5c3659c tests: looping to wait for an osd to be up is expected
Remove the reference to a bug that suggests otherwise.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-11-09 11:59:51 +01:00
Loic Dachary
79f8b81aec tests: increase timeout to accommodate slow machines
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-11-09 11:59:51 +01:00
Loic Dachary
0b4ccbd68d tests: kill_daemon use $name.pid instead of pidfile
So that it can be used instead of stop.sh to stop vstart.sh daemons. The
problem with stop.sh is that it kills any daemon, not just a selection.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-11-09 11:59:51 +01:00
Loic Dachary
6741b71d90 tests: group workunits/cephtool/test.sh tests per daemon
So all tests related to a given daemon (mon, osd, mds) can be run at
once.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-11-09 11:59:51 +01:00
Loic Dachary
0cb12c71b9 tests: run workunits/cephtool/test.sh
Three scripts are added to run qa/workunits/cephtool/test.sh for each
daemon (mon, mds, osd) so they can be run in parallel.

http://tracker.ceph.com/issues/9815 Fixes: #9815

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-11-09 11:59:02 +01:00
Loic Dachary
c3b51ef5fa tests: remove vstart_wrapped_tests.sh
Listing tests to be run in a single script does not take advantage of
parallel runs in make.

The vstart_wrapper.sh script is reworked and made less specialized and
let the caller decide which daemons to run via CEPH_START and does not
enforce the number of deamons of each time. It no longer uses stop.sh to
avoid killing the osd/mon/mds that are unrelated to the tests.

http://tracker.ceph.com/issues/9815 Fixes: #9815

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-11-09 11:59:02 +01:00
Loic Dachary
7a6ca17f18 tests: use different ports for each mon
Run the mon on each test on a different port so they can run in
parallel.

http://tracker.ceph.com/issues/9815 Fixes: #9815

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-11-09 11:59:02 +01:00
Loic Dachary
bdca0ac0b5 tests: tolerate a disk 99% full
The tests do not need much space and will work fine even on a 99% full
disk.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-11-09 11:59:02 +01:00
Sage Weil
b8ec7d7bed Merge pull request #2869 from fgimenez/fix-tests-on-btrfs
Fix tests on btrfs: leftover subvolumes removed

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2014-11-08 21:05:40 -08:00
Loic Dachary
725d05f52f Merge pull request #2860 from XinzeChi/master
osd: cache pool: delete dead code in ReplicatedPG::agent_choose_mode

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2014-11-08 16:03:45 +01:00
Greg Farnum
560e22e845 test: use unsigned ints to compare against size()
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2014-11-07 16:20:57 -08:00
Gregory Farnum
daa9f9ffe8 Merge pull request #2814 from ceph/wip-inode-scrub
Wip inode scrub

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: John Spray <john.spray@redhat.com>
2014-11-07 16:16:02 -08:00
Loic Dachary
bc8409e50c Merge pull request #2884 from dachary/wip-mailmap
mailmap: Loic Dachary affiliation
2014-11-07 23:56:15 +01:00
Loic Dachary
a21bca1d46 mailmap: Loic Dachary affiliation
Signed-off-by: Loic Dachary <ldachary@redhat.com>
2014-11-07 23:54:07 +01:00
tmuthamizhan
28f0bb8416 Merge pull request #2882 from ceph/wip-doc-dumpling-to-firefly
doc: Added Dumpling to Firefly upgrade section.
2014-11-07 14:09:02 -08:00
John Wilkins
3e0295ffa9 doc: Added Dumpling to Firefly upgrade section.
Fixes: #7679

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2014-11-07 13:16:45 -08:00
Greg Farnum
ca2e72aeb9 Merge remote-tracking branch 'origin/master' into wip-inode-scrub
Conflicts:
	src/common/Makefile.am
	src/mds/Server.cc

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2014-11-07 13:15:58 -08:00
Greg Farnum
15d487f73d MDS: clean up internal MDRequests the standard way
All cleanup is now routed through respond_to_request(),
which invokes the internal_op_finish Context*, then does
mdcache->request_finish(). This is easier to reason about,
and indeed fixes a bug (I was not cleaning up locks
following flush). Use the MDSContinuation to facilitate
this in scrub's case.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2014-11-07 12:53:03 -08:00
Greg Farnum
07e0831cd5 MDS: CInode: break out of validation early on symlinks
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2014-11-07 12:53:03 -08:00
Greg Farnum
f1677e745a common/ceph_strings: add some MDS internal op names to ceph_mds_op_name()
In addition to my validate and flush, this is also missing exportdir.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:53:03 -08:00
Greg Farnum
26736b2244 MDS: add a flush_dentry() function, and wire it up to the admin socket
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:53:03 -08:00
Greg Farnum
86384fe33a MDS: CInode: create a flush() function
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:53:03 -08:00
Greg Farnum
063cd2fca5 MDCache: handle internal ops in respond_to_request()
This only works for those which have specified a finisher in the MDR.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:53:03 -08:00
Greg Farnum
f82f6efec9 MDCache: make scrub_dentry schedulable and reentrant
Rather than assuming that any necessary inodes are in the cache, split up
MDCache::scrub_dentry into setup and work phases. Add an internal_op_finisher()
to MDRequest. Dispatch any CEPH_MDS_OP_VALIDATE internal operations to
scrub_dentry_work(). Taken together, these make everything work properly when
path_traverse() (by way of rdlock_path_pin_ref()) needs to go to disk before
satisfying the lookup.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:53:02 -08:00
Greg Farnum
391740215d MDCache: "handle" request_forward on internal ops
For now, just return -EXDEV ("Cross-device link") on internal ops that
require forwarding, as forwarding internal ops will require a great deal more
infrastructure.. But push the issue down to this level instead of worrying
about it in path_traverse, and consider the possibility that the MDRequest
might not have a client_request that it's wrapped around.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:42:15 -08:00
Greg Farnum
a4da522665 Server: rename reply_request() -> respond_to_request()
This is no longer necessarily a reply; it could turn into a Context
activation or something.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:42:15 -08:00
Greg Farnum
cfac5c3319 Server: rename reply_request -> reply_client_request; make it private
The generic reply_request(MDRequest, int) is now the only caller. It's still
just building an MClientRequest to pass along, but we can change it a lot more
easily now to support responding to non-client requests.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:42:14 -08:00
Greg Farnum
e980d1b291 Server: add snapbl to MDRequest and eliminate last explicit MClientReply
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:42:14 -08:00
Greg Farnum
592be4dd0a Server: use mdr->reply_extra_bl instead of explicit MClientReply
Set the MClientReply::extra_bl from reply_extra_bl unconditionally in
reply_request(), instead of only in early_reply(). Further isolate
the reply_request() callers from the use of MClientReply this way.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:42:14 -08:00
Greg Farnum
c9f8d11d2b Server: do not use explicit MClientReply if we don't need to
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:42:14 -08:00
Greg Farnum
a65d986898 Server: remove tracei and tracedn parameters from reply_request
We have members for these two parameters in the MDRequestImpl already, so
make use of them. This helps us move towards dropping the expectation of an
MClientRequest from functions like rdlock_path_pin_ref().

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:42:14 -08:00
Greg Farnum
515ab2d587 MDCache: add a scrub_dentry() function, and wire it up to the admin socket
scrub_dentry() is passed a string path, and it validates it before replying. We
hook up an admin socket command "scrub_path" to call it and dump the output.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:42:14 -08:00
Greg Farnum
fa75434faf MDS: CInode: implement validated_data::dump()
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 12:41:51 -08:00
Greg Farnum
0d6f8b659f MDS: CInode::validate_disk_state()
Add a function that will validate the on-disk state of the CInode. We currently
check that the on-disk backtrace matches (or is older) and compare rstats on
dirfrags against the parent dir's inode (for directories only).

TODO: validate that the on-disk Inode object matches what the parent
directory holds.

It's using a sort-of new programming model, trying to stuff stack data into
a Continuation object and write everything sequentially instead of having
a function and Context per IO.

Signed-off-by: Greg Farnum <greg@inktank.com>
Signed-off-by: John Spray <john.spray@redhat.com>
2014-11-07 11:48:44 -08:00
Greg Farnum
153aa2027b Rebase: MDS: Add an MDSContinuation for ease of use
Unlike the regular Continuation, this one works in terms of an MDRequest
and has wrappers to provide Context callbacks that are either
internal MDS or IO appropriate.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2014-11-07 11:48:44 -08:00
Greg Farnum
a7020dd1c2 Continuation: Add a new Continuation class.
Signed-off-by: Greg Farnum <greg@inktank.com>
Signed-off-by: John Spray <john.spray@redhat.com>

SQUASH "Continuation: Add a new Continuation class."
2014-11-07 11:48:43 -08:00
Greg Farnum
c575d16290 MDCache: create_unlinked_system_inode() as the guts of create_system_inode()
This way we can create duplicate CInodes without actually linking them
into the cache. It'll be helpful for comparing different versions of
disk states and in-memory state, etc.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 11:48:43 -08:00
Greg Farnum
9441283f03 MDS: MDRequestImpl: provide filepath/filepath2 substitute for MClientRequest
Use this passthru in the Server path locking functions so that we can get
locks or auth pins without an associated MClientRequest.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 11:48:43 -08:00
Greg Farnum
adee21bfa3 MDRequest: dump internal op names as well as IDs
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 11:48:43 -08:00
Greg Farnum
05d2444d73 MDCache: remove #if 0'd code
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 11:48:43 -08:00
Greg Farnum
f8db040c46 mdstypes: add a same_sums() function to nest_info_t
operator== is checking equality of the version as well, but I want
something I can use to check that the internal sums match. This is useful
for eg comparing the sums of a set of dirfrags to the tally stored in
the inode.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 11:48:43 -08:00
Greg Farnum
af4bddd54d test/mds: unit tests for the inode_backtrace_t and inode_t compare() functions
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 11:48:42 -08:00
Greg Farnum
4743f28b26 mdstypes: write inode_t::compare() function
This compares one inode_t against another, seeing which version is newer
and checking that differences in the data members make sense given that.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 11:48:42 -08:00