Commit Graph

30464 Commits

Author SHA1 Message Date
Ilya Dryomov
9b7364d245 rbd: expose options available to rbd map
Add a -o / --options option, which would allow users to specify
rbd-specific and generic ceph client and osd options available at
mapping time in a comma separated list (similar to mount(8) mount
options).

Exposed options are:

- fsid=%s
- ip=%s
- share
- noshare
- crc
- nocrc
- osdkeepalive=%d
- osd_idle_ttl=%d
- rw
- ro (equivalent to existing --read-only flag)

The rw/ro < 3.7 kernels compatibility kludge added in commit
fb0f198644 is preserved.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-27 09:56:24 -08:00
Ilya Dryomov
87b8e54fae ceph_argparse: kill _daemon versions of argparse calls
Commit c76bbc2e6d, which introduced _daemon versions of some of the
argparse calls, also changed the behaviour of non-_daemon versions.
The change resulted in incorrect error messages, e.g.

  $ ./rbd create b0 --size
  rbd: extraneous parameter --size

instead of what should have been

  $ ./rbd create b0 --size
  Option --size requires an argument.

The users of _daemon versions were added in commit be801f6c50 and
removed in commit f26bd55e57, so just kill the _daemon versions and
restore the old behaviour.  (This effectively reverts commit
c76bbc2e6df1.)

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-25 21:41:16 +02:00
Loic Dachary
ab75df3c00 Merge pull request #988 from ceph/wip-crush-location
add 'crush location' config option

make check is ok

Reviewed-by: Loic Dachary <loic@dachary.org>
2013-12-25 01:07:04 -08:00
Sage Weil
7c9638f24b Merge pull request #993 from ceph/wip-librados-lock
Wip librados lock

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-24 10:51:01 -08:00
Yehuda Sadeh
e7bf5b2970 librados: lockless get_instance_id()
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-12-24 09:00:11 -08:00
Yehuda Sadeh
771da13b66 objecter, librados: create Objecter::Op in two phases
(currently only in some librados operations)
First create the op, only then lock and submit so that we reduce lock
contention.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-12-24 09:00:03 -08:00
Sage Weil
5ff30d6cf3 crush/CrushWrapper: note about get_immediate_parent()
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-24 08:01:15 -08:00
Sage Weil
0cdbc97614 librados: mark old get_version() as deprecated
Use the newly-discovered (for me) deprecated attribute to mark the old
get_version() method and point users toward get_version64().  And fix a
couple of users in the kvstore code!

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-24 07:58:08 -08:00
Sage Weil
006449ddb5 librados: deprecate aio_operate() read variant that takes snapid
The argument was ignored.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-24 07:58:07 -08:00
Sage Weil
909f8a42b6 librbd: localize or distribute parent (snap) reads
The parent is always a snapshot.  We may want to treat it differently
than other snaps by virtue of it (likely) being a more highly-shared
image.

By default, localize parent reads.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-24 07:58:07 -08:00
Sage Weil
22df773251 osdc/Objecter: use crush location and distance for LOCALIZE_READS
Use the hierarchy in the CRUSH map to determine what the closest
replica is.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-24 07:58:07 -08:00
Sage Weil
ac14d4ffae osdc/Objecter: maintain crush_location multimap
Observe and parse the 'crush location' config option.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-24 07:58:07 -08:00
Sage Weil
746069ee62 crush/CrushWrapper: simplify get_full_location_ordered()
Just ascend the hierarchy; it is much less complicated.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-24 07:58:07 -08:00
Sage Weil
dcc5e3559f crush/CrushWrapper: add get_common_ancestor_distance()
Calculate closest common ancestor (type) in the hierarchy.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-24 07:57:56 -08:00
Sage Weil
a6852afd03 Merge pull request #990 from ceph/wip-fix-mon-fwd
mon: fix forwarded request features when requests are resent

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-12-23 17:02:11 -08:00
Sage Weil
b9c7eb68f3 Merge pull request #989 from ceph/wip-7056
osd/ReplicatedPG: include omap header in copy-get

This now passes rados/thrash tests without failures.
2013-12-23 15:53:42 -08:00
Sage Weil
0903f3fa46 mon/OSDMonitor: use generic CrushWrapper::parse_loc_map helper
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-23 15:18:31 -08:00
Sage Weil
8f48906db7 crush/CrushWrapper: add parse_loc_[multi]map helpers
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-23 15:18:31 -08:00
Sage Weil
7351031e4f Merge pull request #991 from dachary/wip-stop
vstart/stop: do not loop forever on kill

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-23 13:12:14 -08:00
Sage Weil
8fc66a4ab2 osd/ReplicatedPG: fix copy-get iteration of omap keys
We need to call upper_bound() before checking if the iterator is valid!

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-23 12:54:00 -08:00
Sage Weil
0c9acf147d ceph_test_rados: s/tmap/omap/
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-23 12:53:09 -08:00
Loic Dachary
3b0d9b2c4d vstart/stop: do not loop forever on kill
It may be the case that stop.sh can't stop a process for reasons
unrelated to vstart.sh. Because apache runs independantly, for
instance. Instead of trying forever, try twice in a raw ( should be
enough 99% of the case ) and try three more times, sleeping one second
between each try should be more than enough.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-23 21:44:38 +01:00
Sage Weil
4ce6400a77 config: add 'crush location' option
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-23 12:31:58 -08:00
Wido den Hollander
19213e61b2 doc: Fix caps documentation for Admin API
The correct caps is users instead of user
2013-12-23 21:10:59 +01:00
Sage Weil
ac10aa5d1e mon: fix forwarded request features when requests are resent
Pass the features in explicitly so that we can use messages we've just
decoded in resend_routed_requests().

Keep the features in struct RoutedRequest.

Renamed conn_features -> con_features while we are here.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-23 10:59:14 -08:00
Sage Weil
2e4c61b602 osd/ReplicatedPG: include omap header in copy-get
Missed this the first time around.  Thank you, ceph_test_rados!

Fixes: #7056
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-23 10:21:44 -08:00
Sage Weil
9ffe9ddfc9 Merge pull request #984 from ceph/wip-7051
#7051: forward connection features alongside with message

Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-23 09:52:02 -08:00
Sage Weil
4fd60ac29b Merge remote-tracking branch 'gh/next' 2013-12-23 09:28:29 -08:00
Sage Weil
fc368d5ba9 Merge remote-tracking branch 'gh/wip-cache' 2013-12-23 09:22:36 -08:00
Sage Weil
adc9b3168c Merge pull request #987 from ceph/wip-crush-shrink-diff
crush: shrink diff with kernel implementation

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-23 09:19:11 -08:00
Ilya Dryomov
537a7c3f97 crush: misc formatting and whitespace fixes
- whitespace in crush.h

- format is_out() definition and call site to 80 columns

- whitespace around local_fallback_tries in crush_choose_firstn()

All of this is to shrink the diff with the kernel implementation.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-23 18:12:56 +02:00
Ilya Dryomov
fa6a99ab34 crush: use kernel-doc consistently
kernel-doc syntax is "@arg: desc", not "@param arg desc".  In addition,
these comments are usually placed around function definitions instead
of function declarations.  Follow these guidelines to shrink the diff.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-23 18:12:56 +02:00
Ilya Dryomov
6e36794fc9 crush/mapper: unsigned -> unsigned int
Kernel implementation is located in net/, and use of "unsigned int" is
preferred to bare "unsigned" in net tree (as proven by several net/
cleanups).  Follow this guideline to shrink the diff.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-23 18:12:56 +02:00
João Eduardo Luís
d24113fd33 Merge pull request #985 from dachary/wip-erasure-code-defaults
mon: use kill instead of pkill in osd-pool-create

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-12-23 04:47:41 -08:00
Loic Dachary
d8512f193c mon: use kill instead of pkill in osd-pool-create
The --pidfile option of pkill is not supported by all versions. Use kill
instead for compatibility. Instead of looping on : loop on sleep 1 so an
inifinite loop does is slower at filling the disk.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-23 13:10:18 +01:00
Joao Eduardo Luis
c030569847 osd: OSDMap: dump osd_xinfo_t::features as an int
Instead of dumping the list in a string-list format, which in
retrospect wasn't very useful.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-12-22 17:29:23 -08:00
Joao Eduardo Luis
b4fbe4f813 mon: Monitor: Forward connection features
We are relying on connection features to track OSD supported
features.  However, we were not forwarding connection features
when we forwarded a message from a peon to the leader.  That
was breaking the OSD feature tracking.

Fixes: 7051

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-12-22 17:26:59 -08:00
Sage Weil
1e529972f3 Merge remote-tracking branch 'gh/master' into wip-cache
Conflicts:
	src/osdc/Objecter.h
	src/vstart.sh

Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-12-22 15:33:59 -08:00
Sage Weil
1349ba894a Merge pull request #976 from dachary/wip-erasure-code-defaults
provide sensible defaults when creating an erasure coded pool

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-22 15:30:43 -08:00
Loic Dachary
93c44cb144 mon: unit test for osd pool create
It is inconvenient to run such tests in the
qa/workunits/cephtool/test.sh because they require that the mon is
restarted to test errors in the format of the default erasure code
properties and check the appropriate error message is output.

osd-pool-create.sh runs a single mon from sources using command
line options and a temporary directory, the same way vstart.sh does but
lightweight.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-22 23:43:54 +01:00
Loic Dachary
59941b10f1 mon: erasure code pool properties defaults
If no properties are set when creating an erasure coded pool, default to
using the jerasure plugin with the cauchy_good technique which is the
fastest.

The defaults are set with osd_pool_default_erasure_code_properties.

The erasure code plugins are loaded from the directory specified in the
erasure-code-directory property. Contrary to the other properties it
will most commonly be the same throughout the cluster. The default is
set to /usr/lib/ceph/erasure-code with
osd_pool_default_erasure_code_directory

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-22 23:43:54 +01:00
Loic Dachary
29d1fcdb95 mon: add error message argument to prepare_new_pool
Add a stringstream argument to prepare_new_pool for the purpose of
recording human readable error message.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-22 23:43:54 +01:00
Loic Dachary
2d01da63df mon: do not include = in pool properties values
foo=bar was parsed as {"foo":"=bar"} instead of {"foo":"bar"} because of
the missing equal++

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-22 23:43:54 +01:00
Loic Dachary
a44a57a7c3 common: implement get_str_map to parse key/values
It is capable of parsing json or key=value pairs. The prototype is made
to look like get_str_list. The implementation is in common + include and
use .h. It will probably be moved to common and use .hpp instead, along
with str_list.{cc,h}.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-22 23:43:54 +01:00
Loic Dachary
df1704eeb0 osd: pool properties are not an array
They must be dumped with open_object_section instead of
open_array_section otherwise only the values are displayed.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-22 23:43:54 +01:00
Loic Dachary
df0d038d7b mon: osd create pool must fail on incompatible type
When osd create pool is called twice on the same pool, it will succeed
because the pool already exists. However, if a different type is
specified, it must fail.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-22 23:43:50 +01:00
Loic Dachary
af22b0a09b packaging: erasure-code plugins go in /usr/lib/ceph
Install the plugins in /usr/lib/ceph/erasure-code instead of
/usr/lib/erasure-code to comply with FHS : "Applications may use a
single subdirectory under /usr/lib."

http://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html

The debian package is modified to install the plugins as part of the
ceph package which also ships rados-classes.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-22 23:11:02 +01:00
Sage Weil
880a7dcb26 Merge pull request #983 from dachary/wip-rep-replicated
mon: s/rep/replicated/ in pool create prototype

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-22 12:39:08 -08:00
Loic Dachary
203c5d673c mon: s/rep/replicated/ in pool create prototype
The test is updated to remove unecessary asserts. Since all combinations
of properties and pool type are allowed, there is no way to statically
check the validity of the arguments.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-22 19:37:27 +01:00
Sage Weil
d192062bfb ceph_test_rados: update in-memory user_version on RemoveAttrsOp
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-22 09:49:28 -08:00