Commit Graph

30068 Commits

Author SHA1 Message Date
Sage Weil
1c0083029e Merge remote-tracking branch 'gh/wip-objecter-full-2'
Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-13 10:49:10 -08:00
Josh Durgin
8e4b5bf8ca Merge pull request #936 from ceph/wip-rbd-single-major
rbd: support for single-major device number allocation scheme

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-13 10:40:11 -08:00
Sage Weil
e7652e6b4d Merge pull request #932 from ceph/wip-6979
replace sgdisk subprocess calls with a helper

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-13 10:03:43 -08:00
Sage Weil
d5ac73658d Merge remote-tracking branch 'gh/next' 2013-12-13 09:58:10 -08:00
Yan, Zheng
5bb04763de test/libcephfs: release resources before umount
Fixes: #6742
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-13 09:57:50 -08:00
Alfredo Deza
897dfc113f use the new get_command helper in check_call
Signed-off-by: Alfredo Deza <alfredo@deza.pe>
2013-12-13 12:06:25 -05:00
Ilya Dryomov
eae8531e42 rbd: modprobe with single_major=Y on newer kernels
On kernels that support it, and if 'rbd map' is given a chance to
modprobe, turn on single-major device number allocation scheme.  For
users who for some reason don't want it, the workaround is to insert
the rbd module manually before executing the first 'rbd map' command.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-13 17:40:52 +02:00
Ilya Dryomov
8a473bcc99 rbd: add support for single-major device number allocation scheme
With the preparatory commits ("rbd: match against wholedisk device
numbers on unmap" and "rbd: match against both major and minor on unmap
on kernels >= 3.14") in, this amounts to chosing to work with new rbd
bus interfaces (/sys/bus/rbd/{add,remove}_single_major) if they are
available, instead of the old ones (/sys/bus/rbd/{add,remove}).

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-13 17:40:52 +02:00
Ilya Dryomov
784cc894e0 rbd: match against both major and minor on unmap on newer kernels
As described in commit "rbd: match against wholedisk device numbers on
unmap", currently we only match against major numbers.  In preparation
for support for single-major device number allocation scheme, start
matching against minor numbers also, which newer kernels provide in
a /sys/bus/rbd/devices/<id>/minor sysfs attribute.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-13 17:40:52 +02:00
Ilya Dryomov
462b3898e5 rbd: match against whole disks on unmap
Currently the way 'rbd unmap' translates a user-provided block device
into an rbd id is it matches the major number of the specified device
against /sys/bus/rbd/devices/<id>/major for each rbd mapping and
declares success on the first match.  This works for both entire disks
and partitions, because under the current device number allocation
scheme, each mapping means a new major number.

In preparation for support for single-major device number allocation
scheme, which would require matching both major and minor numbers, make
sure to always match against entire disk device numbers, by converting
the specified device major:minor pair into wholdedisk major:minor pair.
To achive that, use the libblkid library, which accomplishes this goal
by walking stable sysfs structures.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-13 17:40:52 +02:00
Ilya Dryomov
a42130592d rbd: switch to strict_strtol for major parsing
Use common/strict_strtol, which actually parses integers in a proper
way, instead of atoi for parsing /sys/bus/rbd/devices/<id>/major.  This
is important, because the kernel apparently can write things like
"(none)" into that file, and in general is more bulletproof.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2013-12-13 17:40:52 +02:00
Sage Weil
31b0823deb Merge pull request #934 from cernceph/wip-rgw-ulimit
radosgw: increase nofiles ulimit on sysvinit machines
2013-12-12 09:42:21 -08:00
Sage Weil
500de8b241 Merge pull request #935 from ceph/wip-vstart-memstore
vstart.sh: add --memstore option
2013-12-12 09:41:40 -08:00
Yehuda Sadeh
bcde2003af vstart.sh: add --memstore option
for setting memstore backed osds

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-12-12 09:31:53 -08:00
Alfredo Deza
a9334a1c8c use the absolute path for executables if found
Signed-off-by: Alfredo Deza <alfredo@deza.pe>
2013-12-12 11:16:38 -05:00
Alfredo Deza
43561f7916 remove trailing semicolon
Signed-off-by: Alfredo Deza <alfredo@deza.pe>
2013-12-12 10:26:05 -05:00
Dan van der Ster
a33c95f125 radosgw: increase nofiles ulimit on sysvinit machines
Clusters with many OSDs require a higher nofiles ulimit than the RHEL default. Increase it.

Tested-by: Dan van der Ster <daniel.vanderster@cern.ch>
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
2013-12-12 14:53:13 +01:00
Sage Weil
71cefc2927 doc/release-notes: sort
meh

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-11 16:13:51 -08:00
Sage Weil
ee3173d900 doc/release-notes: fix indentation; sigh
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-11 16:11:00 -08:00
Sage Weil
3abc189454 doc/release-notes: v0.73
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-11 15:59:45 -08:00
Sage Weil
03429d1e4d PendingReleaseNotes: note CRUSH and hashpspool default changes
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-11 15:39:37 -08:00
Sage Weil
1504b961a9 Merge pull request #930 from ceph/wip-hashpspool
enable hashpspool by default

Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-12-11 15:37:46 -08:00
Greg Farnum
bb50276f2f Revert "Partial revert "mon: osd pool set syntax relaxed, modify unit tests""
This reverts commit e80ab94bf4.

We accept non-CephInt arguments again, now that we've got the monitors
handling differing APIs intelligently.

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-12-11 15:29:53 -08:00
Sage Weil
0cd36e0587 mon/OSDMonitor: take 'osd pool set ...' value as a string again
We ran into problems before when we made this a string because a mixed
cluster of mons might forward a client request with the wrong schema.
To make this work, we make the new code understand both the new and
old schema, and also backport a change to emperor and dumpling to
handle the new schema.

For the previous attempt to do this, see:
 337195f046
 2fe0d0d97a

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-12-11 15:29:42 -08:00
Gregory Farnum
72a304acb0 Merge pull request #925 from ceph/wip-mon-api
Merge in changes to unify the API presented by the monitors and handle changes gracefully.

(Upgrade tests) Tested-by: Tamil Muthamizhan <tamil.muthamizhan@inktank.com>

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-12-11 13:27:03 -08:00
Alfredo Deza
e19e38012b replace sgdisk subprocess calls with a helper
Signed-off-by: Alfredo Deza <alfredo@deza.pe>
2013-12-11 15:41:45 -05:00
Sage Weil
4b6d721434 osd: enable HASHPSPOOL by default
Much like the CRUSH tunables, this first appears in kernel v3.9.

Unlike the CRUSH tunables, it does not appear in Ceph until v0.64
(post cuttlefish, pre dumpling).

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-11 11:19:37 -08:00
Greg Farnum
fb47d54044 mon: if we're the leader, don't validate command matching
Classic-format commands never match our leader command set!

Signed-off-by: Greg Farnum <greg@inktank.com>
2013-12-11 10:12:56 -08:00
Greg Farnum
2bfd34ac95 mon: by default, warn if some members of the quorum are "classic"
Signed-off-by: Greg Farnum <greg@inktank.com>
2013-12-11 10:12:56 -08:00
Greg Farnum
b8884e01a0 MemStore: update for the new ObjectStore interface
68fdcfa1cc changed the ObjectStore
interface in the 'next' branch, which was merged into master by
e5a02c33e2. Unfortunately the
Memstore (added via the master branch) was not corrected for this
interface change.

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
2013-12-10 17:08:21 -08:00
Gary Lowell
e5a02c33e2 Merge branch 'next' 2013-12-10 21:00:14 +00:00
Gregory Farnum
b66902b64e Merge pull request #927 from dachary/wip-crush-test
crush: remove crushtool test leftover
2013-12-10 12:25:07 -08:00
Loic Dachary
8ac1da8e35 crush: remove crushtool test leftover
Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-10 20:35:34 +01:00
Sage Weil
6bd63a1ec1 Merge pull request #920 from dachary/wip-man
man: Ceph is also an object store

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-10 11:10:41 -08:00
Greg Farnum
ec609cacde Elector: use monitor's encoded command sets instead of our own
Signed-off-by: Greg Farnum <greg@inktank.com>
2013-12-10 10:23:03 -08:00
scuttlemonkey
85a024a6bd Merge pull request #865 from ceph/wip-doc-build-cluster
Wip doc build cluster
2013-12-10 10:14:59 -08:00
Greg Farnum
e223e5348d Monitor: encode and expose mon command sets
Signed-off-by: Greg Farnum <greg@inktank.com>
2013-12-10 10:09:24 -08:00
Loic Dachary
420a2f15a5 man: update man/ from doc/man/8
As explained in admin/manpage-howto.txt

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-10 18:34:16 +01:00
Loic Dachary
8d60cd1ac2 man: Ceph is also an object store
Replace

   Ceph distributed file system

with

   Ceph distributed storage system

to help reduce the idea that Ceph is just a file system.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-10 18:33:05 +01:00
Sage Weil
d650474059 Merge pull request #923 from dachary/wip-crush-test
CrushTester patches and documentation

Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-10 09:06:31 -08:00
Sage Weil
faaf546303 os/MemStore: do on_apply_sync callback synchronously
We can easily deadlock if we put this in the Finisher thread behind other
work; do it synchronously!

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2013-12-10 08:56:35 -08:00
Gary Lowell
d8ad51ee8a v0.73 2013-12-10 04:55:36 +00:00
Greg Farnum
a6f4d71c65 Elector: keep a list of classic mons instead of each mon's commands
We aren't actually using the sets, so don't bother keeping them.

Signed-off-by: Greg Farnum <greg@inktank.com>
2013-12-09 15:34:34 -08:00
Loic Dachary
a888a57f79 crush: implement --show-bad-mappings for indep
Support the presence of ITEM_NONE device numbers in the indep mapping as
proof of a bad mapping. Implement the associated unit tests.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-09 21:10:29 +01:00
Loic Dachary
20263dd30e crush: add unitest for crushtool --show-bad-mappings
Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-09 21:10:29 +01:00
Loic Dachary
fbc4f99080 crush: remove scary message string
The string is no longer used and can be removed.

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-09 21:10:29 +01:00
Loic Dachary
472f495e40 crush: document the --test mode of operations
Signed-off-by: Loic Dachary <loic@dachary.org>
2013-12-09 21:10:23 +01:00
Greg Farnum
ea86444fb3 Monitor: Elector: share the classic command set if we have a classic mon
The leader now checks to see if any monitors did not provide their
command set, and if so, shares the list of "classic" commands instead
of his own set. This will prevent users from seeing different commands
(depending on whether they connect to an old or new mon) while
performing upgrades, and will make it really obvious if they forgot
to upgrade one of the monitors!

Signed-off-by: Greg Farnum <greg@inktank.com>
2013-12-09 11:26:04 -08:00
Greg Farnum
f1ccdb418b Elector: share local command set when deferring
We're about to use this at a basic level, to identify when we have
"classic" monitors in-quorum, but could also do something more
sophisticated like a set intersection on the commands.

Signed-off-by: Greg Farnum <greg@inktank.com>
2013-12-09 11:26:04 -08:00
Greg Farnum
ba673be3e6 Monitor: import MonCommands.h from original Dumpling and expose it
If the Elector doesn't receive a set of commands from the elected leader, it
assumes the monitor is "classic" and uses the Dumpling command set as
the leader set.

Signed-off-by: Greg Farnum <greg@inktank.com>
2013-12-09 11:26:04 -08:00