Commit Graph

41894 Commits

Author SHA1 Message Date
Yehuda Sadeh
51bf619b5e Merge pull request #4745 from jmunhoz/object-copy-bug
rgw: Use attrs from source bucket on copy

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2015-05-25 00:05:11 -07:00
Yan, Zheng
2daaa61bc2 mds: fix use-after-free in SessionMap::remove_session
Fixes: #11752
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-05-25 11:46:06 +08:00
Loic Dachary
d49d8166a0 Merge pull request #4709 from dachary/wip-shec-corpus
erasure-code: update ceph-erasure-code-corpus for shec
2015-05-23 10:54:12 +02:00
Loic Dachary
4507cb235d Merge pull request #4751 from islepnev/wip-11612
Support NVMe device partitions by ceph-disk

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-23 09:05:03 +02:00
islepnev
9b62cf254d ceph-disk: support NVMe device partitions
Linux nvme kernel module v0.9 enumerate devices as following:

/dev/nvme0 - characted revice
/dev/nvme0n1 - whole block device
/dev/nvme0n1p1 - first partition
/dev/nvme0n1p2 - second partition

http://tracker.ceph.com/issues/11612 Fixes: #11612

Signed-off-by: Ilja Slepnev <islepnev@gmail.com>
2015-05-23 00:51:45 +03:00
Kefu Chai
f36344f07f Merge pull request #4736 from tchaikov/wip-11693-only-restart-crashed-osds
test/test-erasure-code: spin off EIO tests to avoid lingering OSDs after tests finish

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-23 03:56:34 +08:00
Josh Durgin
87ec95fb37 Merge pull request #4748 from ceph/wip-11562
dev/rbd-diff: clarify encoding of image size

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-05-22 12:43:46 -07:00
Kefu Chai
f417eda9a4 tests/test-erasure-code: spin off eio tests into another testsuite
* since the eio tests crashes some of the OSD nodes, before the
  change, the tests try to undo the crash before moving on, so it
  won't interfere with following tests. a more robust/clean way to
  do this is to isolate individual tests in a sandbox, so each eio
  test will have its own:
    setup + inject + verify crash + teardown
  cycle. this change helps to remove the cleanup/undo steps in
  invidual test.
* update the disabled tests accordingly.
* use a minimum set of OSDs and R-S(2,1) for the testing to speed
  up the test.
* add the new testsuite to check_SCRIPTS

Fixes: #11693
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-23 03:22:41 +08:00
Kefu Chai
2230deffce tests: fix the get_config()
* the "daemon" parameter was not respected.
* update the test_get_config() to check the overrided option instead of
  the default one.
* add set_config()

Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-23 03:19:23 +08:00
Loic Dachary
09a6457296 Merge pull request #4749 from ddiss/ceph_disk_test_fix
tests: don't choke on deleted losetup paths

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 20:45:16 +02:00
Casey Bodley
33eae4ec2f xio: fix reuse of outer loop index in inner loop
Reported-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-22 11:09:43 -07:00
Casey Bodley
367a5fccf2 cmake: add missing source file to test_librbd
Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-22 11:09:37 -07:00
Casey Bodley
a8fca3c212 cmake: add missing common/util.cc dependency
Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-22 11:09:33 -07:00
Casey Bodley
15dd70cd5a cmake: skip man/CMakeLists.txt
man pages have to be preprocessed now, and can't be installed directly.
skip installing them until we add the cmake-fu to copy what man/Makefile.am
is doing

Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-22 11:09:29 -07:00
David Disseldorp
7c1bae5624 tests: don't choke on deleted losetup paths
If a file has been deleted with a loopback device attached, then the
`losetup --all` output will carry:
/dev/loopX: [0032]:344213 (/.../src/test-ceph-disk/vdf.disk (deleted))

This causes the losetup parsing in reset_leftover_dev() to throw an
error, e.g.:
rreset_leftover_dev: 430: test
'(/home/ddiss/ceph/src/test-ceph-disk/vdf.disk' '(deleted))' =
'(/home/ddiss/ceph/src/test-ceph-disk/vdf.disk)'
test/ceph-disk.sh: line 430: test: too many arguments

Fix this by quoting the path variable for the string comparison.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2015-05-22 17:22:51 +02:00
Jason Dillaman
f9ba711c30 dev/rbd-diff: clarify encoding of image size
Fixes: #11562
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-05-22 11:18:09 -04:00
Loic Dachary
7defc06962 Merge pull request #4512 from hjwsm1989/init-ceph
init-ceph.in: Create osd data dir if don't exists.

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 16:19:28 +02:00
Loic Dachary
9f0d2da72f Merge pull request #4740 from ktdreyer/wip-11688-doc-firewall-ports
#11688: doc: update OSD/MDS firewall port list

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 14:32:04 +02:00
Kefu Chai
6a7aa6c552 Merge pull request #4734 from wonzhq/aio-completion
test/aio: fix the leak of aio completion

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-05-22 17:57:48 +08:00
Kefu Chai
c2c36bc977 Merge pull request #4738 from dachary/wip-11618-osd-create-dup
tests: ceph create may consume more than one id

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-05-22 16:55:04 +08:00
Loic Dachary
ab8e9e39c8 tests: CEPH_CLI_TEST_DUP_COMMAND=1 for qa/workunits/cephtool/test.sh
Run cephtool-test-{mon,osd,mds}.sh with CEPH_CLI_TEST_DUP_COMMAND=1 to
detect idempotency related problems during make check. This is how
ceph-qa-suite/tasks/workunit.py will run
suites/rados/singleton/all/cephtool.yaml and it's easier to fix when
make check fails rather than later on when a fully populated rados suite
has one failed job.

http://tracker.ceph.com/issues/11618 Refs: #11618

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 10:16:37 +02:00
Loic Dachary
5c69f5e15f tests: ceph create may consume more than one id
When CEPH_CLI_TEST_DUP_COMMAND=1 is set, ceph osd create will consume
two osd id and return the later. Fix the test to account for that and
not assume the osd id being allocated by osd create is always the
next available osd id.

The other osd create tests do not suffer from the same variation because
they provide a UUID argument that guarantees the same osd id is going to
be returned every time.

http://tracker.ceph.com/issues/11618 Fixes: #11618

Signed-off-by: Loic Dachary <ldachary@redhdat.com>
2015-05-22 10:16:24 +02:00
Javier M. Mellid
1dac80df1d rgw: Use attrs from source bucket on copy
On copy objects, when bucket source is the same as the destination, use attrs
from source bucket.

Fixes: #11639

Signed-off-by: Javier M. Mellid <jmunhoz@igalia.com>
2015-05-22 09:41:01 +02:00
Yehuda Sadeh
4be8e49e38 Merge pull request #4617 from aakso/wip-11367-pki-token-expire
rgw: always check if token is expired

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2015-05-21 14:09:38 -07:00
Ken Dreyer
11fef2275d doc: recommend opening entire 6800-7300 port range
Prior to this commit, the Network Configuration Reference guide and
Troubleshooting guide recommended opening a number of ports that were
unique to the number of daemons that we ran.

This doesn't really cover all use cases. Users can easily restart
daemons in ways that cause the daemons to bind to higher ports. This
leads to OSDs or MDSs binding to ports that are firewalled.

Update the Network Configuration Reference guide and Troubleshooting
guides to simply recommend that users open all the ports between 6800
and 7300 on their OSDs and MDSs.

http://tracker.ceph.com/issues/11688 Refs: #11688

Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2015-05-21 14:31:00 -06:00
Samuel Just
4fe7d2abdf RadosModel: randomly prefix delete with assert_exists
Signed-off-by: Samuel Just <sjust@redhat.com>
2015-05-21 12:14:02 -07:00
Samuel Just
121aa3bc61 RadosModel: assert exists on subsequent writes
Signed-off-by: Samuel Just <sjust@redhat.com>
2015-05-21 12:14:02 -07:00
Ken Dreyer
b50cc9472f doc: update OSD port range to 6800-7300
The upper limit for OSD/MDS ports changed from 7100 to 7300 in commit
f9ec5a7945. Update the Quick Start
Preflight documentation to reflect this change.

Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2015-05-21 13:00:32 -06:00
Yehuda Sadeh
98cdf03363 Merge pull request #4391 from nilamdyuti/wip-doc-ceph-object-gateway
doc: Removes references to s3gw.fcgi in simple gateway configuration file...

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2015-05-21 13:00:20 -04:00
Casey Bodley
3dda5faf75 xio: malloc if xio_mempool_alloc fails
Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-21 07:04:36 -07:00
Casey Bodley
5c14a69395 xio: fix for xio_msg release after teardown
The xio_msg pointers to be freed in XioPortal::release_xio_rsp() are no
longer valid after a call to xio_connection_destroy(). We were already
avoiding the call to xio_release_msg() in this case, but were still
dereferencing the xio_msg for its user_context pointer. Moved the check
for is_connected() outside of the loop to avoid any access to msg.

Suggested-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-21 07:04:26 -07:00
Casey Bodley
16d1c1e97d xio: use ceph clock for timestamps
accelio is using rdtsc to generate xio_msg.timestamp, which can't be
reliably converted to a timeval. now uses ceph_clock_now() to assign
the Message::recv_stamp and recv_complete_stamp

Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-21 07:00:12 -07:00
Vu Pham
c2bba8ebee xio: save nonce for bind address
A missing nonce in the osd addrs was preventing the monitor from
detecting osd restarts. XioMessenger::bind() now sets the nonce in the
same way that SimpleMessenger and AsyncMessenger do

Signed-off-by: Casey Bodley <casey@cohortfs.com>
Signed-off-by: Vu Pham <vu@mellanox.com>
2015-05-21 06:59:59 -07:00
Casey Bodley
355aa0e44b xio: check if connection is on list before erasing
also removed the extra conditional put() in on_disconnect_event()

Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-21 06:59:46 -07:00
Vu Pham
bb621b074d xio: better way to assign connections to specific lane
Better way to assign connections to a specific lane of a portal
Avoiding lane competition/hogging.
This change resolves the slow ramping up and spiky behaviors during
clients starting/running I/Os.

Signed-off-by: Vu Pham <vu@mellanox.com>
2015-05-21 06:59:36 -07:00
Zhiqiang Wang
855a70d83e test/aio: aio completion is not released
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-05-21 14:58:20 +08:00
Dan Mick
783fdc7c3e Merge pull request #4517 from ceph/wip-11388-debian-argparse
#11388 debian: move ceph_argparse into ceph-common

Reviewed-by: Dan Mick <dmick@redhat.com>
2015-05-20 14:54:16 -07:00
Ilya Dryomov
8190f44f07 Merge pull request #4721 from ceph/wip-fix-concurrent.sh
Fix ceph.conf path in concurrent.sh - krbd qa suite

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2015-05-20 20:54:55 +03:00
Ken Dreyer
110608e5bd debian: move ceph_argparse into ceph-common
Prior to this commit, if a user installed the "ceph-common" Debian
package without installing "ceph", then /usr/bin/ceph would crash
because it was missing the ceph_argparse library.

Ship the ceph_argparse library in "ceph-common" instead of "ceph". (This
was the intention of the original commit that moved argparse to "ceph",
2a23eac549)

http://tracker.ceph.com/issues/11388 Refs: #11388

Reported-by: Jens Rosenboom <j.rosenboom@x-ion.de>
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2015-05-20 11:29:04 -06:00
Kefu Chai
8c65e2af29 Merge pull request #4720 from athanatos/wip-clarify-DBObjectMap-sync
DBObjectMap::sync: add comment clarifying locking

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-05-20 21:28:28 +08:00
Kefu Chai
4e272e5eb1 Merge pull request #3946 from tchaikov/randomize-scrub-time
osd: Randomize scrub time

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2015-05-20 21:21:13 +08:00
Piotr Dałek
ca6abca63d tools: add --no-verify option to rados bench
When doing seq and rand read benchmarks using rados bench, a quite large
portion of cpu time is consumed by doing object verification. This patch
adds an option to disable this verification when it's not needed, in turn
giving better cluster utilization. rados -p storage bench 600 rand scores
without --no-verification:

Total time run:       600.228901
Total reads made:     144982
Read size:            4194304
Bandwidth (MB/sec):   966
Average IOPS:         241
Stddev IOPS:          38
Max IOPS:             909522486
Min IOPS:             0
Average Latency:      0.0662
Max latency:          1.51
Min latency:          0.004

real    10m1.173s
user    5m41.162s
sys     11m42.961s

Same command, but with --no-verify:

Total time run:       600.161379
Total reads made:     174142
Read size:            4194304
Bandwidth (MB/sec):   1.16e+03
Average IOPS:         290
Stddev IOPS:          20
Max IOPS:             909522486
Min IOPS:             0
Average Latency:      0.0551
Max latency:          1.12
Min latency:          0.00343

real    10m1.172s
user    4m13.792s
sys     13m38.556s

Note the decreased latencies, increased bandwidth and more reads performed.

Signed-off-by: Piotr Dałek <piotr.dalek@ts.fujitsu.com>
2015-05-20 12:41:22 +02:00
Kefu Chai
6344fc8393 osd: use another name for randomize scrub option
s/osd_scrub_interval_limit/osd_scrub_interval_randomize_ratio/

Fixes: #10973
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-20 18:23:21 +08:00
Kefu Chai
5e44040e85 osd: randomize scrub times to avoid scrub wave
- to avoid the scrub wave when the osd_scrub_max_interval reaches in a
  high-load OSD, the scrub time is randomized.
- extract scrub_load_below_threshold() out of scrub_should_schedule()
- schedule an automatic scrub job at a time which is uniformly distributed
  over [now+osd_scrub_min_interval,
        now+osd_scrub_min_interval*(1+osd_scrub_time_limit]. before
  this change this sort of scrubs will be performed once the hard interval
  is end or system load is below the threshold, but with this change, the
  jobs will be performed as long as the load is low or the interval of
  the scheduled scrubs is longer than conf.osd_scrub_max_interval. all
  automatic jobs should be performed in the configured time period, otherwise
  they are postponed.
- the requested scrub job will be scheduled right away, before this change
  it is queued with the timestamp of `now` and postponed after
  osd_scrub_min_interval.

Fixes: #10973
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-20 18:23:21 +08:00
Kefu Chai
0f7f35670f osd: use __func__ in log messages
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-20 18:23:21 +08:00
Kefu Chai
2ab0e606df osd: simplify OSD::scrub_load_below_threshold() a little bit
avoid unnecessary comparison

Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-20 18:23:21 +08:00
Haomai Wang
8ec7303b95 Merge pull request #4691 from varadakari/wip-kvs-objheader
KeyValueStore: optimize the object header writes

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
2015-05-20 16:29:21 +08:00
Vasu Kulkarni
f9e5b68b23 qa: unbreak concurrent.sh workunit
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2015-05-19 15:55:05 -04:00
Samuel Just
c2d17b927f test/librados/snapshots.cc: add test for 11677
Signed-off-by: Samuel Just <sjust@redhat.com>
2015-05-19 11:02:40 -07:00
Yan, Zheng
e585ddf43f Merge pull request #4658 from ceph/wip-11481
#11481: MDS resilience to weird mdsmaps
2015-05-19 16:03:52 +08:00