Commit Graph

41739 Commits

Author SHA1 Message Date
Yan, Zheng
b20ea4302f client: start flushing dirty caps in Client::_fsync()
Client::flush_caps(Inode *, MetaSession *) does not start flushing
dirty caps. It only re-send caps message for caps that are already
being flushed.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-05-29 10:03:08 +08:00
Yan, Zheng
6883b82998 client: make fsync wait for unsafe directory operations
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-05-29 10:03:08 +08:00
Yan, Zheng
ce27ae4f16 client: make fsync waits for single inode's flushing caps
Client::_fsync() calls Client::wait_sync_caps(uint64_t), which
waits for all inodes' flush caps. It's suboptimal, this patch
makes Client::_fsync() wait for flushing caps which belong to
the fsync inode.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-05-29 10:03:08 +08:00
Yan, Zheng
6bb9e158fc client: don't update flushing_cap_seq when there are flushing caps
Current code always updates flushing_cap_seq when marking dirty
caps as flushing. If there are old flushing caps, updating
flushing_cap_seq confuses Client::wait_sync_caps(uint64_t), make
it think that the old caps have been flushed.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-05-29 10:03:01 +08:00
Loic Dachary
d49d8166a0 Merge pull request #4709 from dachary/wip-shec-corpus
erasure-code: update ceph-erasure-code-corpus for shec
2015-05-23 10:54:12 +02:00
Loic Dachary
4507cb235d Merge pull request #4751 from islepnev/wip-11612
Support NVMe device partitions by ceph-disk

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-23 09:05:03 +02:00
islepnev
9b62cf254d ceph-disk: support NVMe device partitions
Linux nvme kernel module v0.9 enumerate devices as following:

/dev/nvme0 - characted revice
/dev/nvme0n1 - whole block device
/dev/nvme0n1p1 - first partition
/dev/nvme0n1p2 - second partition

http://tracker.ceph.com/issues/11612 Fixes: #11612

Signed-off-by: Ilja Slepnev <islepnev@gmail.com>
2015-05-23 00:51:45 +03:00
Kefu Chai
f36344f07f Merge pull request #4736 from tchaikov/wip-11693-only-restart-crashed-osds
test/test-erasure-code: spin off EIO tests to avoid lingering OSDs after tests finish

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-23 03:56:34 +08:00
Josh Durgin
87ec95fb37 Merge pull request #4748 from ceph/wip-11562
dev/rbd-diff: clarify encoding of image size

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-05-22 12:43:46 -07:00
Kefu Chai
f417eda9a4 tests/test-erasure-code: spin off eio tests into another testsuite
* since the eio tests crashes some of the OSD nodes, before the
  change, the tests try to undo the crash before moving on, so it
  won't interfere with following tests. a more robust/clean way to
  do this is to isolate individual tests in a sandbox, so each eio
  test will have its own:
    setup + inject + verify crash + teardown
  cycle. this change helps to remove the cleanup/undo steps in
  invidual test.
* update the disabled tests accordingly.
* use a minimum set of OSDs and R-S(2,1) for the testing to speed
  up the test.
* add the new testsuite to check_SCRIPTS

Fixes: #11693
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-23 03:22:41 +08:00
Kefu Chai
2230deffce tests: fix the get_config()
* the "daemon" parameter was not respected.
* update the test_get_config() to check the overrided option instead of
  the default one.
* add set_config()

Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-23 03:19:23 +08:00
Loic Dachary
09a6457296 Merge pull request #4749 from ddiss/ceph_disk_test_fix
tests: don't choke on deleted losetup paths

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 20:45:16 +02:00
David Disseldorp
7c1bae5624 tests: don't choke on deleted losetup paths
If a file has been deleted with a loopback device attached, then the
`losetup --all` output will carry:
/dev/loopX: [0032]:344213 (/.../src/test-ceph-disk/vdf.disk (deleted))

This causes the losetup parsing in reset_leftover_dev() to throw an
error, e.g.:
rreset_leftover_dev: 430: test
'(/home/ddiss/ceph/src/test-ceph-disk/vdf.disk' '(deleted))' =
'(/home/ddiss/ceph/src/test-ceph-disk/vdf.disk)'
test/ceph-disk.sh: line 430: test: too many arguments

Fix this by quoting the path variable for the string comparison.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2015-05-22 17:22:51 +02:00
Jason Dillaman
f9ba711c30 dev/rbd-diff: clarify encoding of image size
Fixes: #11562
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-05-22 11:18:09 -04:00
Loic Dachary
7defc06962 Merge pull request #4512 from hjwsm1989/init-ceph
init-ceph.in: Create osd data dir if don't exists.

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 16:19:28 +02:00
Loic Dachary
9f0d2da72f Merge pull request #4740 from ktdreyer/wip-11688-doc-firewall-ports
#11688: doc: update OSD/MDS firewall port list

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 14:32:04 +02:00
Kefu Chai
6a7aa6c552 Merge pull request #4734 from wonzhq/aio-completion
test/aio: fix the leak of aio completion

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-05-22 17:57:48 +08:00
Kefu Chai
c2c36bc977 Merge pull request #4738 from dachary/wip-11618-osd-create-dup
tests: ceph create may consume more than one id

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-05-22 16:55:04 +08:00
Loic Dachary
ab8e9e39c8 tests: CEPH_CLI_TEST_DUP_COMMAND=1 for qa/workunits/cephtool/test.sh
Run cephtool-test-{mon,osd,mds}.sh with CEPH_CLI_TEST_DUP_COMMAND=1 to
detect idempotency related problems during make check. This is how
ceph-qa-suite/tasks/workunit.py will run
suites/rados/singleton/all/cephtool.yaml and it's easier to fix when
make check fails rather than later on when a fully populated rados suite
has one failed job.

http://tracker.ceph.com/issues/11618 Refs: #11618

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 10:16:37 +02:00
Loic Dachary
5c69f5e15f tests: ceph create may consume more than one id
When CEPH_CLI_TEST_DUP_COMMAND=1 is set, ceph osd create will consume
two osd id and return the later. Fix the test to account for that and
not assume the osd id being allocated by osd create is always the
next available osd id.

The other osd create tests do not suffer from the same variation because
they provide a UUID argument that guarantees the same osd id is going to
be returned every time.

http://tracker.ceph.com/issues/11618 Fixes: #11618

Signed-off-by: Loic Dachary <ldachary@redhdat.com>
2015-05-22 10:16:24 +02:00
Yehuda Sadeh
4be8e49e38 Merge pull request #4617 from aakso/wip-11367-pki-token-expire
rgw: always check if token is expired

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2015-05-21 14:09:38 -07:00
Ken Dreyer
11fef2275d doc: recommend opening entire 6800-7300 port range
Prior to this commit, the Network Configuration Reference guide and
Troubleshooting guide recommended opening a number of ports that were
unique to the number of daemons that we ran.

This doesn't really cover all use cases. Users can easily restart
daemons in ways that cause the daemons to bind to higher ports. This
leads to OSDs or MDSs binding to ports that are firewalled.

Update the Network Configuration Reference guide and Troubleshooting
guides to simply recommend that users open all the ports between 6800
and 7300 on their OSDs and MDSs.

http://tracker.ceph.com/issues/11688 Refs: #11688

Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2015-05-21 14:31:00 -06:00
Ken Dreyer
b50cc9472f doc: update OSD port range to 6800-7300
The upper limit for OSD/MDS ports changed from 7100 to 7300 in commit
f9ec5a7945. Update the Quick Start
Preflight documentation to reflect this change.

Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2015-05-21 13:00:32 -06:00
Yehuda Sadeh
98cdf03363 Merge pull request #4391 from nilamdyuti/wip-doc-ceph-object-gateway
doc: Removes references to s3gw.fcgi in simple gateway configuration file...

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2015-05-21 13:00:20 -04:00
Zhiqiang Wang
855a70d83e test/aio: aio completion is not released
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-05-21 14:58:20 +08:00
Dan Mick
783fdc7c3e Merge pull request #4517 from ceph/wip-11388-debian-argparse
#11388 debian: move ceph_argparse into ceph-common

Reviewed-by: Dan Mick <dmick@redhat.com>
2015-05-20 14:54:16 -07:00
Ilya Dryomov
8190f44f07 Merge pull request #4721 from ceph/wip-fix-concurrent.sh
Fix ceph.conf path in concurrent.sh - krbd qa suite

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2015-05-20 20:54:55 +03:00
Ken Dreyer
110608e5bd debian: move ceph_argparse into ceph-common
Prior to this commit, if a user installed the "ceph-common" Debian
package without installing "ceph", then /usr/bin/ceph would crash
because it was missing the ceph_argparse library.

Ship the ceph_argparse library in "ceph-common" instead of "ceph". (This
was the intention of the original commit that moved argparse to "ceph",
2a23eac549)

http://tracker.ceph.com/issues/11388 Refs: #11388

Reported-by: Jens Rosenboom <j.rosenboom@x-ion.de>
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2015-05-20 11:29:04 -06:00
Kefu Chai
8c65e2af29 Merge pull request #4720 from athanatos/wip-clarify-DBObjectMap-sync
DBObjectMap::sync: add comment clarifying locking

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-05-20 21:28:28 +08:00
Kefu Chai
4e272e5eb1 Merge pull request #3946 from tchaikov/randomize-scrub-time
osd: Randomize scrub time

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2015-05-20 21:21:13 +08:00
Kefu Chai
6344fc8393 osd: use another name for randomize scrub option
s/osd_scrub_interval_limit/osd_scrub_interval_randomize_ratio/

Fixes: #10973
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-20 18:23:21 +08:00
Kefu Chai
5e44040e85 osd: randomize scrub times to avoid scrub wave
- to avoid the scrub wave when the osd_scrub_max_interval reaches in a
  high-load OSD, the scrub time is randomized.
- extract scrub_load_below_threshold() out of scrub_should_schedule()
- schedule an automatic scrub job at a time which is uniformly distributed
  over [now+osd_scrub_min_interval,
        now+osd_scrub_min_interval*(1+osd_scrub_time_limit]. before
  this change this sort of scrubs will be performed once the hard interval
  is end or system load is below the threshold, but with this change, the
  jobs will be performed as long as the load is low or the interval of
  the scheduled scrubs is longer than conf.osd_scrub_max_interval. all
  automatic jobs should be performed in the configured time period, otherwise
  they are postponed.
- the requested scrub job will be scheduled right away, before this change
  it is queued with the timestamp of `now` and postponed after
  osd_scrub_min_interval.

Fixes: #10973
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-20 18:23:21 +08:00
Kefu Chai
0f7f35670f osd: use __func__ in log messages
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-20 18:23:21 +08:00
Kefu Chai
2ab0e606df osd: simplify OSD::scrub_load_below_threshold() a little bit
avoid unnecessary comparison

Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-20 18:23:21 +08:00
Haomai Wang
8ec7303b95 Merge pull request #4691 from varadakari/wip-kvs-objheader
KeyValueStore: optimize the object header writes

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
2015-05-20 16:29:21 +08:00
Vasu Kulkarni
f9e5b68b23 qa: unbreak concurrent.sh workunit
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2015-05-19 15:55:05 -04:00
Yan, Zheng
e585ddf43f Merge pull request #4658 from ceph/wip-11481
#11481: MDS resilience to weird mdsmaps
2015-05-19 16:03:52 +08:00
Josh Durgin
1b758c9945 Merge pull request #4722 from ceph/wip-rbd-xfstests-20150518
rbd: expunge xfstests generic/078

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-05-18 23:13:35 -07:00
Douglas Fuller
bf40b9b553 rbd: expunged xfstests generic/078
This tests RENAME_WHITEOUT, which was enabled for xfs in kernel commit
7dcf5c3e4527cfa2807567b00387cf2ed5e07f00. At first execution, it throws a BUG.
Subsequent executions appear to work correctly. This issue manifests for disks
and RBD instances.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2015-05-18 17:37:00 -07:00
David Zafman
87433dabdd Merge pull request #4705 from stiopaa1/exit
cryptic error message in ceph interactive mode

Reviewed-by: David Zafman <dzafman@redhat.com>
2015-05-18 13:27:14 -07:00
Samuel Just
2eca53682f DBObjectMap::sync: add comment clarifying locking
Signed-off-by: Samuel Just <sjust@redhat.com>
2015-05-18 12:29:05 -07:00
Yan, Zheng
765ddaeaa0 Merge pull request #4715 from ceph/wip-11641
mds: fix handling missing mydir dirfrag
2015-05-19 00:28:01 +08:00
John Spray
9ed491989a mds: fix handling missing mydir dirfrag
This was broken by 96992466 aka "mds: handle missing mydir dirfrag"

The previous code was mistakenly treating a not-yet-loaded
dirfrag as a non-existent dirfrag, resulting in
inconsistent fragstats even when no objects had
actually been lost.

Fixes: #11641
Signed-off-by: John Spray <john.spray@redhat.com>
2015-05-18 16:15:07 +01:00
Haomai Wang
2863163cd5 Merge pull request #4693 from varadakari/wip-kvdb-prefix
KeyValueStore: Fix the prefix comparion to avoid object leaks.

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
2015-05-18 21:58:10 +08:00
Haomai Wang
0a087c1dae Merge pull request #4692 from varadakari/wip-kvs-iterator
wip-kvs-iterator

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
2015-05-18 21:56:11 +08:00
Kefu Chai
c1f4b7a257 Merge pull request #4703 from dachary/wip-make-check-verbose
tests: reduce make check verbosity

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-05-18 10:14:29 +08:00
Loic Dachary
e4ca4685e0 tests: reduce make check verbosity
Move check-local scripts

   src/test/run-cli-tests
   encode-decode-non-regression.sh
   test/encoding/readable.sh

to check_SCRIPTS. Their output is captured in .log file when running
with a recent automake. This reduces the output of make check by an
order of magnitude.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-18 00:41:16 +02:00
Loic Dachary
a0eac3e48c Merge pull request #4711 from dachary/wip-ceph-detect-init
ceph-detect-init typo

Reviewed-by: Michal Jarzabek <stiopa@gmail.com>
2015-05-17 22:55:31 +02:00
Loic Dachary
64f584a8e7 ceph-detect-init: fix pep8 extra space
Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-17 21:30:54 +02:00
Loic Dachary
855aeee697 ceph-detect-init: run-tox.sh always succeeds
Because of the | grep, the status of tox is no longer the status of
run-tox.sh and errors are not reported as they should.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-17 21:29:25 +02:00