Commit Graph

41832 Commits

Author SHA1 Message Date
John Spray
38a319d515 qa/cephtool: add blacklist json output check
...not very elegantly because this is bash, but
at least check the expected value is somewhere
present in the JSON output.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-05-26 10:58:32 +01:00
John Spray
8ef6f8600e osd: fix blacklist field in OSDMap::dump
This was using an array_section so we were getting
a list of only the times, instead of an array
mapping addr to time.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-05-26 10:58:32 +01:00
Haomai Wang
4cc0f2f42c KeyValueStore: Add collect_metadata support
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-05-26 12:40:19 +08:00
Haomai Wang
7b5fc50005 KeyValueStore: Avoid extra lookup for map
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-05-26 12:34:08 +08:00
Haomai Wang
e805b944b9 Merge pull request #4757 from xinxinsh/wip-kv-cleanup
os : remove GenericObjectMap::sync() function

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
2015-05-26 11:57:10 +08:00
Yan, Zheng
70198708c5 Merge pull request #4755 from ceph/wip-11752
mds: fix use-after-free in SessionMap::remove_session
2015-05-26 09:18:05 +08:00
xinxin shu
81130516f1 os : remove unused GenericObjectMap::sync() funtion since no caller invoke this function
Signed-off-by: xinxin shu <xinxin.shu@intel.com>
2015-05-26 03:38:50 +08:00
Loic Dachary
db7936ae1c erasure-code: implement consistent error stream
The error stream in the erasure code path is broken and the error
message is sometime not reported back to the user. For instance the
ErasureCodePlugin::factory method has no error stream: when an error
happens the user is left with a cryptic error code that needs lookup in
the sources to figure it out.

The error stream is made more systematic by:

  * always pass it as ostream *ss (instead of something passing it as
    a reference and sometime as a stringstream)

  * ostream *ss is added to ErasureCodePlugin::factory

  * define the ErasureCodeInterface::init pure virtual. It is
    already implemented by all plugins, only in slightly different
    ways. The ostream *ss is added so the init function has a way to
    report error in a human readable way to the caller, in addition to
    the error code.

The ErasureCodePluginJerasure::init return value was incorrectly ignored
when called from ErasureCodePluginJerasure::factory and now returns when
it fails.

The ErasureCodeLrc::layers_init method is given ostream *ss for error
messages instead of printing them via derr.

The ErasureCodePluginLrc::factory method no longer prints errors via
derr: this workaround is made unnecessary by the ostream *ss argument.

The ErasureCodeShec::init ostream *ss argument is ignored. The
ErasureCodeShec::parse method entirely relies on derr to report errors
and converting it goes beyond the scope of this cleanup. There is a
slight risk of getting it wrong and it deserves a separate commit and
careful and independent review.

The PGBackend, OSDMonitor.{cc,h} changes are only about prototype
changes.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-25 16:59:02 +02:00
Loic Dachary
0822922566 erasure-code: do not leak shec instance on failure
If the shec plugin fails to initialize the instance, it must be deleted
before returning to the caller, otherwise it will be leaked.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-25 16:59:02 +02:00
Loic Dachary
6ca60061bb erasure-code: lrc size test depends on layer semantic
When the lrc layers are defined, the semantic of the D,c and _
characters are defined, the rest is undefined. The test that verifies
the guard against layers of different size uses the A character which
is undefined. Depending on the implementation, the size test could fail
because the A character is undefined and a guard to forbid undefined
characters is added. Replace A with D to make sure the undefined
character A will not interfere with the test.

This may seem nitpicking but it actually caused problems after a code
refactor that will appear in a few commits from here.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-25 16:59:02 +02:00
Loic Dachary
21036cf086 erasure-code: define the ErasureCodeProfile type
Instead of map<string,string>. Make it a non const when initializing
an ErasureCodeInterface instance so that it can be modified.

Rename parameters into profile for consistency with the user
documentation. The parameters name was chosen before the user interface
was defined. This cosmetic update is made in the context of larger
functional changes to improve error reporting and user interface
consistency.

The init() method are made to accept non const parameters.  It is
desirable for them to be able to modify the profile so that is
accurately reflects the values that are used. The caller may use this
information for better error reporting.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-25 16:49:44 +02:00
Ilya Dryomov
52440c4b97 rbd: document mount_timeout in the man page
With "rbd: timeout watch teardown on unmap with mount_timeout" going
into kernel 4.2, document its effect in the man page.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-05-25 15:41:51 +03:00
Yehuda Sadeh
51bf619b5e Merge pull request #4745 from jmunhoz/object-copy-bug
rgw: Use attrs from source bucket on copy

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2015-05-25 00:05:11 -07:00
Yan, Zheng
2daaa61bc2 mds: fix use-after-free in SessionMap::remove_session
Fixes: #11752
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-05-25 11:46:06 +08:00
Loic Dachary
d49d8166a0 Merge pull request #4709 from dachary/wip-shec-corpus
erasure-code: update ceph-erasure-code-corpus for shec
2015-05-23 10:54:12 +02:00
Loic Dachary
4507cb235d Merge pull request #4751 from islepnev/wip-11612
Support NVMe device partitions by ceph-disk

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-23 09:05:03 +02:00
islepnev
9b62cf254d ceph-disk: support NVMe device partitions
Linux nvme kernel module v0.9 enumerate devices as following:

/dev/nvme0 - characted revice
/dev/nvme0n1 - whole block device
/dev/nvme0n1p1 - first partition
/dev/nvme0n1p2 - second partition

http://tracker.ceph.com/issues/11612 Fixes: #11612

Signed-off-by: Ilja Slepnev <islepnev@gmail.com>
2015-05-23 00:51:45 +03:00
Kefu Chai
f36344f07f Merge pull request #4736 from tchaikov/wip-11693-only-restart-crashed-osds
test/test-erasure-code: spin off EIO tests to avoid lingering OSDs after tests finish

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-23 03:56:34 +08:00
Josh Durgin
87ec95fb37 Merge pull request #4748 from ceph/wip-11562
dev/rbd-diff: clarify encoding of image size

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-05-22 12:43:46 -07:00
Kefu Chai
f417eda9a4 tests/test-erasure-code: spin off eio tests into another testsuite
* since the eio tests crashes some of the OSD nodes, before the
  change, the tests try to undo the crash before moving on, so it
  won't interfere with following tests. a more robust/clean way to
  do this is to isolate individual tests in a sandbox, so each eio
  test will have its own:
    setup + inject + verify crash + teardown
  cycle. this change helps to remove the cleanup/undo steps in
  invidual test.
* update the disabled tests accordingly.
* use a minimum set of OSDs and R-S(2,1) for the testing to speed
  up the test.
* add the new testsuite to check_SCRIPTS

Fixes: #11693
Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-23 03:22:41 +08:00
Kefu Chai
2230deffce tests: fix the get_config()
* the "daemon" parameter was not respected.
* update the test_get_config() to check the overrided option instead of
  the default one.
* add set_config()

Signed-off-by: Kefu Chai <kchai@redhat.com>
2015-05-23 03:19:23 +08:00
Loic Dachary
09a6457296 Merge pull request #4749 from ddiss/ceph_disk_test_fix
tests: don't choke on deleted losetup paths

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 20:45:16 +02:00
Casey Bodley
33eae4ec2f xio: fix reuse of outer loop index in inner loop
Reported-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-22 11:09:43 -07:00
Casey Bodley
367a5fccf2 cmake: add missing source file to test_librbd
Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-22 11:09:37 -07:00
Casey Bodley
a8fca3c212 cmake: add missing common/util.cc dependency
Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-22 11:09:33 -07:00
Casey Bodley
15dd70cd5a cmake: skip man/CMakeLists.txt
man pages have to be preprocessed now, and can't be installed directly.
skip installing them until we add the cmake-fu to copy what man/Makefile.am
is doing

Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-22 11:09:29 -07:00
David Disseldorp
7c1bae5624 tests: don't choke on deleted losetup paths
If a file has been deleted with a loopback device attached, then the
`losetup --all` output will carry:
/dev/loopX: [0032]:344213 (/.../src/test-ceph-disk/vdf.disk (deleted))

This causes the losetup parsing in reset_leftover_dev() to throw an
error, e.g.:
rreset_leftover_dev: 430: test
'(/home/ddiss/ceph/src/test-ceph-disk/vdf.disk' '(deleted))' =
'(/home/ddiss/ceph/src/test-ceph-disk/vdf.disk)'
test/ceph-disk.sh: line 430: test: too many arguments

Fix this by quoting the path variable for the string comparison.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2015-05-22 17:22:51 +02:00
Jason Dillaman
f9ba711c30 dev/rbd-diff: clarify encoding of image size
Fixes: #11562
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-05-22 11:18:09 -04:00
Loic Dachary
7defc06962 Merge pull request #4512 from hjwsm1989/init-ceph
init-ceph.in: Create osd data dir if don't exists.

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 16:19:28 +02:00
Loic Dachary
9f0d2da72f Merge pull request #4740 from ktdreyer/wip-11688-doc-firewall-ports
#11688: doc: update OSD/MDS firewall port list

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 14:32:04 +02:00
Kefu Chai
6a7aa6c552 Merge pull request #4734 from wonzhq/aio-completion
test/aio: fix the leak of aio completion

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-05-22 17:57:48 +08:00
Kefu Chai
c2c36bc977 Merge pull request #4738 from dachary/wip-11618-osd-create-dup
tests: ceph create may consume more than one id

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-05-22 16:55:04 +08:00
Loic Dachary
ab8e9e39c8 tests: CEPH_CLI_TEST_DUP_COMMAND=1 for qa/workunits/cephtool/test.sh
Run cephtool-test-{mon,osd,mds}.sh with CEPH_CLI_TEST_DUP_COMMAND=1 to
detect idempotency related problems during make check. This is how
ceph-qa-suite/tasks/workunit.py will run
suites/rados/singleton/all/cephtool.yaml and it's easier to fix when
make check fails rather than later on when a fully populated rados suite
has one failed job.

http://tracker.ceph.com/issues/11618 Refs: #11618

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-22 10:16:37 +02:00
Loic Dachary
5c69f5e15f tests: ceph create may consume more than one id
When CEPH_CLI_TEST_DUP_COMMAND=1 is set, ceph osd create will consume
two osd id and return the later. Fix the test to account for that and
not assume the osd id being allocated by osd create is always the
next available osd id.

The other osd create tests do not suffer from the same variation because
they provide a UUID argument that guarantees the same osd id is going to
be returned every time.

http://tracker.ceph.com/issues/11618 Fixes: #11618

Signed-off-by: Loic Dachary <ldachary@redhdat.com>
2015-05-22 10:16:24 +02:00
Javier M. Mellid
1dac80df1d rgw: Use attrs from source bucket on copy
On copy objects, when bucket source is the same as the destination, use attrs
from source bucket.

Fixes: #11639

Signed-off-by: Javier M. Mellid <jmunhoz@igalia.com>
2015-05-22 09:41:01 +02:00
Yehuda Sadeh
4be8e49e38 Merge pull request #4617 from aakso/wip-11367-pki-token-expire
rgw: always check if token is expired

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2015-05-21 14:09:38 -07:00
Ken Dreyer
11fef2275d doc: recommend opening entire 6800-7300 port range
Prior to this commit, the Network Configuration Reference guide and
Troubleshooting guide recommended opening a number of ports that were
unique to the number of daemons that we ran.

This doesn't really cover all use cases. Users can easily restart
daemons in ways that cause the daemons to bind to higher ports. This
leads to OSDs or MDSs binding to ports that are firewalled.

Update the Network Configuration Reference guide and Troubleshooting
guides to simply recommend that users open all the ports between 6800
and 7300 on their OSDs and MDSs.

http://tracker.ceph.com/issues/11688 Refs: #11688

Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2015-05-21 14:31:00 -06:00
Ken Dreyer
b50cc9472f doc: update OSD port range to 6800-7300
The upper limit for OSD/MDS ports changed from 7100 to 7300 in commit
f9ec5a7945. Update the Quick Start
Preflight documentation to reflect this change.

Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2015-05-21 13:00:32 -06:00
Yehuda Sadeh
98cdf03363 Merge pull request #4391 from nilamdyuti/wip-doc-ceph-object-gateway
doc: Removes references to s3gw.fcgi in simple gateway configuration file...

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2015-05-21 13:00:20 -04:00
Casey Bodley
3dda5faf75 xio: malloc if xio_mempool_alloc fails
Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-21 07:04:36 -07:00
Casey Bodley
5c14a69395 xio: fix for xio_msg release after teardown
The xio_msg pointers to be freed in XioPortal::release_xio_rsp() are no
longer valid after a call to xio_connection_destroy(). We were already
avoiding the call to xio_release_msg() in this case, but were still
dereferencing the xio_msg for its user_context pointer. Moved the check
for is_connected() outside of the loop to avoid any access to msg.

Suggested-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-21 07:04:26 -07:00
Casey Bodley
16d1c1e97d xio: use ceph clock for timestamps
accelio is using rdtsc to generate xio_msg.timestamp, which can't be
reliably converted to a timeval. now uses ceph_clock_now() to assign
the Message::recv_stamp and recv_complete_stamp

Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-21 07:00:12 -07:00
Vu Pham
c2bba8ebee xio: save nonce for bind address
A missing nonce in the osd addrs was preventing the monitor from
detecting osd restarts. XioMessenger::bind() now sets the nonce in the
same way that SimpleMessenger and AsyncMessenger do

Signed-off-by: Casey Bodley <casey@cohortfs.com>
Signed-off-by: Vu Pham <vu@mellanox.com>
2015-05-21 06:59:59 -07:00
Casey Bodley
355aa0e44b xio: check if connection is on list before erasing
also removed the extra conditional put() in on_disconnect_event()

Signed-off-by: Casey Bodley <casey@cohortfs.com>
2015-05-21 06:59:46 -07:00
Vu Pham
bb621b074d xio: better way to assign connections to specific lane
Better way to assign connections to a specific lane of a portal
Avoiding lane competition/hogging.
This change resolves the slow ramping up and spiky behaviors during
clients starting/running I/Os.

Signed-off-by: Vu Pham <vu@mellanox.com>
2015-05-21 06:59:36 -07:00
Zhiqiang Wang
855a70d83e test/aio: aio completion is not released
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-05-21 14:58:20 +08:00
Dan Mick
783fdc7c3e Merge pull request #4517 from ceph/wip-11388-debian-argparse
#11388 debian: move ceph_argparse into ceph-common

Reviewed-by: Dan Mick <dmick@redhat.com>
2015-05-20 14:54:16 -07:00
Ilya Dryomov
8190f44f07 Merge pull request #4721 from ceph/wip-fix-concurrent.sh
Fix ceph.conf path in concurrent.sh - krbd qa suite

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2015-05-20 20:54:55 +03:00
Ken Dreyer
110608e5bd debian: move ceph_argparse into ceph-common
Prior to this commit, if a user installed the "ceph-common" Debian
package without installing "ceph", then /usr/bin/ceph would crash
because it was missing the ceph_argparse library.

Ship the ceph_argparse library in "ceph-common" instead of "ceph". (This
was the intention of the original commit that moved argparse to "ceph",
2a23eac549)

http://tracker.ceph.com/issues/11388 Refs: #11388

Reported-by: Jens Rosenboom <j.rosenboom@x-ion.de>
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2015-05-20 11:29:04 -06:00
Kefu Chai
8c65e2af29 Merge pull request #4720 from athanatos/wip-clarify-DBObjectMap-sync
DBObjectMap::sync: add comment clarifying locking

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-05-20 21:28:28 +08:00