Commit Graph

46007 Commits

Author SHA1 Message Date
Vicente Cheng
b954c519e8 tests: ceph-disk: add wait_for_osd_down() in ceph-disk-test.py of qa
- add wait_for_osd_down() to avoid the side effect of deactivate

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
0f892e65a4 tests: ceph-disk: modify the ceph-disk qa test cases
- minor correct for the latest ceph-disk qa test cases

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
c110a63382 tests: ceph-disk: improve the unit test to coverage all deactivate/destroy function.
- rework the unit test to coverage all deactivate/destroy function
  - make test item simple and easier to read

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
3823c31f56 ceph-disk: improve the device query stage on deactivate/destroy feature.
- Make code path much easier to get device info. (get little bit overhead)
  - Let some error rasie the correct execption
  - for dmcrypt device, we unmap on the deactivate stage. (consist with activate)

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
7e88cf005f tests: ceph-disk: add deactivate/reactivate/destroy test cases.
- using the deactivate/destroy feature to destroy osd
  - test reactivate option when the osd goes deactive
  - add check_osd_status to check osd status when osd goes up

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
f51dd57b5d tests: ceph-disk: add some unittest functions to coverage destroy/deactivate feature.
- Add new unittest functions to coverage the part of dmcrypt/mpath handling.

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
56b3bbcd4c ceph-disk: modify the destroy/deactivate behavior to handle dmcrypt/mpath feature.
- modify the behavior to handle dmcrypt/mpath osd
  - add some functions to get the information of dmcrypt osd
  - fixed the logging format Use `%(filename)s` instead of `%(filename)`

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
d490fe99c6 tests: ceph-disk: Make unit test coverage all ceph-disk destroy/deactivate feature
- Add some unit test to coverage all destroy/deactivate feature.
  - Do some minor modifications on the ceph-diskw

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
06aeec9e5e tests: ceph-disk: modified the destroy_osd test function.
- use the new implementation (ceph-disk deactivate/destroy) instead of step by step remove

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
7922041080 ceph-disk: Implement unittest for ceph-disk deactivate/destroy feature
- The unit tests cover all the implemention about deactivate/destroy.

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
7b5151a4d1 ceph-disk: add --reactivate option, modify parameter about deactivate and destroy
- add `--reactivate` option (activate do no-op without reactivate with deactive flag)
  - for consistency, make both deactivate and destroy take the device/partition name
  - add `--deactivate-by-id` to deactivate and destroy for ease of use

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
7ca8d1daed ceph-disk: add --mark-out option on deactivate feature.
- Using `--mark-out` option to mark osd out when
    deactivate this osd instead of always mark osd out.

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
be471a2994 ceph-disk: use ceph osd dump to check osd status
- Before deactivated osd, we need to know the status
    including IN/OUT, UP/DOWN on this osd.

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
f064622103 ceph-disk: add destroy feature
Implement destroy option on ceph-disk.

  - remove OSD from CRUSH map
  - remove OSD cephx key
  - deallocate OSD ID
  - destroy data (with --zap option)

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:43 +08:00
Vicente Cheng
3fcdf411d4 ceph-disk: add deactivate feature
Implement deactivate option on ceph-disk.

  - stop ceph-osd service if needed (If osd still in osd map, make it out first)
  - remove 'ready', 'active', and INIT-specific files
  - remove gpt partition type and change partition name (prevent triggered by udev)
  - create deactive flag
  - umount device and remove mount point

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2015-11-17 09:24:42 +08:00
Jason Dillaman
00a9ce74ae tests: fix typo in TestClsRbd.snapshots test case
Fixes: #13727
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 3b39226709)
2015-11-16 14:33:25 -08:00
Josh Durgin
6bf5c8755d Merge pull request #6607 from dillaman/wip-13784
rbd: support negative boolean command-line optionals

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-11-16 11:21:38 -08:00
Josh Durgin
b66af10f8b Merge pull request #6606 from dillaman/wip-13806
rbd: add missing command aliases to refactored CLI

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-11-16 11:20:58 -08:00
Jason Dillaman
5aa840af0a rbd: support negative boolean command-line optionals
Fixes: #13784
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-11-16 13:31:34 -05:00
Sage Weil
6df48f8e38 os/newstore: disable rocksdb compression
This has been shown to be problematic for performance on the
monitor.  Note that this takes us from ~170/bytes per onode to
~540/bytes per onode.  (The encoded onode_t is 390 bytes, not
including the key name.)

Signed-off-by: Sage Weil <sage@redhat.com>
2015-11-16 13:27:25 -05:00
Sage Weil
d885489062 common: mirror leveldb default tuning w/ rocksdb
Signed-off-by: Sage Weil <sage@redhat.com>
2015-11-16 13:27:25 -05:00
Sage Weil
ae516d7d63 mon: disabled rocksdb compression when used as the backend
This significantly reduced CPU utilization on the bigbang scale
testing cluster at CERN.  Note that it is already disabled for
leveldb by default (in ceph_mon.cc).

Signed-off-by: Sage Weil <sage@redhat.com>
2015-11-16 13:27:24 -05:00
Sage Weil
ba60bf05b0 os/fs/FS: fix zero()'s PUNCH_HOLE incancation
We get EOPNOTSUPP unconditionally without KEEP_SIZE.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-11-16 13:23:16 -05:00
Sage Weil
2d921138f6 os/fs/FS: fix zero() return value on fallback
Signed-off-by: Sage Weil <sage@redhat.com>
2015-11-16 13:23:15 -05:00
Sage Weil
eddb00bd4f os/RocksDBStore: set up $path.wal -> $path symlink
If $path.wal doesn't exist, create it and symlink it to $path.
Set wal_dir to that.  This makes it easy to move the wal content
elsewhere later, or to pre-create the .wal dir.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-11-16 13:23:15 -05:00
Sage Weil
0dac747c79 os/newstore: distinguish between db open and create
Signed-off-by: Sage Weil <sage@redhat.com>
2015-11-16 13:23:15 -05:00
Sage Weil
22c9310bd8 os/newstore: remove newstore_db_path option
It is simpler to have fixed locations and symlinks.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-11-16 13:23:15 -05:00
Sage Weil
9c0ae4b86d os/newstore: newstore_backend_options -> newstore_rocksdb_options
This way we can have default settings per-backend.  Also note that
this is what we currently do with leveldb on the mon and osd.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-11-16 13:23:15 -05:00
Sage Weil
7f07e1ee22 os/newstore: set rocksdb default options
max_write_buffer_number=16
 min_write_buffer_number_to_merge=6

This cuts the amount of short-lived WAL data that gets
rewritten by roughly a factor of 6.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-11-16 13:23:15 -05:00
Jason Dillaman
f58ffdc040 tests: new rbd CLI command aliases
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-11-16 12:57:29 -05:00
Jason Dillaman
1ff6889054 rbd: add missing command aliases to refactored CLI
Fixes: #13806
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-11-16 12:55:11 -05:00
Dan van der Ster
37642a7768 osd: scrub if load below daily avg and decreasing
Store a daily loadavg and use as an upper limit on when to allow
scrubs. Also track the 15 minute loadavg and only scrub when the
loadavg is decreasing (i.e. 1m < 15m).

Backports: hammer, infernalis

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
2015-11-16 18:26:59 +01:00
Dan van der Ster
b7df772dd4 osd: randomize deep scrubbing
osd_scrub_interval_randomize_ratio works to randomize the shallow scrubs
but doesn't prevent a thundering herd of deep scrubs every osd_deep_scrub_interval.
Add the option osd_deep_scrub_randomize_ratio which defines the rate at which scrubs
will randomly turn into deep scrubs.

Backports: hammer, infernalis

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Signed-off-by: Herve Rousseau <herve.rousseau@cern.ch>
2015-11-16 18:16:34 +01:00
Jason Dillaman
8f9841d617 Merge pull request #6590 from ceph/wip-rbd-user-option
rbd: accept --user, refuse -i command-line optionals

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2015-11-16 09:07:39 -05:00
Jason Dillaman
9fd0df50a5 Merge pull request #6593 from trociny/fixup-rbd-refactor
rbd: stripe unit/count set incorrectly from config

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2015-11-16 09:05:36 -05:00
Yan, Zheng
511435f568 client: avoid creating orphan object in Client::check_pool_perm()
Fixes: #13782
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-11-16 21:17:46 +08:00
Loic Dachary
28f884e3ee Merge pull request #6594 from chenji-kael/patch-2
Update .mailmap
2015-11-16 07:33:08 +01:00
Li Wang
f02a51ffea scrub: do not assign value if read error for ECBackend
Signed-off-by: Li Wang <li.wang@kylin-cloud.com>
2015-11-16 12:17:49 +08:00
Li Wang
8bb61d3eea scrub: do not assign value if read error for ReplicatedBackend
Signed-off-by: Li Wang <li.wang@kylin-cloud.com>
2015-11-16 12:17:12 +08:00
Xiaowei Chen
a8b746411e osdservice: state changed to atomic_t to decrease thread context switch.
service.is_stopping is in the key IO path, hot call, better use spinlock.

Signed-off-by: Xiaowei Chen <chen.xiaowei@h3c.com>
2015-11-15 21:56:26 -05:00
Xiaowei Chen
b3a129041a osd: change mutex to spinlock to optimize thread context switch.
Signed-off-by: Xiaowei Chen <chen.xiaowei@h3c.com>
2015-11-15 21:40:32 -05:00
chenji-kael
7fc87e3469 Update .mailmap 2015-11-16 09:25:41 +08:00
Mykola Golub
08fd09a856 rbd: stripe unit/count set incorrectly from config
(after rbd-refactor fixup)

Signed-off-by: Mykola Golub <mgolub@mirantis.com>
2015-11-15 21:37:44 +02:00
Josh Durgin
6ad1013a59 Merge pull request #6549 from kylinstorage/global_flags
fix: use right init_flags to finish CephContext

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-11-15 10:45:04 -08:00
Sage Weil
d5bf38e6d5 Merge pull request #6575 from ceph/update-infernalis-releasenotes
doc: update infernalis release notes
2015-11-14 21:45:03 -05:00
Sage Weil
16f23abb8d Merge pull request #6484 from XinzeChi/wip-journal-optimization
os: write file journal optimization

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
2015-11-14 21:44:18 -05:00
Sage Weil
6becf839a9 Merge pull request #6371 from liewegas/wip-osd-less-crush
osd: avoid calculating crush mapping for most ops

Reviewed-by: Samuel Just <sjust@redhat.com>
2015-11-14 21:42:17 -05:00
Sage Weil
3a5e0a3db8 Merge pull request #6416 from rohanmars/wip-solaris-port
librados: Solaris port

Reviewed-by: Sage Weil <sage@redhat.com>
2015-11-14 21:41:31 -05:00
Sage Weil
b9251b4b6f Merge pull request #6439 from XinzeChi/wip-merge-tnx
osd: merge local_t and op_t txn to single one

Reviewed-by: Sage Weil <sage@redhat.com>
2015-11-14 21:40:42 -05:00
Sage Weil
8dd801d86d Merge pull request #6547 from xiaoxichen/fix1
osd: check do_shutdown before do_restart

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-11-14 21:39:46 -05:00