Commit Graph

50322 Commits

Author SHA1 Message Date
Kefu Chai
e2374c43c9 rados: add "list-inconsistent-snapset" cmd
to list inconsistent snapsets of a given PG, this command exposes
get_inconsistent_snapsets() rados API to user.

Fixes: #13505
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:41:55 +08:00
Kefu Chai
8018eab344 rados: add "list-inconsistent-obj" cmd
to list inconsistent objects of a given PG, this command exposes
get_inconsistent_objects() rados API to user.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:41:55 +08:00
Kefu Chai
c9b593d2d7 librados: add get_inconsistent_snapsets() API
Fixes: #13505
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:41:55 +08:00
Kefu Chai
dfc2f4823b librados: add get_inconsistent_objects() API
Fixes: #13505
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:41:55 +08:00
Kefu Chai
3dea4f1f5c osd: add CEPH_OSD_OP_SCRUBLS pg op
it is a new pg op which returns the encoded objects stored when
scrubbing.

Fixes: #13505
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:41:55 +08:00
Kefu Chai
2009ed274d osd: persist inconsistent snapsets using omap
the inconsistent snapsets are identified in ReplicatedPG::_scrub()
after we compared the authorized objects with their replica/shards.
these inconsistent information are stored in the omap of objects
with prefix "SCRUB_SS_".

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:41:55 +08:00
Kefu Chai
c828c398b1 librados: add inconsistent_snapset_t type
for presenting the inconsistent snapsets found in scrub

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:41:55 +08:00
Samuel Just
8ed62772bd osd/: clear scrub store safetly
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-02-25 12:41:55 +08:00
Kefu Chai
fb956b742c osd: persist inconsistent objs using omap
persist inconsistent objects found when comparing the ScrubMaps
collected from replica/shards. the discrepancies between the auth
copy and the replica are identified as inconsistencies. and hence
encoded into the omap of an object of the temp coll of the PG in
question.
scrub_types.{h,cpp} are introduced to hide the details of how we
persist the scrub types from the librados client.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:41:55 +08:00
Kefu Chai
f236b7d736 osd: more constness to spg_t
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:40:17 +08:00
Kefu Chai
b43d480938 librados: add inconsistent_obj_t types
which present the inconsistent objects found in scrub

Fixes: #13505
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:40:17 +08:00
Kefu Chai
4c3270692e rados: add "list-inconsistent-pg" command
to list inconsistent PGs of a given pool. this command exposes
the underlying get_inconsistent_pgs() API to user.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:40:17 +08:00
Kefu Chai
d0af316cf2 pybind: add Rados.get_inconsistent_pgs method
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:40:17 +08:00
Kefu Chai
cb4efbd72f librados: add get_inconsistent_pgs() to librados
to list the inconsistent PGs of given pool, it's a wrapper
around the "ceph pg ls" command.

Fixes: #13505
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:40:17 +08:00
Kefu Chai
50bbf7f92e tools/rados: support more --format options
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-02-25 12:40:17 +08:00
Yehuda Sadeh
1c4ccfe3c4 Merge pull request #7597 from rzarzynski/wip-rgw-keystone-proper-fail
rgw: improve error handling in S3/Keystone integration

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2016-02-24 14:40:02 -08:00
Sage Weil
f053e3e7f0 Merge pull request #7051 from wonzhq/scrub-cmp-map
osd: avoid FORCE updating digest been overwritten by MAYBE when comparing scrub map

Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-02-24 11:35:38 -08:00
Sage Weil
48d007f659 Merge pull request #7617 from liewegas/wip-14745
osd: fix forced prmootion for CALL ops

Reviewed-by: Sage Weil <sage@redhat.com>
2016-02-24 10:45:59 -08:00
Sage Weil
e67b70e7fa Merge pull request #7286 from ktdreyer/wip-10587-init-ceph-disk
init-ceph.in: skip ceph-disk if it is not present

Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-02-24 10:42:37 -08:00
Sage Weil
ccfbca425a Merge pull request #7599 from ifed01/wip-14707
log: fix stack overflow when flushing large log lines

Reviewed-by: Sage Weil <sage@redhat.com>
2016-02-24 10:42:04 -08:00
Sage Weil
c10f62874a Merge pull request #7643 from jazeltq/add_count_github
osd: filejournal: report journal entry count

Reviewed-by: Sage Weil <sage@redhat.com>
2016-02-24 10:41:17 -08:00
Sage Weil
fa8ba08b96 Merge pull request #7658 from majianpeng/bluestore
osd: bluestore: misc fixes

Reviewed-by: Sage Weil <sage@redhat.com>
2016-02-24 10:40:26 -08:00
Sage Weil
6aa9a4735a Merge pull request #7677 from xiexingguo/xxg-wip-14786
osd: fix fusestore hanging during stop/quit

Reviewed-by: Sage Weil <sage@redhat.com>
2016-02-24 10:39:27 -08:00
Sage Weil
4229cd436d Merge pull request #7680 from ceph/wip-da-SCA-20160203
common: various fixes from SCA runs

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Jason Dillaman <jdillama@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2016-02-24 10:38:06 -08:00
Sage Weil
b3141ce1d0 Merge pull request #7681 from jjhuo/kstore-updates
osd: kstore: sync up kstore with recent bluestore updates

Reviewed-by: Sage Weil <sage@redhat.com>
2016-02-24 10:37:13 -08:00
Jason Dillaman
8ff1a8df1c Merge pull request #7759 from trociny/wip-rbd-mirror-image-replayer-improvements
rbd-mirror: ImageReplayer improvements

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2016-02-24 11:40:43 -05:00
Jason Dillaman
53d87e1775 Merge pull request #7736 from trociny/fix-librbd-asok-empty-name
librbd: retrieve image name when opening by id

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2016-02-24 11:07:22 -05:00
Jason Dillaman
f0e428ad45 Merge pull request #7620 from jdurgin/wip-14419
cls_rbd: mirroring directory

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2016-02-24 10:35:28 -05:00
Josh Durgin
28e2d1bc55 cls_rbd: add methods for keeping track of mirrored images
These will track whether local images should be mirrored, and map them
to a unique global id. There's a state field for safely disabling
mirroring while operating on multiple objects.

Fixes: #14419
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-02-24 10:34:28 -05:00
Josh Durgin
da9b36a70d librbd: rename rbd_pool_settings object to rbd_mirroring
We'll use this object only for mirroring-related purposes, not generic
settings on a pool.

Refs: #14419
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-02-24 10:34:28 -05:00
Hector Martin
6e901bac30 pybind/rados: track completions before calling aio functions
Tracking completions is critical for memory safety - if the
aio function succeeds, the completion must be tracked. However,
if a KeyboardInterrupt or similar arrives between the call and
the tracking, the completion will not be tracked.

Fix this by tracking the completion before the aio call, and
explicitly cleaning up in the failure case.

This leaves the opposite problem, where an unexpected exception
(other than simple error return from the aio function) will cause
the completion to not be freed until the Ioctx is destroyed, but
that is a relatively minor issue.

Signed-off-by: Hector Martin <marcan@marcan.st>
2016-02-25 00:10:14 +09:00
Hector Martin
3c767ab815 pybind/rados: Fix error handling and leaks in aio
aio_read:
The reference to ret_s begins existing at PyBytes_FromStringAndSize and
is handed over to the callback if rados_aio_read succeeds. This creates
a lot of subtle scenarios where it might not be XDECREFed (e.g. if
a KeyboardInterrupt arrives at the wrong time). Instead, store the pointer
to that buffer in the completion object, and hand over responsibility for
the XDECREF to it. This guarantees that the "special" reference to this
incomplete object will be released when the completion object is
deallocated.

Also make sure we don't try to _PyBytes_Resize with a negative length.

Add a failure case to the aio_read test in test_rados.py

Completion: the wrapper methods weren't being called, which prevents
the completion objects from being freed until the Ioctx is. Fix this
and add a refcount check to the aio_read test.

Signed-off-by: Hector Martin <marcan@marcan.st>
2016-02-25 00:10:10 +09:00
Loic Dachary
894b3b7cf4 Merge pull request #7762 from ErwanAliasr1/evelu-check
Improving 'make check' for ceph-disk

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2016-02-24 21:42:02 +07:00
Mykola Golub
5360d8612d librbd: init asok_hook on open so name is always known
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
2016-02-24 16:28:05 +02:00
Mykola Golub
80656b1131 librbd: get image name on open if it is opened by id
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
2016-02-24 16:28:05 +02:00
Kefu Chai
1ee8c13f9c Merge pull request #7743 from JiYou/open-test-for-bug-2339
test: enable test for bug #2339 which has been resolved.

Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-02-24 21:57:05 +08:00
Haomai Wang
281b1e37bb Merge pull request #7773 from roidayan/xio_remove_redundant_magic_methods
xio: remove redundant magic methods

Reviewed-by: Haomai Wang<haomai@xsky.com>
2016-02-24 21:49:21 +08:00
Jason Dillaman
5c5ca11b74 Merge pull request #7771 from xiexingguo/xxg-wip-fixawatch
librados: do not clear handle for aio_watch()

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2016-02-24 08:35:06 -05:00
Roi Dayan
af12582b35 xio: remove redundant magic methods
The same methods are inherited from Messenger.

Signed-off-by: Roi Dayan <roid@mellanox.com>
2016-02-24 14:00:59 +02:00
xie xingguo
3cfb83d1d5 librados: remove unused local variables
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-02-24 16:59:33 +08:00
xie xingguo
8caa2e455c librados: do not clear handle for aio_watch()
which is needed for aio_unwatch() to work.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-02-24 16:56:23 +08:00
Erwan Velu
48af4bec8b tests/ceph-disk: Let teardown clearing data
When running the ceph-disk check suite, 'test_mark_init' & 'test_activate_dir' had a
permission denied of the freshly created stuff like in :

tests/ceph-disk.sh:237: test_mark_init:  /bin/rm -fr /ceph/src/ceph-disk/test-ceph-disk/dir
/bin/rm: cannot remove « /ceph/src/ceph-disk/test-ceph-disk/dir/snap_1 »: Permission denied
/bin/rm: cannot remove « /ceph/src/ceph-disk/test-ceph-disk/dir/current »: Permission denied
/bin/rm: cannot remove « /ceph/src/ceph-disk/test-ceph-disk/dir/snap_2 »: Permission denied

Those two tests are starting a mon with run_mon.
Trying to delete the files while the daemon are still alive is impossible.
This patch removes the explicit 'rm' and let the teardown doing the cleaning
stuff by stopping daemons & cleaning content.

By using this patch, make check is now sucessful on ceph-disk.

Signed-off-by: Erwan Velu <erwan@redhat.com>
2016-02-24 09:56:01 +01:00
Erwan Velu
57f31e9536 tests/ceph-disk: Using dummy device mappers
When doing a make test, if your local setup was running a "dm-0" or "dm-1",
the make check was failing.

The detection of the local "dm-x" were leading to a wrong comparaison with
the expected simulated values.

The fix is using a dummy name for that "dm" to prevent any collision with the local setup.

Prior that patch, a typical output of that error looked like :

>               assert expect == main.list_devices()
E               assert [{'partitions...: '/dev/Xda'}] == [{'partitions'...: '/dev/Xda'}]
E                 At index 0 diff: {'partitions': [{'dmcrypt': {'holders': ['dm-0'], 'type': 'plain'}, 'ptype': '4fbd7e29-9d25-41b8-afd0-5ec00ceff05d', 'path': '/dev/Xda1', 'is_partition': True, 'mount': None, 'uuid': '56244cf5-83ef-4984-888a-2d8b8e0e04b2', 'type': 'data', 'state': 'unprepared', 'fs_type': None}], 'path': '/dev/Xda'} != {'partitions': [{'ptype': '4fbd7e29-9d25-41b8-afd0-5ec00ceff05d', 'whoami': None, 'path': '/dev/Xda1', 'is_partition': True, 'mount': '/var/cache/ccache', 'uuid': '56244cf5-83ef-4984-888a-2d8b8e0e04b2', 'ceph_fsid': None, 'fs_type': 'btrfs', 'dmcrypt': {'holders': ['dm-0'], 'type': 'plain'}, 'type': 'data', 'state': 'active'}], 'path': '/dev/Xda'}
E                 Full diff:
E                 - [{'partitions': [{'dmcrypt': {'holders': ['dm-0'], 'type': 'plain'},
E                 + [{'partitions': [{'ceph_fsid': None,
E                 +                   'dmcrypt': {'holders': ['dm-0'], 'type': 'plain'},
E                 -                   'fs_type': None,
E                 ?                              ^^^^
E                 +                   'fs_type': 'btrfs',
E                 ?                              ^^^^^^^
E                 'is_partition': True,
E                 -                   'mount': None,
E                 +                   'mount': '/var/cache/ccache',
E                 'path': '/dev/Xda1',
E                 'ptype': '4fbd7e29-9d25-41b8-afd0-5ec00ceff05d',
E                 -                   'state': 'unprepared',
E                 ?                             ^^^^ -----
E                 +                   'state': 'active',
E                 ?                             ^^^^^
E                 'type': 'data',
E                 -                   'uuid': '56244cf5-83ef-4984-888a-2d8b8e0e04b2'}],
E                 ?                                                                 --
E                 +                   'uuid': '56244cf5-83ef-4984-888a-2d8b8e0e04b2',
E                 +                   'whoami': None}],
E                 'path': '/dev/Xda'}]

tests/test_main.py:342: AssertionError

Signed-off-by: Erwan Velu <erwan@redhat.com>
2016-02-24 09:55:57 +01:00
Erwan Velu
ba05b7e700 tests/ceph-disk: Creating missing working dir
When running run-tox.sh in a very simple env,
the test will fail if '/var/lib/ceph/tmp' doesn't exist.

This patch adds a check to create this directory if required as mkdtemp doesn't do it for you.

Prior this patch, the following behavior was seen :

tests/test_main.py:342:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
ceph_disk/main.py:3753: in list_devices
    fstype=fs_type, options='')
ceph_disk/main.py:1217: in mount
    dir=STATEDIR + '/tmp',
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

suffix = '', prefix = 'mnt.', dir = '/var/lib/ceph/tmp'

    def mkdtemp(suffix="", prefix=template, dir=None):
        """User-callable function to create and return a unique temporary
        directory.  The return value is the pathname of the directory.

        Arguments are as for mkstemp, except that the 'text' argument is
        not accepted.

        The directory is readable, writable, and searchable only by the
        creating user.

        Caller is responsible for deleting the directory when done with it.
        """

        if dir is None:
            dir = gettempdir()

        names = _get_candidate_names()

        for seq in xrange(TMP_MAX):
            name = names.next()
            file = _os.path.join(dir, prefix + name + suffix)
            try:
>               _os.mkdir(file, 0700)
E               OSError: [Errno 2] No such file or directory: '/var/lib/ceph/tmp/mnt.KoAV85'

/usr/lib64/python2.7/tempfile.py:333: OSError

Signed-off-by: Erwan Velu <erwan@redhat.com>
2016-02-24 09:55:46 +01:00
Jason Dillaman
dc5a9053ce Merge pull request #7649 from yuyuyu101/wip-async-watch
librados: implement async watch/unwatch

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2016-02-23 23:41:03 -05:00
You Ji
f59c872d4c test: enable test for bug #2339 which has been resolved.
Signed-off-by: You Ji <jiyou09@gmail.com>
2016-02-23 20:36:37 -07:00
Josh Durgin
0b56c5e550 Merge pull request #7761 from dillaman/wip-14847
librbd: fix state machine race conditions during shut down

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2016-02-23 18:07:12 -08:00
Haomai Wang
7109de7b80 RadosClient: call watch_flush before finisher stop
Signed-off-by: Haomai Wang <haomai@xsky.com>
2016-02-24 09:29:05 +08:00
Robert LeBlanc
3a6d6279fe test/common/test_weighted_priority_queue Fix the unit tests since the
changes to WeightedPriorityQueue there is no strict round robin
dequeueing of classes. Removed that test from the unittest.

Signed-off-by: Robert LeBlanc <robert.leblanc@endurance.com>
2016-02-23 20:42:52 +00:00
Robert LeBlanc
f03de8e2d0 common/WeightedPriorityQueue Rewrote the queue to use intrusive contianers. Microbenchmarks
show 60-70% of execution time compared to before.

Signed-off-by: Robert LeBlanc <robert.leblanc@endurance.com>
2016-02-23 20:42:43 +00:00