RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-10 21:20:46 +00:00

Author	SHA1	Message	Date
Sage Weil	126ffe6165	osd: log 'slow op' debug messages for individual slow ops Otherwise it is very hard to identify which OSD ops are slow when we've seen a SLOW_OPS health warning in a qa run. Notably, without this, bugs like http://tracker.ceph.com/issues/23769 are very challenging to track down. Signed-off-by: Sage Weil <sage@redhat.com>	2018-05-01 13:53:49 -05:00
Sage Weil	1124839204	mon/OSDMonitor: set FLAG_SELFMANAGED_SNAPS on cephfs snap removal CephFS uses a different path to remove selfmanaged snaps than librados, so while the librados path goes through pg_pool_t::remove_unmanaged_snap(), we open code the snap addition to the pool's removed_snaps here. If we don't set FLAG_SELFMANAGED_SNAPS at that time, we will implicitly set it during decode and get a CRC mismatch. Fix by explicitly setting FLAG_SELFMANAGED_SNAPS flag here. Fixes: http://tracker.ceph.com/issues/23949 Signed-off-by: Sage Weil <sage@redhat.com>	2018-05-01 13:46:47 -05:00
Sage Weil	6024c5c52c	mon/OSDMonitor: dump osdmaps if crc doesn't match Dump both the json and hexdump at debug level 20. Hunting http://tracker.ceph.com/issues/23949 Signed-off-by: Sage Weil <sage@redhat.com>	2018-05-01 12:39:03 -05:00
Sage Weil	c335bc16a4	Merge pull request #21742 from liewegas/wip-23940 osdc/Objecter: fix recursive locking in _finish_command Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2018-05-01 12:26:06 -05:00
Sage Weil	edcbb1bf15	Merge pull request #21745 from liewegas/wip-pg-removal-race osd: fix _process handling for pg vs slot race Reviewed-by: Greg Farnum <gfarnum@redhat.com>	2018-05-01 12:25:42 -05:00
Yuri Weinstein	b28ab5616d	Merge pull request #20678 from ceph/wip-s3a-fix fix s3atests that are failing for sometime Reviewed-by: Casey Bodley <cbodley@redhat.com>	2018-05-01 09:28:24 -07:00
Yuri Weinstein	61a66b4e14	Merge pull request #20894 from ZVampirEM77/wip-multisite-cleanup rgw: some cleanup for sync status Reviewed-by: Casey Bodley <cbodley@redhat.com>	2018-05-01 09:27:52 -07:00
Yuri Weinstein	f0e5e624b0	Merge pull request #21647 from yehudasa/wip-23859 rgw: fix for issue #21647 Reviewed-by: Casey Bodley <cbodley@redhat.com>	2018-05-01 09:27:32 -07:00
Yuri Weinstein	1d2b8b8025	Merge pull request #21648 from yehudasa/wip-cloud-sync-7 rgw: cloud sync fixes Reviewed-by: Casey Bodley <cbodley@redhat.com>	2018-05-01 09:27:10 -07:00
Kefu Chai	35b1e7ea63	Merge pull request #21678 from idiv-biodiversity/wip-doc-scrub_load_threshold doc: fix error in osd scrub load threshold Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-05-01 20:22:09 +08:00
Jason Dillaman	ff82e168f6	Merge pull request #21727 from trociny/wip-23929 librbd: release lock executing deep copy progress callback Reviewed-by: Jason Dillaman <dillaman@redhat.com>	2018-05-01 07:44:17 -04:00
Yan, Zheng	e4160d7e78	mds: don't report slow request for blocked filelock request Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Fixes: http://tracker.ceph.com/issues/22428	2018-05-01 12:54:29 +08:00
Patrick Donnelly	49365c70dc	Merge PR #21719 into master * refs/pull/21719/head: mds: trim log during shutdown to clean metadata Reviewed-by: Zheng Yan <zyan@redhat.com>	2018-04-30 17:26:07 -07:00
Patrick Donnelly	8153cfa696	Merge PR #21720 into master * refs/pull/21720/head: mds: kick rdlock if waiting for dirfragtreelock Reviewed-by: Zheng Yan <zyan@redhat.com>	2018-04-30 17:25:01 -07:00
Boris Ranto	056bc08d51	prometheus: Handle the TIME perf counter type metrics This patch correctly sets the PERFCOUNTER_MASK to 3 so that the PERFCOUNTER_TIME metrics are not ignored by the mgr_module code. It also converts the TIME metrics from nanoseconds to seconds just like the ceph perf dump does and exposes the metrics via prometheus module. Signed-off-by: Boris Ranto <branto@redhat.com>	2018-05-01 01:20:44 +02:00
Patrick Donnelly	db3b6ca546	common: refactor for array size Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-04-30 13:43:54 -07:00
Sage Weil	7dbcc772f6	osd: fix _process handling for pg vs slot race We could see the slot with a different PG than we expected if the old PG was removed and a new one was instantiated in its place. We can't just pick up the new PG pointer, however, since it isn't locked. Fix by retrying with the slot's new pg (possibly null!). Move this check below the other cases so that we know we are otherwise consistent with the slot, since the next pass around we might get pg==null and skip the to_process.empty() and requeue_seq checks entirely. Signed-off-by: Sage Weil <sage@redhat.com>	2018-04-30 15:38:59 -05:00
Mykola Golub	93b9eb7dd8	librbd: release lock executing deep copy progress callback Fixes: http://tracker.ceph.com/issues/23929 Signed-off-by: Mykola Golub <mgolub@suse.com>	2018-04-30 22:24:19 +03:00
Josh Durgin	625c6895fb	Merge pull request #21706 from liewegas/wip-23860 osd/PG: fix DeferRecovery vs AllReplicasRecovered race Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2018-04-30 11:32:31 -07:00
Patrick Donnelly	9e12aa5d3b	mds: kick rdlock if waiting for dirfragtreelock Fixes: https://tracker.ceph.com/issues/23919 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-04-30 10:04:08 -07:00
Patrick Donnelly	c60ef1b806	mds: trim log during shutdown to clean metadata Otherwise the trimming won't advance so that the remaining inodes are marked clean. Fixes: http://tracker.ceph.com/issues/23923 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-04-30 09:58:10 -07:00
Sage Weil	f459de15aa	Merge pull request #21702 from theanalyst/wip-std-mutex osdc/Objector: use std::shared_mutex instead of boost::shared_mutex Reviewed-by: Casey Bodley <cbodley@redhat.com>	2018-04-30 11:18:11 -05:00
Patrick Donnelly	cec1fa0998	Merge PR #21731 into master * refs/pull/21731/head: client: drop function _get_inodeno Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2018-04-30 09:16:48 -07:00
John Spray	1e768b3f4e	mgr/dashboard: silence E741 This is a pretty questionable check because it complains about the caller of an API instead of the API itself, if one of the API's members/arguments is one of the forbidden variable names such as 'O'. The interface to pyopenssl includes an 'O' member on the certificate object. Signed-off-by: John Spray <john.spray@redhat.com>	2018-04-30 16:39:40 +01:00
Sage Weil	c584061d16	Merge pull request #21743 from yuriw/wip-yuriw-crontab qa/tests: removed rest suite from the mix	2018-04-30 10:33:36 -05:00
Mykola Golub	6b752a3859	Merge pull request #21697 from dillaman/wip-18753-1 rbd-mirror: additional thrasher testing Reviewed-by: Mykola Golub <mgolub@suse.com>	2018-04-30 18:25:35 +03:00
Yuri Weinstein	42fa821724	qa/tests: removed rest suite from the mix Signed-off-by: Yuri Weinstein <yweinste@redhat.com>	2018-04-30 08:20:06 -07:00
Ken Dreyer	a630681c65	Merge pull request #21716 from smithfarm/wip-drop-obs-kludge build/ops: rpm: Revert "ceph.spec: work around build.opensuse.org" Reviewed-by: Ken Dreyer <kdreyer@redhat.com> Reviewed-by: David Disseldorp <ddiss@suse.de>	2018-04-30 09:15:23 -06:00
Sage Weil	854f44b247	Merge pull request #21739 from tchaikov/wip-23922 qa/suites/rados/thrash-old-clients: ms_type=simple Reviewed-by: Sage Weil <sage@redhat.com>	2018-04-30 09:55:10 -05:00
Andrew Schoen	2f15a4fba3	Merge pull request #21685 from alfredodeza/wip-rm23874 ceph-volume failed ceph-osd --mkfs command doesn't halt the OSD creation process Reviewed-by: Andrew Schoen <aschoen@redhat.com>	2018-04-30 14:52:50 +00:00
Sage Weil	891f519242	osdc/Objecter: fix recursive locking in _finish_command The path #9 Objecter::_finish_command (this=this@entry=0x7f76c00aeb30, c=c@entry=0x7f76b0000b10, r=<optimized out>, rs="osd down") at /build/ceph-13.0.2-1932-g458b4fb/src/osdc/Objecter.cc:4950 #10 0x00007f76d26de106 in Objecter::_check_command_map_dne (this=this@entry=0x7f76c00aeb30, c=c@entry=0x7f76b0000b10) at /build/ceph-13.0.2-1932-g458b4fb/src/osdc/Objecter.cc:1726 #11 0x00007f76d26e52e4 in Objecter::_scan_requests (this=this@entry=0x7f76c00aeb30, s=0x7f76c00af8a0, skipped_map=skipped_map@entry=false, cluster_full=cluster_full@entry=false, pool_full_map=0x7f76be7fb330, need_resend=..., need_resend_linger=..., need_resend_command=std::map with 0 elements, sul=..., gap_removed_snaps=0x7f76ac0016f8) at /build/ceph-13.0.2-1932-g458b4fb/src/osdc/Objecter.cc:1120 #12 0x00007f76d26eded5 in Objecter::handle_osd_map (this=this@entry=0x7f76c00aeb30, m=m@entry=0x7f76ac0014a0) at /build/ceph-13.0.2-1932-g458b4fb/src/osdc/Objecter.cc:1228 led to recursive lock of the session mutex (locked in _scan_requests, and again in _finish_command). Fix by making the callers for _finish_command (and _check_command_map_dne) take the session lock. Fixes: http://tracker.ceph.com/issues/23940 Signed-off-by: Sage Weil <sage@redhat.com>	2018-04-30 09:52:38 -05:00
Kefu Chai	e62bc6bcd6	Merge pull request #21708 from dalgaaf/wip-da-SCA-20180425 Various fixes for SCA issues Reviewed-by: Casey Bodley <cbodley@redhat.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-04-30 21:57:19 +08:00
Kefu Chai	f072045ebf	Merge pull request #21690 from xiexingguo/wip-pr-20304 mon, osd: add create-time for pool Reviewed-by: Sage Weil <sage@redhat.com>	2018-04-30 21:53:34 +08:00
Kefu Chai	ceaf329811	Merge pull request #21659 from yangDL/master pybind/ceph_argparse.py:'timeout' must in kwargs when call run_in_thread Reviewed-by: Kefu Chai <kchai@redhat.com>	2018-04-30 21:48:37 +08:00
Kefu Chai	770dbae2ca	qa/suites/rados/thrash-old-clients: ms_type=simple hammer does not support async messenger, so set ms_type to "simple" for hammer client. Fixes: http://tracker.ceph.com/issues/23922 Signed-off-by: Kefu Chai <kchai@redhat.com>	2018-04-30 21:40:53 +08:00
John Spray	5596f489da	mgr/dashboard: fix linter complaints In addition to line ordering, there were a couple of bogus ones: E: 30, 0: No name 'version' in module 'distutils' (no-name-in-module) E: 30, 0: Unable to import 'distutils.version' (import-error) E: 36, 8: No name 'wsgiserver' in module 'cherrypy' (no-name-in-module) E: 36, 8: Unable to import 'cherrypy.wsgiserver.wsgiserver2' (import-error) I don't know why pylint can't see these modules, but they're definitely there, so I've added them to the ignored list in .pylintrc Signed-off-by: John Spray <john.spray@redhat.com>	2018-04-30 14:27:49 +01:00
Jason Dillaman	5d99f4e719	Merge pull request #21733 from trociny/wip-23938 qa/workunits/rbd: potential race in mirror disconnect test Reviewed-by: Jason Dillaman <dillaman@redhat.com>	2018-04-30 08:55:12 -04:00
Rishabh Dave	b14302d1fe	qa/cephfs: test if evicted client unmounts without hanging Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-04-30 12:02:56 +00:00
Rishabh Dave	18a9d0c491	qa/tasks: allow custom timeout for umount_wait() Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-04-30 12:02:56 +00:00
Rishabh Dave	0f56c7e8e5	client: don't hang when MDS sessions are evicted Currently, a filesystem client hangs if a request is made after it's eviction. Prevent the client from hanging and allow a manual unmount in such cases. Fixes: http://tracker.ceph.com/issues/10915 Signed-off-by: Rishabh Dave <ridave@redhat.com>	2018-04-30 12:01:52 +00:00
John Spray	b869bfadd9	Merge pull request #21671 from jan--f/mgr-module-config-doc doc/mgr/plugins: add note about distinction between config and kv store Reviewed-by: John Spray <john.spray@redhat.com>	2018-04-30 12:42:18 +01:00
Mykola Golub	5bc1d4a51a	qa/workunits/rbd: potential race in mirror disconnect test (due to a typo in get_image_id command arg) Fixes: http://tracker.ceph.com/issues/23938 Signed-off-by: Mykola Golub <mgolub@suse.com>	2018-04-30 09:44:12 +03:00
Jos Collin	ab46bb3314	client: drop function _get_inodeno Drop _get_inodeno() as per the comment in https://github.com/ceph/ceph/pull/21554. Signed-off-by: Jos Collin <jcollin@redhat.com>	2018-04-30 10:04:04 +05:30
Patrick Donnelly	67c7e46191	client: use common interp of st_nlink for dirs Apparently some applications use this (like mail servers) and since it's trivial to support, let's do it. Idea is that st_nlinks for a directory is either 0 (it is unlinked) or 2 + the number of sub-directories (which have .. parent links). Fixes: https://tracker.ceph.com/issues/23873 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2018-04-29 20:04:43 -07:00
Sage Weil	cfe59cf20c	osd/PG: fix DeferRecovery vs AllReplicasRecovered race - DeferRecovery event queued by AsyncReserver due to preemption event. We are in Recovering state with RECOVERING bit set. - We finish recovery, clear RECOVERING state bit, and queue AllReplicasRecovered from PrimaryLogPG::start_recovery_ops() - DeferRecovery event arrives, moving us from Recovering -> NotRecovering - AllReplciasRecovered event arrives, crashing us. This is all hard to deal with because the events are queued and may arrive later. Solve the problem here by tolerating a delayed DeferRecovery event: if the RECOVERING pg state bit isn't set, ignore it (it's old). The async reserver cancel events are unpredictable. Fixes: http://tracker.ceph.com/issues/23860 Signed-off-by: Sage Weil <sage@redhat.com>	2018-04-29 16:00:41 -05:00
Patrick Donnelly	543d8a0e4c	Merge PR #21554 into master * refs/pull/21554/head: client: avoid second lock on client_lock Reviewed-by: Jos Collin <jcollin@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com>	2018-04-29 11:05:33 -07:00
Patrick Donnelly	5a56301945	Merge PR #21592 into master * refs/pull/21592/head: mds: filter out blacklisted clients when importing caps mds: don't add blacklisted clients to reconnect gather set mds: combine MDCache::{cap_exports,cap_export_targets} Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2018-04-29 11:05:27 -07:00
Patrick Donnelly	e7856ffa04	Merge PR #21593 into master * refs/pull/21593/head: mds: properly check auth subtree count in MDCache::shutdown_pass() Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2018-04-29 11:05:22 -07:00
Patrick Donnelly	0c11a6fcb4	Merge PR #21601 into master * refs/pull/21601/head: mds: don't discover inode/dirfrag when mds is in 'starting' state Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2018-04-29 11:05:16 -07:00
Patrick Donnelly	6c07c85796	Merge PR #21610 into master * refs/pull/21610/head: cephfs-journal-tool: wait prezero ops before destroying journal Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2018-04-29 11:05:11 -07:00

1 2 3 4 5 ...

86249 Commits