Commit Graph

91819 Commits

Author SHA1 Message Date
Ricardo Marques
fbd4d9502c mgr/dashboard: Fix ts error on iSCSI page
This error only happens until initiator is connected to the target.

Fixes: https://tracker.ceph.com/issues/36564

Signed-off-by: Ricardo Marques <rimarques@suse.com>
2018-10-29 15:08:56 +00:00
Jason Dillaman
abba091352 include/types: fixed compile warning for signed/unsigned comparison
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-10-29 10:48:49 -04:00
Jason Dillaman
eed3163f29 osd/PrimaryLogPG: uncommitted dup ops should respond with logged return code
Fixes: http://tracker.ceph.com/issues/36408
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-10-29 10:48:45 -04:00
Jason Dillaman
45b3cedb48 osd/PrimaryLogPG: propagate error return codes on object copy_get ops
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-10-29 10:48:41 -04:00
Jason Dillaman
881669f007 osd/PGLog: optionally record error return codes for extra_reqids
When a cache tier promotes an object with one or more error PG log
entries, these errors need to be propagated and recorded for dup
op detection.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-10-29 10:48:37 -04:00
Jason Dillaman
645c3122b2 osd/osd_types: include PG log return codes in object copy data
If the base tier records an error against an operation, the cache
tier currently might incorrectly respond with a success return code.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-10-29 10:48:32 -04:00
Sage Weil
0d166f49dc Merge PR #24686 into master
* refs/pull/24686/head:
	os/bluestore: show compress and buffered from WriteContext
	os/bluestore: fix rename race with trim on replacement onode at old name

Reviewed-by: Jianpeng Ma <jianpeng.ma@intel.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
2018-10-29 08:46:43 -05:00
Sage Weil
1c081dde77 mgr/diskprediction: use global device_failure_prediction_mode setting
Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-29 08:44:17 -05:00
Sage Weil
6c744ea582 Merge PR #24229 into master
* refs/pull/24229/head:
	common: drop BL_BACKWARD_COMPAT in bufferlist.
	osd: OSDMap encoding uses bufferlist::contiguous_filler.
	mds: encode_xattrs() uses buffer::list::contiguous_filler.
	common: introduce contiguous_filler to optimize ENCODE_START.
	common: optimize hole appending in bufferlist.
	common: duplicate an encoding macro to suppress warnings.
	common: drop backward iteration from bufferlist.

Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-10-29 08:42:48 -05:00
Sage Weil
c40685ebdd Merge PR #24787 into master
* refs/pull/24787/head:
	Merge PR #24796 into nautilus
	osd: fix heartbeat_reset unlock
	Merge PR #24780 into nautilus
	Merge PR #24761 into nautilus
	Merge PR #24651 into nautilus
	osd: fix race between op_wq and context_queue
	test: Make sure kill_daemons failure will be easy to find
	test: Add flush_pg_stats to make test more deterministic
2018-10-29 08:36:34 -05:00
Igor Fedotov
5d38f8b49b qa/standtalone/osd-bluefs-volume-ops: remove redundant code.
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
2018-10-29 16:30:36 +03:00
Lenz Grimmer
6352009a0e
Merge pull request #24763 from zmc/wip-36416
mgr/dashboard: Map dev 'releases' to master

Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2018-10-29 13:07:25 +01:00
Sage Weil
bc7cfe0885 Merge PR #24796 into nautilus
* refs/pull/24796/head:
	osd: fix heartbeat_reset unlock

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-10-28 21:25:55 -05:00
Sage Weil
1a0e2f7e15 osd: fix heartbeat_reset unlock
Fixes 51d8e2457d, which moved to lock_guard
but didn't remove the unlock call on this exit path.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-28 20:22:37 -05:00
Sage Weil
039e29b5dd Merge PR #24774 into master
* refs/pull/24774/head:
	cmake: update dpdk drivers/modules to accomodate dpdk submodule
	spdk: update to latest v18.07

Reviewed-by: Neha Ojha <nojha@redhat.com>
2018-10-28 20:11:10 -05:00
Sage Weil
9e068b9108 Merge PR #24666 into master
* refs/pull/24666/head:
	include/types: fixed compile warning for signed/unsigned comparison
	osd/PrimaryLogPG: uncommitted dup ops should respond with logged return code
	osd/PrimaryLogPG: propagate error return codes on object copy_get ops
	osd/PGLog: optionally record error return codes for extra_reqids
	osd/osd_types: include PG log return codes in object copy data

Reviewed-by: Sage Weil <sage@redhat.com>
2018-10-28 20:08:32 -05:00
Sage Weil
708f19438c Merge PR #24688 into master
* refs/pull/24688/head:
	common: make ceph_abort store same crash info as ceph_assert
	global: store assert msg in global and dump to crash meta
	pybind/mgr: make 'ceph crash ls' output sorted list
	log: don't clear ring when dump_recent is called
	ceph-crash: make clear to user that 'posted' should be directory

Reviewed-by: Sage Weil <sage@redhat.com>
2018-10-28 09:41:31 -05:00
Sage Weil
b17398aea8 Merge PR #24780 into nautilus
* refs/pull/24780/head:
	osd: take heartbeat_lock before checking for session

Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
2018-10-28 09:41:01 -05:00
Sage Weil
b21f436c2d Merge PR #24780 into master
* refs/pull/24780/head:
	osd: take heartbeat_lock before checking for session
	Merge PR #24725 into nautilus
	qa/tasks/qemu: use unique clone directory to avoid race with workunit
	mds: add missing mds_lock

Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
2018-10-28 09:39:50 -05:00
Sage Weil
01f0818163 Merge PR #24778 into master
* refs/pull/24778/head:
	rpm: use %license macro for packaging license file

Reviewed-by: Nathan Cutler <ncutler@suse.com>
2018-10-28 09:39:34 -05:00
Kefu Chai
e4fcc3887a
Merge pull request #24792 from falcon78921/wip-doc-grammar1
doc: fixed typo in man page

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-10-28 22:26:49 +08:00
Kefu Chai
d72842e82e
Merge pull request #24702 from wjwithagen/wjw-fix-blkdev-serial-const
common: mark BlkDev::serial() const to match with its declaration

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-10-28 22:24:04 +08:00
James McClune
3b63679472
doc: fixed minor grammar error
Changed buchket to bucket

Signed-off-by: James McClune <jmcclune@mcclunetechnologies.net>
2018-10-28 02:02:59 -04:00
James McClune
b8317f5e1e
doc: added demo document changes section
Added a brief section about how to demo Ceph documentation
changes. 

Signed-off-by: James McClune <jmcclune@mcclunetechnologies.net>
2018-10-28 01:44:17 -04:00
Radoslaw Zarzynski
5b4bb65021 osd: slightly refactor PrimaryLogPG::do_op.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2018-10-27 08:32:51 +02:00
Kefu Chai
53e1daf177
Merge pull request #24785 from falcon78921/wip-docs-36605
doc: purge subcommand link broken

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-10-27 13:07:21 +08:00
James McClune
8d41cfd093 doc: used ceph osd command ref label
Referenced purge subcommand info via ceph osd command label.
Fixes: https://tracker.ceph.com/issues/36605

Signed-off-by: James McClune <jmcclune@mcclunetechnologies.net>
2018-10-26 23:32:06 -04:00
Sage Weil
f755bed3e4 Merge PR #24761 into nautilus
* refs/pull/24761/head:
	osd: fix race between op_wq and context_queue

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Jianpeng Ma <jianpeng.ma@intel.com>
2018-10-26 22:07:27 -05:00
Sage Weil
143f601155 Merge PR #24651 into nautilus
* refs/pull/24651/head:
	test: Make sure kill_daemons failure will be easy to find
	test: Add flush_pg_stats to make test more deterministic

Reviewed-by: Neha Ojha <nojha@redhat.com>
2018-10-26 21:07:09 -05:00
Xie Xingguo
19b2995378
Merge pull request #24743 from rzarzynski/wip-osd-avoid-osdmap-refcounting
core: avoid unnecessary refcounting of OSDMap on OSD's hot paths

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-10-27 09:51:13 +08:00
Xie Xingguo
e6f9241aeb
Merge pull request #24657 from xiexingguo/wip-rm-device-class-fix
mon/OSDMonitor: two "ceph osd crush class rm" fixes

Reviewed-by: Sage Weil <sage@redhat.com>
2018-10-27 09:49:57 +08:00
Patrick Donnelly
6da295f3cb
Merge PR #24585 into master
* refs/pull/24585/head:
	doc: add developer documentation on new cephfs reclaim interfaces

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
2018-10-26 16:28:43 -07:00
Yuri Weinstein
64db174368
Merge pull request #21094 from guzhongyan/dne-cleanup
src: no 'dne' acronym in user cmd output

Reviewed-by: Neha Ojha <nojha@redhat.com>
2018-10-26 15:25:04 -04:00
Yuri Weinstein
135edcc55a
Merge pull request #22633 from dongbula/add-dbstatistics-for-filestore
OSD: add impl for filestore to get dbstatistics

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-10-26 15:24:09 -04:00
Yuri Weinstein
1998c7a542
Merge pull request #23528 from jcsp/wip-osdmon-application
mon: don't commit osdmap on no-op application ops

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2018-10-26 15:23:18 -04:00
Zack Cerza
2887f8e973 mgr/dashboard: Map dev 'releases' to master
In CephReleaseNamePipe, we used to blindly return the "release name" portion of
the version string. This ends up e.g. returning 'nautilus' for master right
now, which causes us to link to nonexistent documentation on ceph.com.  This
change causes builds marked as 'dev' (as opposed to 'stable') to report
'master' as their release name.

Fixes: https://tracker.ceph.com/issues/36416

Signed-off-by: Zack Cerza <zack@redhat.com>
2018-10-26 11:22:53 -06:00
J. Eric Ivancich
4891ae5931 rgw: recover from incomplete reshard attempt
In case a reshard attempt is left in an incomplete state, i.e., flags
still show resharding even though the bucket reshard lock isn't being
held, try to recover by taking the bucket reshard lock and clearing
flags associated with resharding.

This change requires access to an RGWBucketInfo object. So call stack
into this function should provide that to prevent unnecessary
work. Changes were made to provide this object.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2018-10-26 12:17:54 -04:00
Sage Weil
51d8e2457d osd: take heartbeat_lock before checking for session
When we open a connection, there is a short window before we attach
the session.  If a fault happens quickly, we won't get the reset, and
will persistently fail to send osd pings.

Move the lock up to avoid this.  Note that we should rarely really see
connections without sessions here anyway (except when this specific
race happens), so this should have no negative impact (by taking the lock
when we weren't before).

Fixes: http://tracker.ceph.com/issues/36602
Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-26 10:39:28 -05:00
J. Eric Ivancich
18ab99cd54 rgw: move RGWReshardBucket lock to its own separate class
There are other processes beyond resharding that would need to take a
bucket reshard lock (e.g., correcting bucet resharding flags in event
of crash, tools to remove bucket shard information from earlier
versions of ceph). Pulling this logic outside of RGWReshardBucket
allows this code to be re-used.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2018-10-26 11:38:55 -04:00
Alfredo Deza
e828c56d08
Merge pull request #24504 from ErwanAliasr1/evelu-ceph-volume-choose_disk
Additional work on ceph-volume to add some choose_disk capabilities

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2018-10-26 11:32:37 -04:00
Sage Weil
d76789444c osd: fix race between op_wq and context_queue
ThreadA                             		                   ThreadB
  sdata->shard_lock.Lock();
  if (sdata->pqueue->empty() &&
     !(is_smallest_thread_index && !sdata->context_queue.empty())) {

								    void queue(list<Context *>& ls) {
								        bool empty = false;
                                                                       {
                                                                         std::scoped_lock l(q_mutex);
                                                                         if (q.empty()) {
                                                                           q.swap(ls);
                                                                           empty = true;
                                                                         } else {
                                                                           q.insert(q.end(), ls.begin(), ls.end());
                                                                         }
                                                                       }

                                                                       if (empty) {
                                                                         mutex.Lock();
                                                                         cond.Signal();
                                                                         mutex.Unlock();
                                                                       }
                                                                    }

     sdata->sdata_wait_lock.Lock();
    if (!sdata->stop_waiting) {

Fix by simply rechecking that context_queue is empty after taking the
wait lock.  We still check it without taking that lock to keep the hot/busy
path fast (we avoid the wait lock in general) at the expense of taking
the context_queue qlock twice in the idle/wait path (where we don't care
so much about additional latency/cycles).

Fixes: http://tracker.ceph.com/issues/36473
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-26 10:27:09 -05:00
J. Eric Ivancich
4577801271 rgw: failed resharding clears resharding status from shard heads
Previously, when resharding failed, we restored the shard status on
the bucket info object. However the status on each of the shards was
left indicating a reshard was underway. This prevented some write
operations from taking place, as they would wait for resharding to
complete. This adds the missing functionality. It also makes the
functionality available to other classes via static functions in
RGWBucketReshard.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2018-10-26 11:19:22 -04:00
J. Eric Ivancich
bc0a5ff952 rgw: change the bucket reshard lock to exclusive-ephemeral
The bucket reshard lock was simply an exclusive lock that existed on
an object solely for the purpose of representing the lock. This is now
changed to exclusvie-ephemeral lock, so as not to leave these objects
behind.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2018-10-26 11:19:22 -04:00
J. Eric Ivancich
a289f2d865 cls: add exclusive ephemeral locks that auto-clean
Add a new type of cls lock -- exclusive ephemeral for which the
object only exists to represent the lock and for which the object
should be deleted at unlock. This is to prevent the accumulation of
unneeded objects in the cluster by automatically cleaning them up.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
2018-10-26 11:19:22 -04:00
Sage Weil
72daf28986 Merge PR #24703 into master
* refs/pull/24703/head:
	common: prefer std::size to ARRAY_SIZE macro

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2018-10-26 09:18:50 -05:00
Sage Weil
c98bd0b026 Merge PR #24712 into master
* refs/pull/24712/head:
	mgr/DaemonServer: "osd safe-to-destroy" - more verbose output

Reviewed-by: Sage Weil <sage@redhat.com>
2018-10-26 09:18:33 -05:00
Sage Weil
80bb0664d6 Merge PR #24713 into master
* refs/pull/24713/head:
	mon: drop repeated 'goodchars' and add osd crush ls testcase

Reviewed-by: João Eduardo Luís <joao@suse.de>
2018-10-26 09:18:13 -05:00
Sage Weil
1227bf5f6d Merge PR #24724 into master
* refs/pull/24724/head:
	test/librados/aio: remove unused callbacks
	test/librados/aio: wait for all completions properly

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-10-26 09:17:54 -05:00
Sage Weil
dc9de8ce02 Merge PR #24736 into master
* refs/pull/24736/head:
	os/tests: should read size_t options using get_val<Option::size_t>()

Reviewed-by: Sage Weil <sage@redhat.com>
2018-10-26 09:17:23 -05:00
John Spray
c5fd31dfcc
Merge pull request #24767 from votdev/issue_36581
mgr/dashboard/qa: Fix various vstart_runner.py issues

Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: John Spray <john.spray@redhat.com>
2018-10-26 15:09:59 +01:00