Commit Graph

90543 Commits

Author SHA1 Message Date
Alfredo Deza
7e52bc559b ceph-volume lvm.batch.bluestore consume --block-db-size
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-09-20 11:02:13 -04:00
Sage Weil
1bf449cdba Merge PR #23415 into master
* refs/pull/23415/head:
	msgr/async: huge refactoring of protocol V1
	msgr/async: fix forward declaration of DispatchQueue

Reviewed-by: Sage Weil <sage@redhat.com>
2018-09-20 09:30:41 -05:00
Xie Xingguo
a3e58900bb
Merge pull request #24175 from xiaomanh/master
doc: Fix Spelling Error In File dynamicresharding.rst

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-09-20 22:03:14 +08:00
Sage Weil
fe29c6ff52 osd: simplify init of fabricated pg
This was similar (but different) to the logic in PG::merge_from().  Do not
do any initialization here, and instead rely on merge_from() to do the
right thing.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:35:53 -05:00
Volker Theile
859b40c8c7 mgr/dashboard: Increase usability of role management by enabling the user to check/uncheck all rows/columns of permissions
Fixes: https://tracker.ceph.com/issues/35695

Signed-off-by: Volker Theile <vtheile@suse.com>
2018-09-20 15:35:16 +02:00
Sage Weil
e7f4291fe7 osd/PG: inherit pg history from merge source, if necessary
Having an accurate(ish) same_interval_since is important for making sure
any subsequent PastIntervals we add are consistent with the
last_epoch_clean value that the bounds are tested against.  Otherwise we
might have lec 100 and merge in 150, an interval changes gives us a pi of
[150,something) and we fail the bounds check.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:35:15 -05:00
Sage Weil
2bfa241f44 osd/osd_types: increasing pg_num_pending is also an interval change
If we move into the premerge period (pg_num_pending < pg_num), that is a
new interval.  Moving out again (canceling the merge) is also a new
interval.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:35:15 -05:00
Sage Weil
b8ac4361be osd: cancel pg merge if PGs are undersized
If the PG is undersized, cancel the PG merge attempt early.  Undersized is
a bad thing because it makes merge more dangerous.

It's also bad because the PG won't be fully clean when it finishes
peering, which means last_epoch_clean can be something far in the past,
and past_intervals won't be empty.  Since we also take the past_intervals
from the source PG, we want to be confident that it is valid.  It *should*
match up with the target PG since they should have mapped to the same
OSDs since they were both clean at the ReadyToMerge point--in fact, they
should both be empty.  If a PG mapping change snuck in such that they did
map somewhere else, though, the same set of mapping changes will have
applied to both the source and target, so it should be safe.

(It would be better of the mon rejected the ReadyToMerge if the
mapping with the latest OSDMap has changed since the message was sent.
If we do that the situation is even better, but this change is still
appropriate.)

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:35:15 -05:00
Sage Weil
6c6cf6360c mon/OSDMonitor: handle ready_to_merge message that cancels the merge
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:35:15 -05:00
Sage Weil
5b945583ef osd/PG: only signal ready_to_merge if we have all replicas
Only signal we are ready to merge if all replicas are in good shape.  If
they aren't, do nothing (yet).

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:35:15 -05:00
Sage Weil
4d1b78704a osd/PG: move all mark_clean-ish activity into try_mark_clean()
Keep it all in one place (try_mark_clean()).  The key behavioral change
is that we update last_epoch_clean and last_epoch_started when we are
peered too, only only when we are active.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:35:15 -05:00
Sage Weil
15701e8c95 osd/PG: use last_epoch_clean from ReadyToMerge point in time for fabricated history
If we are fabricating the pg history values, we need something that is
reasonably valid, but that won't screw up peering of the PG by indicating
that the PG has peered at some point later than when it really has.
Otherwise we can end up in a situation where everyone thinks there is a
newer pg info out there that doesn't actually exist, and the PG will end
up as incomplete.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:35:15 -05:00
Sage Weil
142317b2ac osd: send last_epoch_clean when indicating PG is ready to merge
The mon can put this in the pg_pool_t.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:35:15 -05:00
Sage Weil
8af0ee92fa osd/osd_types: rename pg_num_pending_dec_epoch -> pg_num_dec_last_epoch_clean
Change the content of this field to be the last_epoch_clean for the PG
when it tells the mon it is ready to be merged.

This will later be used to populate the last_epoch_clean and
last_epoch_started fields in PG::merge_from() in certain corner cases.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:35:15 -05:00
Sage Weil
f302558de4 common: Lock -> lock, Unlock -> lock, TryLock -> try_lock
Use the Lockable convention.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:11:36 -05:00
Sage Weil
b5c9948e38 common: Mutex::Locker -> std::lock_guard<Mutex>
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:11:36 -05:00
Sage Weil
ea9a72512d common/Mutex: typedef lock_guard<Mutex> Locker
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:11:36 -05:00
Sage Weil
145a205c7a common/Mutex: Lock -> lock, Unlock -> unlock
This aligns us with the Lockable concept, which means we can use
lock_guard<>, unique_lock<>, etc.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:11:35 -05:00
Sage Weil
03b519114e common/Mutex: kill mutex_perf_counter
This has a measurable overhead even when turned off, and we do not use it.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-20 08:11:35 -05:00
Casey Bodley
52054cedc5
Merge pull request #23429 from joke-lee/wip-website-rule-num-limit
rgw: website routing rules num limit

Reviewed-by: Matt Benjamin <mbenjami@redhat.com>
2018-09-20 08:50:06 -04:00
Ricardo Marques
7871dc4ee5
Merge pull request #24028 from votdev/fix_rgw_status
mgr/dashboard: Catch LookupError when checking the RGW status

Reviewed-by: Patrick Nawracay <pnawracay@suse.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
2018-09-20 13:44:24 +01:00
John Spray
c25b26c8a5 doc: remove Calamari content from ceph-deploy
Signed-off-by: John Spray <john.spray@redhat.com>
2018-09-20 13:13:04 +01:00
Jason Dillaman
98e5354d95
Merge pull request #23823 from dillaman/wip-namespace-osd-check
librbd: prevent use of namespaces on pre-nautilus OSDs

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-09-20 08:07:21 -04:00
Jason Dillaman
fc809e6969
Merge pull request #22579 from dillaman/wip-pybind-rados
pybind/rados: new methods for manipulating self-managed snapshots

Reviewed-by: Mykola Golub <mgolub@suse.com>
2018-09-20 08:06:32 -04:00
xie xingguo
3654d56985 osd/PrimaryLogPG: fix potential pg-log overtrimming
In https://github.com/ceph/ceph/pull/21580 I set a trap to catch some wired
and random segmentfaults and in a recent QA run I was able to observe it was
successfully triggered by one of the test case, see:

```
http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log
```

The root cause is that there might be holes on log versions, thus the
approx_size() method should (almost) always overestimate the actual number of log entries.
As a result, we might be at the risk of overtrimming log entries.

https://github.com/ceph/ceph/pull/18338 reveals a probably easier way
to fix the above problem but unfortunately it also can cause big performance regression
and hence comes this pr..

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-09-20 17:17:49 +08:00
Kefu Chai
f03dd73df2
Merge pull request #22739 from majianpeng/osd-shardthread-do-bluestore-oncommits
os/bluestore: make osd shard-thread do oncommits

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-09-20 16:47:07 +08:00
Jianpeng Ma
6c583fe756 osd/OSD: choose a fixed thread do oncommits callback function
Now bluestore oncommit callback exec by osd op threads.
If there are multi threads of shard, it will cause out-of order.
For example, threads_per_shard=2
              Thread1                                 Thread2
    swap_oncommits(op1_oncommit)
                                            swap_oncommits(op2_oncommit)
    OpQueueItem.run(Op3)
                                            op2_oncommit.complete();
    op1_oncommit.complete()

This make oncommits out of order.
To avoiding this, we choose a fixed thread which has the smallest
thread_index of shard to do oncommit callback function.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2018-09-20 22:10:20 +08:00
Jianpeng Ma
754f3aa445 common/Finisher: only queue empty only wake up waiter.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2018-09-20 21:53:47 +08:00
Jianpeng Ma
d7ca34e12f common/Finisher: only queue empty only wake up waiter.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2018-09-20 21:52:53 +08:00
Xie Xingguo
03abaf98e3
Merge pull request #24004 from xiexingguo/wip-yet-more-async-fixes
osd/PG: async-recovery should respect historical missing objects

Reviewed-by: Yan Jun <yan.jun8@zte.com.cn>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-09-20 14:49:31 +08:00
Mahati Chamarthy
4b848c205a src/include:fix unused parameter
fixes unused parameter

Signed-off-by: Mahati Chamarthy <mahati.chamarthy@intel.com>
2018-09-20 10:04:22 +05:30
Kefu Chai
a778cc701f
Merge pull request #24130 from tchaikov/wip-gcc-7.3
rpm: bump up required GCC version to 7.3.1

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2018-09-20 12:25:57 +08:00
yuliyang
3275dffa45 rgw: website routing rules num limit
According to AWS S3 , an website routing rules can
have up to 50 rules.

Signed-off-by: yuliyang <yuliyang@cmss.chinamobile.com>
2018-09-20 08:50:32 +08:00
Neha Ojha
38ef3da8d2 qa: install build dependencies for cfuse_workunit_kernel_untar_build.yaml
Fixes: https://tracker.ceph.com/issues/36076
Signed-off-by: Neha Ojha <nojha@redhat.com>
2018-09-19 15:22:16 -07:00
Alfredo Deza
0a16dbd7a8 ceph-volume lvm.batch add sizing flags for journal and block.db LVs
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-09-19 17:08:16 -04:00
Ricardo Marques
20cfc0212a
Merge pull request #23779 from Devp00l/wip-table-actions-component
mgr/dashboard: Add table actions component

Reviewed-by: Ricardo Marques <rimarques@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2018-09-19 20:51:41 +01:00
Jason Dillaman
e8eee15518 librbd: use the correct error code when the exclusive lock isn't locked
If the client is currently blacklisted, use -EBLACKLISTED, otherwise
use -EROFS.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-09-19 15:03:02 -04:00
Jason Dillaman
a84fbb2565 librbd: helper to retrieve the correct error code for read-only op
When the exclusive lock is unlocked, the error code should be
-EBLACKLISTED when the client is blacklisted, otherwise -EROFS.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-09-19 14:52:48 -04:00
Jason Dillaman
2057d99f45 librbd: reacquire lock should properly handle failed watcher
If the watch has been lost, assume the lock has been lost but attempt
to reacquire it if and when the watch is re-established.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-09-19 14:52:48 -04:00
Jason Dillaman
b4fc7b520a librbd: fix improper indentation of 'ceph_assert' statements
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-09-19 14:52:48 -04:00
Jason Dillaman
60064f68f5 librbd: assume lock is unlocked if blacklisted or object deleted
This will ensure that it's possible to potentially re-acquire the
lock should the blacklist expire before the image is closed.

Fixes: http://tracker.ceph.com/issues/34534
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-09-19 14:52:48 -04:00
Jason Dillaman
9ea94f2840 librbd: watcher should internally track blacklisted state
Since it will periodically attempt to re-acquire the watch,
it will know when the RADOS client has been blacklisted and
when the blacklist has been removed.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-09-19 14:52:48 -04:00
Jason Dillaman
23b7447f6b librbd: attempt to recover lost image watcher upon all failures
For example, if an image is blacklisted and the blacklist eventually
expires, the image should recover its watch.

Fixes: http://tracker.ceph.com/issues/34534
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-09-19 14:52:48 -04:00
Sage Weil
41c67ab1be
Merge pull request #24160 from jdurgin/wip-rados-lead
doc/dev/index: update rados lead

Reviewed-by: Sage Weil <sage@redhat.com>
2018-09-19 13:45:32 -05:00
Jason Dillaman
bbdc545ec1 rbd-mirror: instantiate the status formatter before changing state
This will avoid a possible race between pre-queued status updates
firing between the time the state has been changed and the formatter
has been instantiated.

Fixes: http://tracker.ceph.com/issues/36084
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2018-09-19 14:27:24 -04:00
Mykola Golub
ce6e35a81e
Merge pull request #23662 from dillaman/wip-24412
librbd: support v2 cloning across namespaces

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: songweibin <song.weibin@zte.com.cn>
2018-09-19 21:20:58 +03:00
Casey Bodley
bd6d3f61e1
Merge pull request #24059 from cbodley/wip-rgw-opstate-rm
rgw, cls: remove cls_statelog and rgw opstate tracking

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2018-09-19 14:01:04 -04:00
Yuri Weinstein
eaca188733 qa/tests: removed knfs suite
Fixes: http://tracker.ceph.com/issues/36075
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2018-09-19 09:36:34 -07:00
Sage Weil
fe14970768 Merge PR #24143 into master
* refs/pull/24143/head:
	qa/workunits/cephtool/test_kvstore_tool.sh: run test in ., not /tmp

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2018-09-19 10:41:04 -05:00
Stephan Müller
62b85cc9a2 mgr/dashboard: Use table actions component for roles
Signed-off-by: Stephan Müller <smueller@suse.com>
2018-09-19 17:17:01 +02:00