Commit Graph

69398 Commits

Author SHA1 Message Date
John Spray
6cf9c2956c qa: add TestStrays.test_purge_queue_op_rate
For ensuring that the PurgeQueue code is not generating
too many extra IOs.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:02 +00:00
John Spray
207846f89f mds: write_head when reading in PurgeQueue
Previously write_head calls were only generated
on the write side, so if you had a big queue
and were just working through consuming it, you
wouldn't record your progress, and on a daemon
restart would end up repeating a load of work.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:02 +00:00
John Spray
92bf85efd4 osdc: expose Journaler::write_head_needed
So that callers on the read side can optionally
do their own write_head calls according to
the same condition that Journaler uses
internally for its write_head during _flush() condition.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:01 +00:00
John Spray
0933f61d82 mds: remove unnecessary flush() from PurgeQueue
We can drive all flushing from the read side.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:01 +00:00
John Spray
335bdc126d mds: update for removing Timer from Journaler
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:01 +00:00
John Spray
8d4f6b92cb osdc: less aggressive prefetch in read/write Journaler
Previously, if doing a write/is_readable/write/is_readable sequence,
you'd end up doing a flush after every write, even though there
was already a flush in flight that would advance the readable-ness
of the journal.

Because this flush-during-read path is only active when using
a read/write journal such as in PurgeQueue, tweak the behaviour
to suit this case.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:00 +00:00
John Spray
199e5b465a osdc: remove Journaler "journaler_batch_*" settings
This was an unused code path.  If anyone set a nonzero
value here the MDS would crash because the Timer implementation
has changed since this code was written, and now requires
add_event_after callers to hold the right lock.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:00 +00:00
John Spray
6d59f15e12 mds: add error handling in PurgeQueue
For decode errors, and for Journaler errors.
Both are considered damage to the MDS rank, as
with other per-rank data structures.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:00 +00:00
John Spray
0952ce9910 mds: expose progress during PurgeQueue drain
We don't track an item count, but we do have
a number of bytes left in the Journaler, so
can use that to give an indication of progress
while the MDS rank shutdown is waiting for
the PurgeQueue to do its thing.

Also lift the ops limit on the PurgeQueue
when it goes into the drain phase.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:26:59 +00:00
John Spray
4427aed62b mds: update PurgeQueue for single-ack OSD change
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:26:59 +00:00
John Spray
3e66de2182 mds: create purge queue if it's not found
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:26:59 +00:00
John Spray
f826c7e8aa qa/cephfs: add TestStrays.test_purge_on_shutdown
...and change test_migration_on_shutdown to
specifically target non-purgeable strays (i.e.
hardlink-ish things).

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:26:55 +00:00
John Spray
0c9a69a8d8 mds: wait for purgequeue on rank shutdown
Also, move shutdown_pass call from dispatch
to tick, so that it doesn't rely on incoming
messages to make progress.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:59 +00:00
John Spray
3970502c9b qa: update test_strays for purgequeue
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:59 +00:00
John Spray
f2fb1874ca mds: implement PurgeQueue throttling
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:59 +00:00
John Spray
d96d0b0aa0 mds: add stats to PurgeQueue
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:58 +00:00
John Spray
ed4c7cbea8 mds: move dir purge and truncate into purgequeue
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:58 +00:00
John Spray
8a1b3e1b2d mds: move throttling code out of StrayManager
This will belong in PurgeQueue from now on.  We assume
that there is no need to throttle the rate of insertions
into purge queue as it is an efficient sequentially-written
journal.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:58 +00:00
John Spray
7189b53b41 mds: move PurgeQueue up to MDSRank
To better reflect its lifecycle: it has a part to play
in create/open and has an init/shutdown, unlike StrayManager.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:57 +00:00
John Spray
8ebf7d95a9 mds: use a persistent queue for purging deleted files
To avoid creating stray directories of unbounded size
and all the associated pain, use a more appropriate
datastructure to store a FIFO of inodes that need
purging.

Fixes: http://tracker.ceph.com/issues/11950
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:57 +00:00
John Spray
1b36be9850 osdc/Journaler: wrap recover() completion in finisher
Otherwise, the callback will deadlock if it in turn
calls into any Journaler functions.  Don't care
about performance because we do this once at startup.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:56 +00:00
John Spray
459c745a70 mds: const snaprealm getters on CInode
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:56 +00:00
John Spray
6cc01307d3 mds: const methods on SnapRealm
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:56 +00:00
John Spray
50f4783459 osdc/Filer: const fix for passed layouts
...so that const references can be passed into
purge calls.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:55 +00:00
John Spray
050dc5cc03 common/lockdep: clearer log messages
Previously these were contextless "using id..." messages with
no indication of what subsystem the message came from.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:55 +00:00
John Spray
059a877f05 osdc/Journaler: add have_waiter()
Allows users of wait_for_readable to conveniently
see if there is already a waiter.  Yes, they could
do this themselves, but I'd rather peek at an existing
variable than add a new one caller-side.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:55 +00:00
John Spray
58ec1c6aeb osdc/Journaler: remove incorrect assertion
This asserted that flush_pos would be ahead of
safe_pos after calling _flush.  However, this
is not guaranteed to be the case because
prezeroing might prevent us from flushing
right now.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:54 +00:00
John Spray
e8c6e74f20 osdc/Journaler: assign a name for logging
Now that we have an MDLog journaler and a PurgeQueue journaler,
this is needed to avoid confusion.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:54 +00:00
John Spray
6f6ef708b0 compact_set: add #includes for dependencies
This was previously working by side effects, I happened
to include it somewhere that its dependencies weren't
already included.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:53 +00:00
Kefu Chai
f46b327bb0 Merge pull request #13397 from SUSE/doc-fix-qa-links
doc: update links to point to ceph/qa instead of ceph-qa-suite

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-03-08 18:19:40 +08:00
John Spray
8bd12b2c14 Merge pull request #13816 from batrick/i19201
mds: print rank as int

Reviewed-by: Yan, Zheng <zyan@redhat.com>
2017-03-08 10:14:19 +00:00
John Spray
8f6324f1d5 Merge pull request #13830 from jcsp/wip-doc-multimds
doc: instructions and guidance for multimds

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-03-08 10:13:49 +00:00
Nathan Cutler
6a0ffa22ae doc: mention interactive task in developer guide
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-03-08 10:12:10 +01:00
Nathan Cutler
266fe30654 doc: rewrite Deploy a cluster for manual testing section
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-03-08 10:07:42 +01:00
Jan Fajerski
e0a00b3948 doc: update links to point to ceph/qa instead of ceph-qa-suite
Also fix two broken links to install task and two typos.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2017-03-08 10:07:41 +01:00
Nathan Cutler
799fd70ee3 Merge pull request #12506 from SUSE/wip-18259
Revert "dummy: reduce run time, run user.yaml playbook"

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2017-03-08 09:33:46 +01:00
Sage Weil
a68106934c Merge pull request #13615 from liewegas/wip-osd-full
mon,osd: new mechanism for managing full and nearfull OSDs for luminous

Reviewed-by: David Zafman <dzafman@redhat.com>
2017-03-07 21:33:14 -06:00
Sage Weil
7fbe8fb085 Merge pull request #13759 from liewegas/wip-19133
osdc/Objecter: resend RWORDERED ops on full

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2017-03-07 21:31:50 -06:00
Sage Weil
71db343636 Merge pull request #13734 from liewegas/wip-jewel-x
qa/suite/upgrade/jewel-x: various fixes

Reviewed-by: Yuri Weinstein <yweinstei@redhat.com>
2017-03-07 21:25:13 -06:00
Yehuda Sadeh
1bb5ea860f Merge pull request #13846 from rzarzynski/wip-qa-rgw-start-apache-first
qa/tasks/rgw.py: start Apache before RadosGW.

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2017-03-07 15:13:35 -08:00
Sage Weil
296708091c qa/tasks/ceph_manager: use new luminous set-full-ratio etc
Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-07 16:39:09 -05:00
Yehuda Sadeh
e9228f3460 Merge pull request #13410 from yehudasa/wip-tracing-fix
tracing: don't include oid when tracing at dequeue_op()

Reviewed-by: Sage Weil <sage@redhat.com>
2017-03-07 13:31:47 -08:00
Sage Weil
4272214136 Merge pull request #13839 from theanalyst/release/10.2.6/changelog
doc: add changelog for v10.2.6 Jewel release
2017-03-07 15:30:04 -06:00
Abhishek Lekshmanan
32e128c093 doc: add changelog for v10.2.6 Jewel release
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
2017-03-07 21:44:23 +01:00
John Spray
92e7e890c3 Merge pull request #13704 from batrick/mds-counter-unify
mds: remove some redundant object counters

Reviewed-by: Yan, Zheng <zyan@redhat.com>
2017-03-07 19:50:11 +00:00
Sage Weil
c4b73f19a7 osdc/Objecter: resend RWORDERED ops on full
Our condition for respecting the FULL flag is complex, and involves
the WRITE | RWORDERED flags vs the FULL_FORCE | FULL_TRY flags.  Previously,
we could block a read bc of RWORDRED but not resend it later.

Fix by capturing the complex condition in a respects_full() bool and using
it both for the blocking-on-send and resending-on-possibly-notfull-later
checks.

Fixes: http://tracker.ceph.com/issues/19133
Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-07 13:33:44 -05:00
Sage Weil
a202b68d18 qa/tasks/thrashosds: chance_thrash_cluster_full
Induce a momentarily full cluster.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-07 13:33:44 -05:00
Daniel Gryniewicz
0007adb5b7 Merge pull request #13832 from linuxbox2/wip-rgw-fs_inst
rgw_file:  fix fs_inst progression
2017-03-07 12:52:44 -05:00
Yuri Weinstein
05412184b5 Merge pull request #10240 from songbaisen/b2
mon: remove the redudant jugement in paxosservice is_writeable function

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-03-07 08:57:40 -08:00
Matt Benjamin
0e988edfb6 rgw_file: fix fs_inst progression
Reported by Gui Hecheng<guimark@126.com>.  This change is a
variation on proposed fix by Dan Gryniewicz<dang@redhat.com>
to take root_fh.state.dev as fs_inst for new handles.

Fixes: http://tracker.ceph.com/issues/19214

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2017-03-07 11:43:39 -05:00