Commit Graph

90130 Commits

Author SHA1 Message Date
Alfredo Deza
71fcd35c3d ceph-volume lvm.batch.filestore capture SizeAllocationErrors
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-09-10 09:17:11 -04:00
Alfredo Deza
d2ea49a4e6 ceph-volume lvm.batch make sure data devices don't have existing LVs on bluestore
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-09-10 09:17:11 -04:00
Casey Bodley
dfc1c78889
Merge pull request #21271 from cbodley/wip-rgw-beast-async
rgw: beast frontend reworks pause/stop and yields during body io

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
2018-09-10 09:05:57 -04:00
Sage Weil
4d2a73c7f1 Merge PR #23845 into master
* refs/pull/23845/head:
	osd/OSDMap: include age in up and in counts for ceph status
	mon/OSDMonitor: set new_last_{up,in}_change
	osd/OSDMap: store last_up_change and last_in_change
	mgr/MgrMap: include mgr age in map printer
	mon/MgrMap: track active_changed timestamp
	mon: include mon quorum age in status
	include/utime: add utimespan_str helper

Reviewed-by: John Spray <john.spray@redhat.com>
2018-09-10 07:45:58 -05:00
Sage Weil
ff826e69c7 Merge PR #23949 into master
* refs/pull/23949/head:
	mon/OSDMonitor: invalidate max_failed_since on cancel_report

Reviewed-by: Sage Weil <sage@redhat.com>
2018-09-10 07:41:44 -05:00
Sage Weil
838958daa4 Merge PR #23968 into master
* refs/pull/23968/head:
	dout: add basic prefix providers
	dout: add DoutPrefixPipe for composing prefix providers

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-09-10 07:41:25 -05:00
Sage Weil
9f30b12e39 Merge PR #23971 into master
* refs/pull/23971/head:
	cls/numops: fix cls_numops.cc log add to mul

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-09-10 07:41:03 -05:00
Sage Weil
f9d45d06f9 Merge PR #23975 into master
* refs/pull/23975/head:
	common/buffer.cc: add create_small_page_aligned to avoid mem waste when apply for small mem in big page size(e.g. 64k) OS

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-09-10 07:40:37 -05:00
Lenz Grimmer
51604c6c78
Merge pull request #23939 from votdev/bug_35685
mgr/dashboard: Fix bug in user form when changing password

Reviewed-by: Stephan Müller <smueller@suse.com>
2018-09-10 14:29:24 +02:00
xie xingguo
7bc97797eb osd/OSD: kick right merge source
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-09-10 20:01:24 +08:00
Jason Dillaman
95861114d6
Merge pull request #23839 from trociny/wip-migration-commit-race
librbd: fix potential live migration after commit issues due to not refreshed image header

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2018-09-10 07:27:24 -04:00
xie xingguo
4c0804ad02 osd/OSD: clear ping_history on heartbeat_reset
Because the old connections are gone, and hence we should not
leave behind a long list of obsolete ping_history there, which
is misleading...

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-09-10 18:50:50 +08:00
xie xingguo
79f480442f mon/OSDMonitor: share new maps with even non-active osds
OSDs may not be aware of their deadness and trapped at
an obsolete map in which they were still marked as up:

```
host        osd     down_at     stuck_at
ceph-03     9       e712        e711
ceph-03     13      e700        e699
ceph-03     28      e697        e696
ceph-03     48      e697        e696
ceph-03     52      e707        e704
ceph-03     61      e710        e708
ceph-03     73      e712        e710
ceph-03     77      e708        e707

ceph-05     12      e711        e710
ceph-05     21      e703        e702
ceph-05     24      e700        e699
ceph-05     29      e703        e699
ceph-05     41      e711        e710
ceph-05     53      e711        e710
ceph-05     72      e712        e711

```

In https://github.com/ceph/ceph/pull/23958 an OSD will ping monitor
periodically now if it is stuck at __wait_for_healthy__. But in the
above case OSDs are still considering themselves as __active__ and
hence should miss that fixer.

Since these OSDs might be still able to contact with monitors (
otherwise there is no way for them to be marked up again) and send
beacons contiguously, we can simply get them out of the trap by
sharing some new maps with them.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
Signed-off-by: runsisi <runsisi@zte.com.cn>
2018-09-10 18:48:21 +08:00
Volker Theile
f3d25fb3a8 mgr/dashboard: Refactor autofocus directive
Refactor the autofocus directive and add some unit tests.

Signed-off-by: Volker Theile <vtheile@suse.com>
2018-09-10 12:08:34 +02:00
Volker Theile
7f3a982d6f mgr/dashboard: Unable to edit user when making an accidental change to the password field
Fixes: https://tracker.ceph.com/issues/35685

Signed-off-by: Volker Theile <vtheile@suse.com>
2018-09-10 12:03:25 +02:00
xie xingguo
4b10ed1035 mgr/DaemonServer: split should respect inflight creating pgs
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-09-10 16:58:03 +08:00
Kefu Chai
6cf6615225
Merge pull request #23993 from badone/wip-fedora-build-Cython3-error
rpm: Fix Fedora error "No matching package to install: 'Cython3'"

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-09-10 16:46:26 +08:00
Kefu Chai
69d81c55ce
Merge pull request #23833 from falcon78921/wip-docs-34539
doc/rados: fixed hit set type link

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-09-10 15:13:20 +08:00
Xie Xingguo
cfa8591128
Merge pull request #24000 from libingyang-zte/master
doc: Fix Spelling Error of Radosgw

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-09-10 10:56:01 +08:00
James McClune
f38426a766 doc: fixed hit set type link
Fixed reference link for hit set type value. Restructured wording in description.
Fixes: https://tracker.ceph.com/issues/34539

Signed-off-by: James McClune <jmcclune@mcclunetechnologies.net>
2018-09-09 21:41:08 -04:00
李丙洋 10208981
bf56495b98 doc: Fix Spelling Error of Radosgw
Signed-off-by: Li Bingyang <li.bingyang1@zte.com.cn>
2018-09-10 09:21:31 +08:00
Patrick Donnelly
a45852f8fd
qa: fix symlink
Introduced-by: 6ac1882dc4

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-09-08 19:21:57 -07:00
Xie Xingguo
5a3344f0e5
Merge pull request #23895 from xiexingguo/wip-more-async-fixes
osd/PrimaryLogPG: update missing_loc more carefully

Reviewed-by: Neha Ojha <nojha@redhat.com>
2018-09-08 09:51:04 +08:00
Xie Xingguo
65a238a0b1
Merge pull request #23958 from xiexingguo/wip-heartbeat-stuck
osd/OSD: ping monitor if we are stuck at __waiting_for_healthy__

Reviewed-by: Sage Weil <sage@redhat.com>
2018-09-08 09:49:27 +08:00
xie xingguo
91a2d408a9 mon/OSDMonitor: invalidate max_failed_since on cancel_report
max_failed_since might reference the very failure-report which is to be
cancelled. We can simply invalidate it here to make **get_failed_since()**
recalculate if necessary.

Fixes: http://tracker.ceph.com/issues/35860
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-09-08 09:45:16 +08:00
Sage Weil
3eaf937459 Merge PR #23449 into master
* refs/pull/23449/head:
	osd/OSDMap: cleanup: s/tmpmap/nextmap/
	qa/standalone/osd/osd-backfill-stats: fixes
	osd/OSDMap: clean out pg_temp mappings that exceed pool size
	mon/OSDMonitor: clean temps and upmaps in encode_pending, efficiently
	osd/OSDMapMapping: do not crash if acting > pool size

Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2018-09-07 19:34:00 -05:00
Patrick Donnelly
17f79b0745
Merge PR #23984 into master
* refs/pull/23984/head:
	mon: test if gid exists in pending for prepare_beacon

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2018-09-07 15:35:55 -07:00
Sage Weil
d249fa8675 osd/OSDMap: cleanup: s/tmpmap/nextmap/
Be consistent with OSDMap.h

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 17:11:18 -05:00
Sage Weil
f47921f293 qa/standalone/osd/osd-backfill-stats: fixes
Grep from the primary's log, not every osd's log.

For the backfill_remapped task in particular, after the pg_temp change it
just so happens that the primary changes across the pool size change and
thus two different primaries do (some) backfill.  Fix that test to pass
the correct primary.

Other tests are unaffected as they do not (happen to) trigger a primary
change and already satisfied the (removed) check that only one OSD does
backfill.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 17:11:18 -05:00
Sage Weil
daf53f423d osd/OSDMap: clean out pg_temp mappings that exceed pool size
If the pool size is reduced, we can end up with pg_temp mappings that are
too big.  This can trigger bad behavior elsewhere (e.g., OSDMapMapping,
which assumes that acting and up are always <= pool size).

Fixes: http://tracker.ceph.com/issues/26866
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 17:11:18 -05:00
Sage Weil
1c2eb40651 mon/OSDMonitor: clean temps and upmaps in encode_pending, efficiently
- do not rebuild the next map when we already have it
- do this work in encode_pending, not create_pending, so we get bad
values before they are published.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 17:11:18 -05:00
Sage Weil
f793118656 osd/OSDMapMapping: do not crash if acting > pool size
Existing oversized pg_temp mappings (or some other bug) might make acting
exceed the pool size.  Avoid overrunning out buffer if that happens.

Note that the mapping won't be completely accurate in that case!

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 17:10:18 -05:00
Patrick Donnelly
f26752a8a8
mon: test if gid exists in pending for prepare_beacon
If it does not, send a null map. Bug introduced by
624efc6432 which made preprocess_beacon only look
at the current fsmap (correctly). prepare_beacon relied on preprocess_beacon
doing that check on pending.

Running:

    while sleep 0.5; do bin/ceph mds fail 0; done

is sufficient to reproduce this bug. You will see:

    2018-09-07 15:33:30.350 7fffe36a8700  5 mon.a@0(leader).mds e69 preprocess_beacon mdsbeacon(24412/a up:reconnect seq 2 v69) v7 from mds.0 127.0.0.1:6813/2891525302 compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
    2018-09-07 15:33:30.350 7fffe36a8700 10 mon.a@0(leader).mds e69 preprocess_beacon: GID exists in map: 24412
    2018-09-07 15:33:30.350 7fffe36a8700  5 mon.a@0(leader).mds e69 _note_beacon mdsbeacon(24412/a up:reconnect seq 2 v69) v7 noting time
    2018-09-07 15:33:30.350 7fffe36a8700  7 mon.a@0(leader).mds e69 prepare_update mdsbeacon(24412/a up:reconnect seq 2 v69) v7
    2018-09-07 15:33:30.350 7fffe36a8700 12 mon.a@0(leader).mds e69 prepare_beacon mdsbeacon(24412/a up:reconnect seq 2 v69) v7 from mds.0 127.0.0.1:6813/2891525302
    2018-09-07 15:33:30.350 7fffe36a8700 15 mon.a@0(leader).mds e69 prepare_beacon got health from gid 24412 with 0 metrics.
    2018-09-07 15:33:30.350 7fffe36a8700  5 mon.a@0(leader).mds e69 mds_beacon mdsbeacon(24412/a up:reconnect seq 2 v69) v7 is not in fsmap (state up:reconnect)

in the mon leader log. The last line indicates the problem was safely handled.

Fixes: http://tracker.ceph.com/issues/35848

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-09-07 14:00:46 -07:00
Sage Weil
09ee3f3538 Merge PR #20469 into master
* refs/pull/20469/head:
	osd/PG: remove warn on delete+merge race
	osd: base project_pg_history on is_new_interval
	osd: make project_pg_history handle concurrent osdmap publish
	osd: handle pg delete vs merge race
	osd/PG: do not purge strays in premerge state
	doc/rados/operations/placement-groups: a few minor corrections
	doc/man/8/ceph: drop enumeration of pg states
	doc/dev/placement-groups: drop old 'splitting' reference
	osd: wait for laggy pgs without osd_lock in handle_osd_map
	osd: drain peering wq in start_boot, not _committed_maps
	osd: kick split children
	osd: no osd_lock for finish_splits
	osd/osd_types: remove is_split assert
	ceph-objectstore-tool: prevent import of pg that has since merged
	qa/suites: test pg merging
	qa/tasks/thrashosds: support merging pgs too
	mon/OSDMonitor: mon_inject_pg_merge_bounce_probability
	doc/rados/operations/placement-groups: update to describe pg_num reductions too
	doc/rados/operations: remove reference to lpgs
	osd: implement pg merge
	osd/PG: implement merge_from
	osdc/Objecter: resend ops on pg merge
	osd: collect and record pg_num changes by pool
	osd: make load_pgs remove message more accurate
	osd/osd_types: pg_t: add is_merge_target()
	osd/osd_types: pg_t::is_merge -> is_merge_source
	osd/osd_types: adding or substracting invalid stats -> invalid stats
	osd/PG: clear_ready_to_merge on_shutdown (or final merge source prep)
	osd: debug pending_creates_from_osd cleanup, don't use cbegin
	ceph-objectstore-tool: debug intervals update
	mgr/ClusterState: discard pg updates for pgs >= pg_num
	mon/OSDMonitor: fix long line
	mon/OSDMonitor: move pool created check into caller
	mon/OSDMonitor: adjust pgp_num_target down along with pg_num_target as needed
	mon/OSDMonitor: add mon_osd_max_initial_pgs to cap initial pool pgs
	osd/OSDMap: set pg[p]_num_target in build_simple*() methods
	mon/PGMap: adjust SMALLER_PGP_NUM warning to use *_target values
	mon/OSDMonitor: set CREATING flag for force-create-pg
	mon/OSDMonitor: start sending new-style pg_create2 messages
	mon/OSDMonitor: set last_force_resend_prenautilus for pg_num_pending changes
	osd: ignore pg creates when pool FLAG_CREATING is not set
	mgr: do not adjust pg_num until FLAG_CREATING removed from pool
	mon/OSDMonitor: add FLAG_CREATING on upgrade if pools still creating
	mon/OSDMonitor: prevent FLAG_CREATING from getting set pre-nautilus
	mon/OSDMonitor: disallow pg_num changes while CREATING flag is set
	mon/OSDMonitor: set POOL_CREATING flag until initial pool pgs are created
	osd/osd_types: add pg_pool_t FLAG_POOL_CREATING
	osd/osd_types: introduce last_force_resend_prenautilus
	osd/PGLog: merge_from helper
	osd: no cache agent or snap trimming during premerge
	osd: notify mon when pending PGs are ready to merge
	mgr: add simple controller to adjust pg[p]_num_actual
	mon/OSDMonitor: MOSDPGReadyToMerge to complete a pg_num change
	mon/OSDMonitor: allow pg_num to adjusted up or down via pg[p]_num_target
	osd/osd_types: make pg merge an interval boundary
	osd/osd_types: add pg_t::is_merge() method
	osd/osd_types: add pg_num_pending to pg_pool_t
	osd: allow multiple threads to block on wait_min_pg_epoch
	osd: restructure advance_pg() call mechanism
	mon/PGMap: prune merged pgs
	mon/PGMap: track pgs by state for each pool
	osd/SnapMapper: allow split_bits to decrease (merge)
	os/bluestore: fix osr_drain before merge
	os/bluestore: allow reuse of osr from existing collection
	os/filestore: (re)implement merge
	os/filestore: add _merge_collections post-check
	os: implement merge_collection
	os/ObjectStore: add merge_collection operation to Transaction
2018-09-07 15:55:21 -05:00
Yuri Weinstein
61a2a74eb2
Merge pull request #23894 from xiexingguo/wip-complete-to-2
osd/PrimaryLogPG: avoid dereferencing invalid complete_to

Reviewed-by: Sage Weil <sage@redhat.com>
2018-09-07 13:03:28 -07:00
Ilya Dryomov
478aca82eb
Merge pull request #23976 from idryomov/wip-cram-git-clone
qa/tasks/cram: tasks now must live in the repository

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2018-09-07 19:57:42 +02:00
Casey Bodley
1af938a99a
Merge pull request #23828 from cbodley/wip-rgw-sync-trace-cleanup
rgw: cleanups for sync tracing

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2018-09-07 13:46:37 -04:00
Casey Bodley
97e4db983d
Merge pull request #23571 from cbodley/wip-26938
rgw: data sync respects error_retry_time for backoff on error_repo

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2018-09-07 13:45:36 -04:00
Casey Bodley
93de3367a7 rgw: beast frontend closes connections on stop
the strategy for stop relies on the fact that process_request() is
completely synchronous, so that io_context.stop() would still complete
each request and clean up properly

to tolerate an asynchronous process_request(), we instead need to drain
all outstanding work on the io_context so that io_context.run() can
return control natually to all of the worker threads. that would allow
us to suspend our coroutine in the middle of process_request(), and
still guarantee that process_request() will resume and run to completion
before the worker threads exit

each connected socket also counts as outstanding work, and needs to be
closed in order to drain the io_context. each connection now adds itself
to a connection list so that stop() can close its socket

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2018-09-07 13:11:36 -04:00
Casey Bodley
378b01064c rgw: beast frontend uses async SharedMutex for pause
the strategy for pause relied on stopping the io_context and waiting for
io_context.run() to return control to all of the worker threads. this
relies on the fact that process_request() is completely synchronous (so
considered a single unit of work in the io_context) - otherwise, pause
could complete in the middle of a call to process_request(), and destroy
the RGWRados instance while it's still in use

calling io_context.stop() to pause the worker threads also assumes that
no other work will be scheduled on these threads

to decouple pause from worker threads, handle_connection() now uses an
async shared mutex to synchronize with pause/unpause

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2018-09-07 13:11:36 -04:00
Sage Weil
564212ce56 osd/PG: remove warn on delete+merge race
This was there just to confirm that this path was exercised by the
rados suite (it is, several hits per rados run of 1/666).

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:42 -05:00
Sage Weil
4bc01379bb osd: base project_pg_history on is_new_interval
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:42 -05:00
Sage Weil
cfe6ca82ed osd: make project_pg_history handle concurrent osdmap publish
The class's osdmap may be updated while we are in our loop.  Pass it in
explicitly instead.

Fixes: http://tracker.ceph.com/issues/26970
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:42 -05:00
Sage Weil
93b6283829 osd: handle pg delete vs merge race
Deletion involves an awkward dance between the pg lock and shard locks,
while the merge prep and tracking is "shard down".  If the delete has
finished its work we may find that a merge has since been prepped.

Unwinding the merge tracking is nontrivial, especially because it might
involved a second PG, possibly even a fabricated placeholder one. Instead,
if we delete and find that a merge is coming, undo our deletion and let
things play out in the future map epoch.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:42 -05:00
Sage Weil
ce53eb3329 osd/PG: do not purge strays in premerge state
The point of premerge is to ensure that the constituent parts of the
target PG are fully clean.  If there is an intervening PG migration and
one of the halves finishes migrating before the other, one half could
get removed and the final merge could result in an incomplete PG.  In the
worst case, the two halves (let's call them A and B) could have started
out together on say [0,1,2], A moves to [3,4,5] and gets deleted from
[0,1,2], and then the final merge happens such that *all* copies of the PG
are incomplete.

We could construct a clever check that does allow removal of strays when
the sibling PG is also ready to go, but it would be complicated.  Do the
simple thing.  In reality, this would be an extremely hard case to hit
because the premerge window is generally very short.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:42 -05:00
Sage Weil
5eba9ba074 doc/rados/operations/placement-groups: a few minor corrections
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:42 -05:00
Sage Weil
856a01fcfc doc/man/8/ceph: drop enumeration of pg states
This is more maintainable.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:42 -05:00
Sage Weil
5ff6bbf63d doc/dev/placement-groups: drop old 'splitting' reference
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:05 -05:00
Sage Weil
62a208b423 osd: wait for laggy pgs without osd_lock in handle_osd_map
We can't hold osd_lock while blocking because other objectstore completions
need to take osd_lock (e.g., _committed_osd_maps), and those objectstore
completions need to complete in order to finish_splits.  Move the blocking
to the top before we establish any local state in this stack frame since
both the public and cluster dispatchers may race in handle_osd_map and
we are dropping and retaking osd_lock.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:05 -05:00
Sage Weil
47d627736a osd: drain peering wq in start_boot, not _committed_maps
We can't safely block in _committed_osd_maps because we are being run
by the store's finisher threads, and we may have to wait for a PG to split
and then merge via that same queue and deadlock.

Do not hold osd_lock while waiting as this can interfere with *other*
objectstore completions that take osd_lock.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:05 -05:00