Commit Graph

101213 Commits

Author SHA1 Message Date
Yingxin Cheng
d1cd196981 crimson/net: handle fault for READY, CONNECTING and ACCEPTING
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:22:45 +08:00
Yingxin Cheng
d75c9e884a crimson/net: WAIT state and backoff for client
Client goes to WAIT state when it is delayed to reconnect, or wants to
be replaced by a newly established socket.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:22:45 +08:00
Yingxin Cheng
8ea10a6a75 crimson/net: SERVER_WAIT state for accepting server
Server wait for peer client close the socket at SERVER_WAIT.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:22:45 +08:00
Yingxin Cheng
f8053d08ee crimson/net: STANDBY state for lossless server or peer
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:22:38 +08:00
Yingxin Cheng
dd59586ef0 crimson/net: allow REPLACING state wait for protocol exit
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:22:23 +08:00
Yingxin Cheng
49a08e8bc3 crimson/net: send AckFrame for lossless policy
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:18:15 +08:00
Yingxin Cheng
6cacf1f7b2 crimson/net: maintain the sent queue for lossless policy
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:17:43 +08:00
Yingxin Cheng
492263962c crimson/net: reset write state with reset_write()
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:02:52 +08:00
Yingxin Cheng
babc9c24fd crimson/net: allow connecting state reentrant
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:02:52 +08:00
Yingxin Cheng
675a50326c crimson/net: reset handshake status when connecting/accepting
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:02:52 +08:00
Yingxin Cheng
b7c7dc0b26 crimson/net: pending_q to store the pending(sending) messages
We cannot left the pending messages in the out_q, because with lossless
policy, they can be partially sent and even acknowledged.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:02:52 +08:00
Yingxin Cheng
4fa1c4c07d crimson/net: wait_write_exit() to wait for writer stopped
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:02:51 +08:00
Yingxin Cheng
b3f1e56d6c crimson/net: is_queued() to check if there's any pending writes
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:02:51 +08:00
Yingxin Cheng
04f8a35d79 crimson/net: fix variables for stateful connection
server_cookie, client_cookie, connect_seq and global_seq are identifiers
of a stateful connection.

We already have some related implementations, but they are stub code
when implement lossy policy and cannot work properly.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 17:02:42 +08:00
Kefu Chai
4913510173 qa/tasks/cbt.py: use "git --depth 1 for" faster clone
we don't need the full history for performing the test.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-12 16:58:12 +08:00
Jeegn Chen
3bfb5c2621 osd: support osd_scrub_extended_sleep
1. always take osd_scrub_sleep for manually initiated
   scrubs
2. when scrub_time_permit() return true for scheduled
   ones, the existing osd_scrub_sleep is used
3. when scrub_time_permit() return false for scheduled
   ones, there may be 2 scenarios
   3.1 if osd_scrub_extended_sleep <= osd_scrub_sleep,
       let's take osd_scrub_sleep
   3.2 otherwise, let's take osd_scrub_extended_sleep

Fixes: http://tracker.ceph.com/issues/40955
Signed-off-by: Jeegn Chen <jeegnchen@tencent.com>
2019-08-12 16:54:36 +08:00
Yingxin Cheng
469a9cda73 crimson/net: clean up, exsiting_conn and existing_proto
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 16:34:45 +08:00
Yingxin Cheng
014a662b20 crimson/net: next_step_t for explicit decision of next state
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 16:34:45 +08:00
Yingxin Cheng
b41af731b4 crimson/net: introduce 3 ways to abort the active protocol state
* abort_in_fault(): a fault is happening and needs to be handled.
* abort_protocol(): abort the current protocol state due to preemptive
                    state change.
* abort_in_close(): close this connection and abort the current protocol
                    state due to some fatal error.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2019-08-12 16:34:38 +08:00
Kefu Chai
2e2414b3df ceph-objectstore-tool: update-mon-db: do not fail if incmap is missing
there is chance that we could use an OSD which does not have incmap of a
certain epoch for rebuilding the monstore. and since OSD does not read
and store the incmap if the MOSDMap message already has the fullmap of
that fullmap, and if an OSD does not have previous fullmap, monitor
will just send it the fullmao. so it's not unusual that an OSD has
a fullmap of some epoch without corresponding incmap.

Fixes: https://tracker.ceph.com/issues/41177
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-12 13:06:01 +08:00
Zengran Zhang
ff32a1d4ae os/bluestore: more aggressive deferred submit when onode trim skipping
Current the deferred txcs will hung till the amount reach
the bluestore_deferred_batch_ops(default value is 64).
But in some extreme case, the client may long time only generate
non-deferred txc, and meanwhile the meta update,like osdmap,
may generate few deferred txcs with low frequency, so these txcs may hung
too long time.the problem is osdmap updating usually will write
3 objects: full, inc, superblock. So when these txcs hung,
the ref of these onodes will be hold. when bluestore cache trim
onodes,there is an option called bluestore_cache_trim_max_skip_pinned
(default value is 64), so 22 osdmaps update deferred txcs will hold 66 onodes,
if these onodes is on the endian of lru waiting trim, the trim will skipped, and lead to
onode cached more and more..

here is the more aggressive approach to void skipping trim too long time..

Fixes: http://tracker.ceph.com/issues/21531

Signed-off-by: Zengran Zhang <zhangzengran@sangfor.com.cn>
2019-08-12 09:35:33 +08:00
Matt Benjamin
801d2f0449
Merge pull request #28157 from Kriechi/docs-rgw-ldap
docs: improve rgw ldap auth options
2019-08-11 20:45:29 -04:00
Tianshan Qu
bc82637f54 rgw: fix list bucket with delimiter wrongly skip some special keys
list with delimiter will skip subfile with directory + after_delim_s,
but the code wrongly add after_delim_s to next marker regardless it have directory

Fixes: http://tracker.ceph.com/issues/40905

Signed-off-by: Tianshan Qu <tianshan@xsky.com>
2019-08-12 00:19:54 +08:00
Yuval Lifshitz
929c062ae9 rgw: don't throw when accept errors are happening on frontend
Signed-off-by: Yuval Lifshitz <yuvalif@yahoo.com>
2019-08-11 10:06:05 +03:00
Josh Durgin
3f18ed55aa
Merge pull request #28227 from sseshasa/monCachePriority
mon/OSDMonitor: Use generic priority cache tuner for mon caches

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-08-09 14:23:39 -07:00
Casey Bodley
bc45261470
Merge pull request #29540 from cbodley/wip-rgw-user-rename
rgw: followup for 'user rename'

Reviewed-by: Shilpa Jagannath <smanjara@redhat.com>
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
2019-08-09 16:57:25 -04:00
Sage Weil
9346d3c3bc os/bluestore: do not set osd_memory_target default from cgroup limit
On the aarch64 box I'm testing, this gives us a value of
7378697629483768832, which is not what we want.

I think we are better off relying on this limit being explicitly set via
environment variables (POD_* by kuberentes/rook) or via the command line.

This partially reverts 5c6b533697, but not
all of it, since we wan to keep the option itself, as it is now used by
common/config.cc when dealing with the POD_MEMORY_LIMIT env var.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-09 12:25:59 -05:00
Casey Bodley
13f1595335
Merge pull request #29558 from theanalyst/rgw-cache-lock
rgw: fix unlock of shared lock in RGWCache

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
2019-08-09 13:03:35 -04:00
Sage Weil
377fdb1484 os/bluestore: refuse to mkfs or mount if osd_max_object_size >= MAX_OBJECT_SIZE
BlueStore has its own object size limit (2^32-1).  Make sure the cluster
limit is below that or refuse to mkfs or mount.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-09 10:57:14 -05:00
Abhishek Lekshmanan
2b6dbe31c8 rgw: fix unlock of shared lock in RGWCache
similar to https://github.com/ceph/ceph/pull/29538/ we unlock a shared_lock with
unlock causing a crash. Also scope the single line if statements to make the
code more concise

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
2019-08-09 17:54:22 +02:00
Sage Weil
f011c13547 Merge PR #29292 into master
* refs/pull/29292/head:
	os/bluestore: warn on no per-pool omap
	os/bluestore: fsck: warning (not error) by default on no per-pool omap
	os/bluestore: fsck: int64_t for error count
	os/bluestore: default size of 1 TB for testing
	os/bluestore: behave if we *do* set PGMETA and PERPOOL flags
	os/bluestore: do not set both PGMETA_OMAP and PERPOOL_OMAP
	os/bluestore: fsck: only generate 1 error per omap_head
	os/bluestore: make fsck repair convert to per-pool omap
	os/bluestore: teach fsck to tolerate per-pool omap
	os/bluestore: ondisk format change to 3 for per-pool omap
	mon/PGMap: add data/omap breakouts for 'df detail' view
	osd/osd_types: separate get_{user,allocated}_bytes() into data and omap variants
	mon/PGMap: fix stored_raw calculation
	mon/PGMap: add in actual omap usage into per-pool stats
	osd: report per-pool omap support via store_statfs_t
	os/bluestore: set per_pool_omap key on mkfs
	osd/osd_types: count per-pool omap capable OSDs
	os/bluestore: report omap_allocated per-pool
	os/bluestore: add pool prefix to omap keys
	kv/KeyValueDB: take key_prefix for estimate_prefix_size()
	os/bluestore: fix manual omap key manipulation to use Onode::get_omap_key()
	os/bluestore: make omap key helpers Onode methods
	os/bluestore: add Onode::get_omap_prefix() helper
	os/bluestore: change _do_omap_clear() args

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-08-09 10:40:45 -05:00
Alfredo Deza
8363d89a4d
Merge pull request #29528 from tchaikov/wip-build-doc-with-python3
admin/build-doc: use python3

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2019-08-09 11:17:19 -04:00
Sage Weil
9426974195 os/bluestore/BlueFS: fix device_migrate_to_* to handle varying alloc sizes
The previous implementation moved extents individually.  This caused
problems when moving an extent with a small alloc_size that wasn't
a multiple of the target device's alloc_size.

Instead, identify files with extents that need to be moved, and then read
the file in its entirety and rewrite it in its entirety.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-09 10:10:12 -05:00
Sage Weil
e8b5a458c3 os/bluestore/BlueFS: apply shared_alloc_size to shared device
Keep an alloc_size vector so that we have this value handy at all times.
Allow bluestore to fetch this value directly instead of looking at the
bluefs_* config options since this encapsulates things a bit better, and
also isn't vulnerable to the config setting changing at runtime.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-09 10:10:12 -05:00
Abhishek Lekshmanan
fac4ab71fb rgw: url decode PutUserPolicy params
Since these are sent as a part of a POST request which is usually urlencoded,
the json parser would later return invalid json for jsons containing whitespace

Fixes: https://tracker.ceph.com/issues/41189
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
2019-08-09 16:57:25 +02:00
Varsha Rao
c023adf195 mds: Reorganize class members in FSMap header
Fixes: https://tracker.ceph.com/issues/41181
Signed-off-by: Varsha Rao <varao@redhat.com>
2019-08-09 20:01:41 +05:30
Sage Weil
39db4d7c4b os/bluestore/KernelDevice: print aio error extent in hex
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-09 09:23:25 -05:00
Matt Benjamin
a7b29647fd
Merge pull request #29560 from linuxbox2/wip-rgwf-advance
rgw_file: dont deadlock in advance_mtime()
2019-08-09 10:01:03 -04:00
Sage Weil
b8501164ef os/bluestore: warn on no per-pool omap
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-09 08:21:18 -05:00
Jan Fajerski
b8d6dcbe9f ceph-volume: never log to stdout, use stderr instead
We should never print log messages to stdout, as this should be reserved
for output of ceph-volume.

Fixes: https://tracker.ceph.com/issues/41158

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2019-08-09 14:26:16 +02:00
Sebastian Krah
9edf36f464 mgr/dashboard: Update language files
Signed-off-by: Sebastian Krah <skrah@suse.com>
2019-08-09 10:50:29 +02:00
Sebastian Krah
d33b5d33b5 mgr/dashboard: Add transifex-i18ntool
Adds the npm package transifex-i18ntool which manages the translation
files of the ceph dashboard

Signed-off-by: Sebastian Krah <skrah@suse.com>
2019-08-09 10:50:29 +02:00
alfonsomthd
bc8e811b08 mgr/dashboard: adapt bucket tenant tests to new behaviour
Fixes: https://tracker.ceph.com/issues/41175
Signed-off-by: alfonsomthd <almartin@redhat.com>
2019-08-09 10:05:46 +02:00
Kefu Chai
2db496017a
Merge pull request #29400 from wjwithagen/wjw-fix-do_freeBSD.sh
do_freebsd.sh: update build scripts to resemble Jenkins scripts

Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-08-09 14:09:42 +08:00
Kefu Chai
ed8ff905fc
Merge pull request #29495 from ifed01/wip-ifed-finisher-improve
common/Finisher: remove some lock acquisitions.

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Jianpeng Ma <jianpeng.ma@intel.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-08-09 12:24:28 +08:00
Kefu Chai
e5165db985
Merge pull request #29477 from tchaikov/wip-osd-pg-as-mutex
osd: pg as a mutex

Reviewed-by: Greg Farnum gfarnum@redhat.com
Reviewed-by: Samuel Just <sjust@redhat.com>
2019-08-09 12:22:26 +08:00
Kefu Chai
abce85bbd2
Merge pull request #29486 from runsisi/wip-fix-verbose
ceph.in: fix verbose print

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-08-09 12:21:29 +08:00
Kefu Chai
32b33d3258
Merge pull request #29465 from penglaiyxy/wip_bluestore_caculated_revert
os/bluestore: no need to add tail length (revert PR#29185)

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
2019-08-09 12:20:46 +08:00
Kefu Chai
88c6c039d1
Merge pull request #29488 from majianpeng/bluestore-remove-lock
os/bluestore: no need protected by OpSequencer::qlock.

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
2019-08-09 12:19:27 +08:00
Kefu Chai
3efc51fa1d
Merge pull request #29385 from kamoltat/wip-qa-tasks-mgr-test-progress-bug-fix
qa/tasks/mgr/test_progress.py: fix bug in 9b4dbf0

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-08-09 12:18:40 +08:00