Commit Graph

124541 Commits

Author SHA1 Message Date
Kefu Chai
07143c4dc3
Merge pull request #41946 from liewegas/fix-51294
mgr/devicehealth: fix _get_device_metrics ValueError

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-26 22:17:30 +08:00
Sage Weil
1bfa812f5d qa/tasks/vstart_runner: add LocalCluster.run
Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-26 08:47:06 -04:00
Sage Weil
8788af5663 qa/tasks/cephfs/test_nfs: fiddle with sudo
- no sudo for 'ceph' commands
- explicit sudo for _sys_cmd (things like 'rados' don't need sudo!)

Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-26 08:47:06 -04:00
Sage Weil
d1c20f8003 mgr/nfs/export: some cleanup, minor refactoring
Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-26 08:47:06 -04:00
Sage Weil
51bb1703f1 mgr/nfs/cluster: remove unused @cluster_setter
Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-26 08:47:06 -04:00
Kefu Chai
8533dbe4f9
Merge pull request #41977 from rzarzynski/wip-crimson-common-print-more-on-crash
crimson/common: dump more on faults

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-26 09:08:34 +08:00
Patrick Donnelly
c82379c28a
mds: fix compile warning
../src/mds/Server.cc: In member function ‘void Server::handle_set_vxattr(MDRequestRef&, CInode*)’:
    ../src/mds/Server.cc:5703:18: warning: unused variable ‘realm’ [-Wunused-variable]
           SnapRealm *realm = cur->find_snaprealm();
                      ^~~~~

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-25 17:55:47 -07:00
Sage Weil
14cf8c7174 nfs/mgr: fix help message case
Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-25 19:13:09 -04:00
Sage Weil
fc304f2d56 doc/cephfs/fs-nfs-export: add note about export update behavior
Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-25 19:13:09 -04:00
Sage Weil
10786a6380 mgr/nfs: move user create/delete into helper
- Do user create or delete via a helper
- Defer until after we have validated the Export (on create or update)
- Support updates to user_id, which is needed to keep the naming consistent
and to also support changing the bucket, since the user_id is derived
from that.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-25 19:13:09 -04:00
Ernesto Puerta
62e3a5c41c
Merge pull request #41838 from p-se/grafana-clean-up
monitoring: Clean up Grafana dashboards

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: jan--f <NOT@FOUND>
Reviewed-by: p-se <NOT@FOUND>
Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
2021-06-25 20:45:28 +02:00
Sage Weil
3edc04a46b qa/suites/rados/mgr: whitelist module crash during selftest
One of the selftests triggers an exception from serve().

Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-25 13:48:45 -04:00
Ernesto Puerta
26df5df247
Merge pull request #41721 from aaryanporwal/telemetry-ident-fix
mgr/dashboard: telemetry activate: show ident fields when checked

Reviewed-by: aaryanporwal <NOT@FOUND>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2021-06-25 18:48:34 +02:00
Daniel Gryniewicz
806480eaa0
Merge pull request #41991 from dang/wip-dang-bucket-delete
RGW - Bucket Remove Op: Pass in user

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2021-06-25 12:00:37 -04:00
Neha Ojha
4b6619a41d
Merge pull request #41993 from ronen-fr/wip-ronenf-50346
osd/scrub: replace a ceph_assert() with a test

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-06-25 08:48:45 -07:00
Sebastian Wagner
8ef5657cf6
cephadm: Fix normalize_image_digest for local registries
Cause they typically don't have dots in it.

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-06-25 17:26:49 +02:00
Kefu Chai
82b78ab27e
Merge pull request #42024 from rzarzynski/wip-crimson-load_obc_nocpy
crimson/osd: don't extra copy hobject in PG::load_head_obc().

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-25 21:02:47 +08:00
Radoslaw Zarzynski
45a173f79a crimson/osd: don't extra copy hobject in PG::load_head_obc().
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-06-25 11:45:13 +00:00
Kefu Chai
7213811dd0 ceph.spec.in: increase memory per core to 3000MB on SUSE distros
in the KVM instance offered by OBS, we have

[  346s] + cat /proc/meminfo
[  347s] MemTotal:       10167736 kB
[  347s] MemFree:         4983964 kB
[  347s] MemAvailable:    9826800 kB
[  347s] Buffers:           85856 kB
[  347s] Cached:          4615192 kB
[  347s] SwapCached:            0 kB
...
[  347s] SwapTotal:       2097148 kB

and its number of hardware threads is

[  346s] ++ /usr/bin/getconf _NPROCESSORS_ONLN
[  346s] + _threads=8

so ($MemTotal+$SwapTotal)/1024/2600 = 4.6, which is less
than the # of threads, so "4" was used for the number of jobs.

but per our recent observation in
38be14bc0f, some compiling jobs could
take up to 3GB. in the OOM failure in OBS, we had

[24915s] [24848.843594] Out of memory: Killed process 16894 (cc1plus) total-vm:4293756kB, anon-rss:2970012kB, file-rss:0kB, shmem-rss:0kB, UID:399 pgtables:8324kB oom_score_adj:0

where 4GiB memory was allocated, in which 3GiB was mapped into
memory. this matches with our findings.

in this change, the memory per core is bumped up to 3000MB
in hope to address the OOB. the downside of this change is
that it would take even longer to finish the build if the
building host is limited in memory.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-06-25 19:27:34 +08:00
Kefu Chai
74df5af8e2
Merge pull request #41615 from tchaikov/wip-avl-alloc-ff
os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* options

Reviewed-by: Igor Fedotov <ifedotov@suse,com>
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
2021-06-25 17:01:11 +08:00
Kefu Chai
7b07baf457
Merge pull request #38939 from ronen-fr/wip-ronenf-scrub-blocked
osd: issue a warning if the scrubber blocks for too long on an object

Reviewed-by: David Zafman <dzafman@redhat.com>
2021-06-25 14:57:31 +08:00
Kefu Chai
2eca09ef5f
Merge pull request #40850 from varshar16/wip-vstart-support-cephadm-rgw
src/vstart: deploy rgw service with cephadm and create rgw user with system flag

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-25 14:51:25 +08:00
Samuel Just
daf27534fe
Merge pull request #42020 from athanatos/sjust/wip-cache-assert
crimson/os/seastore: transaction conflict handling improvements

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-06-24 23:08:02 -07:00
Kefu Chai
174aa6b163
Merge pull request #42003 from cyx1231st/wip-seastore-fix-onode-tree
crimson/onode-staged-tree: fix ref-counter assert failures

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-25 12:52:23 +08:00
Radoslaw Zarzynski
16dd71bdab crimson/common: dump entire siginfo on segmentation fault.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-06-25 04:16:30 +00:00
Radoslaw Zarzynski
9250882347 crimson/common: FatalSignal::signaled() takes siginfo by a reference.
There is no point in having the distincted `nullptr` value.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-06-25 04:16:30 +00:00
Radoslaw Zarzynski
231a5e7e5c crimson/common: dump /proc/self/maps on crash.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-06-25 04:16:04 +00:00
Xiubo Li
c854a4eea4 mds: just respawn mds daemon when osd op requests timeout
Fixes: https://tracker.ceph.com/issues/51280
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2021-06-25 12:05:41 +08:00
Kevin Zhao
8c7729af08 ceph.spec.in, debian/rules: Set rbd-rwl-cache optional on arm64 and ppc64le
set rwl cache option on arm64 and ppc64le as PMDK is not well supported.
Currently, only 64-bit Linux* and Windows* on x86 are supported PMDK

Reference:
1. Experimental support on Arm64, but lacking of librpmem:
See: https://github.com/pmem/pmdk#experimental-support-for-64-bit-arm
2. No RPM for PMDK on Arm64:
See: https://bugzilla.redhat.com/show_bug.cgi?id=1340635
3. > Does PMDK support ARM64*?
   > Currently only 64-bit Linux* and Windows* on x86 are supported.
See: https://software.intel.com/content/www/us/en/develop/articles/persistent-memory-faq.html
4. Make check fail on Arm64
See: https://github.com/pmem/pmdk/issues/5255

Fixes: https://tracker.ceph.com/issues/51339
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
2021-06-25 11:53:18 +08:00
Kefu Chai
423f8d3c23
Merge pull request #41889 from ChenFanTony/mkfs_wait_complete
osd/OSD: mkfs need wait for transcation completely finish

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-25 10:59:55 +08:00
Yingxin Cheng
d2454022f0 crimson/onode-staged-tree: reset root node after lookup
Otherwise there could be unexpected references that will break the
asserts when remove nodes during insert/delete.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-06-25 10:40:28 +08:00
Yingxin Cheng
be96437157 crimson/onode-staged-tree: add missing mutable keyword
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-06-25 10:37:21 +08:00
Kefu Chai
d45f9e469e
Merge pull request #42004 from tchaikov/wip-crimson-osd-fsm
crimson/osd: shutdown if osdmap forces us to do so

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-06-25 08:27:41 +08:00
Samuel Just
6df608159e seastore/.../staged_fltree/node: check for conflict in Node::load
This will be unnecessary once converted to interruptible_future.

Signed-off-by: Samuel Just <sjust@redhat.com>
2021-06-24 16:30:15 -07:00
Samuel Just
9a514cccaa crimson/os/seastore/lba_manager/btree/lba_btree_node_impl: add debugging
Signed-off-by: Samuel Just <sjust@redhat.com>
2021-06-24 16:30:15 -07:00
Samuel Just
0cb483303c seastore/.../node_extent_manager/seastore: detect transaction conflicts in read_extent
This won't be necessary once converted to interruptible_future.

Signed-off-by: Samuel Just <sjust@redhat.com>
2021-06-24 16:29:52 -07:00
Samuel Just
2bc257beb2 crimson/os/seastore/cache: mark conflict in get_extent
After wait_io, the extent may have been mutated again, so it may be
invalid.  Check in the caller and mark the transaction conflicted as
needed.

Signed-off-by: Samuel Just <sjust@redhat.com>
2021-06-24 16:29:23 -07:00
Samuel Just
add641a286 crimson/os/seastore/transasction: expose is_conflicted
Useful for components not yet converted to use interruptible_future.

Signed-off-by: Samuel Just <sjust@redhat.com>
2021-06-24 16:29:23 -07:00
Samuel Just
94d5650157
Merge pull request #41963 from athanatos/sjust/wip-interruptible-tm
crimson/os/seastore: refactor transaction_manager and below to use interruptible_future

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-06-24 13:19:47 -07:00
Sage Weil
73b9653cfe mgr/devicehealth: fix _get_device_metrics ValueError
This appears to have broken with abd35d4769

The SQL OR doesn't work because in the case that sample is passed,
_t2epoch(min_sample) is 0 and the 0 <= time portion of the expression
is always true.

Fixes: https://tracker.ceph.com/issues/51294
Signed-off-by: Sage Weil <sage@newdream.net>
2021-06-24 14:55:32 -05:00
Samuel Just
45f42b82a3 test/crimson/test_interruptible_future: disable handle_error
Seems to cause a linker hang with gcc-9 in bionic.

Signed-off-by: Samuel Just <sjust@redhat.com>
2021-06-24 11:49:27 -07:00
Samuel Just
3ef4040c41 crimson/os/seastore/transaction_manager: pass t by ref to submit_transaction
Signed-off-by: Samuel Just <sjust@redhat.com>
2021-06-24 11:49:25 -07:00
Casey Bodley
d7a6c47026
Merge pull request #39934 from Jeegn-Chen/wip-tracker-49128
rgw: write meta of a MP part to a correct pool

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2021-06-24 12:17:53 -04:00
Casey Bodley
6c75be40fd
Merge pull request #41739 from liewegas/rgw-realm-metadata
radosgw: include realm_{id,name} in service map

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
2021-06-24 12:16:19 -04:00
Ronen Friedman
d232c4e8d8 qa/suites/rados: add simultaneous scrubs (multiple options) to the thrasher
Setting osd-max-scrubs to either 2 or 3.

Triggered by https://tracker.ceph.com/issues/50346

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2021-06-24 18:53:50 +03:00
Daniel Gryniewicz
a77775caa4 RGW - Bucket Remove Op: Pass in user
When a bucket remove op is called on the non-master zone, the op is
forwarded to the master zone, but this needs a user, so pass the user
in.

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
2021-06-24 11:42:45 -04:00
zdover23
5dce141b79
Merge pull request #41994 from anthonyeleven/anthonyeleven/adjust-rados-operations-pools
doc/rados/operations: Update pools.rst

Reviewed-by: Zac Dover <zac.dover@gmail.com>
2021-06-24 23:51:30 +10:00
Ilya Dryomov
7641537c7d
Merge pull request #42005 from trociny/wip-51342
test/librbd: use really invalid domain

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2021-06-24 14:48:13 +02:00
Kefu Chai
3c1cae8dd5 ceph.spec.in: enable --with-rbd_ssd_cache by default
unlike rbd_rwl_cache, rbd_ssd_cache does not depend on pmdk (libpmem),
so let's enable it on all supported architecture and rpm based distros.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-06-24 19:54:50 +08:00
Mykola Golub
7a2a3fed4c test/librbd: use really invalid domain
in TestMockMigrationHttpClient.OpenResolveFail

Fixes: https://tracker.ceph.com/issues/51342

Signed-off-by: Mykola Golub <mgolub@suse.com>
2021-06-24 12:25:09 +01:00