Commit Graph

137521 Commits

Author SHA1 Message Date
Xiubo Li
ef2cdfdefa qa: introduce postmerge for fuse/kclient mounts
Suggested by Patrick and this will check mounter's type.

Fixes: https://tracker.ceph.com/issues/57591
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2023-03-29 09:07:58 +08:00
Xiubo Li
e123fcaadc qa: remove the '0-' prefix
Both the nautilus and pacific will be run in parallel.

Fixes: https://tracker.ceph.com/issues/57591
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2023-03-29 09:07:58 +08:00
Zac Dover
71ee225d7b doc/start: documenting-ceph - add squash procedure
Add a procedure to doc/start/documenting-ceph.rst that explains how to
perform an interactive rebase to squash commits.

Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-03-29 09:57:48 +10:00
Avan Thakkar
51a89906df exporter: user only counter dump/schema commands for extacting counters
Fixes: https://tracker.ceph.com/issues/59191
Signed-off-by: Avan Thakkar <athakkar@redhat.com>

Ceph exporter no more required the output of perf dump/schema, as the ``counter dump`` command
returns both labeled and unlabeled perf counters which exporter can fetch and export.
Removed the ``exporter_get_labeled_counters`` confiug option as exporter will now export
all the counters, labeled or unlabeled.
Also the fix includes the support for renaming the metrics name of rgw multi-site and
adding labels to it, similar to what is there in prometheus module.
2023-03-28 23:42:09 +05:30
Anthony D'Atri
cec15a5992
Merge pull request #50713 from zdover23/wip-doc-2023-03-28-glossary-cephx
doc/glossary: improve "CephX" entry
2023-03-28 08:28:49 -04:00
Aashish Sharma
9a28ba2a89
Merge pull request #50529 from rhcs-dashboard/dashboard-edit-rgw-multisite
mgr/dashboard: edit realm in rgw-multisite


Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2023-03-28 17:08:17 +05:30
Pere Diaz Bou
bd0eb20c67 mgr/dashboard: rgw role creation form
Fixes: https://tracker.ceph.com/issues/59187
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
Signed-off-by: Nizamudeen A <nia@redhat.com>
2023-03-28 11:19:10 +02:00
Zac Dover
02e3a5cb76 doc/glossary: improve "CephX" entry
Improve the glossary entry for "CephX".

Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-03-28 18:54:07 +10:00
Aashish Sharma
eb56f2680c mgr/dashboard: Add unit test for realm
Fixes: https://tracker.ceph.com/issues/59171
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
2023-03-28 12:26:17 +05:30
Yingxin
0793495b9d
Merge pull request #50653 from xxhdx1985126/wip-exist-clean
crimson/os/seastore/cache: consider EXIST_CLEAN extents as pending ones

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2023-03-28 14:07:43 +08:00
Yingxin Cheng
865285a53c crimson/os/seastore/cache: use CachedExtent::is_mutable() where appropriate
CachedExtent::is_mutable() should only be used to check whether need to
call duplicate_for_write(extent).

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2023-03-28 09:41:55 +08:00
zdover23
6a4088a9c8
Merge pull request #50697 from zdover23/wip-doc-2023-03-28-glossary-scrubbing
doc/glossary: add "Scrubbing"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2023-03-28 10:39:16 +10:00
Yaarit Hatuka
d51947fe8a mgr/telemetry: add leaderboard description and documentation
Users who are opted-in to telemetry can also opt-in to participating in
a leaderboard in the telemetry public dashboards
(https://telemetry-public.ceph.com/).

Users can also add a description of the cluster to publicly appear in
the leaderboard.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
2023-03-27 22:51:51 +00:00
Zac Dover
4a66819da4 doc/glossary: add "Scrubbing"
Add "Scrubbing" to the glossary.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-03-28 08:07:30 +10:00
Patrick Donnelly
d8b6d45184
tools/cephfs: include lost+found in scan_links
Otherwise, any injected dentries have incorrect first snapids.

Fixes: https://tracker.ceph.com/issues/59183
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-03-27 14:55:48 -04:00
Adam King
5493e2d330 qa/cephadm: add check that iscsi daemon /etc/hosts matches host /etc/hosts
To make sure we aren't being affected by any podman introduced
changes to the /etc/hosts file and test that we're properly
mounting /etc/hosts in our daemon containers

Signed-off-by: Adam King <adking@redhat.com>
2023-03-27 14:01:30 -04:00
Adam King
dd8627bbe3 cephadm: mount host /etc/hosts for containers in podman deployments
Podman messes with the /etc/hosts file in certain version. There
was already a past issue with it placing the container name
there fixed by https://github.com/ceph/ceph/pull/42242. This time
it is adding an entry for "host.containers.internal" (seems to be
podman 4.1 onward currently). Iscsi figures out the FQDN for a
host by running

python3 -c 'import socket; print(socket.getfqdn())

which is resolving to "host.containers.internal" when run in
the container with the podman modified /etc/hosts.

There is also an issue with grafana dashboard with
this entry present

Passing --no-hosts resolves this, but I think in the past
we avoided that due to not wanting to break deployments
where host name resolution was handled using /etc/hosts.
That's why we had that workaround previously linked. This
time I'm not sure such a workaround exists. The try here
is to mount a copy of the host's version of /etc/hosts
into the iscsi container. That copy won't have the extra
entry podman adds in but will have any user created entries in
case they were actually using it for host name resolution.
If /etc/hosts file isn't present for whatever reason, we're
assuming that this user isn't using /etc/hosts for hostname
resolution, and just going back to passing --no-hosts.

Fixes: https://tracker.ceph.com/issues/58532
Fixes: https://tracker.ceph.com/issues/57018

Signed-off-by: Adam King <adking@redhat.com>
2023-03-27 14:01:11 -04:00
Adam King
c989b0a351
Merge pull request #48937 from adk3798/device-ls-size
mgr/orchestrator: fix device size in `orch device ls` output

Reviewed-by: Redouane Kachach <rkachach@redhat.com>
2023-03-27 13:55:31 -04:00
Adam King
a57fc000cc cephadm: handle exceptions applying secondary services during bootstrap
Otherwise we risk hitting a mismatch between the cephadm binary version
and the container image version we're bootstrapping on, resulting in
bootstrap failing. Example in the tracker.

Fixes: https://tracker.ceph.com/issues/59082

Signed-off-by: Adam King <adking@redhat.com>
2023-03-27 13:34:54 -04:00
Casey Bodley
85032c74a4 rgw/admin: 'data sync status' formats binary error repo entries
Fixes: https://tracker.ceph.com/issues/59174

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2023-03-27 10:56:40 -04:00
Xiubo Li
043d4abccf
Merge pull request #50681 from lxbsz/qa-fscrypt-020
qa: fscrypt enable xfstests-dev generic/020 test case
2023-03-27 21:31:06 +08:00
avanthakkar
f658ac2670 disable default check if already set to true for selected realm
Fixes: https://tracker.ceph.com/issues/59171
Signed-off-by: avanthakkar <avanjohn@gmail.com>
2023-03-27 17:42:08 +05:30
avanthakkar
d42ea1d5af disable create zonegroup if no master zone exist for existing master zonegroup
Fixes: https://tracker.ceph.com/issues/59171
Signed-off-by: avanthakkar <avanjohn@gmail.com>
2023-03-27 17:42:01 +05:30
avanthakkar
e804800432 mgr/dashboard: edit rgw-multisite
Fixes: https://tracker.ceph.com/issues/59171
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
2023-03-27 17:41:44 +05:30
Redouane Kachach
a2927e0cdd
mgr/cephadm: fixing ceph-exporter prometheus's job section
Fixes: https://tracker.ceph.com/issues/59170

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
2023-03-27 13:16:03 +02:00
Yuval Lifshitz
e100d392a0 rgw/notifications: support bucket notification with bucket policy
following policy should be used to allow any user to get, put and delete
bucket notification on a bucket called "my-bucket":
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Statement",
      "Effect": "Allow",
      "Principal": "*",
      "Action": ["s3:GetBucketNotification", "s3:PutBucketNotification"],
      "Resource": "arn:aws:s3:::my-bucket"
    }
  ]
}

note that notification deletion uses the "PUT" permission.

Fixes: https://tracker.ceph.com/issues/59136

Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
2023-03-27 10:26:06 +00:00
Xuehan Xu
f34faf363e crimson/os/seastore/cache: consider EXIST_CLEAN extents as pending ones
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2023-03-27 15:32:34 +08:00
Xiubo Li
585481f343 qa: fscrypt enable xfstests-dev generic/020 test case
Since the https://git.ceph.com/xfstests-dev.git has pulled the
corresponding fix about the long attribute for generic/020 test case,
we can enable it now.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
2023-03-27 14:25:52 +08:00
zdover23
f5c5009eab
Merge pull request #50675 from zdover23/wip-doc-2023-03-27-rados-operations-bluestore-migration-cleanup
doc/rados: clean up ops/bluestore-migration.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2023-03-27 05:06:45 +10:00
Zac Dover
b28be76d0f doc/rados: clean up ops/bluestore-migration.rst
Clean up internal links, fix the numbering of a procedure, and implement
Anthony D'Atri's suggestions in
https://github.com/ceph/ceph/pull/50487 and
https://github.com/ceph/ceph/pull/50488.

https://tracker.ceph.com/issues/58485

Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-03-27 02:07:30 +10:00
zdover23
9792102cb3
Merge pull request #50654 from zdover23/wip-doc-2023-03-24-glossary-user
doc/glossary: add "User"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2023-03-26 05:43:00 +10:00
Zac Dover
fd6bfaf3fe doc/glossary: add "User"
Add "User" to glossary.rst.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-03-25 06:50:00 +10:00
Anthony D'Atri
68df405e53
Merge pull request #50660 from zdover23/wip-doc-2023-03-25-rados-operations-bluestore-migration-prompt-fix
doc/operations: fix prompt in bluestore-migration
2023-03-24 13:54:06 -04:00
Zac Dover
5e54641aec doc/operations: fix prompt in bluestore-migration
Fix a single prompt in bluestore-migration.rst.

Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-03-25 03:47:10 +10:00
Laura Flores
9fbedc6c9b qa/crontab: add reef upgrade tests and teuthology/nop
Signed-off-by: Laura Flores <lflores@redhat.com>
2023-03-24 11:15:13 -05:00
Redouane Kachach
b431e308a7
qa: adding logic to wait for rgw realm tokens before testing
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
2023-03-24 15:59:10 +01:00
Redouane Kachach
17bcfa8b99
mgr/cephadm: increasing container stop timeout for OSDs
Fixes: https://tracker.ceph.com/issues/58158

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
2023-03-24 13:09:14 +01:00
Ilya Dryomov
b89782a369
Merge pull request #50302 from weirdwiz/rbd-perf-counters
rbd-mirror: switch to labeled perf counters

Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2023-03-24 13:00:10 +01:00
Rishabh Dave
76177ab1a9
Merge pull request #50497 from rishabh-d-dave/fs-qa-caps-helper
qa/cephfs: add more helper methods to caps_helper.py

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-03-24 16:28:38 +05:30
Ilya Dryomov
4431be49fc
Merge pull request #49302 from petrutlucian94/adapter_resets
rbd-wnbd: optionally handle wnbd adapter restart events

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2023-03-24 11:41:04 +01:00
Avan
5976b1f21d
Merge pull request #50369 from rhcs-dashboard/exporter-labeled-counters
exporter: add support for exposing labeled perf counters
2023-03-24 14:56:28 +05:30
Lucian Petrut
98a7aff741 rbd-wnbd: consistently use negative error codes in rbd-wnbd
The rbd-wnbd iterators return positive errors, which is why
in certain cases we may end up with both positive and negative
error codes.

This change ensures that we'll consistently use negative
error codes.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2023-03-24 09:00:21 +00:00
Lucian Petrut
3d8afc0021 common, rbd-wnbd: bump Windows log level
We're increasing the log level for certain Windows operational log
messages.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2023-03-24 08:59:49 +00:00
Lucian Petrut
0c25ca6564 rbd-wnbd: optionally handle wnbd adapter restart events
The WNBD adapter may be reset in certain situations (e.g. driver
upgrade, MS WHQL tests, etc).

We're going to monitor the WNBD adapter using WMI[1] events, restarting
the rbd-wnbd disk mappings whenever necessary. Adapter monitoring can be
enabled by passing the --adapter-monitoring-enabled flag to the service.

This feature is optional for the following reasons:

* it's mainly used during development / driver certification
* we had to use a relatively small polling interval, which might imply
  additional resource usage. WMI quotas also have to be considered.

While at it, we're updating two lambdas that are submitted to thread pools,
avoiding default reference capturing and explicitly specifying the variables
that get copied.

[1] https://learn.microsoft.com/en-us/windows/win32/wmisdk/wmi-start-page

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
2023-03-24 08:58:15 +00:00
Rishabh Dave
969a93d0dc qa/cephfs: add more helper methods to caps_helper.py
Add methods that will accept read/write permissions, CephFS names and
CephFS mount point and in return will generate string form of MON, OSD
and MDS caps exactly as it is reported in Ceph keyrings.

Replace similar code in test_multifs_auth.py with calls to these helper
methods.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-03-24 11:32:58 +05:30
Anthony D'Atri
b5d6ca7a8f
Merge pull request #50649 from Thingee/foundation-mem-update-20230323
doc/foundation: Update Foundation members
2023-03-23 20:11:04 -04:00
Mike Perez
73aa44aa44 doc/foundation: Update Foundation members
Removing EasyStack, Vexxhost and adding 42on

Signed-off-by: Mike Perez <thingee@gmail.com>
2023-03-23 15:52:12 -07:00
Conrad Hoffmann
402d2eacbc doc: account for PG autoscaling being the default
The current documentation tries really hard to convince people to set
both `osd_pool_default_pg_num` and `osd_pool_default_pgp_num` in their
configs, but at least the latter has undesirable side effects on any
Ceph version that has PG autoscaling enabled by default (at least quincy
and beyond).

Assume a cluster with defaults of `64` for `pg_num` and `pgp_num`.
Starting `radosgw` will fail as it tries to create various pools without
providing values for `pg_num` or `pgp_num`. This triggers the following
in `OSDMonitor::prepare_new_pool()`:

- `pg_num` is set to `1`, because autoscaling is enabled
- `pgp_num` is set to `osd pool default pgp_num`, which we set to `64`
- This is an invalid setup, so the pool creation fails

Likewise, `ceph osd pool create mypool` (without providing values for
`pg_num` or `pgp_num`) does not work.

Following this rationale:

- Not providing a default value for `pgp_num` will always do the right
  thing, unless you use advanced features, in which case you can be
  expected to set both values on pool creation
- Setting `osd_pool_default_pgp_num` in your config breaks pool creation
  for various cases

This commit:

- Removes `osd_pool_default_pgp_num` from all example configs
- Adds mentions of the autoscaling and how it interacts with the default
  values in various places

For each file that was touched, the following maintenance was also
performed:

- Change interternal spaces to underscores for config values
- Remove mentions of filestore or any of its settings
- Fix minor inconsistencies, like indentation etc.

There is also a ticket which I think is very relevant and fixed by this,
though it only captures part of the broader issue addressed here:

Fixes: https://tracker.ceph.com/issues/47176
Signed-off-by: Conrad Hoffmann <ch@bitfehler.net>
2023-03-23 22:15:25 +01:00
J. Eric Ivancich
f086510440
Merge pull request #50545 from ivancich/wip-fix-bi-restore-script-installation
rgw: install rgw scripts with common files rather than radosgw files

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2023-03-23 15:29:19 -04:00
J. Eric Ivancich
4e9b8fa4bd
Merge pull request #50617 from ivancich/wip-add-unordered-list-restore-index
rgw: add unordered listing to reindex to force stats update

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2023-03-23 15:17:34 -04:00