Commit Graph

140860 Commits

Author SHA1 Message Date
Xuehan Xu
4ff02f53fe crimson/os/seastore/onode_manager: avoid unnecessary delta related
overhead

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
2023-10-10 12:08:55 +08:00
Venky Shankar
a8e3a32d6c Merge PR #53885 into main
* refs/pull/53885/head:
	Revert "mds: disable delegating inode ranges to clients"

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2023-10-10 09:35:37 +05:30
Yuri Weinstein
a9359c5c56
Merge pull request #53517 from cbodley/wip-qa-distros-s
qa/distros: remove centos/rhel8 and ubuntu20.04 from supported distros

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Ali Maredia <amaredia@redhat.com>
2023-10-09 14:04:54 -07:00
Laura Flores
9c5755a4ac qa/suites/upgrade: fix env indentation in stress-split upgrade tests
This is an issue with the stress-split yaml files, as introduced in https://github.com/ceph/ceph/pull/51889.

The stress-split tests have an incorrectly-intented "env" section, which teuthology detects as an entry for "clients".

Fixes: https://tracker.ceph.com/issues/63158
Signed-off-by: Laura Flores <lflores@ibm.com>
2023-10-09 20:27:25 +00:00
Ilya Dryomov
425704acdf
Merge pull request #53829 from ajarr/wip-63009
librbd: kick ExclusiveLock state machine stalled waiting for lock from reacquire_lock()

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2023-10-09 21:38:21 +02:00
Rishabh Dave
f9626b5969
Merge pull request #53722 from rishabh-d-dave/mon-authmon
mon/AuthMonitor: clean up AuthMonitor

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
2023-10-09 22:53:42 +05:30
Rishabh Dave
e7200f584e
Merge pull request #53721 from rishabh-d-dave/mon-mdsmon
mon/MDSMonitor: clean up MDSMonitor

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
2023-10-09 22:52:44 +05:30
Rishabh Dave
1f047664e1
Merge pull request #53405 from rishabh-d-dave/ceph-auth-caps-val-caps
mon/AuthMonitor: make "ceph auth caps" print error messsages

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
2023-10-09 22:49:28 +05:30
Patrick Donnelly
d84042aa29
Merge PR #53099 into main
* refs/pull/53099/head:
	script: update ceph-debug-docker for centos 9.stream

Reviewed-by: Laura Flores <lflores@redhat.com>
2023-10-09 13:03:37 -04:00
Venky Shankar
c9d67526b2 Revert "mds: disable delegating inode ranges to clients"
This isn't necessary -- the MDS handles delegating inode ranges
to clients from its preallocated inode set properly - the suspected
bug involving not persisting the sessionmap and causing asserts
during replay isn't an issue. The preallocated set is persisted
with the log event and the MDS correctly rebuild the set from
this information during replay.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-10-09 22:22:21 +05:30
Patrick Donnelly
e90f0e9e00
Merge PR #53206 into main
* refs/pull/53206/head:
	mds: use LogSegment dump for debugging

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-10-09 11:53:31 -04:00
Yuval Lifshitz
d05fa0bea9
Merge pull request #52254 from vedanshbhartia/coverity_uninit
rgw: Add coverity uninitialized variable and initialize RGWBucketEntryMetadataObject

reviewed-by: yuvalif
2023-10-09 17:58:42 +03:00
Yuval Lifshitz
94d69a11ab
Merge pull request #52328 from vedanshbhartia/coverity_1512267
rgw: fix potential null dereference in rgw_iam_policy.c: ParseState::do_string

reviewed-by: yuvalif
2023-10-09 17:57:47 +03:00
Yuval Lifshitz
4bf3b6b3ca
Merge pull request #52472 from vedanshbhartia/coverity_1510724
rgw: Remove unnecessary null check from valid_s3_bucket_name

reviwed-by: soumyakoduri , yuvalif
2023-10-09 17:56:40 +03:00
Yuval Lifshitz
47c77bcf8a
Merge pull request #52734 from vedanshbhartia/coverity_ostream
rgw: Restore ostream format state after changing it

reviwed-by: yuvalif
2023-10-09 17:54:34 +03:00
Yuval Lifshitz
7b774c4e51
Merge pull request #52326 from yuvalif/wip-yuval-lua-reload
rgw/lua: support reloading lua packages on all RGWs

reviwed-by: dang, cbodle, anthonyeleven
2023-10-09 17:53:15 +03:00
zdover23
7d85410aa1
Merge pull request #53890 from zdover23/wip-doc-2023-10-09-troubleshooting-troubleshooting-mon-4-of-x
doc/rados: edit troubleshooting-mon.rst (4 of x)

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2023-10-09 23:53:05 +10:00
Rishabh Dave
b3811c90ee
Merge pull request #53892 from rishabh-d-dave/fix-test_cephfs.py
src/test/pybind: don't use decorator "with_setup"

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2023-10-09 19:17:07 +05:30
Zac Dover
99e92fb94b doc/rados: edit troubleshooting-mon.rst (4 of x)
Edit doc/rados/troubleshooting/troubleshooting-mon.rst.

Follows https://github.com/ceph/ceph/pull/53875

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-10-09 23:10:26 +10:00
Daniel Gryniewicz
7c204e7da9
Merge pull request #53884 from leonid-s-usov/rgw-posix-test
test/rgw: don't compile POSIX test unless enabled

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2023-10-09 08:57:13 -04:00
Venky Shankar
1568321c17 Merge PR #53873 into main
* refs/pull/53873/head:
	qa: typo fix when checking for perf counter - s/md_thresh_evicted/mdthresh_evicted
	qa: lower mds_session_metadata_threshold for tests

Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
2023-10-09 17:49:30 +05:30
Rishabh Dave
bda6d195af src/test/pybind: don't use decorator "with_setup"
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-10-09 16:58:16 +05:30
Venky Shankar
5856a1e6b7 qa: typo fix when checking for perf counter - s/md_thresh_evicted/mdthresh_evicted
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-10-09 14:59:18 +05:30
Venky Shankar
92200d9d10 qa: lower mds_session_metadata_threshold for tests
... and increase the number of files that are created so as to
hit the threshold with a high probability.

Fixes: http://tracker.ceph.com/issues/62873
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-10-09 14:59:18 +05:30
Venky Shankar
5be9213738 doc/cephfs-shell: drop installing packaging module
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-10-09 12:19:19 +05:30
Venky Shankar
7f7858748d cephfs-shell: use pkg_resources rather than packaging module
`pkg_resources` is already being used by other py scripts.

Fixes: https://tracker.ceph.com/issues/62739
Signed-off-by: Venky Shankar <vshankar@redhat.com>
2023-10-09 12:19:19 +05:30
Nizamudeen A
6ed5a2884f
Merge pull request #53817 from cloudbehl/active-alert-filter
mgr/dashboard: Filter active alerts

Reviewed-by: Nizamudeen A <nia@redhat.com>
2023-10-09 10:41:21 +05:30
Leonid Usov
b18174d96a test/rgw: don't compile POSIX test unless enabled
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
2023-10-08 17:33:20 +03:00
Ramana Raja
18b018578c librbd/ManagedLock: kickstart ExclusiveLock state machine
... that is stalled waiting for lock. Do this when trying to reacquire
lock in the ImageWatcher's rewatch mechanism. This would enable the
ExclusiveLock state machine to propagate the blocklist error to the
caller trying to perform an image operation requiring an exclusive
lock.

Previous attempt, e66db763, to fix the hang due to exclusive lock
acquisiton (stuck waiting for lock) racing with client blocklisting
did not always work. e66db763 kickstarted the ExclusiveLock state
machine when the ImageWatcher tried to schedule a exclusive lock
request and the blocklisting was detected. However, there is a short
window between a watch getting deregistered and client blocklisting
getting detected as part of rewatching. If hit when trying to schedule
a lock request, the ExclusiveLock state machine wasn't kickstarted,
blocklist error wasn't propagated, and the hang resurfaced.

A more robust approach is taken to resume the ExclusiveLock state
machine stuck waiting for lock during client blocklisting. Whenever
a client's ImageWatcher loses connection to the cluster, as it happens
during blocklising, the ImageWatcher initiates a mechanism to rewatch
the image and tries to reacquire the lock. Piggyback on this rewatch
mechanism that gets triggered during client blocklisting. And when
trying to reacquire the lock, kickstart the ExclusiveLock state
machine stalled waiting for lock (STATE_WAITING_FOR_LOCK).

Fixes: https://tracker.ceph.com/issues/63009
Signed-off-by: Ramana Raja <rraja@redhat.com>
2023-10-08 06:16:48 -04:00
Xuehan Xu
544985f089 crimson/os/seastore/onode_manager: drop write_dirty
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
2023-10-08 17:09:51 +08:00
Xuehan Xu
bfd1236597 crimson/os/seastore/onode_manager: populate delta recorders for each
onode modification

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
2023-10-08 17:09:51 +08:00
zdover23
03e43c5bb0
Merge pull request #53874 from zdover23/wip-doc-2023-10-07-rados-troubleshooting-community
doc/rados: edit troubleshooting/community.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2023-10-08 16:10:20 +11:00
zdover23
9c383d15ad
Merge pull request #53875 from zdover23/wip-doc-2023-10-07-troubleshooting-troubleshooting-mon-3-of-x
doc/rados: edit troubleshooting-mon.rst (3 of x)

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2023-10-08 15:50:45 +11:00
Zac Dover
fabfec2734 doc/rados: edit troubleshooting/community.rst
Edit doc/rados/troubleshooting/community.rst.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-10-08 14:43:58 +10:00
Zac Dover
fc45a0c4dd doc/rados: edit troubleshooting-mon.rst (3 of x)
Edit doc/rados/troubleshooting/troubleshooting-mon.rst.

Follows https://github.com/ceph/ceph/pull/52827

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-10-08 14:35:33 +10:00
Anthony D'Atri
25775d5442
Merge pull request #53876 from zdover23/wip-doc-2023-10-08-architecture-rbd-sentence-repair
doc/architecture: repair RBD sentence
2023-10-07 20:45:38 -04:00
Zac Dover
5abd530460 doc/architecture: repair RBD sentence
Improve an ambiguous sentence in doc/architecture.rst.

The problem presented by the original sentence is that the phrasal verb
"to provide with" is implicated in one of its possible readings.
Interpreted in that way, the sentence seems to express the incorrect
idea that RBD furnishes block devices with snapshotting and cloning, as
though snapshotting and cloning are being delivered to the block
devices. In fact, snapshotting and cloning are just features of RBD, and
are features that are described on this page:
https://docs.ceph.com/en/quincy/rbd/rbd-snapshot/.

Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-10-08 07:47:00 +10:00
zdover23
b05d167b48
Merge pull request #53790 from zdover23/wip-doc-2023-10-03-architecture-17-of-x
doc/architecture: edit "Peering and Sets"
2023-10-07 15:51:27 +11:00
Zac Dover
c69b111966 doc/architecture: edit "Peering and Sets"
Edit the English in the section "Peering and Sets" in the file
doc/architecture.rst.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2023-10-07 14:24:30 +10:00
Patrick Donnelly
617f7153d7
Merge PR #53855 into main
* refs/pull/53855/head:
	script: add option for debug build

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2023-10-06 20:15:23 -04:00
Patrick Donnelly
ec720e94e9
script: add option for debug build
See: https://github.com/ceph/ceph-build/pull/2167

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2023-10-06 12:06:01 -04:00
Adam King
43d822dd5a mgr/cephadm: fix upgrades with nvmeof
Currently, nvmeof was being treated as if it used
a ceph image during upgrades. This would cause logging
of messages like (I've removed the nvmeof daemon id)

log [WRN] :     Upgrade daemon: nvmeof.<id>: Cannot redeploy
nvmeof.<id> with a new image: Supported types are: mgr, mon,
crash, osd, mds, rgw, rbd-mirror, cephfs-mirror, ceph-exporter,
iscsi, nfs

and if you had set a custom image for the
mgr/cephadm/container_image_nvmeof setting, this would
be undone as part of the upgrade process.

Fixes: https://tracker.ceph.com/issues/63127

Signed-off-by: Adam King <adking@redhat.com>
2023-10-06 11:30:21 -04:00
Yuri Weinstein
080768f77c
Merge pull request #53417 from jrchyang/fix_mclock_scheduling_slow_main
osd: fix: slow scheduling when item_cost is large

Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
2023-10-06 06:58:11 -07:00
Yuval Lifshitz
9a0a855fb0
Merge pull request #52430 from vedanshbhartia/coverity_datarace
rgw: Add coverity annotations for missing mutex locks

reviewed-by: yuvalif, mkogan1
2023-10-06 13:07:31 +03:00
Yuval Lifshitz
7a11f1d574 rgw/lua/doc: support reloading lua packages on all RGWs
without requiring a restart of the RGWs
test instructions:
https://gist.github.com/yuvalif/95b8ed9ea73ab4591c59644a050e01e2
also use capitalized "Lua" in logs/doc

Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
2023-10-06 12:54:56 +03:00
Rishabh Dave
a669cd6422 mon/AuthMonitor: check if entity is absent before creating it
Although this code path is not used for creating entities yet, it is
better to fix the bug sooner than later. Method
AuthMonitor::_update_or_create_entity() must exit (with appropriate
error code) when entity to be created on the Ceph cluster is already
present.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-10-06 14:22:14 +05:30
Rishabh Dave
4228df3f35 mds/MDSAuthCaps: re-word an error message for better clarity
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2023-10-06 14:22:14 +05:30
zdover23
be8824907d
Merge pull request #53834 from dparmar18/remove-egg-fragment-from-doc
doc: remove egg fragment from dev/developer_guide/running-tests-locally

Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Zac Dover <zac.dover@proton.me>
2023-10-06 10:21:14 +11:00
John Mulligan
ffe1f2f8f1 cephadm: update test to avoid using exception handling as an assertion
The use of an exception as an assertion mostly works but has the side
effect of hiding other errors. Hiding these errors can make it hard to
debug problems in this code path, as it did for me recently. Update the
test to use a standard assertion as well as asserting that the assertion
must have been called.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2023-10-05 17:05:33 -04:00
John Mulligan
9015edc3f3 cephadm: convert monitoring type to a ContainerDaemonForm
Signed-off-by: John Mulligan <jmulligan@redhat.com>
2023-10-05 17:05:33 -04:00