This reverts commit 1a128a8d8c.
With commit fcbf7367d2 ("rbd-nbd: map using netlink interface by
default") backported to reef, this reef-only fixup limited to fsx is no
longer needed.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Mapping rbd images to nbd devices using ioctl interface is not
robust. It was discovered that the device size or the md5 checksum
of the nbd device was incorrect immediately after mapping using
ioctl method. When using the nbd netlink interface to map RBD images
the issue was not encountered. Switch to using nbd netlink interface
for mapping.
Fixes: https://tracker.ceph.com/issues/64063
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit fcbf7367d2)
Conflicts:
PendingReleaseNotes [ moved to >=18.2.5 section ]
Random data is written and write zeroes is invoked on 0~256, but the
read is done on 256~256. This means that if write zeroes malfunctions
the test wouldn't catch it (especially in the thick provision case).
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit d41f0fa01f)
Allow a diff to start from a non-user snapshot. This would be used by
"rbd du" command to account for non-user snapshots which are currently
just skipped potentially resulting in underreported space usage and in
other places.
Fixes: https://tracker.ceph.com/issues/65720
Co-authored-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Vinay Bhaskar Varada <vvarada@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 54f47cc28f)
Conflicts:
src/include/rbd/librbd.h [ commit e5ccce14c4 ("rbd: add group
snap info command") not in reef ]
src/test/pybind/test_rbd.py [ commit d7fd66ec99 ("librbd: add
rbd_clone4() API to take parent snapshot by ID") not in reef ]
Add a new markdown file in the root of the tree, ContainerBuild.md, that
can serve as a basic introduction to the new container build tools
recently merged to ceph.
Add a small 'breadcrumb' section to the project README.md to help find
this new document.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 313546146c)
The source dir (aka homedir, default /ceph) is mounted in the container
read-write. This is needed as the various ceph build scripts expect to
write things into the tree - often this is in the build directory - but
not always. This can lead to small messes and/or situations that are
confusing to debug, especially if one is jumping between distros often.
Add an option to use an overlay volume for the homedir - by default we
enable a persistent overlay with a supplied "upper dir" where files that
were written will appear. One can also enable a temporary overlay that
forgets the writes when the container exits - maybe useful when doing
experiments in 'interactive' mode.
To use this option run the command with the `--overlay=<dir>` option.
For example: `./src/script/build-with-container.py -b build.inner
--overlay-dir build.ovr`. This will create a directory
`build.ovr/content` automatically and all new files will appear there.
For example the build directory will appear at
`build.ovr/content/build.inner`.
To use the temporary overlay use a `-` as the directory name. For
example: `./src/script/build-with-container.py -b build.inner
--overlay-dir -`
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 794e3d0b25)
When using docker the --volume option is not available during build
(docker [buildx] build), unlike podman. Since passing these volumes must
be conditional on them being set up I see no way to handle this short of
just disabling the option on docker. Log the fact that it's being
skipped - the only other issue is that we pointlessly set up some dirs
and the build may be a bit slower.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 4208a73665)
On the original github pr #59841 user fayak kindly informed us that the
--volume option was not supported by docker build. Since this section
was a leftover from a previous way of constructing the builder image and
was no longer needed we simply removed it.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 612a9d6808)
Construct the builder image using the --pull=always flag to initiate a
pull of the base image (centos, ubuntu, etc) in order to avoid using a
stale base image. Since the script automatically (by default) avoids
building if a matching tag is in local container storage it is handy to
use a fresh base when it *is* time to build something. Otherwise, you
end up in a situation like I sometimes do - using a months old base
unintentionally.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit f6e6188e30)
Add a `packages` target to build-with-container.py that requests a build
of packages, whatever package type is native to the distro selected.
For example `./src/script/build-with-container.py -d ubuntu22.04 -e
packages` will automatically select a deb packages build where
`./src/script/build-with-container.py -d centos9 -e packages` will
trigger rpm packages to be built. The underlying package-type specific
targets remain unchanged.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 37b7d509c5)
Previously, one could use the `--tag` option to completely override the
container tag generated by the script. However, there are cases where
one may want to add information to the tag rather than override it.
Allow the tag value to start with a plus (+) character that indicates
that the remainder of the string is to be suffixed to the generated tag.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 30836c4ed4)
Add a command line option --base-branch that allows the user to supply a
custom base branch name. git doesn't make determining this easy so we
always assume a base branch of 'main' by default - but this option lets
one change that.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit ff34bf7241)
Previously, we were passing build argument of CEPH_BRANCH, but that was
a bit misleading as we expect the current branch to vary a bit (as users
will be using branches to develop and test the code). What we actually
care about is the base branch ('main', 'squid', etc) as that is fed into
our bootstrap script and we want the option to simple variations based
on the name of said base branch.
Rename CEPH_BRANCH to CEPH_BASE_BRANCH for clarity.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit a1d49d557c)
Add a new --current-branch argument that lets the user supply a name for
the current branch. This allows the automatic tag generation to avoid
calling git - something useful if the tree is not using a git checkout
(like a tarball). It also allows you to pull a temporary branch in git
but ignore it and act like the temporary branch is the base branch.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit c1713c5bc3)
Add a system to define distro name aliases and use that to define some
additional aliases, primarily to match ubuntu codenames rather than
version numbers. Requested by Zack.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 65f055f0d8)
After the last set of fixes and enhancements I forgot to reformat the
file. This applies standard `black` formatting to the file.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit de855aec1c)
With Mirror::image_disable() taking image_lock for write and calling
list_children() under it, the following deadlock is possible:
1. Mirror::image_disable() takes image_lock for write and calls
list_children()
2. AbstractWriteLog::periodic_stats() timer fires (it runs every
5 seconds) and ImageCacheState::write_image_cache_state() is called
under a global timer_lock
3. ImageCacheState::write_image_cache_state() successfully takes
owner_lock and blocks attempting to take image_lock for read because
it's already held for write by Mirror::image_disable()
4. list_children() blocks inside of a call to ImageState::close() on
a descendant image
5. The descendant image close can't proceed because TokenBucketThrottle
requires a global timer_lock to complete QosImageDispatch shutdown
6. safe_timer thread which is holding timer_lock can't proceed because
ImageCacheState::write_image_cache_state() is effectively blocked on
the descendant image close through Mirror::image_disable()
Until commit 281a64acf9 ("librbd: remove snapshot mirror image-meta
when disabling"), Mirror::image_disable() was taking image_lock only for
read meaning that this deadlock wasn't possible. The only other change
that commit 281a64acf9 made to the code block protected by image_lock
was using child_mirror_image_internal for cls_client::mirror_image_get()
call on descendant images instead of mirror_image_internal to preserve
the value of mirror_image_internal for later. Both are local variables
that have nothing to do with image_lock, so I'm going back and making
Mirror::image_disable() take image_lock only for read again.
Fixes: https://tracker.ceph.com/issues/70190
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ff9aa20bc3)
Optionally, for those that want to run build.sh locally and
use the images. The default is to remove, for Jenkins builders,
which will build, push, and rmi.
Fixes: https://tracker.ceph.com/issues/70196
Signed-off-by: Dan Mick <dan.mick@redhat.com>
(cherry picked from commit 642e5f2da0)
Add a reproducer for the crash on a bad variant access which was fixed
in commit 7d75161051 ("librbd: fix a crash in get_rollback_snap_id").
The reproducer deliberately works around many other issues with force
promote in snapshot-based mirroring: stopping rbd-mirror daemon
shouldn't be necessary (let alone with SIGKILL), get_rollback_snap_id()
and its caller can_create_primary_snapshot() are flawed and can pick
the wrong snapshot to roll back to or skip rollback when it's actually
required, the user snapshot in this scenario should be removed as part
of force promoting because it's incomplete and won't be usable after
the image is promoted, etc.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 0f4a37dd9f)
Conflicts:
qa/workunits/rbd/rbd_mirror_journal.sh [ commits 3fd8a03887
("qa/workunits/rbd: merge journal and snapshot test scripts")
and 3fdbc160bb ("rbd-mirror: allow mirroring to a different
namespace") not in reef ]
qa/workunits/rbd/rbd_mirror_snapshot.sh [ duplicated/cloned for
snapshot-based mirroring ]
Improve the guidance around setting pg_num, and clear up confusion
around whether pgp_num should be set manually or, indeed, if it even can
be set manually.
This PR was raised in response to Mark Schouten's email here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/CBDJTLTTIEZVG7GVZBX37UAWGYNSSMPD/
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit c43e733721)
while authenticating AssumeRoleWithWebIdentity using JWT obtained
from an external IDP.
fixes: https://tracker.ceph.com/issues/68836
Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
(cherry picked from commit 919da36966)
get_rollback_snap_id() did not check if the snapshot it was
accessing was a mirror snapshot, causing it to crash if it wasn't.
Fixes: https://tracker.ceph.com/issues/70075
Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
(cherry picked from commit 7d75161051)