Since both the sentences in the note point aren't strictly related to
each other, it's better to split that note point into two.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
mgr/cephadm: retry mgr fail over in case of transient failure
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
The global snaprealm would be created and then destroyed immediately
every time when updating it.
Fixes: https://tracker.ceph.com/issues/54362
Signed-off-by: Xiubo Li <xiubli@redhat.com>
The type of 'num_fwd' in ceph 'MClientRequestForward' is 'int32_t',
while in 'ceph_mds_request_head' the type is '__u8'. So in case
the request bounces between MDSes exceeding 256 times, the client
will get stuck.
In this case it's ususally a bug in MDS and continue bouncing the
request makes no sense.
Fixes: https://tracker.ceph.com/issues/55129
Signed-off-by: Xiubo Li <xiubli@redhat.com>
os/bluestore: Always update the cursor position in AVL near-fit search.
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
- Avoid crashing when a call out to an external program expectedly does
not return exit status zero.
There are programs that communicate other information than error/no
error through exit status. E.g. `systemctl status` will return different
exit codes depending on the actual status of the units in question.
In cases where this is expected crashing with a RuntimeError exception
is inappropriate and should be avoided.
Fixes: https://tracker.ceph.com/issues/55117
Signed-off-by: Moritz Röhrich <moritz.rohrich@suse.com>
Commit 403f1ec288 ("cmake: make "WITH_CEPH_DEBUG_MUTEX" depend on
CMAKE_BUILD_TYPE") made WITH_CEPH_DEBUG_MUTEX depend on build type
being set to Debug, in CMakeLists.txt. However, if CMAKE_BUILD_TYPE
isn't specified by the user, we may still set it to Debug later, in
src/CMakeLists.txt, and in that case WITH_CEPH_DEBUG_MUTEX doesn't
get enabled. The result is that
$ do_cmake.sh -DCMAKE_BUILD_TYPE=Debug ...
debug builds have mutex debugging enabled, while
$ do_cmake.sh ...
builds, which are supposed to be the same, don't. Jenkins builders
don't pass -DCMAKE_BUILD_TYPE=Debug so that commit effectively turned
off all ceph_mutex_is_locked* asserts in "make check".
Fixes: https://tracker.ceph.com/issues/55318
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
add_event_after() expects an externally provided mutex to be held
for the call. This was missed in commit 8965a0f2a6 ("rbd-mirror:
synchronize with in-flight stop in ImageReplayer::stop()").
Fixes: https://tracker.ceph.com/issues/55317
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Following sequence of operation lead to deadlock
1. Created subvolume
2. Written some I/O on the subvolume
3. Create snapshot of the subvolume
4. Create clone of the snapshot
5. Delete snapshot from back end (don't use subvolume interface) before
clone completes
6. Delete clone with force
7. Delete subvolume
8. Delete fs and associated pools
9. Created new fs
10 Created new subvolume,
11. Written some I/O on the subvolume
12. Create snapshot of the subvolume
13. Create clone of the snapshot <---------------THIS OPERATION HANGS -----------------
Root Cause:
Since the snapshot is deleted from the back end, the clone fails. But it
also fails to remove the clone index at '/volumes/_index/clone'. The
cloner thread goes to infinite loop of starting the clone and failing.
This involves taking 'self.async_job.lock()' and reads the clone index
to get the job and registers the above job.
While the 'cloner thread' is in above loop, the fs is destroyed. The
cloner threads which lives till the mgr/volumes is enabled in mgr, takes
the 'self.async_job.lock()' and hangs while reading the clone index.
Any further clone operations which also requires above lock hangs.
Fix:
Remove the clone index even though snapshot is not present.
Fixes: https://tracker.ceph.com/issues/55217
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Always return std::nullopt rather than an empty buffer -- this way users
can rely on this as an invariant.
Signed-off-by: Samuel Just <sjust@redhat.com>