If load_schedules() (i.e. periodic refresh) races with add_schedule()
invoked by the user for a fresh image, that image's schedule may get
lost until the next rebuild (not refresh!) of the queue:
1. periodic refresh invokes load_schedules()
2. load_schedules() creates a new Schedules instance and loads
schedules from rbd_mirror_snapshot_schedule object
3. add_schedule() is invoked for a new image (an image that isn't
present in self.images) by the user
4. before load_schedules() can grab self.lock, add_schedule() commits
the new schedule to rbd_mirror_snapshot_schedule object and adds it
to self.schedules
5. load_schedules() grabs self.lock and reassigns self.schedules with
Schedules instance that is now stale
6. periodic refresh invokes load_pool_images() which discovers the new
image; eventually it is added to self.images
7. periodic refresh invokes refresh_queue() which attempts to enqueue()
the new image; this fails because a matching schedule isn't present
The next periodic refresh recovers the discarded schedule from
rbd_mirror_snapshot_schedule object but no attempt to enqueue() that
image is made since it is already "known" at that point. Despite the
schedule being in place, no snapshots are created until the queue is
rebuilt from scratch or rbd_support module is reloaded.
To fix that, extend self.lock critical sections so that add_schedule()
and remove_schedule() can't get stepped on by load_schedules().
Fixes: https://tracker.ceph.com/issues/56090
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The existing logic often leads to refresh_pools() and refresh_images()
being invoked after a 120 second delay instead of after an intended 60
second delay.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The deadlock is illustrated best by the following snippet
provided by jianwei zhang who also made the problem analysis
(many thanks!).
```
thread-35
AsyncMessenger::shutdown_connections hold AsyncMessenger::lock std::lock_guard l{lock}
AsyncConnection::stop wait AsyncConnection::lock lock.lock()
thread-3
ProtocolV2::handle_existing_connection hold AsyncConnection::lock std::lock_guard<std::mutex> l(existing->lock)
AsyncMessenger::accept_conn wait AsyncMessenger::lock std::lock_guard l{lock}
```
Fixes: https://tracker.ceph.com/issues/55355
Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
Fixes a scenario where the scrub machine gets stuck if starting
a deep scrub while the noscrub flag is set. It was dropping a
scrub reschedule op, without clearing scrub state, leaving the FSM
stuck in ActiveScrubbing,PendingTimer state.
Fixes: https://tracker.ceph.com/issues/54172
Signed-off-by: Cory Snyder <csnyder@iland.com>
One-click button in the case of an orch cluster for configuring the
rbd-mirroring when its not properly setup. This button will create an
rbd-mirror service and also an rbd labelled pool(replicated: size-3) (if they are not
existing)
Fixes: https://tracker.ceph.com/issues/55646
Signed-off-by: Nizamudeen A <nia@redhat.com>
Whenever a scrub session is waiting for an excessive length
of time for a locked object to be unlocked, the total
number of concurrent scrubs in the system is reduced.
The existing cluster warning issued on such occurrences is
easily overlooked. Here we add a constant reminder each time
the OSD tries to schedule scrubs.
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
This changes "master" to "main" in a title. If we lived in an
ideal world, this would have been a part of PR#46678.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
Support folk have asked if we can have a timestamp on the output of
multisite status commands so they can see at a glance how they relate
to other events and changes.
As such, we now have a status command added to any outputs where it
doesn't disrupt things. In practice this means anything whose output
isn't a single JSON array.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
mon/Elector: notify_rank_removed erase rank from both live_pinging and dead_pinging sets for highest ranked MON
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
TestMDSMetrics.test_delayed_metrics is failing due to
the absence of omit_sudo parameter in the remote.run()
of set_inter_mds_block() in qa/tasks/cephfs/filesystem.py
Fixes: https://tracker.ceph.com/issues/56065
Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>
also pin sphinx-autodoc-typehints to 1.18.3
to address following error:
ERROR: sphinx-autodoc-typehints 1.18.3 has requirement Sphinx>=4.5, but you'll have sphinx 4.4.0 which is incompatible.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
This will fix the fino colliding bug, which is caused when the
snapid is later than 0xffff.
From mds 'mds_max_snaps_per_dir' option, we can see that the max
snapshots for each directory is 4_K, and in ceph-fuse we have
around 64_K, which is from 0xffff - 2, stags could be used to make
the fake fuse inode numbers for each directory.
Fixes: https://tracker.ceph.com/issues/54653
Signed-off-by: Xiubo Li <xiubli@redhat.com>
All the snap ids of the finos returned to libfuse from libcephfs
will be recorded in the map of 'stag_snap_map', and will never be
erased before unmounting. So if libfuse passes invalid fino the
ceph-fuse should return EINVAL errno instead of crash itself.
Fixes: https://tracker.ceph.com/issues/54653
Signed-off-by: Xiubo Li <xiubli@redhat.com>
There have two stags will be reserved, 0 for CEPH_NOSNAP and 1 for
CPEH_SNAPDIR.
This will always make sure that for the nonsnap and snapdir inode
numbers to be consistent for all the ceph-fuse mounts.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
If the flags is empty then in option.h in can_update_at_runtime()
it will return true. That means this opetion could be changed in
runtime, which is buggy. Because if this is false, ceph-fuse will
use its own fake inos instead of libcephfs'. If this is changed
during runtime, we will hit inos dosn't exist assert bugs.
Fixes: https://tracker.ceph.com/issues/54653
Signed-off-by: Xiubo Li <xiubli@redhat.com>
This PR changes "master" to "main" in the
basic_workflow.rst file. I have even changed
"master" to "main" in some terminal output from
several years ago. This isn't historically ac-
curate, of course, but my hope is that this change
will prevent someone in the future from being con-
fused about why an antiquated branch name is ref-
erred to.
Signed-off-by: Zac Dover <zac.dover@gmail.com>