For a ZNS device, a open/full zone has to be reset before it can be
reused to write from start. Seastore releases a segment/zone and marks
it empty and expects to be able to write to it from start. So as a part
of release reset the zone, so it moves to empty state on the device.
Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
Zones in IMP-OPEN, EXP-OPEN, CLOSED states in a ZNS device are
counted as active resources. ZNS SSDs can have a limit on the
number of zones that can be active at the same time (max_active_resources).
If CLOSED zones reach max_active_zones supported by the device, then
opening/writing to newer zones will fail.
So a close_segment() from Seastore is essentially a FINISH
operation on a ZNS zone.
Do FINISH operation on a zone instead of CLOSE from segment_close().
Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
SegmentAllocator::close_segment() writes tail information to a
segment before closing the segment, and this is written at the
end of segment. However, for ZNS SSDs, the writes have to always happen
at write pointer, so writing tail info at the end of a zone fails if
the WP is not at the offset requested by close_segment().
If the write pointer is not at lba where the tail information is written,
then advance write pointer by writing zeroes to the zone from it's current
write pointer. Then write the tail information at the end of zone.
Added advance_wp() function which advances the write pointer and then write
tail information, in case of ZNS devices but for a regular device it
continues to write at the end of segment.
Do close_segment() call after writing tail information, closing a segment
first and then writing tail information can cause potential race conditions
on a zns backed segment.
Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
For ZNS SSDs, every write to a segment/zone has to happen at the zone's
write pointer. Any write request to an offset which is not at
write pointer will be failed by the drive.
In ZNSSegment::write() error out if write offset is not same as WP.
Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
librbd: use actual monitor addresses when creating a peer bootstrap token
Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Christopher Hoffman <choffman@redhat.com>
install-deps: script exit on "/ValueError" in centos_stream8
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
common/ceph_context: leak some memory fail to show in valgrind
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Each compilation unit should be able to define its own dout_subsys
without generating a redefinition warning. When dout_subsys is defined
in header files, it complicates this matter. This commit removes
definitions and header files and makes sure definitions are added to
.cc files as needed.
Additionally, at Adam Emerson's suggestion, use "static constexpr"
rather than "#define" to set "dout_subsys" in a few places as a
reminder to ultimately do it more broadly.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
This is regarding failures in unregister_remote_update_watcher() and
unregister_local_update_watcher(). handle_replay_complete() can't be
called in these cases anymore as it would blindly attempt to unregister
watchers from scratch again. Dropping handle_replay_complete() calls
there means that these failures would only be logged and would not be
surfaced by snapshot replayer. But the only caller ignores them
anyway:
void ImageReplayer<I>::shut_down(int r) {
...
// close the replayer
if (m_replayer != nullptr) {
ctx = new LambdaContext([this, ctx](int r) {
m_replayer->destroy();
m_replayer = nullptr;
ctx->complete(0); <------
});
ctx = new LambdaContext([this, ctx](int r) {
m_replayer->shut_down(ctx);
});
}
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
If a shutdown is requested, e.g. by update_pool_replayers() because
remote RADOS instance got blocklisted, and Replayer::shut_down() pends
it on completion of current snapshot sync, it gets stuck if replayer
encounters an error in the interim. This is particularly likely in the
blocklist case: a higher layer may detect that client got blocklisted
and request a shutdown first, and then when replayer sees EBLOCKLISTED
in turn, it calls handle_replay_complete() -- which does not resume
a pending shutdown. Because update_pool_replayers() blocks on shutdown
with Mirror::m_lock held, eventually the entire daemon hangs in
perpetuity.
Fixes: https://tracker.ceph.com/issues/56154
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
This PR adds unselectable prompts to three files that are
transcluded in the doc/mgr/dashboard.rst file. These three
files are:
1. debug.inc.rst
2. feature_toggles.inc.rst
3. motd.inc.rst
The addition of unselectable prompts to these three files
completes the work begun in PR#47810 (d8064b4), which sought
to bring dashboard.rst into line with the unselectable prompt
standard introduced by Kefu Chai in 2020.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
1. fixing the output to show local-time instead of UTC format, matching
operator<<() handling (and all the rest of our logs)
2. adding a 'short' mode (as {:s}) for when, e.g. in most scrub logs,
we only need 3 digits for the sub-second, and do not need the
trailing TZ designation.
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
so we can use the formatter defined for `LogEntry` in fmtlib v9.
in this new version of fmtlib, it is required to define a specialization
for the formatted type even when it comes to the types with an override of
operator<<(). since we already have an override for `LogEntry`, let's define
the specialization for `fmt::formatter<LogEntry>`.
this change should address the FTBFS when building with fmtlib v9.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
so we can use the formatter defined for `entity_name_t`. in fmtlib v9,
it is required to define a specialization for the formatted type even
the type has an override of operator<<(). now that we already have a
formatter for `entity_name_t`, let's just use it.
this change should address the FTBFS when building with fmtlib v9.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Relying on mon_host config option is fragile, as the user may confuse
v1 and v2 addresses, group them incorrectly, etc. Get mon_host value
only as a fallback.
Fixes: https://tracker.ceph.com/issues/57317
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
in 23c3f76018, the change to fail the mgr
is proposed immediately. but `MgrMonitor::prepare_command()` method still
returns `true` in this case. its indirect caller of
`PaxosService::dispatch()` considers this as a sign that it needs to
propose the change with `propose_pending()`. but the pending change has
already been proposed by `MgrMonitor::prepare_command()`, and
`have_pending` is also cleared by this call. as we don't allow
consecutive paxos proposals, the second `propose_pending()` call is
delayed with a configured latency. but when the timer is fired, this
poseponed call would find itself trying to propose nothing. the change
to fail the mgr has been proposed. that's why we have
`ceph_assert(have_pending)` assertion failures.
in this change, the second proposal is not proposed anymore if the
proposal is proposed immediately. this should avoid the assertion
failure.
this change should address the regression introduced by
23c3f76018.
Fixes: https://tracker.ceph.com/issues/56850
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
so the `DOWNLOAD_EXTRACT_TIMESTAMP` property of
`ExternalProject_Add()` command is set by default on CMake v3.24 and up.
it helps to set the a more accurate timestamp for the downloaded
content, hence the targets depending on the extracted content can be
rebuilt if the URL changes.
see also https://cmake.org/cmake/help/latest/policy/CMP0135.html
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
we were using a for loop for this purpose, but the for loop was unrolled
when we bumped up the required cmake version.
this change paves the road to setting "CMP0135" to "NEW". this policy
is a new one introduced by CMake v3.24.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
mgr/dashboard: Improve level A accessibility for pagination component
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
client: specify the quota type when finding the root quota realm
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
crimson/os/seastore: generalize journal tail calculations with CircularBoundedJournal
Reviewed-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
mgr/orchestrator/tests: don't match exact whitespace in table output
Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>