Prior to this set of commits, the MDS would write the ESubtreeMap to the
journal, trim everything up to that segment, then finally force the
trimming of that last segment (`MDLog::trim(0)`). This is awkward in the
new code which preserves a major segment boundary at the beginning of
the journal during trimming. Instead of writing a special case for this
situation, it seems more natural to just use a new "lid" or "cap" event
to mark the beginning of the journal when no subtree map can yet be
written but we need sequence numbers to tie in other MDS tables.
Like ESegment, ELid doesn't actually contain any state. It's just a
marker for the beginning the log after rank deactivation or rank
creation. It can appear in the middle of the log if the shutdown
sequence is interrupted while writing the event but the MDS will skip it
during replay in that case.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
When the MDS journal is wiped, EResetJournal is a major segment boundary
as it necessarily begins the journal.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
When the ESubtreeMap is very large (~5k+ subtrees), the MDS will
end up logging only a few events (as bad as 1) per segment as the
subtree map dominates the segment size.
This test simply creates an artificially large subtree and confirms that
other file system activity completes in a timely manner. This is now
taking advantage of the minor segments which allows for a normal set of
events per log segment (and fewer subtree maps). The test fails on the
current main HEAD.
Historical note: when I first observed this abberant behavior, the
vstart cluster was actually using mds_debug_subtrees = True (the default
for every vstart cluster). This caused the MDS to write out the subtree
map (for debugging reasons) with every event. When testing the MDS with
large subtrees (distributed ephemeral pinning), this caused the MDS to
slow to a trickle of operations per second. Despite this unintentional
misconfiguration, the problem still exists but the number of auth
subtrees must be large for a particlar rank to replicate the behavior.
On main HEAD, the creation of 10k files (workload stage) takes ~110
seconds. On this branch, it takes ~30 seconds.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This commit adds a new ESegment event type which can delineate
LogSegments. This event can be used as an alternative to the heavy
weight ESubtreeMap which can be very expensive to generate when the MDS
has a large subtree map.
Fixes: https://tracker.ceph.com/issues/58154
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The major problem here is that the MDLog::_start_entry method puts the
current event sequence number in the EMetaBlob of the event (if
present). Because of this, no other event can be submitted as this would
invalidate the event sequence. Instead, fixup the event sequence during
submission and simplify related logic that uses it during EMetaBlob
construction.
Secondarily, for the purposes of this commit series, _start_entry
introduced recursive locks when generating the ESubtreeMap within
MDLog::_segment_upkeep. So, this commit is a necessary cleanup.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
We're currently setting FMT_USE_TZSET=0 when building libfmt
in order to avoid the _tzset function, which is unavailable
under Mingw:
aa5769ecf1
The issue is that it still gets used by fmt/chrono.h, which is
why we'll move this definition to the top level cmake file.
Note that the Windows build is currently failing as a result of
a recent change: https://github.com/ceph/ceph/pull/52590/files
In file included from ceph/src/common/ceph_time.h:22,
from ceph/src/include/encoding.h:31,
from ceph/src/include/uuid.h:9,
from ceph/src/include/types.h:21,
from ceph/src/crush/CrushWrapper.h:14,
from ceph/src/crush/CrushCompiler.h:7,
from ceph/src/crush/CrushCompiler.cc:4:
ceph/src/fmt/include/fmt/chrono.h: In lambda function:
ceph/src/fmt/include/fmt/chrono.h:953:5: error: ‘_tzset’ was
not declared in this scope; did you mean ‘tzset’?
953 | _tzset();
| ^~~~~~
| tzset
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
mgr/dashboard: empty grafana panels for performance of daemons
Reviewed-by: Pegonzal <NOT@FOUND>
Reviewed-by: cloudbehl <NOT@FOUND>
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Fixes: https://tracker.ceph.com/issues/61792
Signed-off-by: avanthakkar <avanjohn@gmail.com>
Removing the `ceph-` prefix from ceph_daemon label to adopt it with the label
format used by queries in grafana dashboards. Also changing the
`instance_id` label for rgw to match the values coming from
exporter and prometheus module
Use the more modern prompt block instead of
using code blocks for example commands.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
The start time and end time CLI option specification is missing a space between the date and the optional time value. Also expand the text to talk about "optional time" after the date.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
cephx: initializing two member variables in the ctors
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
exporter: ceph-exporter scrapes failing on multi-homed server
Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@ibm.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
Add instructions directing the reader to install the "python3-routes"
package. This package is required in order to launch the dashboard after
the installation procedure has completed, but is not yet included in the
install-deps.sh script.
Signed-off-by: Zac Dover <zac.dover@proton.me>