Problem:
In the monitors we hold 2 copies of disallowed_leader ...
1. MonMap class 2. Elector class.
When computing the ConnectivityScore for the monitors during
the election, we use the `disallowed_leader` from Elector
class to determine which monitors we shouldn't allow to lead.
Now, we rely on the function `set_elector_disallowed_leaders`
to set the `disallowed_leader` of the Elector class, MonMap
class copy of the `disallowed_leader` contains the
`tiebreaker_monitor` so we inherit that plus we also add the
monitors that are dead due to a zone failure.
Hence, the `adding dead monitors` phase is only allowed if we can
enter stretch_mode. However, there is a problem when failing over a stretch cluster
zone and reviving the entire zone back up, the revived monitors
couldn't enter stretch_mode when they are at the state of "probing"
since PaxosServices like osdmon becomes unreadable (this is expected)
Solution:
We unconditionally add monitors that are in
`monmap->stretch_marked_down_mons` to the
`disallowed_leaders` list in
`Monitor::set_elector_disallowed_leaders` since
if the monitors are in `monmap->stretch_marked_down_mons`
we know that they probably belong in a marked down
zone and is not fit for lead.
This will fix the problem of newly revived monitors
having different disallowed_leaders set
and getting stuck in election.
Fixes: https://tracker.ceph.com/issues/63183
Signed-off-by: Kamoltat <ksirivad@redhat.com>
crimson/os/seastore: return ghobject_t::max as the end when list_objects reaches the end of the listing
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
this structure should be created at the frontend and trickle all the way
to the RADOS layer. holding: dout prefix, optional yield and trace.
in this commit, so far it was only added to the "complete()" sal interface,
and to the "write_meta()" rados interface.
in the future, it should be added to more sal interfaces, replacing the
current way where dpp and optional yield are passed as sepearte
arguments to all functions.
in addition, if more information would be needed, it should be possible
to add that information to the request context struct without changing
many function prototypes
basic test instructions:
https://gist.github.com/yuvalif/1c7f1e80126bed5fa79345efb27fe1b1
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
CephExporter was being (partially) over-shadowed by the Ceph class as
the Ceph class listed 'ceph-exporter' as one of the daemon types it
handled. This change updates CephExporter to a ContainerDaemonForm while
simultaneously breaking the link between Ceph and 'ceph-exporter',
allowing CephExporter to handle all the duty of managing ceph-exporter,
continuing the process of having clearer logical responsibilities and
class hierarchy in cephadm.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Prevent classes that want to check the filesystem from breaking the
simple daemon forms instantiation test case. A better future fix would
be avoiding checking the file system during __init__ of the class but
that is left for future improvements.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
The Ceph.daemons property has two unfortunate behaviors: most important,
it includes ceph-exporter which causes the other CephExporter class to
be over-shadowed the DaemonForms mechanism. Second, it couples all
functions that want to know the names of ceph daemon types to the Ceph
class preventing future refactoring of that class.
Break the existing coupling by adding a new `ceph_daemons` function
similar to `get_supported_daemons` but returning the same value that
Ceph.daemons used to provide. This will permit future fixes and
improvements.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Eliminate the _dispatch_deploy function, folding it into the
_common_deploy function, because the mass of if-elif lines have
been replaced and keeping it as a separate function no longer
serves much of a useful purpose.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
CACHE_POOL_NO_HIT_SET is retained in *api_tests*.yaml and
rbd_mirror.yaml snippets for TestLibRBD.ListChildrenTiered and
TestClusterWatcher.CachePools tests.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
With cache tiering facets gone, "pool" facets are strictly about
--data-pool option now. Rename to "data-pool" and create symlinks
to a common directory.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Cache tiering facets have been a constant source of job timeouts
accompanied by "slow request" warnings on the OSDs for at least two
years. Same workloads pass without pool/small-cache-pool.yaml or
thrashers/cache.yaml.
See cache tiering deprecation note added in commit 535b8db33e ("doc:
deprecate the cache tiering").
Fixes: https://tracker.ceph.com/issues/63149
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>