on_replica_init() might be legitimately called twice,
if the replica was waiting for updates to complete
before servicing the request.
Fixes: https://tracker.ceph.com/issues/49867
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
At the end of the lost_unfound tests add an additional wait_for_clean()
check to ensure that recoveries get enough time to complete before
proceeding and avoid failures down the line. For e.g. failure like
"Scrubbing terminated -- not all pgs were active and clean." is because
recoveries on the PGs did not get sufficient time to complete even though
they were bound to eventually complete.
Fixes: https://tracker.ceph.com/issues/49844
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
If we are deploying a daemon to bind to a specific port and there is
an existing daemon we are removing that also binds to that port, stop
it first. Unless we are both binding to different IPs.
This resolves the case where daemons bind to * and we redeploy with a
subnet to bind to. It would eventually converge before, but would
throw a bind error in the process and take longer.
Signed-off-by: Sage Weil <sage@newdream.net>
crimson/onode-staged-tree: fix tree_cursor_t::Cursor to be aware of extent duplication
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
This patch ensures that if a device has GPT headers it will
not show up in `ceph-volume inventory` as available.
Fixes: https://tracker.ceph.com/issues/48697
Resolves: rhbz#1908065
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
Latency of a request added at the end of request
summary rgw log line. This summary line also contains
information about the request like the op, bucket,
object, http status.
Signed-off-by: Ali Maredia <amaredia@redhat.com>
* refs/pull/40160/head:
qa/suites/rados/cephadm/orchestrator_cli: random-distro$ -> 0-random-distro$
qa/suites/rados/cephadm/smoke-roleless: distro -> 0-distro
qa/distros/podman: install kubic once per host, in parallel
qa/suites/fs/multiclient: use clients: not all: for pexec
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
PR #40062 tweaked the behavior of lockdep to compile it out
of the code entirely for release builds. This fixes several
gtests where lockdep was force-enabled.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This option enables 3 conversions:
1) pool stats, added in nautilus
2) per-pool omap, added in octopus
3) per-pg omap (replacing (2)) in pacific
Upgrading the long running cluster in sepia from octopus to pacific
resulted in conversion (3). This conversion isn't particularly useful
yet since the follow-on optimization of pg removal aren't in pacific
yet.
This took 25 minutes for the SSD-based osds with <10GB of omap. That's
a lot of disruption, and some clusters have 10x that much omap data.
Upgrades going from nautilus to pacific will miss the finer-grained
stats granularity, but isn't such an important feature it's worth
causing potential availability problems.
In the future we can orchestrate these format changes via cephadm/rook
to minimize the impact on the whole cluster, e.g. going an osd at a
time or doing it during an off-peak period, and not necessarily at the
same time as an upgrade.
Fixes: https://tracker.ceph.com/issues/45265
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
The encryption format API now also implicitly loads the encryption
layer. This tweaks the tests to account for this functional
difference.
Fixes: https://tracker.ceph.com/issues/49848
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
podman on centos 8 at least doesn't accept the Dockerfile being fed to
it via stdin. Change that branch of the script to use the same method
that the ubuntu side does.
This gets the script working on senta03 for me.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
mgr/cephadm: When device size contains the decimal, it can not match size exactly
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
mgr/cephadm: add info to 'ceph orch upgrade status' in cephadm
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
We're still occasionally hitting file descriptor limits when running
this test. Reduce the thread count to 32 for now, since it was possible
to reproduce the original problem with 10 or so threads.
Fixes: https://tracker.ceph.com/issues/49559
Signed-off-by: Jeff Layton <jlayton@redhat.com>
* refs/pull/40177/head:
doc: update Windows MSI link
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Be sure to note that python 3 is a prerequisite. Minimal centos 8
installs don't have it, for instance.
Also, we probably don't want to hardcode an octopus URL into the
suggested curl command. Change it to fill that in with
"|stable-release|", which should always point to the latest released
version name.
Fixes: https://tracker.ceph.com/issues/49806
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Fix a braino that came with commit f6854ac65d ("krbd: make sure the
device node is accessible after the mapping").
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Instead of having a direct download link, we'll point to the
download page, which will eventually contain other MSI versions as
well (e.g. Quincy).
While at it, we're simplifying the document a bit, dropping
information that's also included in the manual install guide.
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
Since kernel 5.12, hardware read-only state and user read-only
policy (BLKROGET/SET ioctls) are tracked separately in the block
layer. As the purpose of our ->set_read_only() method was exactly
that, it was removed.
As a side effect, BLKROSET no longer returns EROFS on an attempt
to make a read-only mapping read-write with "blockdev --setrw".
The policy gets updated, but the device remains read-only as before
because the hardware (== mapping) state is controlled by the driver.
Fixes: https://tracker.ceph.com/issues/49858
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
We switched from make to ninja but we're using the wrong target
when building the tests.
"ninja test" tries to actually run the tests. We'll have to use
"ninja tests" when targeting Windows.
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>