lba btree root leaf is empty after osd reboot, because SegmentStateTracker's states are wrong.
and that is caused by tracker->do_write not finished then seastore closed.
in transaction manager read_extent, can't read extent.
ceph_assert(0 == "Should be impossible");
Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
This PR intends to remove the run-promtool-unittests.sh script as CMakeLists.txt handles the promtool execution
(also adding the description to run these tests in Readme.md)
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
Methods like list(), create(), get() etc doesn't get applied the version.Also for the endpoints that get the version changed, the docs and the request header has still the version v1.0+ in them. So with the version reduced it gives 415 error when trying to make the request. This PR fixes this issue.
Fixes: https://tracker.ceph.com/issues/50855
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
and a simple REPL client allowing developer to peek and poke the
selftest module. if this turns out to be useful, we can promote this
method into a dedicated mix-in class, so other module can use it if
developer wants to test it manually.
Signed-off-by: Kefu Chai <kchai@redhat.com>
there is chance stop() and umount() methods get called even if start()
is not called in the error handling path. in that case, just make these
methods no-op. to ensure that OSD behaves in that case.
Signed-off-by: Kefu Chai <kchai@redhat.com>
thread pool is not needed until AlienStore::start(). with this change,
we are able to tell if the AlienStore is actually started or not in
AlienStore::stop().
as seastar::sharded<Service> start a service in two phases:
1. construct the shard instances
2. actually start them
and it stops a service in a single shot, which both stops the services
and destructs the service instance(s).
so we have to implement a proper stop() method for services whose
start() might not be called after its instance is created by
seastar::sharded<Service>::start() in case of error handling or if
we just don't want to call start().
to ensure we can skip the steps to clean up the stuff created by
start(), we need to have a flag in the sharded service, because
AlienStore is a member variable of OSD, and when we do mkfs, AlienStore
is not start()'ed, and as explained above, we have to call OSD::stop()
to ensure OSD instance is destructed properly. but OSD::stop()
calls store->umount() and store->stop() unconditionally. these methods
in AlienStore rely on a functional thread pool.
fortunately, we don't need to call these methods if the store is never
mounted or started. in a case of failed "mkfs", store is not mounted at
all but the store and osd instances are created.
so, in this change, thread pool is created in AlienStore::start(), and
we will use it to tell if AlienStore is started or not in the following
change which makes the related method no-op if AlienStore is not started
yet.
also, postpone the creation of `store` until in AlienStore::start(), so
we don't need to destroy it in the dtor of AlienStore. otherwise,
BlueStore::~BlueStore() would need to reference resources which are only
available in alien threads, but when OSD::~OSD() is called, we are in
seastar's reactor.
Signed-off-by: Kefu Chai <kchai@redhat.com>
otherwise the sharded_service's dtor complains if we destruct it without
stopping it first, like:
FATAL: startup failed: std::system_error (error crimson::net:3, negotiation failure)
crimson-osd: ../src/seastar/include/seastar/core/sharded.hh:523: seastar::sharded<T>::~sharded() [with Service = crimson::osd::OSD]: Assertion `_instances.empty()' failed.
Aborting on shard 0.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* use seastar::app_template::run() instead of
seastar::app_template::run_deprecated() for returning int,
instead of returning `void`. so the application can return
int explicitly in the continuation passed to run(). more
readable this way.
* wrap the all the block in run() in a giant try-catch block,
so the exceptions thrown by the startup code can be captured
and handled.
* do not capture the exceptions individually, in the try-catch
block anymore. the outer catch block takes care of them.
this change improves the error handling when crimson-osd launches.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Since it is possible there is no podman process running when launching
vstart, use 'command -v' instead of 'pgrep -f'.
Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
with the recent support for async rbd operations from pacific+ when an
older client(non async support) goes on upgrade, and simultaneously
interacts with a newer client which expects the requests to be async,
experiences hang; considering the return code for request completion to
be acknowledgement for async request, which then keeps waiting for
another acknowledgement of request completion.
this if happens should be a rare only when lockowner is an old client
and should be deferred if compatibility issues arises.
see also: 541230475d3b25ab18c4eb9bc5011060462594a6(octopus)
Signed-off-by: Deepika <dupadhya@redhat.com>
As there is no inherent ordering, there may be multiple removable
images past the unremovable image. On top of that, removing a clone
may make its parent removable so perform an additional pass if any
image gets removed.
Fixes: https://tracker.ceph.com/issues/51021
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
This PR adds parallel construction to the "Service
Specification" section of the "Service Managment"
chapter of the cephadm documentation.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
This PR creates parallel structure for the
text in the "Daemon Status" section of the
cephadm Service Management chapter.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
rgw: add the description of blocking io during index resharding
Reviewed-by: Matt Benjamin mbenjamin@redhat.com
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
print more verbose error message when monc fails to connect to moitor.
for better user experience.
also, unregister all dispatchers by calling msgr->stop() before calling
monc.stop() to ensure the messenger can be shutdown gracefully.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* refs/pull/41483/head:
cephadm: stop passing --no-hosts to podman
mgr/nfs: use host.addr for backend IP where possible
mgr/cephadm: convert host addr if non-IP to IP
mgr/dashboard,prometheus: new method of getting mgr IP
doc/cephadm: remove any reference to the use of DNS or /etc/hosts
mgr/cephadm: use known host addr
mgr/cephadm: resolve IP at 'orch host add' time
Reviewed-by: Sebastian Wagner <swagner@suse.com>
This reverts cfc1f914ce, which is no longer
neceesary because (1) we don't use socket.getfqdn(), and (2) we generally
do not rely on DNS or /etc/hosts at all anymore (with the exception of
the upgrade transition).
Signed-off-by: Sage Weil <sage@newdream.net>
Previously we allowed the host.addr to be a DNS name (short or fqdn).
This is problematic because of the inconsistent way that docker and podman
handle /etc/hosts, and undesirable because relying on external DNS is
an external source of failure for the cluster without any benefit in
return (simply updating DNS is not sufficient to make ceph behave).
So: update any non-IP to an IP as soon as we start up (presumably on
upgrade). If we get a loopback address (127.0.0.1 or 127.0.1.1), then
wait and hope that the next instance of the manager has better luck.
Signed-off-by: Sage Weil <sage@newdream.net>
- Use a centralized method get_mgr_ip()
- Look up the hostname via DNS. This is a bit more reliable than
getfqdn() since it will work even when podman adds the container
name to /etc/hosts.
Signed-off-by: Sage Weil <sage@newdream.net>
If the host IP/addr is known, use that. The addr might even be a FQDN
instead of an IP address, in which case we want to look that up instead
of the bare hostname.
Signed-off-by: Sage Weil <sage@newdream.net>
just for the sake of correctness, as they don't need a full-blown
std::string, what they need is but a string like object. and they always
create a std::string instance as a member variable if they want to have
a copy of it.
Signed-off-by: Kefu Chai <kchai@redhat.com>
before this change, cot never destructs the created ObjectStore
instances.
after this change, they are destructed upon returning from main().
Signed-off-by: Kefu Chai <kchai@redhat.com>