Although it is not necessary to mark_down the connection in its
ms_handle_reset() event, but it can be more convenient to allow it.
And Heartbeat already encounters this assertion failure.
So move the assertion to close_clean() which will help identify problems
if we happen to make ms_handle_reset() wait for messenger shutdown.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
* be explicit that mark_down() won't trigger reset event;
* return void so no deadlock is possible and memory is still safe
guarded by Messenger::shutdown();
* related changes in crimson/osd;
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
When a new connection tries to replace the old one, the event order
should be like:
1. reset(old);
2. accept(new);
This means we cannot just reschedule the reset event asynchronously. And
we still need to make sure the internal state is integral when reset.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
* ms_handle_reset() should not be able to contaminate the internal
atomic messenger status, so make it an asynchronous event along
with close();
* add is_closed_clean() for messenger unit test, because the reset event
now happens after connection closed.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
To build Boost.Context (and other libraries) with support to allow
them to be valground usefully, and to include the define to link
against them.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
* refs/pull/34023/head:
mgr/test_orchestrator: fix service filtering when using dummy data
mgr/dashboard: fix adding/removing host errors
Reviewed-by: Lenz Grimmer <lgrimmer@suse.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
* refs/pull/34030/head:
cephadm: env over last used
cephadm: fall back to default for infer_image
cephadm: remove outdated check
cephadm: consolidate default image logic
cephadm: only infer image for shell, run, inspect-image, pull, ceph-volume
Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Ricardo Marques <rimarques@suse.com>
* refs/pull/34060/head:
Merge PR #34027 into octopus
Merge PR #34045 into octopus
Merge pull request #34035 from dillaman/wip-rbd-permissions
mgr/progress: fix duration strings
Merge PR #34014 into octopus
Merge PR #34001 into octopus
Merge PR #34011 into octopus
qa/workunits/rbd: use context managers to control Rados lifespan
Merge pull request #34032 from dillaman/wip-rbd-octopus-docs
doc/releases/octopus: add additional RBD improvements
qa/workunits/cephadm/test_cephadm: mark services unmanaged for test
mgr/cephadm: do not reconfig unmanaged services
Merge PR #33981 into octopus
Merge pull request #34018 from ajarr/octopus-subvolume-clone-cancel
qa/workunits/cephadm/test_cephadm: output file for pub key
Merge PR #33866 into octopus
Merge PR #34005 into octopus
Merge PR #34013 into octopus
mgr/cephadm: pytest: Enable SpecStore
mgr/orchestrator: add test for default implementation for apply()
python-common: validate ServiceSpec.service_type
fixup mgr/cephadm: Fix ceph orch apply -i
mgr/dashbaord: orchestrator service: Revert wait_api_result to a single completion
mgr/orchestrator: `orch daemon add` accepts a yaml
mgr/cephadm: apply_drivegroups() returns a single Completion
mgr/cephadm: remove `trivial_result()`
mgr/cephadm: Fix `ceph orch apply -i`
Merge pull request #33994 from dillaman/wip-librbd-poll-event-race
doc: document `clone cancel` command
test: add `clone cancel` tests
mgr/volumes: introduce "clone cancel" volume command
mgr/volumes: allow canceling a single asynchronous job for a volume
mgr/volumes: helper for looking up a clone entry index
mgr/volumes: periodically check if clone operations should be canceled
mgr/volumes: periodically check if copy operations should be canceled
mgr/volumes: introduce 'canceled' state in clone op state machine
qa/suites/rados/verify/validater/valgrind: tolerate SLOW_OPS
qa/suites/rados/verify/validater/valgrind: less bluestore logging
qa/suites/rados/verify/validater: increase heartbeat grace
Revert "qa/suites/rados/verify: debug_ms = 1, osd_heartbeat_grace = 60"
Revert "qa/suites/rados/verify/validator/valgrind: debug refs = 5"
ceph_test_watch_notify: try notify 10x if ALLOW_TIMEOUTS is set
ceph_test_rados_api_misc: ShutdownRace timeout if ALLOW_TIMEOUTS is set
qa/suites/rados/verify: set ALLOW_TIMEOUTS for workunits
doc/install: edits
doc/cephadm: more edits
doc/cephadm/install: edits
doc/cephadm/adoption: improvements
doc/cephadm/install: a few edits
doc/cephadm/install: do not install ceph-common on host (by default)
doc/cephadm: drop os recs link
doc/cephadm/upgrade: improvements
doc/cephadm/upgrade: document upgrade
doc/cephadm/install: revamp install docs
doc: reorganize cephadm docs
doc/cephadm/administration: update docs on customizing SSH config
doc/cephadm/administration: add a note about the 'removed' dir
mgr/balancer: tolerate pgs outside of target weight map
qa/workunits/cephadm/test_cephadm: --skip-monitoring-stack
Merge PR #33974 into octopus
Merge PR #33442 into octopus
Merge PR #33997 into octopus
Merge PR #34000 into octopus
use quay octopus tip until 15.2 tag is available
python-common: reduce output of ServiceSpec.to_json()
python-common,mgr/cephadm: move assert_valid_host to service_spec
mgr/cephadm: add HostAssignment.validate()
mgr/dashboard: adapt create_osds interface change
mon/MgrMonitor: make 'mgr fail' work with no arguments
cephadm: add allow_ptrace option to enable SYS_PTRACE
update default container images
mgr/cephadm: limit number of times check host is performed in the serve loop
Merge PR #33961 into octopus
Merge PR #33952 into octopus
Merge PR #33990 into octopus
Merge PR #33955 into octopus
Merge PR #33936 into octopus
mgr/orch: add --all-available-devices to 'orch apply osd'
qa/workunits/cephadm: --skip-mon-network when using 127.0.0.1
cephadm: add tests
qa/tasks/cephadm: pass -v to bootstrap
mgr/cephadm: only try to place mons on hosts matching public_network
mgr/cephadm: keep track of host networks, ips
cephadm: automatically infer mon public_network, if we can
cephadm: add list-networks command
cephadm: bootstrap: deploy monitoring stack by default
librbd: defer event socket completion until after callback issued
cephadm: add-repo: add --version
mgr/cephadm: respect 'unmanaged' flag in spec
mgr/orch: orch ls: show <no spec> or <unmanaged> as appropriate
mgr/orch: orch ls: rename SPEC -> PLACEMENT
mgr/orch: add 'unmanaged' property to ServiceSpec
cephadm: rename distro args in repo methods
mgr/orch: combine 'orch daemon add <type> ...' into one command
mgr/orch: combine 'orch apply <type> [<placement>]' into one command
Reviewed-by: Laura Paduano <lpaduano@suse.com>
* refs/pull/34027/head:
qa/workunits/cephadm/test_cephadm: mark services unmanaged for test
mgr/cephadm: do not reconfig unmanaged services
qa/workunits/cephadm/test_cephadm: output file for pub key
Reviewed-by: Sebastian Wagner <swagner@suse.com>
Specify either --release name (to get the latest) or --version x.y.z to
get a specific version.
Adapt to updated locations on download.ceph.com so that we don't need to
know the release name for a specific x.y.z release.
Signed-off-by: Sage Weil <sage@redhat.com>
Mon might fail to share the newest map with any of up osds, e.g.,
due to an injected broken pipe. Since we don't have any client
activities during the osd-markdown tests, osds might be unaware of
the map changes made through CLI. Make sure osds have pulled the
newest map down before we can test its reaction correctly.
Fixes: https://tracker.ceph.com/issues/44662
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
Currently, Heartbeat::send_failures() invokes monc.send_message() in a
continuation which may be run asynchronously, risking involving a daggling
"monc" reference when OSD shuts down and MonClient is destroyed.
Signed-off-by: Xuehan Xu <xxhdx1985126@163.com>
- simplify the code to just calculate the durations when we need them
(I'm not sure why we had those temporary strings!)
- use a nicer time delta format
Fixes: https://tracker.ceph.com/issues/44672
Signed-off-by: Sage Weil <sage@redhat.com>
This is an old test, we have good watch/notify coverage in the newer
tests, and it is buggy.
Fixes: https://tracker.ceph.com/issues/43861
Signed-off-by: Sage Weil <sage@redhat.com>