RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-10 13:10:46 +00:00

Author	SHA1	Message	Date
Radoslaw Zarzynski	bf6404e2b1	crimson/osd: sending EVENT_DISCONNECT becomes implementation detail of Watch. In contrast to ceph-osd crimson sends CEPH_WATCH_EVENT_DISCONNECT directly from the timeout handler and after CEPH_WATCH_EVENT_NOTIFY_COMPLETE. This simplifies the Watch::remove() interface as callers aren't obliged anymore to decide whether EVENT_DISCONNECT needs to be send or not -- it becomes an implementation detail of Watch. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-12 13:29:28 +00:00
Radoslaw Zarzynski	4070a7d557	crimson/osd: wire up handling of watch timeouts. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-12 13:29:28 +00:00
Radoslaw Zarzynski	7c80fcdae0	crimson/osd: s/do_timeout/do_notify_timeout/ per the upcoming do_watch_timeout(). Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-12 13:26:50 +00:00
Radoslaw Zarzynski	b5f1eb879e	crimson/osd: introduce the InternalClientRequest infrastructure. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-12 13:26:50 +00:00
Radoslaw Zarzynski	42425f8cd3	crimson/osd: PG::with_locked_obc() doesn't depend on MOSDOp anymore. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-12 13:26:49 +00:00
Radoslaw Zarzynski	35b03463dc	crimson/osd: PG::get_oid() doesn't depend on MOSDOp anymore. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:03:55 +02:00
Radoslaw Zarzynski	27d5ac3327	osd: introduce OpInfo filling from a vector of OSDOps. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:03:55 +02:00
Radoslaw Zarzynski	a4823d0420	crimson/osd: expose the non-MOSDOp-taking variant of do_osd_ops() externally. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:03:55 +02:00
Radoslaw Zarzynski	eaa82d5823	crimson/osd: PG::do_osd_ops_execute() doesn't depend on MOSDOp anymore. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:03:55 +02:00
Radoslaw Zarzynski	970cd1a29f	crimson/osd: pass std::vector<OSDOp>& to PG::submit_transaction(). This will allow in a moment to get rid of the dependency on `MOSDOp` on all paths of `PG::do_osd_ops_execute()`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	8fca471d14	crimson/osd: PG::do_osd_ops_execute() doesn't directly takes ObjectContextRef. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	21e31280cf	crimson/osd: PG::repair_object() doesn't depend on MOSDOp anymore. Before this commit the method was depending on `MOSDOp::get_min_epoch()` to start an `UrgentRecovery`. However, it seems `PG::get_osdmap_epoch()` would be sufficient here as the very early stages of the processing in `ClientRequest` ensure the PG fits the `get_min_epoch()` requirement. In the classical OSD the counterpart code looks like below: ``` int PrimaryLogPG::rep_repair_primary_object(const hobject_t& soid, OpContext *ctx) { // ... queue_peering_event( PGPeeringEventRef( std::make_shared<PGPeeringEvent>( get_osdmap_epoch(), get_osdmap_epoch(), PeeringState::DoRecovery()))); return -EAGAIN; } ``` In addition to the dependency minimalisation, the commits reformats the code around `PG::repair_object()` to fit our guidelines. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	dd6dec306c	crimson/osd: reload obc also when handling ct_error::object_corrupted. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	55a6d3f95c	crimson/osd: introduce RollbackOrchestrator to OpsExecuter. If the execution of an `OSDOp` fails, we're left with potentially altered `ObjectContext`. We deal with that by reloading `obc` if there was any modification to it. To figure this out, `has_seen_write()` on `OpsExecuter` is being called. Unfortunately, the current impl. has following drawbacks: * `has_seen_write()` can be called after `std::move(ox).flush_...()` which is very inelegant; * it requires catching both `ObjectContext` and `OpsExecuter` while the latter already references the former; * there is no explicitly given reason in the header for justifying the presence of `has_seen_writes()`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	f96a7f0acf	crimson/osd: split PG::do_osd_ops() to facilitate InternalClientRequest. This commit brings `PG::do_osd_ops_execute()` a subset of `PG::do_osd_ops()`; it handles the ops execution through `OpsExecuter` and the `submit_transaction()` but it stays indepedent from `MOSDOp` and `MOSDOpReply`. This trait facilitates the `InternalClientRequest`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	bca91e658c	crimson/osd: erase the message type in OpsExecuter. THe reason is unification of infrastructure between external client requests (everything represented by the `ClientRequest`) and internal requests. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	d6b0e5e0ec	crimson/osd: drop namespace for arg in PG::with_locked_obc(). It's unnecessary. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	e3c64648c6	crimson/osd: split ClientRequest::PGPipeline into CommonPGPipeline. This is another step towards the `InternalClientRequst`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	a6fc6ad174	crimson/osd: the ClientRequest::do_recover_missing() takes oid externally. This refactor is a first step towards sharing the recovery bits with `InternalClientRequest`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	6235e8d52f	crimson/osd: implement ObjectContext relocking. This commit introduces a `ObjectContext`-taking variant of `PG::with_locked_obc()`. The upcoming internal counterpart for the `ClientRequest` is the intended audience. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	43966de10c	crimson/osd: ObjectContext allows the hobject_t to be std::moved in ctor. Taken with "crimson/osd: use obc->get_oid() instead of passing hobject_t around" and enriched with the move-constructing down the `ObjectState` path this should allows to save some work in e.g. `std::string` instances that are part of the `hobject_t`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Radoslaw Zarzynski	b296180146	crimson/osd: OpsExecuter retrieves PG when doing op effects. This will necessary to spawn the upcoming `InternalClientRequest` from the `Watch`'s timeout handler. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-10 18:01:32 +02:00
Patrick Donnelly	8594b4f9a5	Merge PR #41128 into master * refs/pull/41128/head: qa/crontab: reduce frequency of pacific nightlies Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2021-05-10 07:43:43 -07:00
J. Eric Ivancich	c48fe06e3f	Merge pull request #40563 from BryceCao/wip_add_check_for_sync_url rgw : add check empty for sync url Reviewed-by: Casey Bodley <cbodley@redhat.com>	2021-05-10 10:32:39 -04:00
J. Eric Ivancich	0132accc1c	Merge pull request #38729 from rosinL/fix-rgw-file-read rgw/rgw_file: Fix the return value of read() and readlink() Reviewed-by: Matt Benjamin mbenjamin@redhat.com	2021-05-10 10:32:11 -04:00
J. Eric Ivancich	89125de281	Merge pull request #36305 from ivancich/wip-ordered-list-map-efficiency rgw: ordered list map efficiency Reviewed-by: Casey Bodley <cbodley@redhat.com>	2021-05-10 10:31:42 -04:00
Daniel Gryniewicz	5dd603946f	Merge pull request #41108 from dang/wip-dang-zipper-link RGW Zipper - Remove link/unlink from API	2021-05-10 10:15:57 -04:00
Kefu Chai	4029ac1d87	Merge pull request #37720 from ifed01/wip-ifed-alloc-tool-fixes os/bluestore: some minor fixes/improvements for allocator's stats inquiries Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>	2021-05-10 21:26:17 +08:00
Amnon Hanuhov	7860c48b22	Merge pull request #40931 from AmnonHanuhov/wip-refactor_conn_send crimson/net: Refactor conn::send()	2021-05-10 13:37:29 +03:00
Ernesto Puerta	932b294147	Merge pull request #41218 from rhcs-dashboard/revert-base-href mgr/dashboard: fix base-href: revert it to previous approach Reviewed-by: Aashish Sharma <aasharma@redhat.com> Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: Nizamudeen A <nia@redhat.com>	2021-05-10 10:28:41 +02:00
Ilya Dryomov	907044dc59	Merge pull request #41185 from idryomov/wip-rbd-pwl-reopen librbd/cache/pwl: fix parsing of cache_type in create_image_cache_state() Reviewed-by: Mahati Chamarthy <mahati.chamarthy@intel.com> Reviewed-by: Yin Congmin <congmin.yin@intel.com>	2021-05-09 21:49:53 +02:00
J. Eric Ivancich	2a30142d8a	Merge pull request #41141 from ivancich/wip-listing-initial-marker rgw: fix bucket object listing when marker matches prefix Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>	2021-05-08 10:55:37 -04:00
J. Eric Ivancich	0e24f05d7e	Merge pull request #41140 from ivancich/wip-bucket-purge-paging rgw: radosgw_admin remove bucket not purging past 1,000 objects Reviewed-by: Daniel Gryniewicz <dang@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com>	2021-05-08 10:54:58 -04:00
J. Eric Ivancich	d568416961	Merge pull request #40886 from pritha-srivastava/wip-rgw-mfa-pin-check rgw: fix for mfa resync crash when supplied with only one totp_pin. Reviewed-by: Matt Benjamin mbenjamin@redhat.com	2021-05-08 10:54:08 -04:00
Kefu Chai	078b30e1f1	Merge pull request #41166 from tchaikov/wip-cmake-cython-cflags cmake: remove cflags from CC Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2021-05-08 21:17:19 +08:00
Kefu Chai	467932a430	Merge pull request #41234 from tchaikov/wip-crimson-common crimson/common: use string_view when appropriate Reviewed-by: Ronen Friedman <rfriedma@redhat.com> Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com>	2021-05-08 20:00:37 +08:00
Kefu Chai	f3be0d8d81	crimson/common: use string_view when appropriate the typical use case of get_val() passes a literal string as the key, in that case, there is no need to create a std::string. as md_config_t::get_val() always accepts a string_view as the option name. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-05-08 16:46:00 +08:00
Kefu Chai	0edba988f1	Merge pull request #41080 from t-msn/readdir-fix2 os/FileStore: fix to handle readdir error correctly Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Mykola Golub <mgolub@suse.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2021-05-08 16:36:51 +08:00
Kefu Chai	a46db0c127	Merge pull request #40993 from neha-ojha/wip-50466 osd/PG.cc: handle removal of pgmeta object Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>	2021-05-08 16:31:52 +08:00
Kefu Chai	912850d084	Merge pull request #41143 from idryomov/wip-posix-memalign-fix common/buffer: adjust align before calling posix_memalign() Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-08 16:29:37 +08:00
Kefu Chai	a490e7f67a	Merge pull request #41155 from rzarzynski/wip-global-backtrace-bug-50653 log: fix the formatting when dumping thread IDs. Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-05-08 16:29:00 +08:00
Kefu Chai	eb7c3c54df	Merge pull request #41220 from rzarzynski/wip-crimson-monc-honor-cancel crimson/monc: honor auth_result_t::canceled as the result of do_auth(). Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-05-07 23:52:44 +08:00
Kefu Chai	bfa2f3fa22	Merge pull request #41222 from tchaikov/wip-crimson-cleanups crimson/os: cleanups Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com>	2021-05-07 23:39:32 +08:00
Kefu Chai	6c0f8d1fef	Merge pull request #41223 from rzarzynski/wip-crimson-alienstore-sighup crimson/alienstore: block SIGHUP to coexist with Seastar's signal handling Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-05-07 23:38:32 +08:00
Guillaume Abrioux	b4668be672	Merge pull request #41177 from dsavineau/cv_remove_legacy_release_check ceph-volume: remove legacy release check	2021-05-07 17:33:47 +02:00
Guillaume Abrioux	327aab9a05	Merge pull request #41178 from dsavineau/cv_tox_py3 ceph-volume: remove duplicate py3 env	2021-05-07 17:31:55 +02:00
Neha Ojha	d3692a3e92	Merge pull request #40016 from neha-ojha/wip-default-mclock use mclock_scheduler as the default scheduler Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com> Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Sunny Kumar <sunkumar@redhat.com>	2021-05-07 08:08:39 -07:00
Radoslaw Zarzynski	71fd807990	crimson/monc: honor auth_result_t::canceled as the result of do_auth(). An attempt to `Connection::do_auth()` may finish in one of three states: _success_, _failure_ and _cancellation_. Unfortunately, its callers were missing the third treating cancellation like a failure. This was the root cause of the following failure at Sepia: ``` rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-06_22:08:43-rados-master-distro-basic-smithi/6102605$ less ./remote/smithi204/log/ceph-osd.3.log.gz ... WARN 2021-05-06 22:35:40,464 [shard 0] osd - ms_handle_reset ... INFO 2021-05-06 22:35:40,465 [shard 0] monc - do_auth_single: connection closed INFO 2021-05-06 22:35:40,465 [shard 0] ms - [osd.3(client) v2:172.21.15.204:6808/31418@57568 >> mon.? v2:172.21.15.204:3300/0] execute_connecting(): protocol aborted at CLOSING -- std::system_error (error crimson::net:6, protocol aborted) ... ERROR 2021-05-06 22:35:40,465 [shard 0] osd - mon.osd.3 dispatch() ms_handle_reset caught exception: std::system_error (error crimson::net:3, negotiation failure) ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3909-g81233a18/rpm/el8/BUILD/ceph-17.0.0-3909-g81233a18/src/crimson/common/gated.h:36: crimson::common::Gated::dispatch(const char, T&, Func&&) [with Func = crimson::mon::Client::ms_handle_reset(crimson::net::ConnectionRef, bool)::<lambda()>&; T = crimson::mon::Client]::<lambda(std::__exception_ptr::exception_ptr)>: Assertion `eptr.__cxa_exception_type() == typeid(seastar::gate_closed_exception)' failed. Aborting on shard 0. Backtrace: 0# 0x00005618C973932F in ceph-osd 1# FatalSignal::signaled(int, siginfo_t const) in ceph-osd 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t, void)#1}::_FUN(int, siginfo_t, void*) in ceph-osd 3# 0x00007F7BB592EB20 in /lib64/libpthread.so.0 4# gsignal in /lib64/libc.so.6 5# abort in /lib64/libc.so.6 6# 0x00007F7BB3F29B09 in /lib64/libc.so.6 7# 0x00007F7BB3F37DE6 in /lib64/libc.so.6 8# 0x00005618C9FF295C in ceph-osd 9# 0x00005618C3907313 in ceph-osd 10# 0x00005618CCA2F84F in ceph-osd 11# 0x00005618CCA34D90 in ceph-osd 12# 0x00005618CCBEC9BB in ceph-osd 13# 0x00005618CC744E9A in ceph-osd 14# main in ceph-osd 15# __libc_start_main in /lib64/libc.so.6 16# _start in ceph-osd daemon-helper: command crashed with signal 6 ``` Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-07 13:55:32 +00:00
Radoslaw Zarzynski	9b7c026b93	crimson/alienstore: block SIGHUP to coexist with Seastar's signal handling. In `crimson/osd/main.cc` we instruct Seastar to handle `SIGHUP`. ``` // just ignore SIGHUP, we don't reread settings seastar::engine().handle_signal(SIGHUP, [] {}) ``` This happens using the Seastar's signal handling infrastructure which is incompliant with the alien world. ``` void reactor::signals::handle_signal(int signo, noncopyable_function<void ()>&& handler) { // ... struct sigaction sa; sa.sa_sigaction = [](int sig, siginfo_t info, void p) { engine()._backend->signal_received(sig, info, p); }; // ... } ``` ``` extern __thread reactor* local_engine; extern __thread size_t task_quota; inline reactor& engine() { return local_engine; } ``` The low-level signal handler above assumes `local_engine._backend` is not null which stays true only for threads from the S's world. Unfortunately, as we don't block the `SIGHUP` for alien threads, kernel is perfectly authorized to pick up one them to run the handler leading to weirdly-looking segfaults like this one: ``` INFO 2021-04-23 07:06:57,807 [shard 0] bluestore - stat DEBUG 2021-04-23 07:06:58,753 [shard 0] ms - [osd.1(client) v2:172.21.15.100:6802/30478@51064 >> mgr.4105 v2:172.21.15.109:6800/29891] --> #7 === pg_stats(0 pgs seq 55834574872 v 0) v2 (87) ... INFO 2021-04-23 07:06:58,813 [shard 0] bluestore - stat DEBUG 2021-04-23 07:06:59,753 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"} INFO 2021-04-23 07:06:59,753 [shard 0] osd - asok response length: 2947 INFO 2021-04-23 07:06:59,817 [shard 0] bluestore - stat DEBUG 2021-04-23 07:06:59,865 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"} INFO 2021-04-23 07:06:59,866 [shard 0] osd - asok response length: 2947 DEBUG 2021-04-23 07:07:00,020 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"} INFO 2021-04-23 07:07:00,020 [shard 0] osd - asok response length: 2947 INFO 2021-04-23 07:07:00,820 [shard 0] bluestore - stat ... Backtrace: 0# 0x00005600CD0D6AAF in ceph-osd 1# FatalSignal::signaled(int) in ceph-osd 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t, void)#1}::_FUN(int, siginfo_t, void) in ceph-osd 3# 0x00007F5877C7EB20 in /lib64/libpthread.so.0 4# 0x00005600CD830B81 in ceph-osd 5# 0x00007F5877C7EB20 in /lib64/libpthread.so.0 6# pthread_cond_timedwait in /lib64/libpthread.so.0 7# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd 8# 0x00007F5877999BA3 in /lib64/libstdc++.so.6 9# 0x00007F5877C7414A in /lib64/libpthread.so.0 10# clone in /lib64/libc.so.6 daemon-helper: command crashed with signal 11 ``` Ultimately, it turned out the thread came out from a syscall (`futex`) and started crunching the `SIGHUP` handler's code in which a nullptr dereference happened. This patch blocks `SIGHUP` for all threads spawned by `AlienStore`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-07 13:43:40 +00:00
Kefu Chai	bdaa8bd05f	crimson/os: use this explicitly to silence the warning from clang. it fails to figure out that this is actually used, and complains that this is captured but not used. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-05-07 21:40:46 +08:00

1 2 3 4 5 ...

122923 Commits