RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-02-19 17:08:05 +00:00

Author	SHA1	Message	Date
Sage Weil	72c4fc75ad	qa/standalone: default to disable insecure global id reclaim Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-06 17:29:23 -04:00
Sage Weil	3e80f61efe	qa/suites/upgrade/octopus-x: disable insecure global_id reclaim health warnings These will trigger on upgrade; suppress them so that our health gates will still work. Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-06 17:29:23 -04:00
Sage Weil	9f6fd4fe56	qa/tasks/ceph[adm].conf[.template]: disable insecure global_id reclaim health alerts Turn these off everywhere for our tests so they don't interfere with our health checks. Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-06 17:29:21 -04:00
Sage Weil	7ca7418322	cephadm: set auth_allow_insecure_global_id_reclaim for mon on bootstrap If this is a fresh pacific cluster, let's assume that there won't be legacy clients connecting. (And if there are, let's put the burden on the user to enable them to do so insecurely.) This is in contrast to upgrades, where our focus is on not breaking anything. Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-06 17:28:55 -04:00
Sage Weil	18b343b06e	mon/HealthMonitor: raise AUTH_INSECURE_GLOBAL_ID_RENEWAL[_ALLOWED] Two new alerts: - AUTH_INSECURE_GLOBAL_ID_RENEWAL_ALLOWED if we are allowing clients to reclaim global_ids in an insecure manner (for backwards compatibility until clients are upgraded) - AUTH_INSECURE_GLBOAL_ID_RENEWAL if there are currently clients connected that do not know how to securely renew their global_id, as exposed by auth_expose_insecure_global_id_reclaim=true. The client auth names and IPs are listed the alert details (up to a limit, at least). The docs recommend operators mute these alerts instead of silencing, but we still include option that allow the alerts to be disabled entirely. Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-06 17:28:55 -04:00
Ilya Dryomov	05772ab612	auth/cephx: ignore CEPH_ENTITY_TYPE_AUTH in requested keys When handling CEPHX_GET_AUTH_SESSION_KEY requests from nautilus+ clients, ignore CEPH_ENTITY_TYPE_AUTH in CephXAuthenticate::other_keys. Similarly, when handling CEPHX_GET_PRINCIPAL_SESSION_KEY requests, ignore CEPH_ENTITY_TYPE_AUTH in CephXServiceTicketRequest::keys. These fields are intended for requesting service tickets, the auth ticket (which is really a ticket granting ticket) must not be shared this way. Otherwise we end up sharing an auth ticket that a) isn't encrypted with the old session key even if needed (should_enc_ticket == true) and b) has the wrong validity, namely auth_service_ticket_ttl instead of auth_mon_ticket_ttl. In the CEPHX_GET_AUTH_SESSION_KEY case, this undue ticket immediately supersedes the actual auth ticket already encoded in the same reply (the reply frame ends up containing two auth tickets). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:55 -04:00
Ilya Dryomov	522a52e6c2	auth/cephx: rotate auth tickets less often If unauthorized global_id (re)use is disallowed, a client that has been disconnected from the network long enough for keys to rotate and its auth ticket to expire (i.e. become invalid/unverifiable) would not be able to reconnect. The default TTL is 12 hours, resulting in a 12-24 hour reconnect window (the previous key is kept around, so the actual window can be up to double the TTL). The setting has stayed the same since 2009, but it also hasn't been enforced. Bump it to get a 72 hour reconnect window to cover for something breaking on Friday and not getting fixed until Monday. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:55 -04:00
Ilya Dryomov	08766a17ed	mon: fail fast when unauthorized global_id (re)use is disallowed When unauthorized global_id (re)use is disallowed, we don't want to let unpatched clients in because they wouldn't be able to reestablish their monitor session later, resulting in subtle hangs and disrupted user workloads. Denying the initial connect for all legacy (CephXAuthenticate < v3) clients is not feasible because a large subset of them never stopped presenting their ticket on reconnects and are therefore compatible with enforcing mode: most notably all kernel clients but also pre-luminous userspace clients. They don't need to be patched and excluding them would significantly hamper the adoption of enforcing mode. Instead, force clients that we are not sure about to reconnect shortly after they go through authentication and obtain global_id. This is done in Monitor::dispatch_op() to capture both msgr1 and msgr2, most likely instead of dispatching mon_subscribe. We need to let mon_getmap through for "ceph ping" and "ceph tell" to work. This does mean that we share the monmap, which lets the client return from MonClient::authenticate() considering authentication to be finished and causing the potential reconnect error to not propagate to the user -- the client would hang waiting for remaining cluster maps. For msgr1, this is unavoidable because the monmap is sent immediately after the final MAuthReply. But for msgr2 this is rare: most of the time we get to their mon_subscribe and cut the connection before they process the monmap! Regardless, the user doesn't get a chance to start a workload since there is no proper higher-level session at that point. To help with identifying clients that need patching, add global_id and global_id_status to "sessions" output. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:55 -04:00
Ilya Dryomov	abebd643cc	auth/cephx: option to disallow unauthorized global_id (re)use global_id is a cluster-wide unique id that must remain stable for the lifetime of the client instance. The cephx protocol has a facility to allow clients to preserve their global_id across reconnects: (1) the client should provide its global_id in the initial handshake message/frame and later include its auth ticket proving previous possession of that global_id in CEPHX_GET_AUTH_SESSION_KEY request (2) the monitor should verify that the included auth ticket is valid and has the same global_id and, if so, allow the reclaim (3) if the reclaim is allowed, the new auth ticket should be encrypted with the session key of the included auth ticket to ensure authenticity of the client performing reclaim. (The included auth ticket could have been snooped when the monitor originally shared it with the client or any time the client provided it back to the monitor as part of requesting service tickets, but only the genuine client would have its session key and be able to decrypt.) Unfortunately, all (1), (2) and (3) have been broken for a while: - (1) was broken in 2016 by commit `a2eb6ae3fb` ("mon/monclient: hunt for multiple monitor in parallel") and is addressed in patch "mon/MonClient: preserve auth state on reconnects" - it turns out that (2) has never been enforced. When cephx was being designed and implemented in 2009, two changes to the protocol raced with each other pulling it in different directions: commits `0669ca21f4` ("auth: reuse global_id when requesting tickets") and `fec31964a1` ("auth: when renewing session, encrypt ticket") added the reclaim mechanism based strictly on auth tickets, while commit `5eeb711b6b` ("auth: change server side negotiation a bit") allowed the client to provide global_id in the initial handshake. These changes didn't get reconciled and as a result a malicious client can assign itself any global_id of its choosing by simply passing something other than 0 in MAuth message or AUTH_REQUEST frame and not even bother supplying any ticket. This includes getting a global_id that is being used by another client. - (3) was broken in 2019 with addition of support for msgr2, where the new auth ticket ends up being shared unencrypted. However the root cause is deeper and a malicious client can coerce msgr1 into the same. This also goes back to 2009 and is addressed in patch "auth/cephx: ignore CEPH_ENTITY_TYPE_AUTH in requested keys". Because (2) has never been enforced, no one noticed when (1) got broken and we began to rely on this flaw for normal operation in the face of reconnects due to network hiccups or otherwise. As of today, only pre-luminous userspace clients and kernel clients are not exercising it on a daily basis. Bump CephXAuthenticate version and use a dummy v3 to distinguish between legacy clients that don't (may not) include their auth ticket and new clients. For new clients, unconditionally disallow claiming global_id without a corresponding auth ticket. For legacy clients, introduce a choice between permissive (current behavior, default for the foreseeable future) and enforcing mode. If the reclaim is disallowed, return EACCES. While MonClient does have some provision for global_id changes and we could conceivably implement enforcement by handing out a fresh global_id instead of the provided one, those code paths have never been tested and there are too many ways a sudden global_id change could go wrong. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:55 -04:00
Ilya Dryomov	6b860684c6	auth/cephx: make cephx_decode_ticket() take a const ticket_blob Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:55 -04:00
Ilya Dryomov	b50b6abd60	auth/AuthServiceHandler: keep track of global_id and whether it is new AuthServiceHandler already has global_id field, but it is unused. Revive it and let the handler know whether global_id is newly assigned by the monitor or provided by the client. Lift the setting of entity_name into AuthServiceHandler. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:55 -04:00
Ilya Dryomov	49cba02a75	auth/AuthServiceHandler: build_cephx_response_header() is cephx-specific Make the one in CephxServiceHandler private and drop the stub in AuthNoneServiceHandler. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:54 -04:00
Ilya Dryomov	c151c9659b	auth/AuthServiceHandler: drop unused start_session() args session_key, connection_secret and connection_secret_required_length aren't material for start_session() across all three implementations. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:54 -04:00
Ilya Dryomov	a71f6e90d4	mon/MonClient: drop global_id arg from _add_conn() and _add_conns() Passing anything but MonClient instance's global_id doesn't make sense. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:54 -04:00
Ilya Dryomov	c9b022e073	mon/MonClient: reset auth state in shutdown() Destroying AuthClientHandler and not resetting global_id is another way to get MonClient to send CEPHX_GET_AUTH_SESSION_KEY requests with CephXAuthenticate::old_ticket not populated. This is particularly pertinent to get_monmap_and_config() which shuts down the bootstrap MonClient between retry attempts. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:54 -04:00
Ilya Dryomov	236b536b28	mon/MonClient: preserve auth state on reconnects Commit `a2eb6ae3fb` ("mon/monclient: hunt for multiple monitor in parallel") introduced a regression where auth state (global_id and AuthClientHandler) was no longer preserved on reconnects. The ensuing breakage was quickly noticed and prompted a follow-on fix `8bb6193c8f` ("mon/MonClient: persist global_id across re-connecting"). However, as evident from the subject, the follow-on fix only took care of the global_id part. AuthClientHandler is still destroyed and all cephx tickets are discarded. A new from-scratch instance is created for each MonConnection and CEPHX_GET_AUTH_SESSION_KEY requests end up with CephXAuthenticate::old_ticket not populated. The bug is in MonClient, so both msgr1 and msgr2 are affected. This should have resulted in a similar sort of breakage but didn't because of a much larger bug. The monitor should have denied the attempt to reclaim global_id with no valid ticket proving previous possession of that global_id presented. Alas, it appears that this aspect of the cephx protocol has never been enforced. This is dealt with in the next patch. To fix the issue at hand, clone AuthClientHandler into each MonConnection so that each respective CEPHX_GET_AUTH_SESSION_KEY request gets a copy of the current auth ticket. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:54 -04:00
Ilya Dryomov	eec24e4d11	mon/MonClient: claim active_con's auth explicitly Eliminate confusion by moving auth from active_con into MonClient instead of swapping them. The existing MonClient::auth can be destroyed right away -- I don't see why active_con would need it or a reason to delay its destruction (which is what stashing in active_con effectively does). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:54 -04:00
Ilya Dryomov	6faa18e0a8	mon/MonClient: resurrect "waiting for monmap\|config" timeouts This fixes a regression introduced in commit `85157d5aae` ("mon: s/Mutex/ceph::mutex/"). Waiting for monmap and config indefinitely is not just bad UX, it actually masks other more serious bugs. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-04-06 17:28:54 -04:00
Sage Weil	9e96806aea	Merge PR #40603 into master * refs/pull/40603/head: qa/tasks/ceph.conf: shorten cephx TTL for testing Reviewed-by: Ilya Dryomov <idryomov@redhat.com>	2021-04-06 17:28:08 -04:00
Casey Bodley	fb760da0ed	Merge pull request #40190 from cbodley/wip-qa-rgw-sigv4-warnings rgw: silence some unused variable warnings Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>	2021-04-06 16:08:28 -04:00
Sage Weil	3b391cbf7a	Merge PR #40602 into master * refs/pull/40602/head: qa/tasks/cephadm: add apply() method/task Reviewed-by: Sebastian Wagner <swagner@suse.com>	2021-04-06 16:04:27 -04:00
Sage Weil	2218c9473f	Merge PR #40597 into master * refs/pull/40597/head: cephadm: pass '-i' to docker\|podman run for shell\|enter Reviewed-by: Adam King <adking@redhat.com>	2021-04-06 16:04:17 -04:00
J. Eric Ivancich	b38cdb58dc	Merge pull request #40553 from dang/wip-dang-zipper-list RGW Zipper - Make sure bucket list progresses Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>	2021-04-06 13:49:35 -04:00
Patrick Donnelly	61b014c4a6	Merge PR #39939 into master * refs/pull/39939/head: cephfs: ceph-dokan - properly log the mounted root cephfs: Update ceph-dokan "--removable" flag cephfs: document using multiple fs on Windows cephfs: provide additional volume details on Windows cephfs: add ceph-dokan unmap command Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2021-04-06 10:46:23 -07:00
Patrick Donnelly	d0e3b7129d	Merge PR #40418 into master * refs/pull/40418/head: test: unmount when finished ino_release_cb test: wait a time for inode release qa: move ino_release_cb to libcephfs sub-suite qa: simplify recall triggers for bug qa: fix name for qa task referencing tracker issue Reviewed-by: Jeff Layton <jlayton@redhat.com>	2021-04-06 10:45:00 -07:00
Patrick Donnelly	eb38b924ff	Merge PR #40460 into master * refs/pull/40460/head: client: only check pool permissions for regular files Reviewed-by: Sidharth Anupkrishnan <sanupkri@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2021-04-06 10:43:54 -07:00
Patrick Donnelly	414b5593f0	Merge PR #40465 into master * refs/pull/40465/head: test: bump up retries for `test_mirroring_init_failure_with_recovery` test test: fix typo test: disable mgr/mirroring for `test_mirroring_init_failure_with_recovery` test Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2021-04-06 10:43:12 -07:00
Patrick Donnelly	532532e4ce	Merge PR #40468 into master * refs/pull/40468/head: mds/metrics: add one whitespace between metric type the metainfo Reviewed-by: Varsha Rao <varao@redhat.com> Reviewed-by: Rishabh Dave <ridave@redhat.com>	2021-04-06 10:42:42 -07:00
Patrick Donnelly	42d7da5f12	Merge PR #40469 into master * refs/pull/40469/head: qa: check mounts attribute in ctx Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2021-04-06 10:41:55 -07:00
Patrick Donnelly	40dfede339	Merge PR #40501 into master * refs/pull/40501/head: client: fix the opened inodes counter increasing Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2021-04-06 10:41:07 -07:00
Yuval Lifshitz	36e03d79cd	Merge pull request #40113 from yuvalif/wip-yuval-fix-49799 rgw/notification: set correct type to "post" and "copy" notifications	2021-04-06 19:45:57 +03:00
Patrick Donnelly	c94f0a50a5	Merge PR #40620 into master * refs/pull/40620/head: cephfs-top: fix typo in help Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2021-04-06 08:25:49 -07:00
Patrick Donnelly	9f68564bda	Merge PR #40613 into master * refs/pull/40613/head: doc/cephfs/nfs: add user id, fs name and key to FSAL block Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2021-04-06 08:24:59 -07:00
Ernesto Puerta	c448dfad84	Merge pull request #40616 from rhcs-dashboard/fix-50044-master mgr/dashboard: debug nodeenv hangs Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: David Galloway <dgallowa@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-04-06 16:33:24 +02:00
Kefu Chai	39a23d9cd5	Merge pull request #40615 from tchaikov/wip-doc-header-css doc/_themes: fix the styling of section header Reviewed-by: Sebastian Wagner <swagner@suse.com>	2021-04-06 22:04:54 +08:00
Kefu Chai	bd6f810a81	doc/_themes: fix the styling of section header and list items in the latest document generated from RtD, the section headers are now in <section> tags instead of <div class="section">, so update the css accordingly. also tweak the style of the list items in unordered list to be the same as it was. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-04-06 21:38:01 +08:00
Kefu Chai	90a79647fb	Merge pull request #40587 from tchaikov/wip-addr-parse-cleanup src: use entity_addr_t::parse(string_view) when possible Reviewed-by: Sage Weil <sage@redhat.com>	2021-04-06 20:47:17 +08:00
Ernesto Puerta	2c2a397f84	mgr/dashboard: debug nodeenv hangs Increase verbosity in nodeenv command for debugging purposes. Fixes: https://tracker.ceph.com/issues/50044 Signed-off-by: Ernesto Puerta <epuertat@redhat.com>	2021-04-06 13:45:15 +02:00
Varsha Rao	08f1d906c2	doc/cephfs/nfs: add user id, fs name and key to FSAL block Fixes: https://tracker.ceph.com/issues/50161 Signed-off-by: Varsha Rao <varao@redhat.com>	2021-04-06 15:52:04 +05:30
Yuval Lifshitz	21b4f5aaf8	rgw/notification: set correct type to "post" and "copy" notifications Fixes: https://tracker.ceph.com/issues/49799 Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>	2021-04-06 11:01:18 +03:00
Jos Collin	96c7e07799	cephfs-top: fix typo in help Signed-off-by: Jos Collin <jcollin@redhat.com>	2021-04-06 12:08:25 +05:30
Kefu Chai	9f94d27752	Merge pull request #40582 from a16bitsysop/32bit src/common/buffer.cc: change cast to static_cast Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-04-06 14:35:18 +08:00
Kefu Chai	8c37039c95	Merge pull request #40578 from tchaikov/wip-cmake-pmem cmake: require libpmem 1.7 and cleanups Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>	2021-04-06 14:31:51 +08:00
Kefu Chai	4ae1ab90fe	Merge pull request #40572 from athanatos/sjust/wip-48613 osd/PeeringState: fix acting_set_writeable min_size check Reviewed-by: Greg Farnum <gfarnum@redhat.com> Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>	2021-04-06 14:30:20 +08:00
Kefu Chai	57ab82be38	Merge pull request #40475 from tchaikov/wip-qa-focal qa/tests: replaced ubuntu_latest.yaml with ubuntu 20.04 Reviewed-by: Yuri Weinstein <yweins@redhat.com>	2021-04-06 14:28:42 +08:00
Kefu Chai	38946dab0d	Merge pull request #40591 from tchaikov/wip-dencoder tools/ceph-dencoder: link against libtcmalloc Reviewed-by: Willem Jan Withagen <wjw@digiware.nl> Reviewed-by: Sage Weil <sage@redhat.com>	2021-04-06 10:31:07 +08:00
Samuel Just	642a1c1654	osd/PeeringState: fix acting_set_writeable min_size check acting.size() >= pool.info.min_size is meant to check min_size against acting set participants, but acting is a vector with placeholders. actingset is the representation with placeholders removed. The upshot of this bug is that the activation process will basically ignore min_size for an ec pool allowing writes in cases where it shouldn't. PastIntervals::check_new_interval, however, performs the check correctly, and will therefore discount intervals in which we really did serve writes as not writeable. This can trigger many different problem conditions including but not limited to: - Unfound objects due to accepting a last_update with insufficient osds - Lost writes - Crashes due to peering rules being violated This bug was originally introduced with recovery below min_size in `e5a96fd`, and then preserved through refactors in `749a13d` and 95bec9. `7cb818a` exposed it with with expansion of recovery below min_size to include ec pools (acting.size() is sufficient for replicated pools). Fixes: https://tracker.ceph.com/issues/48613 Fixes: https://tracker.ceph.com/issues/48417 Signed-off-by: Samuel Just <sjust@redhat.com>	2021-04-05 17:43:18 -07:00
Sage Weil	0f1aa79a4a	Merge PR #40599 into master * refs/pull/40599/head: rpm: add missing % in %dir directive Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-04-05 16:50:23 -04:00
Yuval Lifshitz	bede749630	Merge pull request #40598 from TRYTOBE8TME/wip-rgw-kafka-tests-fix src/rgw: Issue #50138 fix	2021-04-05 21:30:40 +03:00
Sage Weil	94df762447	qa/tasks/ceph.conf: shorten cephx TTL for testing Rotate tickets frequently to exercise those code paths during testing. Signed-off-by: Sage Weil <sage@newdream.net>	2021-04-05 13:19:57 -05:00

1 2 3 4 5 ...

121772 Commits