Commit Graph

143035 Commits

Author SHA1 Message Date
Laura Flores
61e721c9f1 PendingReleaseNotes: add note about read balancer mgr module integration
Signed-off-by: Laura Flores <lflores@ibm.com>
2024-01-28 13:15:49 -06:00
Laura Flores
e2ce8ed1ff mgr: add read balancer support inside the balancer module
Read balancing may now be managed automatically via the balancer
manager module. Users may choose between two new modes: ``upmap-read``, which
offers upmap and read optimization simultaneously, or ``read``, which may be used
to only optimize reads. Existing balancer commands have also been added to
contain more information about read balancing.

Run the following commands to test the new automatic behavior:
`ceph balancer on` (on by default)
`ceph balancer mode <read|upmap-read>`
`ceph balancer status`

Run the following commands to test the new supervised behavior:
`ceph balancer off`
`ceph balancer mode <read|upmap-read>`
`ceph balancer eval` | `ceph balancer eval <pool-name>`
`ceph balancer eval-verbose` | `ceph balancer eval-verbose <pool-name>`
`ceph balancer optimize <plan-name>`
`ceph balancer show <plan-name>`
`ceph balancer eval <plan-name>`
`ceph balancer execute <plan-name>`

In the balancer module, there is also a new "self_test" function which tests
the module's basic functionality. This test can be triggered with the following
commands:
`ceph mgr module enable selftest`
`ceph mgr self-test module balancer`

Related Trello: https://trello.com/c/sWoKctzL/859-add-read-balancer-support-inside-the-balancer-module
Signed-off-by: Laura Flores <lflores@ibm.com>
2024-01-28 13:15:38 -06:00
Ronen Friedman
5970ff6637 osd/scrub: add required sub-states to handle queued reservation requests
The scrub async reserver is not yet used. All requests are treated as
'legacy' requests, i.e. requests that expect an immediate grant/deny
reply.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2024-01-28 09:40:02 -06:00
Ronen Friedman
c61bca6d6b osd/scrub: add "queue my request" flag to replica reservation messages
Up-to-date primaries will set this flag when sending a reservation
request. The replica OSD, if too busy to handle the request immediately, will queue
it until such time that the number of concurrent reservations is below the
configured limit. The queued requests are honored in FIFO order.

Old primaries will not set this flag, and will receive the expected
grant or deny reply immediately.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2024-01-28 09:40:02 -06:00
Ronen Friedman
c6c05ab639 osd/scrub: add synchronous request to AsyncReserver API
To be used when handling replica reservation requests from "old"
primaries, that expect an immediate grant/deny reply.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2024-01-28 09:40:02 -06:00
Matan Breizman
7eb9e33f53
Merge pull request #55281 from Matan-B/wip-matanb-crimson-cyanstore-rmcoll
crimson/os/cyanstore: support OP_RMCOLL

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: chunmei-liu <chunmei.liu@intel.com>
2024-01-28 11:22:39 +02:00
zdover23
d55f4b4a8d
Merge pull request #55333 from zdover23/wip-doc-2024-01-27-radosgw-index-verb-disagreement
doc/radosgw: fix verb disagreement - index.html

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2024-01-28 18:17:52 +10:00
Zac Dover
9f271093f4 doc/radosgw: fix verb disagreement - index.html
Fix a tricky verb disagreement and rewrite a few sentences for what I
hope is greater clarity.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2024-01-28 18:04:57 +10:00
Guillaume Abrioux
327ec975e9
Merge pull request #54423 from guits/dmcrypt-optim
ceph-volume: use 'no workqueue' options with dmcrypt
2024-01-27 12:27:42 +01:00
Guillaume Abrioux
f72100bbd1 ceph-volume: fix partitions support in disk.get_devices()
The following:
```
is_part = get_file_contents(os.path.join(_sys_dev_block_path, item, 'partition')) == "1"
```
assumes any `/sys/dev/block/x:y/partition` contains '1' which is wrong.
This file actually contains the corresponding partition number.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-27 01:29:19 +01:00
Laura Flores
906bf69521
Merge pull request #55323 from ceph/dependabot-github_actions-gregsdennis-dependencies-action-1.3.2
.github: Bump gregsdennis/dependencies-action from 1.2.3 to 1.3.2
2024-01-26 16:46:36 -06:00
Laura Flores
15bd38eece mgr: add CephReleases class to sustainably compare releases
Changes how the upmap balancer compares min_mon_release
to account for release names eventually wrapping around the alphabet.

Signed-off-by: Laura Flores <lflores@ibm.com>
2024-01-26 22:41:23 +00:00
Laura Flores
702cb64e87
Merge pull request #55331 from ceph/revert-55096-sjust/for-review/wip-crush-msr
Revert "crush: add multistep retry rules"
2024-01-26 16:15:46 -06:00
Guillaume Abrioux
0985e20134 ceph-volume: use 'no workqueue' options with dmcrypt
CloudFlare engineers made some testing and realized that using
workqueues with encryption on flash devices has a bad effect.

See [1] for details.

With this patch it will make ceph-volume call crypsetup with
`--perf-no_read_workqueue` and `--perf-no_write_workqueue` options
when the device is not a rotational.

[1] https://blog.cloudflare.com/speeding-up-linux-disk-encryption/

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Co-Authored-by: Stefan Kooman <stefan@kooman.org>
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-26 22:05:30 +01:00
Samuel Just
a5ce9c3863 Revert "crush: add multistep retry rules"
This PR was merged by accident before it was ready.
Let's revert for now and open a new PR.

Signed-off-by: Samuel Just <sjust@redhat.com>
2024-01-26 20:32:05 +00:00
Yuri Weinstein
37d5d931b0
Merge pull request #55096 from athanatos/sjust/for-review/wip-crush-msr
crush: add multistep retry rules

Reviewed-by: Laura Flores <lflores@redhat.com>
2024-01-26 11:57:53 -08:00
Alexander Indenbaum
11a37da053 build dependencies: centos9
- ceph.spec.in: declare git as build dependency
- install-deps.sh: enable CRB repo

Test procedure:
    docker run --rm -ti  -v /home/baum/ceph-ci:/home/ceph quay.io/centos/centos:stream9 bash
    [root@a3c4b1545e93 /]# cd /home/ceph/
    [root@a3c4b1545e93 ceph]# ./install-deps.sh 2>&1 tee install-deps.log

Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
2024-01-26 19:56:31 +00:00
Laura Flores
b5ad8cb325 .github/workflows: update comment to reflect version change
Signed-off-by: Laura Flores <lflores@ibm.com>
2024-01-26 09:59:56 -06:00
Casey Bodley
4bdc5d18dd rgw/rest: fix url decode of post params for iam/sts/sns
add the `in_query=true` argument to `url_decode()` to replace '+' with ' '

Fixes: https://tracker.ceph.com/issues/64189

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2024-01-26 09:53:33 -05:00
Casey Bodley
1112689da4
Merge pull request #55303 from cbodley/wip-63130-debug
cmake/arrow: don't treat warnings as errors

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
2024-01-26 14:18:48 +00:00
Ilya Dryomov
2b11aa38ea
Merge pull request #55234 from ajarr/wip-64063
rbd-nbd: use netlink interface by default

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2024-01-26 12:37:52 +01:00
Kefu Chai
d813ce1923
Merge pull request #55121 from zhscn/fix-ambiguous-error
common: fix ambiguous error when using gcc 13

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <tchaikov@gmail.com>
2024-01-26 14:42:06 +08:00
Yingxin
3e190e5614
Merge pull request #54896 from cyx1231st/wip-crimson-save-conn-foreign-copy
crimson/osd: drop a foreign-copy to shard-0 for every pg operation

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
2024-01-26 13:47:37 +08:00
Casey Bodley
ecb4eb14e5
Merge pull request #52496 from adamemerson/wip-rgw-surface-neorados
rgw: Surface neorados

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
2024-01-26 02:43:44 +00:00
dependabot[bot]
e03f8a8c16
.github: Bump actions/labeler from 4.0.2 to 5.0.0
Bumps [actions/labeler](https://github.com/actions/labeler) from 4.0.2 to 5.0.0.
- [Release notes](https://github.com/actions/labeler/releases)
- [Commits](5c7539237e...8558fd7429)

---
updated-dependencies:
- dependency-name: actions/labeler
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-25 23:57:22 +00:00
dependabot[bot]
5ae5925a92
.github: Bump gregsdennis/dependencies-action from 1.2.3 to 1.3.2
Bumps [gregsdennis/dependencies-action](https://github.com/gregsdennis/dependencies-action) from 1.2.3 to 1.3.2.
- [Release notes](https://github.com/gregsdennis/dependencies-action/releases)
- [Commits](80b5ffec56...f98d55eee1)

---
updated-dependencies:
- dependency-name: gregsdennis/dependencies-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-25 23:57:13 +00:00
Laura Flores
fc4ff1796e
Merge pull request #55308 from ljflores/wip-dependabot 2024-01-25 17:56:29 -06:00
zdover23
77fbe9ead3
Merge pull request #55307 from zdover23/wip-doc-2024-01-25-radosgw-admin-usage
doc/radosgw: edit "Usage" admin.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2024-01-26 09:24:13 +10:00
Zac Dover
d8df6f61e8 doc/radosgw: edit "Usage" admin.rst
Edit "Usage" in doc/radosgw/admin.rst.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2024-01-26 09:12:59 +10:00
Casey Bodley
93d158711e
Merge pull request #55315 from cbodley/wip-moncommand-dencoder
mon: zero-initialize MonCommand::flags

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
2024-01-25 17:12:07 +00:00
Casey Bodley
5d0477eb1b qa/tempest: use default tempurl_digest_hashlib=sha256
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2024-01-25 11:26:07 -05:00
Patrick Donnelly
5acd763010
qa: use centos 9.stream for cephfs stock kernel testing
RHEL8 is no longer supported in Squid. RHEL9 is not yet available in FOG.

Fixes: https://tracker.ceph.com/issues/64085
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2024-01-25 11:15:52 -05:00
Ramana Raja
fcbf7367d2 rbd-nbd: map using netlink interface by default
Mapping rbd images to nbd devices using ioctl interface is not
robust. It was discovered that the device size or the md5 checksum
of the nbd device was incorrect immediately after mapping using
ioctl method. When using the nbd netlink interface to map RBD images
the issue was not encountered. Switch to using nbd netlink interface
for mapping.

Fixes: https://tracker.ceph.com/issues/64063
Signed-off-by: Ramana Raja <rraja@redhat.com>
2024-01-25 11:00:59 -05:00
Matan Breizman
6a130a7007 crimson/os/cyanstore: support OP_RMCOLL
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2024-01-25 15:33:45 +00:00
Guillaume Abrioux
c07482a86a node-proxy: collect LocationIndicatorActive property (storage)
This makes node-proxy collect the `LocationIndicatorActive`
property for storage component.
This can be needed for the Blinkenlight feature.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
316d032148 node-proxy: add new attribute to BaseRedfishSystem()
This adds `self.component_list()` in order to parametrize
which categories the agent will collect.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
0dd7364364 node-proxy: add packaging related changes
This adds the required changes to build an RPM of node-proxy.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
a724e4cfcf node-proxy: reduce log level in reporter agent
the following messages get logged quite a lot while
this is not a very useful information in a normal situation:

```
2024-01-12 09:09:40,604 - reporter - INFO - data ready to be sent to the mgr.
2024-01-12 09:09:40,604 - reporter - INFO - no diff, not sending data to the mgr.
2024-01-12 09:10:15,022 - reporter - INFO - data ready to be sent to the mgr.
2024-01-12 09:10:15,022 - reporter - INFO - no diff, not sending data to the mgr.
...
```

This commit changes the log level to DEBUG.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
c3682f0e66 node-proxy: fix a thread/locking issue
This `sleep(5)` should be initiated *after* the lock is released.
Otherwise, it can cause troubles with the reporter loop which can
never acquire the lock.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
c4675a6c97 node-proxy: address a typo
while checking logs, I noticed the following message:

```
2024-01-12 09:08:03,751 - reporter - INFO - Reporter url set to https:10.10.10.11:7150/node-proxy/data
```

Although this is only a cosmetic issue as this variable
is only used for logging messages, let's fix it.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
3b8c945a6a node-proxy: make it a separate daemon
The current implementation requires the inclusion of all the recent
modifications in the cephadm binary, which won't be backported.

Since we need the node-proxy code backported to reef, let's move the
code make it a separate daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Co-authored-by: Adam King <adking@redhat.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
f99decff89 node-proxy: rename attribute and class
This renames the mgr's NodeProxyCache attribute from
`self.node_proxy` to `self.node_proxy_cache` and the
class `NodeProxy` in agent.py from `NodeProxy` to
`NodeProxyEndpoint` to make it clearer and avoid confusion.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
da347d235d node-proxy: enhance debug log messages for locking operations
This commit updates the debug log messages in the BaseRedfishSystem
and Reporter classes. The adjustments made enhance the clarity and
precision of the messages by specifically identifying acquired
and released locks, detailing their context, thereby improving the
understanding of the control flow during locking operations
in these components.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
a77a13f6af node-proxy: explicitly set NodeProxy's attributes
The current logic using `setattr()` makes mypy complain:

"NodeProxy" has no attribute "xxx"

Using `self.__dict['xxx']` addresses this mypy error but the
downside of this is that the code isn't clear and less readable.

Explicitly setting the different attributes makes the code clearer
and more readable.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
56a939af49 cephadm/tests: add pyyaml dependency
node-proxy requires this dependency so it needs to be added as
dependency for tox testing.

Typical failure:

```
ImportError while importing test module '/root/ceph/src/cephadm/tests/test_agent.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib64/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_agent.py:10: in <module>
    _cephadm = import_cephadm()
tests/fixtures.py:14: in import_cephadm
    import cephadm as _cephadm
cephadm.py:32: in <module>
    from cephadmlib.node_proxy.main import NodeProxy
cephadmlib/node_proxy/main.py:2: in <module>
    from .redfishdellsystem import RedfishDellSystem
cephadmlib/node_proxy/redfishdellsystem.py:2: in <module>
    from .baseredfishsystem import BaseRedfishSystem
cephadmlib/node_proxy/baseredfishsystem.py:2: in <module>
    from .basesystem import BaseSystem
cephadmlib/node_proxy/basesystem.py:2: in <module>
    from .util import Config
cephadmlib/node_proxy/util.py:2: in <module>
    import yaml
E   ModuleNotFoundError: No module named 'yaml'
```

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
17c9a52507 node-proxy: send oob management requests to the MgrListener()
Note that this won't be a true out of band management.
In the case where the host hangs, this won't work. The oob
management should be reached directly but most of the time
the oob network is isolated. The idea is to send queries to the
the tcp server exposed by the cephadm agent (MgrListener) so it
can send itself queries to the redfish API using the IP address
exposed on the OS.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
04f8b5b85c cephadm: add types-PyYAML dependency in mypy testing
In order to address the following error:

```
cephadmlib/node_proxy/util.py:2: error: Library stubs not installed for "yaml" (or incompatible with Python 3.9)
cephadmlib/node_proxy/util.py:2: note: Hint: "python3 -m pip install types-PyYAML"
cephadmlib/node_proxy/util.py:2: note: (or run "mypy --install-types" to install all missing stub packages)
cephadmlib/node_proxy/util.py:2: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
```

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
f9a6467fdc node-proxy: address flake8 errors in tests
This addresses a lot of flake8 errors in node-proxy tests:

E121 continuation line under-indented for hanging indent

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
f262579ef0 node-proxy: move the output formatting logic to orchestrator
Implementing this in the cephadm module doesn't follow the general idea
of the orchestrator interface. This is where the output formatting should
be done so let's move the logic to the orchestrator module.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00
Guillaume Abrioux
e45cf32511 node-proxy: address a typing issue in agent.NodeProxy.query()
The current logic supports str and bytes types for parameter
`data`. This doesn't make sense, let's drop this logic.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2024-01-25 15:07:21 +00:00