1
0
mirror of https://github.com/ceph/ceph synced 2025-03-11 02:39:05 +00:00
Commit Graph

125095 Commits

Author SHA1 Message Date
Radoslaw Zarzynski
76e5f5caad crimson/osd: prevent premature OSD activation.
In contrast to the classical OSD:

```
int OSD::init()
{
  // ...

  {
    epoch_t bind_epoch = osdmap->get_epoch();
    service.set_epochs(NULL, NULL, &bind_epoch);
  }

  // ...

  // load up pgs (as they previously existed)
  load_pgs();
```

crimson doesn't set the `bind_epoch` when initializing. The net
result is going active prematurely which happens because the 3rd
condition (`bind_epoch < osdmap->get_up_from(whoami)`) is always
true.

```
    if (osdmap->is_up(whoami) &&
        osdmap->get_addrs(whoami) == public_msgr->get_myaddrs() &&
        bind_epoch < osdmap->get_up_from(whoami)) {
      if (state.is_booting()) {
        logger().info("osd.{}: activating...", whoami);
```

Nullifying it translates the "is it activated?" check basically
into "is it up?" verification. This is problematic in a situation
like:

1. Primary got new OSDMap but replica has not.
2. Replica restarts, sends `MOSDBoot` and receives the newer map
   from the previous point.
3. Primary sends a message that is unexpected by replica.
4. Monitor publishes a new OSDMap diven by the `MOSDBoot`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-07-13 14:38:54 +00:00
Kefu Chai
21882e5baf
Merge pull request from ifed01/wip-fix-missing-shared-blob
os/bluestore: fix erroneous SharedBlob record removal during repair.

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-07-13 22:15:34 +08:00
Sebastian Wagner
507ee67848
Merge pull request from sebastian-philipp/doc-dev-cephadm-define-vars
doc/dev/cephadm: Define variables

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-07-13 16:06:01 +02:00
Avan Thakkar
f6afed5aa5 mgr/dashboard: remove usage of 'rgw_frontend_ssl_key'
Fixes: https://tracker.ceph.com/issues/51643
Signed-off-by: Avan Thakkar <athakkar@redhat.com>

Removing the usage of rgw_frontend_ssl_key from the rgw service form.
2021-07-13 19:00:38 +05:30
Kefu Chai
e9a18cc0b1
Merge pull request from liewegas/cleanup-blkdev
common/blkdev: remove stray debug output

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-07-13 21:29:29 +08:00
Kefu Chai
2b8c85965d
Merge pull request from ronen-fr/wip-ronenf-list-object
common/hobject: a minor fix and performance gain to hobjects listing

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-07-13 21:23:43 +08:00
Or Ozeri
3a1edf0c32 msg/async/ProtocolV2: optimize append_frame
The commonly used append_frame function currently copies
frame data, incurring expensive heap allocation and data copying.
Instead, switch to claiming the frame data, re-using it without copying.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-07-13 16:03:44 +03:00
Kefu Chai
23180bf437
Merge pull request from sebastian-philipp/options-ms-bind-port-max
common/options: global.yaml: change ms_bind_port_max to 7568

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-07-13 20:25:15 +08:00
Kefu Chai
83906eded6
Merge pull request from neha-ojha/wip-health-cleanup
mon/PGMap: remove get_stuck_counts because there are no callers

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-07-13 20:24:31 +08:00
Kefu Chai
f6cd3d6341
Merge pull request from tchaikov/tools/kvstore-tool
tools/kvstore_tool: add "std::" before ostream and string

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-07-13 20:23:07 +08:00
Kefu Chai
92b02120e4 crimson/tools/store_nbd: handle ECONNABORTED returned by accept()
if we abort accept() call, an ECONNABORTED is expected. and we should
handle it, otherwise unhandled exception will be noticed by seastar's
reactor. and it complains in that case.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-13 19:43:42 +08:00
Kefu Chai
7094254c4a crimson/tools/store_nbd: call segment_manager->close() after tm->close()
TransactionManager::close() calls into journal->close(), which in turn
calls BlockSegmentManager::segment_close(). and
SegmentStateTracker::write_out() is then called by
BlockSegmentManager::segment_close().

but BlockSegmentManager::close() closes the underlying seastar::file,
we are not able to write to the file after closing it.

in this change, to ensure that we can close a segment correctly in
TMDriver::close(), tm->close() is called before
segment_manager->close().

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-13 19:43:42 +08:00
myoungwon oh
7b669e4af4 osd: fix to recover adjacent clone when set_chunk is called
set_chunk needs adjacent clones to calculate reference count

fixes: https://tracker.ceph.com/issues/51627

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
2021-07-13 19:53:17 +09:00
Varsha Rao
387369373c doc/mgr/nfs: update about RGW exports
This patch just moves the RGW exports created using nfs module to mgr/nfs
document. The RGW requirements will be updated in a different PR.

Signed-off-by: Varsha Rao <varao@redhat.com>
2021-07-13 15:04:54 +05:30
Varsha Rao
65489e8d3e doc/cephfs/nfs: update about nfs module
Signed-off-by: Varsha Rao <varao@redhat.com>
2021-07-13 15:04:42 +05:30
Varsha Rao
5db2b48f14 doc/mgr/nfs: update cephfs export create command about client and squash arguments
Signed-off-by: Varsha Rao <varao@redhat.com>
2021-07-13 14:37:39 +05:30
Varsha Rao
ce7fef695b doc/mgr/nfs: update nfs links
Signed-off-by: Varsha Rao <varao@redhat.com>
2021-07-13 14:37:39 +05:30
Varsha Rao
29f28563ec doc/mgr/nfs: add missing cluster_id to export info command
Signed-off-by: Varsha Rao <varao@redhat.com>
2021-07-13 14:37:39 +05:30
Varsha Rao
c79b797465 doc/cephfs: move nfs doc under mgr docs
Fixes: https://tracker.ceph.com/issues/51428
Signed-off-by: Varsha Rao <varao@redhat.com>
2021-07-13 14:37:39 +05:30
zdover23
6a44ae0c33
Merge pull request from zdover23/wip-doc-dev-essentials-irc-2021-07-10
doc/dev: add IRC information to dev guide

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-07-13 06:20:29 +10:00
zdover23
cba553b2f5
Merge pull request from zdover23/wip-doc-upgrading-ceph-potential-problems-2021-06-30
doc/cephadm: improve "Potential Problems"

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-07-13 06:19:30 +10:00
zdover23
64b42b06e0
Merge pull request from zdover23/wip-doc-upgrading-ceph-starting-the-upgrade-2021-06-29
doc/cephadm: improving "Starting the Upgrade"

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-07-13 06:18:56 +10:00
Laura Flores
cdbd4af471 mgr/telemetry: add debug log for potential empty keys and combine unique rgw perf counters
There was initially a problem with PriorityCache perf counters were part of the name was missing (i.e. "mon.a
.cache_bytes" instead of "mon.a.prioritycache.cache_bytes"). This was fixed in another part of the src code,
but I added a log here to record any future instances of a similar occurrence.

Also: Not every rgw daemon has the same schema. Specifically, each rgw daemon has a uniquely-named collection that starts off identically (i.e. "objecter-0x...") then diverges (i.e. "...55f4e778e140.op_rmw"). I added a bit of code that combines these unique counters all under one rgw instance. Without this check, the schema would remain separeted out in the final report.

Signed-off-by: Laura Flores <lflores@redhat.com>
2021-07-12 19:01:38 +00:00
Laura Flores
9e07175b3c common: fix missing name in PriorityCache perf counters
There was a problem with PriorityCache perf counters, where part of the name was missing (i.e. "mon.a.cache_bytes" instead of "mon.a.prioritycache.cache_bytes"). The problem was happening because a 'this' pointer was missing in the original implementation.

Signed-off-by: Laura Flores <lflores@redhat.com>
2021-07-12 18:47:53 +00:00
Sage Weil
9a49a5819e doc/man/8/cephadm: add --log-to-file (and --single-host-defaults)
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-12 13:45:35 -04:00
Dimitri Savineau
71ba01f018 cephadm: ensure sysctl_dir exist
For some reason, the sysctl directory could not exist if no packages dropping
a custom sysctl file is installed on the host.
Instead we create the directory if it doesn't exist.

Closes: https://tracker.ceph.com/issues/51620

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2021-07-12 11:40:12 -04:00
Ernesto Puerta
7988c25a3f
Merge pull request from clwluvw/osd-device-details-grafana
monitoring: fix Physical Device Latency unit

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: mykaul <NOT@FOUND>
Reviewed-by: p-se <NOT@FOUND>
2021-07-12 17:30:19 +02:00
Ernesto Puerta
c17f019e61
Merge pull request from nSedrickm/auth-storage-directive
mgr/dashboard: create directive for AuthStorage service

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-07-12 17:03:30 +02:00
Ngwa Sedrick Meh
8f809bd95b mgr/dashboard: create directive for AuthStorage service
This commit adds a directive that can be used to conditionally display elements based on authorization/scopes criteria

Fixes: https://tracker.ceph.com/issues/47355
Signed-off-by: Ngwa Sedrick Meh <nsedrick101@gmail.com>
2021-07-12 13:47:01 +01:00
Michael Fritch
f853ce7e9a
cephadm: use pyfakefs during test_create_daemon_dirs_prometheus
convert test to use the `cephadm_fs` fixture

Signed-off-by: Michael Fritch <mfritch@suse.com>
2021-07-11 22:37:12 -06:00
Sage Weil
ee1ba18606 common/blkdev: remove stray debug output
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-11 16:42:46 -04:00
Gal Salomon
3e2c8e94fb
Merge pull request from grajoria/master
doc: Correction and improvisation for Timestamp part of the doc
2021-07-11 23:03:02 +03:00
Kefu Chai
c70cee47fc
Merge pull request from smithfarm/wip-51622
rpm: remove macro invocation from comment line

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-07-10 22:09:49 +08:00
Nathan Cutler
b9c14266a7 rpm: remove macro invocation from comment line
In RPM spec files, comment lines should not include macro invocations,
because RPM can and will expand them, with unpredictable results.

Fixes: https://tracker.ceph.com/issues/51622

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2021-07-10 15:04:07 +02:00
Mark Kogan
e0577da9c4
Merge pull request from wjwithagen/wjw-fix-signal
rgw: Use signaling compatible with POSIX
2021-07-10 15:09:35 +03:00
Zac Dover
1443d486b9 doc/dev: add IRC information to dev guide
In days of yore, the Developer Guide linked to the
IRC page at ceph.io. After the 2021 rewriting of
ceph.io, a new era began, and that page was no
longer easily accessible (it could still be found
at old.ceph.com, but this was deprecated).

Anyway, that IRC information should be included in
the docs. This PR makes sure that the IRC
information is included in the docs.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2021-07-10 07:32:45 +10:00
Laura Flores
4583cf3203 mgr/telemetry: update gather_perf_counters() to be more accurate
The gather_perf_counters() method has been updated to capture unprocessed, raw data instead of derived averages. Certain areas, such as initalizations, have also been improved for readability and consistency.

Signed-off-by: Laura Flores <lflores@redhat.com>
2021-07-09 21:06:52 +00:00
Neha Ojha
7df0d56ac0 mon/PGMap: remove get_stuck_counts because there are no callers
The only callers were removed in 729a08923f,
as a part of the broader health reporting improvements.

Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-07-09 19:26:01 +00:00
Michael Fritch
4a99b771a4
cephadm: use CephadmContext rather than MagicMock
MagicMock hides attribute errors:

```
self = <cephadm.CephadmContext object at 0x7f1121e62370>, name = 'config_json'

    def __getattr__(self, name: str) -> Any:
        if '_conf' in self.__dict__ and hasattr(self._conf, name):
            return getattr(self._conf, name)
        elif '_args' in self.__dict__ and hasattr(self._args, name):
            return getattr(self._args, name)
        else:
>           return super().__getattribute__(name)
E           AttributeError: 'CephadmContext' object has no attribute 'config_json'
```

Signed-off-by: Michael Fritch <mfritch@suse.com>
2021-07-09 13:13:24 -06:00
Michael Fritch
25d62794fc
cephadm: use CephadmContext rather than MagicMock
MagicMock hides attribute errors:

```
ctx = <cephadm.CephadmContext object at 0x7f0a12f58eb0>, container_id = 'container_id', daemon_type = 'node-exporter'

    @staticmethod
    def get_version(ctx, container_id, daemon_type):
        # type: (CephadmContext, str, str) -> str
        """
        :param: daemon_type Either "prometheus", "alertmanager" or "node-exporter"
        """
        assert daemon_type in ('prometheus', 'alertmanager', 'node-exporter')
        cmd = daemon_type.replace('-', '_')
        code = -1
        err = ''
        version = ''
        if daemon_type == 'alertmanager':
            for cmd in ['alertmanager', 'prometheus-alertmanager']:
                _, err, code = call(ctx, [
                    ctx.container_engine.path, 'exec', container_id, cmd,
                    '--version'
                ], verbosity=CallVerbosity.DEBUG)
                if code == 0:
                    break
            cmd = 'alertmanager'  # reset cmd for version extraction
        else:
            _, err, code = call(ctx, [
>               ctx.container_engine.path, 'exec', container_id, cmd, '--version'
            ], verbosity=CallVerbosity.DEBUG)
E           AttributeError: 'NoneType' object has no attribute 'path'
```

Signed-off-by: Michael Fritch <mfritch@suse.com>
2021-07-09 13:09:18 -06:00
Igor Fedotov
7090930d4a os/bluestore: fix erroneous SharedBlob record removal during repair.
Fixes: https://tracker.ceph.com/issues/51619

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
2021-07-09 21:31:42 +03:00
Ilya Dryomov
4dae3915a8
Merge pull request from ceph/1625
doc: 16.2.5 Release Notes

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-07-09 18:06:28 +02:00
Sage Weil
4185376b2b mgr/restful: ignore min/max_size
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-09 12:00:09 -04:00
David Galloway
fd41af70ef doc: 16.2.5 Release Notes
Signed-off-by: David Galloway <dgallowa@redhat.com>
2021-07-09 17:42:03 +02:00
Sage Weil
a0ec8945f6 test/crush: drop min/max_size refs
Signed-off-by: Sage Weil <sage@newdream.net>
2021-07-09 11:41:55 -04:00
Michael Fritch
53d07362ff
cephadm: add infer_config unit test
Signed-off-by: Michael Fritch <mfritch@suse.com>
2021-07-09 08:11:07 -06:00
Michael Fritch
e6dca29ae7
cephadm: add shell command tests
Signed-off-by: Michael Fritch <mfritch@suse.com>
2021-07-09 08:11:06 -06:00
Michael Fritch
c19fb2568e
cephadm: add infer_fsid unit test
Signed-off-by: Michael Fritch <mfritch@suse.com>
2021-07-09 08:11:03 -06:00
Paul Reece
c83afb4359 Amend b7621625ed to not call url_decode excessively
Fixes: 

Signed-off-by: Paul Reece <paul@servercloud.com>
2021-07-09 10:10:55 -04:00
Sage Weil
df32fa6eff
Merge pull request from liewegas/doc-linode
doc/foundation: add linode
2021-07-09 08:44:51 -05:00