Commit Graph

126754 Commits

Author SHA1 Message Date
Adam King
f0139dd983 mgr/cephadm: offline host handling improvements for agent
Signed-off-by: Adam King <adking@redhat.com>
2021-09-24 07:23:51 -04:00
Adam King
c8b41564cd mgr/cephadm: handle use_agent being turned on and off
Signed-off-by: Adam King <adking@redhat.com>
2021-09-24 07:23:51 -04:00
Adam King
f15e1ec255 mgr/cephadm: better handling of offline hosts with agent
Signed-off-by: Adam King <adking@redhat.com>
2021-09-24 07:23:51 -04:00
Adam King
c35375948e mgr/cephadm: convert networks from set to list + don't reset con on down agent hosts
networks needs to be list so they can be encoded in  a json string
resetting con on the hosts where agent isn't reporting (possibly offline hosts)

Signed-off-by: Adam King <adking@redhat.com>
2021-09-24 07:23:51 -04:00
Adam King
eeb18384c3 mgr/cephadm: add ceph volume to metadata agent reports
Signed-off-by: Adam King <adking@redhat.com>
2021-09-24 07:23:51 -04:00
Adam King
d7072d87ec mgr/cephadm: implement 2-way ssl in mgr -> MgrListener comm line
Signed-off-by: Adam King <adking@redhat.com>
2021-09-24 07:23:51 -04:00
Adam King
be72463d03 cephadm: allow mgr listener to handle variable length json strings
Signed-off-by: Adam King <adking@redhat.com>
2021-09-24 07:23:50 -04:00
Adam King
78983ad0d0 mgr/cephadm: cephadm agent 2.0
Creates an http endpoint in mgr/cephadm to receive
http requests and an agent that can be deployed on
each host that will gather metadata on the host and
send it to the mgr/cephadm http endpoint. Should save the
cephadm mgr module a lot of time it would have to spend
repeatedly ssh-ing into each host to gather the metadata
and help performance on larger clusters.

Fixes: https://tracker.ceph.com/issues/51004

Signed-off-by: Adam King <adking@redhat.com>
2021-09-24 07:23:50 -04:00
Ernesto Puerta
17f3685d40
Merge pull request #43037 from ljflores/perf-histogram-formatting
mgr/dashboard: improve formatting of histograms in Telemetry preview form

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>
2021-09-24 11:55:59 +02:00
Samuel Just
080b0e12fd
Merge pull request #43295 from liu-chunmei/fix_store_nbd
crimson/store-nbd: fix store_nbd build error for futurized store mkfs

Reviewed-by: Samuel Just <sjust@redhat.com>
2021-09-23 23:29:12 -07:00
chunmei-liu
75d95dcda3 crimson/store-nbd: fix store_nbd build error for futurized store mkfs
Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
2021-09-23 21:20:39 -07:00
Samuel Just
4a9f1d7909
Merge pull request #43261 from rzarzynski/wip-crimson-mkfs_ertr
crimson/{common,os,osd}: errorate the FuturizedStore::mkfs() paths

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-09-23 19:49:08 -07:00
Samuel Just
d8ee8f03b9
Merge pull request #43288 from rzarzynski/wip-crimson-skip-our-frames-in-bt
crimson/common: skip first 4 frames when dumping a backtrace.

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
2021-09-23 19:48:04 -07:00
Sebastian Wagner
166cd9976c
Merge pull request #43275 from adk3798/maint-target
mgr/cephadm: base maintenance mode enter/exit success/failure on returned message

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-09-23 22:15:28 +02:00
Sebastian Wagner
d5d1f154bc
Merge pull request #43121 from mgfritch/cephadm-pull-error
cephadm: raise error during `pull` failure

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Daniel Pivonka <dpivonka@redhat.com>
2021-09-23 22:12:01 +02:00
Sebastian Wagner
44eb16dd13
Merge pull request #43062 from strenuous-life/wip-cephadm-zap-device
mgr/cephadm: osd should not be zap when it is running

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-09-23 22:11:30 +02:00
Kamoltat Sirivadhna
e3bbfcfd2e
Merge pull request #37544 from kamoltat/wip-mgr-progress-global-efficiency
mgr/progress: optimize global recovery module

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-09-23 14:12:16 -04:00
Laura Flores
94fbcefaf1 mgr/dashboard: improve unittest
Signed-off-by: Laura Flores <lflores@redhat.com>
2021-09-23 17:34:05 +00:00
Radoslaw Zarzynski
295268a11a tests/crimson: make the virtual methods of SeaStoreTestState final.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-09-23 17:27:39 +00:00
Radoslaw Zarzynski
c04f5a2acf crimson/os: workaround the segfaulting GCC 11 issue.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-09-23 17:27:39 +00:00
Radoslaw Zarzynski
e5f7ff249d crimson/osd, crimson/os: errorate the FuturizedStore::mkfs() paths.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-09-23 17:27:27 +00:00
Radoslaw Zarzynski
56208ed326 crimson/common: skip first 4 frames when dumping a backtrace.
It's all about these items:

```
 0# print_backtrace(std::basic_string_view<char, std::char_traits<char> >) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:80
 1# FatalSignal::signaled(int, siginfo_t const&) at /opt/rh/gcc-toolset-9/root/usr/include/c++/9/ostream:570
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:
62
 3# 0x00007F16BBA13B30 in /lib64/libpthread.so.0
```

They are part of our backtrace handling and typically developers
are not interested in them. Let's be consistent with the classical
OSD and hide them.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-09-23 16:44:50 +00:00
Samuel Just
2612268e0d
Merge pull request #43262 from rzarzynski/wip-crimson-alienstore-fix-oncommit
crimson/os/alienstore: fix nullptr deref in OnCommit::finish().

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-09-22 22:59:57 -07:00
Adam King
393330bae6 mgr/cephadm: base maintenance enter/exit success off of return message
rather than on whether there was any stdout from the command

Signed-off-by: Adam King <adking@redhat.com>
2021-09-22 20:21:10 -04:00
Adam King
5350d58b9a mgr/cephadm: unit tests for maintenance enter/exit properly handling success/failure messages
Signed-off-by: Adam King <adking@redhat.com>
2021-09-22 20:21:10 -04:00
Adam King
73e6aa8d9c cephadm: unit tests for maitenance mode return values
Signed-off-by: Adam King <adking@redhat.com>
2021-09-22 20:21:02 -04:00
Adam King
8bdddfa02e cephadm: fix mypy complaints for ThreadedChildWatcher class
Signed-off-by: Adam King <adking@redhat.com>
2021-09-22 17:50:44 -04:00
Adam King
3a15f1dc81 cephadm: fix exiting maintenance when systemd target doesn't exist
If the systemd target doesn't exist we need to just bypass enabling
it and return success or the host will just be stuck in maitnenance
mode.

Signed-off-by: Adam King <adking@redhat.com>
2021-09-22 17:50:44 -04:00
Laura Flores
9158cbe9da mgr/dashboard: fix linting in unittest
Signed-off-by: Laura Flores <lflores@redhat.com>
2021-09-22 21:36:58 +00:00
Laura Flores
f7d6642f5e mgr/dashboard: clarify comment and variable name
Signed-off-by: Laura Flores <lflores@redhat.com>
2021-09-22 19:29:11 +00:00
Kamoltat
fa92db1b37 mgr/progress: optimize global recovery module
Instead of fetching `pg_stats` from the python
part of manager module, we filter out the pgs
that are in active + clean state in ActivePyModules.cc
then parse these pgs along with `reported_epoch` and
the `total_num_pgs` of the clusters to global recovery
module.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
2021-09-22 19:18:31 +00:00
Laura Flores
484126b988 mgr/dashboard: add unit test for telemetry replacer method
This unit test checks that the "replacer" method in telemetry.component.ts works as it should. The replacer method takes the telemetry report and changes the ranges and values of the 'osd_perf_histograms' field from arrays to strings, thereby making the report more readable in the Dashboard Telemetry Preview.

This unit test needs improvement since it currently uses a test report rather than the real one.

Signed-off-by: Laura Flores <lflores@redhat.com>
2021-09-22 18:49:57 +00:00
Radoslaw Zarzynski
9c3f51e99b crimson/os/alienstore: fix nullptr deref in OnCommit::finish().
`seastar::engine()` is available only for Seastar's threads;
it shouldn't be called outside of a reactor thread.
Unfortunately, this assumption is violated in `AlienStore`
where `OnCommit::finish()`, executed from a finisher thread
of `BlueStore`, calls `alien()` on `seastar::engine()`.
The net effect are crashes like the following one:

```
INFO  2021-09-22 14:26:33,214 [shard 0] osd - operator() writing superblock cluster_fsid 1d8f7908-2ebf-4a91-ae70-f445668c126b osd_fsid 4da9fe9a-1da5-4ea9-aa79-a1178165ede5         [381/1839]
Segmentation fault.
Backtrace:
 0# print_backtrace(std::basic_string_view<char, std::char_traits<char> >) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:80
 1# FatalSignal::signaled(int, siginfo_t const&) at /opt/rh/gcc-toolset-9/root/usr/include/c++/9/ostream:570
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:
62
 3# 0x00007F16BBA13B30 in /lib64/libpthread.so.0
 4# (anonymous namespace)::OnCommit::finish(int) at /home/rzarzynski/ceph1/build/../src/crimson/os/alienstore/alien_store.cc:53
 5# Context::complete(int) at /home/rzarzynski/ceph1/build/../src/include/Context.h:100
 6# Finisher::finisher_thread_entry() at /home/rzarzynski/ceph1/build/../src/common/Finisher.cc:65
 7# 0x00007F16BBA0915A in /lib64/libpthread.so.0
 8# clone in /lib64/libc.so.6
Dump of siginfo:
  ...
  si_addr: 0x10
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-09-22 17:04:12 +00:00
Sébastien Han
d855dbb4d8
Merge pull request #40728 from guits/doc_fix_pattern_rgw_nfs
doc/rgw-nfs: use same pattern for keyring name
2021-09-22 17:08:31 +02:00
Samuel Just
e89ed326b5
Merge pull request #43254 from cyx1231st/wip-seastore-fix-onode-order
crimson/onode-staged-tree: convert hash to the reversed version

Reviewed-by: Samuel Just <sjust@redhat.com>
2021-09-22 07:05:01 -07:00
Kefu Chai
143fa9b3f6
Merge pull request #43249 from cyx1231st/wip-seastore-fix-omap-hint
crimson/os/seastore: add missing hints in omap tree

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2021-09-22 21:47:37 +08:00
Daniel Gryniewicz
254bf8e883
Merge pull request #43055 from soumyakoduri/wip-skoduri-lua
rgw/lua: Install the packages only for RadosStore

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Daniel Gryniewicz <dang@redhat.com>
2021-09-22 08:12:23 -04:00
Daniel Gryniewicz
a5ffc44435
Merge pull request #43054 from soumyakoduri/wip-skoduri-dbstore-vstart
rgw: Add option to configure backend store

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2021-09-22 08:11:30 -04:00
Daniel Gryniewicz
2bdef857fc
Merge pull request #42911 from soumyakoduri/wip-skoduri-dbstore-object
rgw/dbstore object APIs

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2021-09-22 08:10:52 -04:00
Sebastian Wagner
a2d9839f58
Merge pull request #43241 from sebastian-philipp/suites-orch-labelere
.github: fix path to cephadm suite

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-09-22 11:34:35 +02:00
Yingxin Cheng
0c1c972866 crimson/onode-staged-tree: consolidate laddr hint calculation
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-09-22 15:55:09 +08:00
Yingxin Cheng
179f826919 crimson/onode-staged-tree: convert hash to the reversed version
Store the reversed version of object hash to make sure that onodes in
the same PG are sorted together.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-09-22 15:55:09 +08:00
Yingxin Cheng
18bf246672 crimson/onode-staged-tree: print cursh-hash in hex mode
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-09-22 15:55:01 +08:00
Yingxin Cheng
30f22ecb02 crimson/os/seastore: add missing hints in omap tree
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-09-22 11:23:34 +08:00
Samuel Just
86b9f03094
Merge pull request #43247 from rzarzynski/wip-crimson-ertr-safe_then_unpack
crimson/common: add safe_then_unpack() to errorated futures

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
2021-09-21 17:12:58 -07:00
Radoslaw Zarzynski
710b928f9d crimson/common: implement singleton_ec.
Unfortunately, GCC explodes when it sees that.

```
[rzarzynski@o06 build]$ ninja crimson-osd vstart
[1/3] Building CXX object src/crimson/os/cyanstore/CMakeFiles/crimson-cyanstore.dir/cyan_store.cc.o
FAILED: src/crimson/os/cyanstore/CMakeFiles/crimson-cyanstore.dir/cyan_store.cc.o
/usr/bin/ccache /opt/rh/gcc-toolset-9/root/usr/bin/c++ -DBOOST_ASIO_DISABLE_THREAD_KEYWORD_EXTENSION -DBOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT -DHAVE_CONFIG_H -DSEASTAR_API_LEVEL=6 -DWITH_SEASTAR=1 -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_REENTRANT -D_THREAD_SAFE -D__CEPH__ -D__STDC_FORMAT_MACROS -D__linux__ -Isrc/include -I../src -I../src/seastar/include -Isrc/seastar/gen/include -isystem boost/include -isystem include -isystem ../src/xxHash -isystem ../src/rapidjson/include -O2 -g -DNDEBUG -fPIC   -U_FORTIFY_SOURCE -Wall -fno-strict-aliasing -fsigned-char -Wtype-limits -Wignored-qualifiers -Wpointer-arith -Werror=format-security -Winit-self -Wno-unknown-pragmas -Wnon-virtual-dtor -Wno-ignored-qualifiers -ftemplate-depth-1024 -Wpessimizing-move -Wredundant-move -Wstrict-null-sentinel -Woverloaded-virtual -fno-new-ttp-matching -fstack-protector-strong -fdiagnostics-color=auto -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free -ftemplate-backtrace-limit=0 -Wno-non-virtual-dtor -std=gnu++17 -U_FORTIFY_SOURCE -Wno-maybe-uninitialized -DSEASTAR_SSTRING -Wno-error=unused-result -std=c++17 -MD -MT src/crimson/os/cyanstore/CMakeFiles/crimson-cyanstore.dir/cyan_store.cc.o -MF src/crimson/os/cyanstore/CMakeFiles/crimson-cyanstore.dir/cyan_store.cc.o.d -o src/crimson/os/cyanstore/CMakeFiles/crimson-cyanstore.dir/cyan_store.cc.o -c ../src/crimson/os/cyanstore/cyan_store.cc
during IPA pass: inline
../src/crimson/os/cyanstore/cyan_store.cc: In member function ‘std::string crimson::os::singleton_ec<MsgV>::this_error_category::message(int) const [with const char* MsgV = (& msg)]’:
../src/crimson/os/cyanstore/cyan_store.cc:97:17: internal compiler error: Segmentation fault
   97 |     std::string message([[maybe_unused]] const int ev) const final {
      |                 ^~~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
Preprocessed source stored into /tmp/ccFi4GGw.out file, please attach this to your bugreport.
ninja: build stopped: subcommand failed.
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-09-21 22:59:09 +00:00
Radoslaw Zarzynski
7f99a88ea9 crimson/common: add safe_then_unpack() to errorated futures.
It was a prerequisite for another commit I finally thrown
away. However, this little bit can be still be useful even
for the sake of compliance with the interruptible variant.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-09-21 21:34:00 +00:00
Radoslaw Zarzynski
06e19d817e crimson/common: assert_moveable() doesn't depend on 3rd party's always_false<>.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-09-21 21:28:02 +00:00
Samuel Just
2ec096b19c
Merge pull request #43243 from rzarzynski/wip-crimson-net-dangling-bindvec2
crimson/net: fix dangling addrvec in bind(), the repeat_until_value() part

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2021-09-21 11:56:12 -07:00
Samuel Just
ed9b233db0
Merge pull request #43209 from rzarzynski/wip-crimson-silent-check-bot
tests/crimson: don't be so verbose when run by the 'make check' bot.

Reviewed-by: Samuel Just <sjust@redhat.com>
2021-09-21 11:55:30 -07:00