to format size options in the same format supported by our C++
strict_iec_cast() parser. so they are more consistent from user's
perspective.
Signed-off-by: Kefu Chai <kchai@redhat.com>
the loop of proc.communicate() on python3.6, where we always are able to
get something out of stdout and/or stderr PIPEs. and the `stdout` and
`stderr` keep growing until out of memory. and teuthology considers
the command crashed after a while.
Fixes: https://tracker.ceph.com/issues/50393
Signed-off-by: Kefu Chai <kchai@redhat.com>
Actually, `OpsExecuter` already holds `ObjectContextRef` and even
has a (private till now) getter for `hobject_t` extraction.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
AddCephTest and googletest's CMake scripts also call
find_package(Python3...), but they do not specify the required minor
version of Python3. by default, find_package(Python3...) picks the highest
available python3. so, if we have multiple python3 versions installed in the
system, and the highest python3 version is not the one specified by the
-DWITH_PYTHON3=3.x.y in the cmake command line, we might end up using a
different python3 for the ceph CLI. and even worse, the required python3
package might not available for the picked python3 interpreter found by
googletest. as, in general, only a single python3 has the full access to
prepackaged python3-* shipped by a GNU/Linux distro.
in this change, the configure_file() calls are rearranged to the top of
src/CMakeLists.txt, so they have less chance to use the "polluted" cmake
variable for their subvars.
this change address the test failure where we have, for instance, python3.8
installed on RHEL8/CentOS8, where python3.6 is the python3 which has
the access to the python3-* packages.
Signed-off-by: Kefu Chai <kchai@redhat.com>
should leave it to do_cmake.sh to decide which python3 version to use,
there is case that we have multiple python3 installed, but only one of them
is fully supported by the distro, in the sense that python3-* packages
are packaged for that python3.
Signed-off-by: Kefu Chai <kchai@redhat.com>
cmake: let WITH_MGR_ROOK_CLIENT depend on WITH_MGR
Reviewed-by: Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
it does not depend on WITH_MGR_DASHBOARD_FRONTEND, which is disabled by
default and is used for enable/disable the inclusion of dashboard
support. while root client is used by orchestrator. so it should depend
on WITH_MGR not WITH_MGR_DASHBOARD_FRONTEND.
this change addresses the regression introduced by
1003f1ffee
Signed-off-by: Kefu Chai <kchai@redhat.com>
read_extents in all except one case was used to read a known single extent
-- replace those users with read_extent. store-nbd uses read_extents as
intended, but other users will need to be able to deal with zero mappings.
Signed-off-by: Samuel Just <sjust@redhat.com>
Manifest objects try to recover their clones if the clones
are unreadable when calculating ref. count.
In some case, the recovery takes more time than 150s,
so this commit extends time from 150s to 300s.
Fixes: https://tracker.ceph.com/issues/50352
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
The hosts-overview Grafana dashboard json file contains a repeated element, making
it invalid JSON. Some JSON parsers handle this. However, this prevents Jsonnet
from parsing the dashboard, which prevents the deployment of this dashboard via
Jsonnet.
Fixes: https://tracker.ceph.com/issues/50410
Signed-off-by: Malcolm Holmes <mdh@odoko.co.uk>
During a teuthology run [1] following crash happended:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-04-08_10:14:11-rados-master-distro-basic-smithi/6028696$ less remote/smithi052/log/ceph-osd.3.log.gz
...
DEBUG 2021-04-08 10:32:58,548 [shard 0] ms - [osd.3(client) v2:172.21.15.52:6813/30889@62168 >> mon.0 v2:172.21.15.52:3300/0] <== #3 === mgrmap(e 4) v1 (1796)
INFO 2021-04-08 10:32:58,549 [shard 0] ms - [osd.3(client) v2:172.21.15.52:6813/30889@62056 >> mgr.4100 v2:172.21.15.52:6800/30259] closing: reset no, replace no
DEBUG 2021-04-08 10:32:58,549 [shard 0] ms - [osd.3(client) v2:172.21.15.52:6813/30889@62056 >> mgr.4100 v2:172.21.15.52:6800/30259] TRIGGER CLOSING, was READY
INFO 2021-04-08 10:32:58,549 [shard 0] ms - [osd.3(client) v2:172.21.15.52:6813/30889@62056 >> mgr.4100 v2:172.21.15.52:6800/30259] execute_ready(): protocol aborted at CLOSING -- std::system_error (error crimson::net:4, read eof)
DEBUG 2021-04-08 10:32:58,549 [shard 0] ms - [osd.3(client) v2:172.21.15.52:6813/30889@62056 >> mgr.4100 v2:172.21.15.52:6800/30259] closed!
Segmentation fault on shard 0.
Backtrace:
0x000000000151765c
0x00000000014d9600
0x00000000014d9902
0x00000000014d9972
/lib64/libpthread.so.0+0x0000000000012b1f
0x0000000000e59cba
0x00000000014dc8a6
0x00000000014cdd1c
0x0000000001503053
0x000000000149fab7
0x00000000006e0ef5
/lib64/libc.so.6+0x00000000000237b2
0x000000000072a23d
daemon-helper: command crashed with signal 11
```
[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-04-08_10:14:11-rados-master-distro-basic-smithi/6028696/
GDB testifies the `conn` during the execution of `ceph::mgr:report()` was null:
```
(gdb) frame 7
154 in /usr/src/debug/ceph-17.0.0-2935.g4153f8c2.el8.x86_64/src/crimson/mgr/client.cc
(gdb) print conn
$1 = {_b = 0x0, _p = 0x0}
```
Taken altogether with the `mgr.4100 v2:172.21.15.52:6800/30259] closed!`
debug this suggests that a call to `report()` occurred (likely from the
timer) but we were in the middle of the unatomic reconnect sequence:
```cpp
seastar::future<> Client::reconnect()
{
if (conn) {
conn->mark_down();
conn = {};
}
// ...
return seastar::sleep(a_while).then([this] {
// ...
conn = msgr.connect(peer, CEPH_ENTITY_TYPE_MGR);
});
}
```
This commit alters the `mgr::report()` to skip reporting is the `conn`
is unavailable.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
* refs/pull/40888/head:
qa/tasks/cephadm: ignore --keep-logs failure
qa/tasks/cephadm: use yaml.dump_all()
qa/suites/rados/cephadm/smoke-*: use cephadm.wait_for_service
qa/tasks/cephadm: tear down clsuter before gathering logs
qa/suites/rados/cephadm/smoke-roleless: test rgw-ingress
mgr/cephadm: remove virtual_ip check during scheduling
mgr/orchestrator: orch ls: leave off virtual_ip prefixlen
qa/tasks/cephadm: add wait_for_service
qa/tasks/cephadm: allow skip_monitor_stack=true
qa/tasks/cephadm: do subst_vip for cephadm.shell and .apply
qa/tasks/vip: add vip task to allocate virtual IPs
qa/suites/rados/cephadm/smoke-roleless: add rgw-ingress test case
qa/tasks/cephadm: shell: take 'all-roles' or 'all-hosts'
qa/tasks/cephadm: let cephadm.shell take string or list
Reviewed-by: Sebastian Wagner <swagner@suse.com>