Commit Graph

128309 Commits

Author SHA1 Message Date
Or Ozeri
044280dcbe librbd/crypto: fix memory leak in ShutDownCryptoRequest
If crypto object dispatch does not exist, a context pointer is leaked.
This commit fixes this issue.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-11-28 13:06:34 +02:00
Or Ozeri
0f61c82d2e test/librbd: fix memory leak in TestMockParentCacheObjectDispatch
fix memory leak in TestMockParentCacheObjectDispatch.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-11-28 13:06:34 +02:00
Or Ozeri
bcca300d26 test/librbd: fix memory leak in TestMockCryptoLuksFormatRequest
fix memory leak in TestMockCryptoLuksFormatRequest.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-11-28 13:06:33 +02:00
Or Ozeri
79501173b7 test/librbd: fix memory leak in TestMockCryptoLuksLoadRequest
fix memory leak in TestMockCryptoLuksLoadRequest.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-11-28 13:06:33 +02:00
Or Ozeri
91c3b0314c test/librbd: fix bad TearDown in TestCryptoOpensslDataCryptor
Fix the TearDown function in TestCryptoOpensslDataCryptor
to call the right class parent function.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-11-28 13:06:33 +02:00
Or Ozeri
09ae3bd03d test/librbd: fix memory leak in TestCryptoOpensslDataCryptor
One of the tests leaks an encryption context.
This commit fixes this issue.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-11-28 13:06:33 +02:00
Or Ozeri
3af5bb7c61 librbd/crypto: fix memory leak in when DataCryptor fails
If DataCryptor fails, either in init_context or update_context,
the encryption context is not returned, which causes a memory leak.
This commit fixes this issue.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-11-28 13:06:33 +02:00
Or Ozeri
78abde0d25 test/librbd: fix memory leak in TestMockCryptoBlockCrypto
fix memory leak in TestMockCryptoBlockCrypto.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
2021-11-28 13:06:33 +02:00
Sage Weil
69b04de293 Merge PR #43997 into master
* refs/pull/43997/head:
	mgr/cephadm: make logging about agent less verbose

Reviewed-by: Adam King <adking@redhat.com>
2021-11-26 15:15:51 -05:00
Sage Weil
cf046f78da Merge PR #44079 into master
* refs/pull/44079/head:
	mgr/cephadm: skip osd_stats check if osd removal queue is empty

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 15:15:42 -05:00
Sage Weil
45312c8627 Merge PR #44075 into master
* refs/pull/44075/head:
	mgr/cephadm: drop osdspec_affinity tracking

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 15:15:27 -05:00
Sage Weil
131212254f Merge PR #44073 into master
* refs/pull/44073/head:
	pybind/mgr/mgr_module: cache mgr_ip

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-11-26 15:15:12 -05:00
Sebastian Wagner
32fdb84956
mgr/cephadm: simplify HostCache.get_daemon_types
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 18:00:10 +01:00
Sebastian Wagner
d770bb5c45
mgr/cephadm: Inventory: Fix dictionary changed size during iteration
Use `.copy()` for that.

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 17:53:14 +01:00
Sage Weil
22c402b84e Merge PR #43936 into master
* refs/pull/43936/head:
	qa/tasks/cephadm: pull image to all hosts in parallel
	qa/tasks/cephadm: add hosts via mon remote
	qa/tasks/cephadm: use shortname for remote directory
	qa/tasks/cephadm: deploy no more than 5 mons in roleless mode
	qa/tasks/radosbench: default clients to all clients (not client.0)
	qa/tasks/ceph_manager: parallelize flush_pg_stats()
	qa/suites/big: remove thrasher
	qa/suites/big: update for cephadm

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 10:38:58 -05:00
Sage Weil
63f986641d Merge PR #44080 into master
* refs/pull/44080/head:
	mgr/cephadm: record when finished with scheduled daemon action

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 10:37:27 -05:00
Sebastian Wagner
e90433b8f9
mgr/cephadm: grafana.ini: Set cookie_secure = true
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 11:20:13 +01:00
Sebastian Wagner
fdae665a2f
mgr/cephadm: Add GrafanaSpec.initial_admin_password
By default, we're not creating any admin accout for Grafana now,
but we're adding an option to set the grafana password manually using:

```yaml
service_type: grafana
spec:
  initial_admin_password: mypassword
```

Users can then easily log into Grafana with the given password.

Fixes: https://tracker.ceph.com/issues/48291
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 11:20:11 +01:00
Sebastian Wagner
f2b0b45176
python-common: Reparent AlertManagerSpec to MonitoringSpec
And remove duplicated members

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 11:19:50 +01:00
Sebastian Wagner
208ce50b92
python-common: Move AlertManagerSpec below MonitoringSpec
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 11:19:50 +01:00
Sebastian Wagner
e7035f1a54
python-common: test_yaml(): add a few tests
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 11:19:50 +01:00
Sebastian Wagner
f76c02a658
python-common: prettify yaml.dump(MonitoringSpec())
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 11:19:50 +01:00
Sebastian Wagner
6f90f0fa2e
pyhton-common: move some tests from cephadm/test_spec.py
Cause they don't have any dependencies to cephadm

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-26 11:19:50 +01:00
Avan Thakkar
071c3b68a1 mgr/dashboard: avoid tooltip if disk_usage=null and fast-diff enabled
Fixes: https://tracker.ceph.com/issues/53404
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
2021-11-26 13:32:39 +05:30
wangbo-yw
aba8c83f06 mgr/dashboard: add some test for controllers/pool.py
Signed-off-by: wangbo-yw <wangbo2_yewu@cmss.chinamobile.com>
2021-11-26 01:53:39 -05:00
Sebastian Wagner
f3d3dcee87
Merge pull request #44106 from sebastian-philipp/mgr-tox-37
mgr/tox.ini: Add python 3.7 environment 

Reviewed-by: Adam King <adking@redhat.com>
2021-11-25 17:54:26 +01:00
Sebastian Wagner
d93e8beab3
Merge pull request #43943 from sebastian-philipp/osd-memeory-hyperconverged
doc/cephadm: OSD memory autotuning for hyperconverged

Reviewed-by: Adam King <adking@redhat.com>
2021-11-25 17:27:26 +01:00
Radoslaw Zarzynski
5a7fc07933 crimson/os: fix a shutdown-related race condition in AlienStore.
This is supposed to tackle crashes like the following one:

```
INFO  2021-11-17 16:33:12,048 [shard 0] alienstore - stat
...
DEBUG 2021-11-17 16:33:12,789 [shard 0] ms - [osd.2(hb_front) v2:0.0.0.0:6813/34383 >> osd.0 v2:127.0.0.1:6809/34293@56992] closed!
DEBUG 2021-11-17 16:33:12,791 [shard 0] ms - [osd.2(hb_front) v2:0.0.0.0:6813/34383@53359 >> osd.7 v2:0.0.0.0:6815/34448] closed!
INFO  2021-11-17 16:33:12,795 [shard 0] alienstore - umount
INFO  2021-11-17 16:33:12,804 [shard 0] osd - osd.2: committed_osd_maps(23, 62)
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-8896-gf35358f1/rpm/el8/BUILD/ceph-17.0.0-8896-gf35358f1/src/rocksdb/db/db_impl/db_impl.cc:1615: rocksdb::Status rocksdb::DBImpl::GetImpl(const rocksdb::ReadOptions&, const rocksdb::Slice&, rocksdb::DBImpl::GetImplOptions&): Assertion `get_impl_options.column_family' failed.
Aborting.
Backtrace:
INFO  2021-11-17 16:33:13,542 [shard 0] ms - [osd.2(cluster) v2:172.21.15.17:6804/34383 >> osd.3 v2:172.21.15.17:6806/34387@50001] execute_ready(): fault at READY with nothing to send, going to STANDBY -- std::system_error (error crimson::net:4, read eof)
DEBUG 2021-11-17 16:33:13,542 [shard 0] ms - [osd.2(cluster) v2:172.21.15.17:6804/34383 >> osd.3 v2:172.21.15.17:6806/34387@50001] TRIGGER STANDBY, was READY
 0# gsignal in /lib64/libc.so.6
 1# abort in /lib64/libc.so.6
 2# 0x00007F12FA13FC89 in /lib64/libc.so.6
 3# 0x00007F12FA14DA76 in /lib64/libc.so.6
 4# rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, rocksdb::Slice const&, rocksdb::DBImpl::GetImplOptions&) in ceph-osd
 5# rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) in ceph-osd
 6# rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*) in ceph-osd
 7# RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*, unsigned long, ceph::buffer::v15_2_0::list*) in ceph-osd
 8# BlueStore::Collection::get_onode(ghobject_t const&, bool, bool) in ceph-osd
 9# BlueStore::read(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, unsigned long, unsigned long, ceph::buffer::v15_2_0::list&, unsigned int) in ceph-osd
10# 0x00005584E516577F in ceph-osd
11# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd
12# 0x00005584E54E71E9 in ceph-osd
13# 0x00007F12FB861BA3 in /lib64/libstdc++.so.6
14# 0x00007F12FBB3C14A in /lib64/libpthread.so.0
15# clone in /lib64/libc.so.6
Content of /proc/self/maps:
7fff7000-8fff7000 rw-p 00000000 00:00 0
```

The problem happened in RocksDB:

```cpp
Status DBImpl::GetImpl(const ReadOptions& read_options, const Slice& key,
                       GetImplOptions& get_impl_options) {
  assert(get_impl_options.value != nullptr ||
         get_impl_options.merge_operands != nullptr);

  assert(get_impl_options.column_family);
  // ...
```

```cpp
tatus DBImpl::Get(const ReadOptions& read_options,
                   ColumnFamilyHandle* column_family, const Slice& key,
                   PinnableSlice* value, std::string* timestamp) {
  GetImplOptions get_impl_options;
  get_impl_options.column_family = column_family;
  get_impl_options.value = value;
  get_impl_options.timestamp = timestamp;
  Status s = GetImpl(read_options, key, get_impl_options);
  return s;
}
```

```cpp
int RocksDBStore::get(
  const string& prefix,
  const char *key,
  size_t keylen,
  bufferlist *out)
{
  ceph_assert(out && (out->length() == 0));
  utime_t start = ceph_clock_now();
  int r = 0;
  rocksdb::PinnableSlice value;
  rocksdb::Status s;
  auto cf = get_cf_handle(prefix, key, keylen);
  if (cf) {
    s = db->Get(rocksdb::ReadOptions(),
                cf,
                rocksdb::Slice(key, keylen),
                &value);
  } else {
    string k;
    combine_strings(prefix, key, keylen, &k);
    s = db->Get(rocksdb::ReadOptions(),
                default_cf,
                rocksdb::Slice(k),
                &value);
  }
  // ...
```

It may be explained by a race condition between `AlienStore::stat()`
and `AlienStore::umount()`. Umounting a BlueStore means nullifying
`default_cf`:

```cpp
void RocksDBStore::close()
{
  // ...
  default_cf = nullptr;
  delete db;
  db = nullptr;
}
```

```
INFO  2021-11-17 16:33:12,048 [shard 0] alienstore - stat
...
INFO  2021-11-17 16:33:12,795 [shard 0] alienstore - umount
INFO  2021-11-17 16:33:12,804 [shard 0] osd - osd.2: committed_osd_maps(23, 62)
```

Although `AlienStore` synchronizes `umount()` and `do_transaction()`
with a `seastar::gate`, it lacks similar mechanism for read-like operations.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-11-25 15:05:34 +00:00
Sebastian Wagner
ee7ed53df8
doc/cephadm: host location: add link to types
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-25 15:52:20 +01:00
Sage Weil
9d50154a93 qa/tasks/cephadm: pull image to all hosts in parallel
This doesn't affect bootstrap, but it does mean we avoid any delay
the first time we cephadm.shell on some non-boostrap host.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:56 -06:00
Sage Weil
3a110f6c00 qa/tasks/cephadm: add hosts via mon remote
If we use a new remote for each shell command, we end up waiting
for the image to pull on every host in sequence.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:56 -06:00
Sage Weil
0e40064d31 qa/tasks/cephadm: use shortname for remote directory
This aligns with what the ceph and syslog tasks do.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:56 -06:00
Sage Weil
689d7ceabd qa/tasks/cephadm: deploy no more than 5 mons in roleless mode
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:55 -06:00
Sage Weil
e7bf9242c4 qa/tasks/radosbench: default clients to all clients (not client.0)
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:55 -06:00
Sage Weil
99cdaaba70 qa/tasks/ceph_manager: parallelize flush_pg_stats()
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:55 -06:00
Sage Weil
9559fea8b2 qa/suites/big: remove thrasher
This doesn't work with roleless (yet)

Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:55 -06:00
Sage Weil
0514b0a323 qa/suites/big: update for cephadm
Signed-off-by: Sage Weil <sage@newdream.net>
2021-11-25 07:52:55 -06:00
Sebastian Wagner
a503e7dc21
mgr/cephadm/tests: remove _deploy_cephadm_binary
(not needed)

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-25 13:29:01 +01:00
Sebastian Wagner
6f7ea4af3e
mgr/tox.ini: Add python 3.7 environment
Plus fixes.

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
2021-11-25 13:22:06 +01:00
Adam King
549a077514 mgr/cephadm: agent: allow agent down multiplier to be configured
Signed-off-by: Adam King <adking@redhat.com>
2021-11-24 18:52:10 -05:00
Adam King
f64b8a34c4 cephadm: only infer conf from mon if fsid matches
fixes: https://tracker.ceph.com/issues/53394

Signed-off-by: Adam King <adking@redhat.com>
2021-11-24 17:27:20 -05:00
Yehuda Sadeh
91a3276aca mgr/rgw: ignore mypy errors
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00
Yehuda Sadeh
1426061815 mgr/rgw: use tool_exec instead of directly spawning commands
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00
Yehuda Sadeh
af402c41e3 docs: document mgr/rgw module
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00
Yehuda Sadeh
a0e8230bfe mgr/rgw: change zone-creds cli commands
- ceph rgw zone-creds create
 - ceph rgw zone-creds remove

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00
Yehuda Sadeh
d90d380e17 rgwam: reorganize code
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00
Yehuda Sadeh
289e5f096a mgr/rgw: start_radosgw is true by default
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00
Yehuda Sadeh
24874dfdf8 mgr/rgw: pass realm_id via token
Add realm_id to token.
Also, use realm_id from token instead of requiring realm_name
for command that uses realm token.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00
Yehuda Sadeh
626e6bebf4 mgr/rgw: realm remove zone-creds
add command to remove zone creds. Either removes the access key
or the entire user if was the only access key for that user.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00
Yehuda Sadeh
f64754a898 pybind/argparse: handle None sequence
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2021-11-24 12:54:30 -08:00