librados2 and librbd1 are installed as a dependency of qemu-kvm.
qemu-kvm is installed by ceph-cm-ansible, see [1].
in thrash-old-clients, jewel packages are installed, but yum does
not allow downgrade unless it's required explicitly. in this change,
we downgrade librbd1 and librados2 to address this issue.
currently, the ceph packages shipped by CentOS/RHEL 7 are still an old
version of jewel. so this issue only kicks in when we try to install
hammer.
this change should address failures like
Command failed on smithi136 with status 1: '\n sudo yum -y install
rbd-fuse\n '
found in rados/thrash-old-clients tests.
---
[1]
3db1cbdc22 (diff-f2b05d775fedff6c5c6689f564b32f1c)
Fixes: http://tracker.ceph.com/issues/37618
Signed-off-by: Kefu Chai <kchai@redhat.com>
The current solution fails on our CI-system as some outputs can have
more values and some parameters like 'w' can vary in different
environments.
As this was only tested before in a vstart cluster environment it
worked.
Through this commit only the given attributes we know to be there,
will be tested.
Fixes: https://tracker.ceph.com/issues/37275
Signed-off-by: Stephan Müller <smueller@suse.com>
* refs/pull/24940/head:
qa: add test for getfattr ceph.dir.pin
client: support getfattr ceph.dir.pin extended attribute
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
mgr/dashboard: add profiles to set cluster's rebuild performance
Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Sebastian Krah <skrah@suse.com>
Reviewed-by: Patrick Nawracay <pnawracay@suse.com>
The new default is bitmap, so we were testing bitmap twice. Instead,
explicitly call out stupid and bitmap cases so a future default change
won't break coverage.
Signed-off-by: Sage Weil <sage@redhat.com>
The change in b3e69a9609 broke the test's assumption that the endpoint
wouldn't be readable by block-manager. It doesn't looks as though that's
actually problematic for the ECP controller, so just update the test to
use rgw-manager instead.
Signed-off-by: Zack Cerza <zack@redhat.com>
* refs/pull/25308/head:
osd/OSD: OSD::mkfs asserts when reusing disk with existing superblock.
os/bluestore: add main device expand capability.
Reviewed-by: Sage Weil <sage@redhat.com>
Some classes should still be imported directly from collections;
only OrderedDict, Iterable and Callable (in the context of the
ceph codebase) are found in collections.abc.
The current code works due to the fallback support for Python 2.
Signed-off-by: James Page <james.page@ubuntu.com>
* add qa/releases/nautilus.yaml so it can be reused.
* use releases/nautilus.yaml in luminous-x upgrade test, so
test_librbd_python.sh is able to use the feature introduced in
nautilus.
Fixes: http://tracker.ceph.com/issues/37432
Signed-off-by: Kefu Chai <kchai@redhat.com>
This splits out the collection of health and log data from the
/api/dashboard/health controller into /api/health/{full,minimal} and
/api/logs/all.
/health/full contains all the data (minus logs) that /dashboard/health
did, whereas /health/minimal contains only what is needed for the health
component to function. /logs/all contains exactly what the logs portion
of /dashboard/health did.
By using /health/minimal, on a vstart cluster we pull ~1.4KB of data
every 5s, where we used to pull ~6KB; those numbers would get larger
with larger clusters. Once we split out log data, that will drop to
~0.4KB.
Fixes: http://tracker.ceph.com/issues/36675
Signed-off-by: Zack Cerza <zack@redhat.com>
* refs/pull/17526/head:
qa/tasks/ceph_manager: avoid test_map_discontinuity stall with too few up osds
Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
Some tests have m=2,k=2 and this will break them. Sometimes even if we
have 5 up osds, we end up with 4 and CRUSH gets picky, so build in a
buffer and only do this if we have 6 up.
We don't have an easy way from here to see what the min up osds for healthy
is... basically this map discontinuity test just sucks.
Signed-off-by: Sage Weil <sage@redhat.com>
The behavior of `safe-to-destroy` has changed in
432f194355 (PR#24799) and the backend
needs to be adapted accordingly.
Fixes: http://tracker.ceph.com/issues/37290
Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
Enabled ctx.managers to take cluster name from config in restart() method instead of default 'ceph'.
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Separate diskprediction local cloud from the diskprediction plugin.
Devicehealth invoke device prediction function related on the global
configuration "device_failure_prediction_mode".
Signed-off-by: Rick Chen <rick.chen@prophetstor.com>
before this change, we assume that the variable set if rados::radospp is
found will be radospp_FOUND, but this is a feature cmake 3, see
https://cmake.org/cmake/help/v3.3/module/FindPackageHandleStandardArgs.html
while the cmake shipped by centos is cmake 2.8.12, where the variable
name will be <UPPERCASED_NAME>_FOUND, see
https://cmake.org/cmake/help/v2.8.12/cmake.html#module:FindPackageHandleStandardArgs
in the test of test_envlibrados_for_rocksdb.sh, we are using cmake not
the cmake3 offered by EPEL7, so RADOSPP_FOUND will be set instead. that's why
executable env_librados_test will fail to link against rados::radospp.
as rados::radospp won't be defined if radospp_FOUND is not defined/set.
after this change, the 2nd mode of FIND_PACKAGE_HANDLE_STANDARD_ARGS()
is used instead to ensure that radospp_FOUND is defined even if cmake
2.8.12 is used.
also, the message() commands for debugging purpose are removed.
Signed-off-by: Kefu Chai <kchai@redhat.com>
we use the playbook of "testnodes.yml" defined by ceph-cm-ansible for
initializing test nodes, and the role of "testnode" is used by
testnodes.yml. "testnode" requires "qemu-system-x86" or "qemu-kvm"
package to be installed. the qemu in turn depends on librbd1 and
librados2.
before librados3 was introduced, this worked perfectly. because in ceph
repo, qa/packages/packages.yaml defines the default set of packages the
"install" tasks should install. and in that yaml file, librados2 was
listed. so the package management system will overwrite the librados2
installed by ansible playbook with the version specified by the
"install" task, as apt/yum thinks this is what user requires explicitly,
so it's fine to install a different version of librados2.
after librados3 was introduced, librados2 was removed from
qa/packages/packages.yaml. because, by default, we need to install
librados3 instead of librados2 for ready a nautilus cluster. but the
problem is, the packge list also applies to "install" tasks installing
releases before nautilus, where we still need to replace the librados2
installed by ansible.
so, to address this issue, "librados2" is added to "extra_packages" of
the "install" tasks of tests installing old releases to install
librados2 explicitly instead of as a dependency of other ceph packages
like librbd1.
Signed-off-by: Kefu Chai <kchai@redhat.com>
The new info endpoint will provide the frontend with the necessary
information it needs to create new profiles.
Fixes: https://tracker.ceph.com/issues/25156
Signed-off-by: Stephan Müller <smueller@suse.com>
Use --rmtype snapmap with new obj16 to remove snapmap only, check for repair message
Use --rmtype nosnapmap to remove obj5 while leaving snapmap behind
Signed-off-by: David Zafman <dzafman@redhat.com>
* refs/pull/24828/head:
qa/osd-bluefs-volume-ops: use ceph-bluestore-tool for fsck
qa/osd-bluefs-volume-ops: reduce space usage for the test case
Reviewed-by: David Zafman <dzafman@redhat.com>
mgr/dashboard: tasks.mgr.dashboard.test_osd.OsdTest failures
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Patrick Nawracay <pnawracay@suse.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
we have switched from tmap to omap long ago.
but keep the server side implementation around, in case ancient
client is still using these tmap APIs.
also, tmap_update() is kept, because librbd is using it for v1 image
backward compatibility.
Signed-off-by: Kefu Chai <kchai@redhat.com>
- Fix bug in Dashboard QA unit test framework. Don't set the application type header manually, this is done by the requests library if required.
- Enhance QA unit test helper: Print the response of the API request if it fails. This should help to identify the problem more easily.
- Fix bug in the OSD controller. A parameter needs to be converted to integer.
- Take care that the params of the request object are not modified.
The issue was introduced by PR https://github.com/ceph/ceph/pull/24475. The CherryPy json_in plugin disclosed the errorneous unit test helper implementation.
Fixes: https://tracker.ceph.com/issues/36708
Signed-off-by: Volker Theile <vtheile@suse.com>
It is particularly useful when running multiple rbd-mirror instances
in Active-Passive or Active-Active mode.
Signed-off-by: Mykola Golub <mgolub@suse.com>
rgw: Return tenant field in bucket_stats function
Reviewed-by: Lenz Grimmer <lgrimmer@suse.com>
Reviewed-by: Matt Benjamin <mbenjami@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
include a patch so rocksdb can use libradospp instead of librados. will
upstream the patch and make it work for both pre-nautilus librados and
nautilus libradospp
Signed-off-by: Kefu Chai <kchai@redhat.com>
the goal is to decouple C++ API from C API, and to version them
differently, as they are targeting different consumers.
this allows us to change the C++ API and bumping up its soversion
without requiring consumer to recompile the librados client for
using the new librados. in this way, C++ API can move faster than
C API. for example, if bufferlist interface is changed for better
performance, and this breaks existing API/ABI, we can bump up
the C++ library's soversion, and and the C library's version unchanged
but ship the new librados's C binding. so the librados client linked
against librados's C library will be able to take advantage of
the improvement in C++ library. while the librados client
linked against C++ library won't break at runtime due to unresolved
symbol or changed structure layout.
this is massive change, the genereal idea is to
* split librados.cc into two source files: librados_c.cc and
librados_cxx.cc, the former for implementing C APIs, the later
for C++ APIs.
* extract the C++ API in librados into librados-cxx, the library
name will be libradospp. but we can change it before nautilus
is released.
* link these librados libraries with static libraries which it
depends on, so "-Wl,--exclude-libs,ALL" link flags can help
hide the non-public symbols.
* extract the tests exercising librados' C++ API into a different
source file named *_cxx.cc. for instance, to move the C++ tests
in aio.cc into aio_cxx.cc
* extract the shared helper functions which do not use any librados
or librados-cxx APIs into test_shared{.cc,h}. the "shared" here
means, *shared* by C++ and C tests.
* extract the test fixtures, i.e., the subclasses of testing::Test,
for testing C++ APIs into testcase_cxx.cc.
* update qa/workunits/rados/test.sh accordingly to add the splitted
tests
* update the consumers of librados to link against librados-cxx
instead, if they are using the C++ API.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* refs/pull/24809/head:
os/bluestore: omit redundant '/' in OSD path for ceph-bluestore-tool if
os/bluestore: improve error handling for migrate ops in
qa/standtalone/osd-bluefs-volume-ops: remove redundant code.
Reviewed-by: Sage Weil <sage@redhat.com>
* refs/pull/24787/head:
Merge PR #24796 into nautilus
osd: fix heartbeat_reset unlock
Merge PR #24780 into nautilus
Merge PR #24761 into nautilus
Merge PR #24651 into nautilus
osd: fix race between op_wq and context_queue
test: Make sure kill_daemons failure will be easy to find
test: Add flush_pg_stats to make test more deterministic
Python 3.7 now shows a warning as below.
/usr/bin/ceph:128: DeprecationWarning: Using or importing the ABCs from
'collections' instead of from 'collections.abc' is deprecated, and in
3.8 it will stop working
import rados
This patch addresses the that particular issue.
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
* refs/pull/24651/head:
test: Make sure kill_daemons failure will be easy to find
test: Add flush_pg_stats to make test more deterministic
Reviewed-by: Neha Ojha <nojha@redhat.com>
This is related to http://tracker.ceph.com/issues/36453. It is far from
a complete solution, but seems like a positive move.
I tested this change by first disabling my browser cache, and then used
the /docs endpoint to query /api/dashboard/health. Before compression:
Content-Length: 60748
Time: 615ms
After:
Content-Length: 7505
Time: 92ms
Then, I logged into the dashboard as normal and reloaded the page once I
was in. Some values for the reload operation before compression:
Total page load time: 58.48s
vendor.js Content-Length: 6486025
vendor.js time: 48.09s
After:
Total page load time: 14.55s
vendor.js Content-Length: 1143178
vendor.js time: 4.50s
Signed-off-by: Zack Cerza <zack@redhat.com>
This fixes "TypeError: admin_socket() got an unexpected keyword argument
'timeout'". The value is never used.
Signed-off-by: Zack Cerza <zack@redhat.com>
If there is a workunit task associated with the same client, the two
tasks will attempt to clone the suite repo to the same directory.
Worse, if it's parallel tasks, the two clones will clobber each
other.
Fixes: http://tracker.ceph.com/issues/36542
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 5d56014c61)
If there is a workunit task associated with the same client, the two
tasks will attempt to clone the suite repo to the same directory.
Worse, if it's parallel tasks, the two clones will clobber each
other.
Fixes: http://tracker.ceph.com/issues/36542
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
For EC pools we have a lot of shards, and 30% probability on each one
means we are very like to repeatedly fail backfill reservations.. long
enough that teuthology gives up waiting.
Signed-off-by: Sage Weil <sage@redhat.com>