librados2 and librbd1 are installed as a dependency of qemu-kvm.
qemu-kvm is installed by ceph-cm-ansible, see [1].
in thrash-old-clients, jewel packages are installed, but yum does
not allow downgrade unless it's required explicitly. in this change,
we downgrade librbd1 and librados2 to address this issue.
currently, the ceph packages shipped by CentOS/RHEL 7 are still an old
version of jewel. so this issue only kicks in when we try to install
hammer.
this change should address failures like
Command failed on smithi136 with status 1: '\n sudo yum -y install
rbd-fuse\n '
found in rados/thrash-old-clients tests.
---
[1]
3db1cbdc22 (diff-f2b05d775fedff6c5c6689f564b32f1c)
Fixes: http://tracker.ceph.com/issues/37618
Signed-off-by: Kefu Chai <kchai@redhat.com>
The current solution fails on our CI-system as some outputs can have
more values and some parameters like 'w' can vary in different
environments.
As this was only tested before in a vstart cluster environment it
worked.
Through this commit only the given attributes we know to be there,
will be tested.
Fixes: https://tracker.ceph.com/issues/37275
Signed-off-by: Stephan Müller <smueller@suse.com>
* add qa/releases/nautilus.yaml so it can be reused.
* use releases/nautilus.yaml in luminous-x upgrade test, so
test_librbd_python.sh is able to use the feature introduced in
nautilus.
Fixes: http://tracker.ceph.com/issues/37432
Signed-off-by: Kefu Chai <kchai@redhat.com>
This splits out the collection of health and log data from the
/api/dashboard/health controller into /api/health/{full,minimal} and
/api/logs/all.
/health/full contains all the data (minus logs) that /dashboard/health
did, whereas /health/minimal contains only what is needed for the health
component to function. /logs/all contains exactly what the logs portion
of /dashboard/health did.
By using /health/minimal, on a vstart cluster we pull ~1.4KB of data
every 5s, where we used to pull ~6KB; those numbers would get larger
with larger clusters. Once we split out log data, that will drop to
~0.4KB.
Fixes: http://tracker.ceph.com/issues/36675
Signed-off-by: Zack Cerza <zack@redhat.com>
we use the playbook of "testnodes.yml" defined by ceph-cm-ansible for
initializing test nodes, and the role of "testnode" is used by
testnodes.yml. "testnode" requires "qemu-system-x86" or "qemu-kvm"
package to be installed. the qemu in turn depends on librbd1 and
librados2.
before librados3 was introduced, this worked perfectly. because in ceph
repo, qa/packages/packages.yaml defines the default set of packages the
"install" tasks should install. and in that yaml file, librados2 was
listed. so the package management system will overwrite the librados2
installed by ansible playbook with the version specified by the
"install" task, as apt/yum thinks this is what user requires explicitly,
so it's fine to install a different version of librados2.
after librados3 was introduced, librados2 was removed from
qa/packages/packages.yaml. because, by default, we need to install
librados3 instead of librados2 for ready a nautilus cluster. but the
problem is, the packge list also applies to "install" tasks installing
releases before nautilus, where we still need to replace the librados2
installed by ansible.
so, to address this issue, "librados2" is added to "extra_packages" of
the "install" tasks of tests installing old releases to install
librados2 explicitly instead of as a dependency of other ceph packages
like librbd1.
Signed-off-by: Kefu Chai <kchai@redhat.com>
For EC pools we have a lot of shards, and 30% probability on each one
means we are very like to repeatedly fail backfill reservations.. long
enough that teuthology gives up waiting.
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/24359/head:
qa/tests: update ansible version to 2.6 for master branch testing.
qa/tests: use lvm as default for ceph-ansible testing, this should also work with raw devices
Reviewed-by: Alfredo Deza <adeza@redhat.com>
Henceforth, we'll require explicit `allow` caps for commands, or for the
config-key service. Blanket caps are no longer allowed for the
config-key service, except for 'allow *'.
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
This makes it easier to re-run tests against a suite branch without
requiring a full ceph-ci build and repo.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
We can't (easily) build updated hammer packages, but all this sh script does
it run this one test binary with --gtest_filter arguments, so just do
it directly and skip the test explicitly here. (Newer version of the .sh
understand the environemnt variable but the hammer version does not.)
Fixes: http://tracker.ceph.com/issues/36104
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/23985/head:
ceph-objectstore-tool: add back pool dne check
qa/suites/rados/singleton/reg11184: remove old test
ceph-objectstore-tool: import pg at original epoch
osd: handle null pg slot on startup
ceph-objectstore-tool: drop support for ancient export files
osd: avoid dropping osd_lock when pg osdmaps are not laggy
qa/standalone/osd/pg-merge.sh: add merge vs pg import test
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
This bug was about filtering missing and divergent when doing a partial
PG import. We don't support partial PG imports any more, so this can
go away!
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/20469/head:
osd/PG: remove warn on delete+merge race
osd: base project_pg_history on is_new_interval
osd: make project_pg_history handle concurrent osdmap publish
osd: handle pg delete vs merge race
osd/PG: do not purge strays in premerge state
doc/rados/operations/placement-groups: a few minor corrections
doc/man/8/ceph: drop enumeration of pg states
doc/dev/placement-groups: drop old 'splitting' reference
osd: wait for laggy pgs without osd_lock in handle_osd_map
osd: drain peering wq in start_boot, not _committed_maps
osd: kick split children
osd: no osd_lock for finish_splits
osd/osd_types: remove is_split assert
ceph-objectstore-tool: prevent import of pg that has since merged
qa/suites: test pg merging
qa/tasks/thrashosds: support merging pgs too
mon/OSDMonitor: mon_inject_pg_merge_bounce_probability
doc/rados/operations/placement-groups: update to describe pg_num reductions too
doc/rados/operations: remove reference to lpgs
osd: implement pg merge
osd/PG: implement merge_from
osdc/Objecter: resend ops on pg merge
osd: collect and record pg_num changes by pool
osd: make load_pgs remove message more accurate
osd/osd_types: pg_t: add is_merge_target()
osd/osd_types: pg_t::is_merge -> is_merge_source
osd/osd_types: adding or substracting invalid stats -> invalid stats
osd/PG: clear_ready_to_merge on_shutdown (or final merge source prep)
osd: debug pending_creates_from_osd cleanup, don't use cbegin
ceph-objectstore-tool: debug intervals update
mgr/ClusterState: discard pg updates for pgs >= pg_num
mon/OSDMonitor: fix long line
mon/OSDMonitor: move pool created check into caller
mon/OSDMonitor: adjust pgp_num_target down along with pg_num_target as needed
mon/OSDMonitor: add mon_osd_max_initial_pgs to cap initial pool pgs
osd/OSDMap: set pg[p]_num_target in build_simple*() methods
mon/PGMap: adjust SMALLER_PGP_NUM warning to use *_target values
mon/OSDMonitor: set CREATING flag for force-create-pg
mon/OSDMonitor: start sending new-style pg_create2 messages
mon/OSDMonitor: set last_force_resend_prenautilus for pg_num_pending changes
osd: ignore pg creates when pool FLAG_CREATING is not set
mgr: do not adjust pg_num until FLAG_CREATING removed from pool
mon/OSDMonitor: add FLAG_CREATING on upgrade if pools still creating
mon/OSDMonitor: prevent FLAG_CREATING from getting set pre-nautilus
mon/OSDMonitor: disallow pg_num changes while CREATING flag is set
mon/OSDMonitor: set POOL_CREATING flag until initial pool pgs are created
osd/osd_types: add pg_pool_t FLAG_POOL_CREATING
osd/osd_types: introduce last_force_resend_prenautilus
osd/PGLog: merge_from helper
osd: no cache agent or snap trimming during premerge
osd: notify mon when pending PGs are ready to merge
mgr: add simple controller to adjust pg[p]_num_actual
mon/OSDMonitor: MOSDPGReadyToMerge to complete a pg_num change
mon/OSDMonitor: allow pg_num to adjusted up or down via pg[p]_num_target
osd/osd_types: make pg merge an interval boundary
osd/osd_types: add pg_t::is_merge() method
osd/osd_types: add pg_num_pending to pg_pool_t
osd: allow multiple threads to block on wait_min_pg_epoch
osd: restructure advance_pg() call mechanism
mon/PGMap: prune merged pgs
mon/PGMap: track pgs by state for each pool
osd/SnapMapper: allow split_bits to decrease (merge)
os/bluestore: fix osr_drain before merge
os/bluestore: allow reuse of osr from existing collection
os/filestore: (re)implement merge
os/filestore: add _merge_collections post-check
os: implement merge_collection
os/ObjectStore: add merge_collection operation to Transaction
Commit 0d8887652d ("qa/tasks/cram: use suite_repo repository for all
cram jobs") removed hardcoded git.ceph.com links, but as it turned out
it is still used for nightlies. There is no good way to accommodate
the different URL schemes, so let's get rid of URLs altogether.
Fixes: https://tracker.ceph.com/issues/27211
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
It's no longer necessary to pass `-k testing` to teuthology-suite. We're also
now regularly testing RHEL 7.5 kernel in upstream testing.
This work is prep for eventually integrating kclient into fs.
Fixes: http://tracker.ceph.com/issues/26995
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
mgr/dashboard: Add support for managing individual OSD settings in the backend
Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
Currently git.ceph.com is hardcoded for all cram jobs. Testing
modifications is a pain: one needs to push to either ceph/ceph.git or
ceph/ceph-ci.git (depending on where the ceph branch is at, triggering
unnecessary builds in the latter case) and wait for the mirror to sync.
Runs scheduled against branches in developer's forks fail.
Move away from git.ceph.com to allow mixing branches and repositories,
similar to workunits.
Fixes: https://tracker.ceph.com/issues/27211
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Add options to mark OSDs in/out/down/reweight/lost/remove/destroy/create
Fixes: http://tracker.ceph.com/issues/24270
Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
* refs/pull/23540/head:
include/ceph_fs: rename old auid field
PendingReleaseNotes: note about auid support removal
radosgw-admin: remove -a --auth-uid arg
rgw: remove auid member from RGWUserInfo
auth: remove auid member from EntityAuth
osd: remove auid session member
mon: remove auid session member
doc/dev/cephx_protocol: drop auid reference
auth: remove auid args from handle_request and verify_authorizer
mon/OSDMonitor: remove 'osd pool {get,set} <name> auid ...'
mon/OSDMonitor: remove auid arg for 'osd lspools' and deprecate
osd/OSDCap: remove auid from grammar
osd/OSDCap: remove auid from is_capable() etc args
auth: clean up cap parse error messages
mon/AuthMonitor: raise health warning on invalid caps
mon/AuthMonitor: drop ancient auth inc encoding compat
messages/MPoolOp: drop auid member
osdc/Objecter: drop change_pool_auid
pybind/rados: drop auid arg to pool_create
pybind/rados: drop change_auid
rados: drop mkpool, rmpool commands
rados: remove 'chown' command
librados: deprecate calls that take auid
librados: mark all auid calls deprecated
mon/OSDMonitor: drop variable pool auid for prepare_new_pool
mon/OSDMonitor: remove pool auid change support
osdc/Objecter: do not pass auid to create_pool
ceph-authtool: remove auid options
qa/workunits/cephtool: remove auid tests
Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
* refs/pull/23439/head:
qa: whitelist cap revoke warning
doc: document cap revoke non-responders client eviction
test: validate client eviction for cap revoke non-responders
mds: add counter for tracking cap non-responding clients
mds: evict clients that do not respond to cap revoke by MDS
mds: pass timeout argument for fetching late clients
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
* refs/pull/23240/head:
qa/suites/rados, qa/workunits/rados: Add suite/workunit for ceph-crash
add ceph-crash service
common/options: enable mgr 'crash' module by default
global/signal_handler: add 'done' file to signal crashdump is ready
Reviewed-by: Sage Weil <sage@redhat.com>
radosgw now uses 512 frontend threads by default, and valgrind won't
start with its default --max-threads=500
Fixes: http://tracker.ceph.com/issues/25214
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Drop unused suites, which ATM means all of them except upgrade/luminous-x
which recently got a cleanup in https://github.com/ceph/ceph/pull/23162
Signed-off-by: Nathan Cutler <ncutler@suse.com>
* refs/pull/21885/head:
qa: update cluster log health warning message
qa: add tests for client features
mds: evict clients that lack required features
mds: cleanup MDSRank::evict_client
mds: infer client version by client metadata and connection's features
mds: introduce "ceph fs set <fs_name> min_compat_client <release_name>"
mds: tell client why it's rejected
mds: introduce cephfs' own feature bits
mds: make Server::prepare_force_open_sessions() update client metadata
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Having lots of deletes will mean deletes on objects that don't exist,
which will in turn mean error log entries and more coverage of the
append_log_entries_update_missing code. Hopefully this will trigger
http://tracker.ceph.com/issues/24597
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/22740/head:
qa: create common conf for all cephfs suites
qa: remove wrongly created random distro conf
Reviewed-by: Zheng Yan <zyan@redhat.com>
This will be followed by removing common CephFS configurations in the
ceph.conf.template in teuthology.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This utilizes the recent feature in teuthology [1] to skip hidden files in
suites when building the job matrix.
Idea of this change is to enable referring to the top-level qa directory in a
position-independent way such that copies of a suite to another location do not
break any symlinks.
[1] https://github.com/ceph/teuthology/pull/1185
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/22596/head:
os/bluestore: use vector instead of set for zombies
os/bluestore: reuse zombie OpSequencers by collection id
qa/suites/rados/objecstore/backends/objectstore: capture coredumps
os/bluestore: more debug output
os/bluestore: print cnode from _open_collections
os/bluestore: print cnode on fsck
qa/suites/rados/objecstore: preserve data dir for ceph_test_objecstore
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Add ability to list, set and unset cluster-wide OSD flags.
Flags can be listed and changed through the `/api/osd/flags` API
resource. By using a GET request, the list is retrieved. By using a PUT
request, the flags are updated (all at once). Flags not contained in the
data of the PUT are removed, additional once are added. Note that the
PUT requests require a JSON body with the data contained as value of the
'flags' key like so:
{"flags": ["flag1", "flag2", ...]}
Fixes: http://tracker.ceph.com/issues/24056
Signed-off-by: Patrick Nawracay <pnawracay@suse.com>