* refs/pull/39724/head:
qa: skip exit-on-first-failure option for valgrind on ubuntu
mds,qa: exit instead of respawn under valgrind
qa: skip chdir for fuse_mount
qa: ignore all slow request warnings
qa: add new mds beacon grace mon config
qa: wait for MDS to join fsmap
qa: move get_valgrind_args to qa
Reviewed-by: Rishabh Dave <ridave@redhat.com>
* refs/pull/38684/head:
qa: add _check_scrub_status helper to simplify the code
qa: add run_scrub helper in filesystem class
qa: add get_scrub_status helper in filesystem class
qa: wait the scrub task to complete
qa: remove passed_validation check for test_damage
qa: move wait_until_scrub_complete helper to filesystem class
mds: simplify the C_MDS_EnqueueScrub finish code
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/39832/head:
mgr/DaemonServer: osd ok-to-stop: return json when there are unknown PGs
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
* refs/pull/39726/head:
mgr/cephadm: document ok_to_stop output argument for clarity
mgr/DaemonServer: make warning language a bit friendlier
mgr/cephadm/upgrade: improve language a bit
mgr/cephadm/upgrade: restart multiple osds at once
mgr/cephadm: gather other osds that are safe to stop
mgr/cephadm: optional pass 'known' through to ok_to_stop
mgr/cephadm/upgrade: log start/stop/pause/resume
Reviewed-by: Sebastian Wagner <swagner@suse.com>
In 791952cc01 we switched to return JSON
both on success and fail to describe which PGs are affected or are blocking
the ability to stop/restart OSDs. Do the same for the case where
some PG states are unknown (i.e., just after a mgr restart) so that
the cephadm upgrade process can unconditionally expect a JSON result.
Signed-off-by: Sage Weil <sage@newdream.net>
This is being done from ansible now. Also, it breaks when
the conf file has unqualified-search-registries but not 'registry'
entries.
Signed-off-by: Sage Weil <sage@newdream.net>
crimson/osd: do not pass lvalue of the lambda to seastar::futurize_invoke
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
crimson/osd: capture error_code by value in PG::handle_failed_op
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com>
As of a49d1dbb32, when the rbd_rwl_cache and
rbd_ssd_cache bconds are enabled and WITH_SYSTEM_PMDK is disabled (as it is by
default), the RPM build attempts to
git clone https://github.com/ceph/pmdk.git
but of course that won't work in the OBS, where the build workers have no
Internet connectivity.
Fortunately, the openSUSE/SLE versions targeted by Ceph master and pacific ship
the necessary PMDK libraries as RPM packages.
Fixes: a49d1dbb32
Fixes: https://tracker.ceph.com/issues/49550
Signed-off-by: Nathan Cutler <ncutler@suse.com>
* refs/pull/39780/head:
qa/vstart_runner: dont log "not Ceph bin" msg too often
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
* refs/pull/39681/head:
vstart_runner: define path to ceph binary and use it
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
valgrind can't handle execve of /proc/self/exe:
2021-02-27T05:52:37.813 INFO:tasks.ceph.mds.d.smithi073.stderr:==00:01:03:20.556 41218== execve(0x18546740(/proc/self/exe), 0x18546670, 0x133ef310) failed, errno 2
2021-02-27T05:52:37.813 INFO:tasks.ceph.mds.d.smithi073.stderr:==00:01:03:20.556 41218== EXEC FAILED: I can't recover from execve() failing, so I'm dying.
2021-02-27T05:52:37.813 INFO:tasks.ceph.mds.d.smithi073.stderr:==00:01:03:20.556 41218== Add more stringent tests in PRE(sys_execve), or work out how to recover.
So configure the MDS to just exit so it can be restarted by QA infra (the
daemon watchdog).
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The use of chdir will muck up the use of nsenter with valgrind:
2021-03-03T02:13:49.897 DEBUG:teuthology.orchestra.run.smithi144:> sudo nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0 cd /home/ubuntu/cephtest && sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper term env 'OPENSSL_ia32cap=~0x1000000000000000' valgrind --trace-children=no --child-silent-after-fork=yes '--soname-synonyms=somalloc=*tcmalloc*' --num-callers=50 --suppressions=/home/ubuntu/cephtest/valgrind.supp --xml=yes --xml-file=/var/log/ceph/valgrind/client.0.log --time-stamp=yes --vgdb=yes --exit-on-first-error=yes --error-exitcode=42 --tool=memcheck --leak-check=full --show-reachable=yes ceph-fuse -f --admin-socket '/var/run/ceph/$cluster-$name.$pid.asok' --id 0 /home/ubuntu/cephtest/mnt.0
2021-03-03T02:13:49.899 DEBUG:teuthology.orchestra.run.smithi144:> sudo modprobe fuse
2021-03-03T02:13:49.914 INFO:teuthology.orchestra.run:Running command with timeout 30
2021-03-03T02:13:49.914 DEBUG:teuthology.orchestra.run.smithi144:> sudo mount -t fusectl /sys/fs/fuse/connections /sys/fs/fuse/connections
2021-03-03T02:13:49.919 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi144.stderr:nsenter: failed to execute cd: No such file or directory
It's not necessary to chdir at all to do the mount, so don't.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
When running under valgrind, MDS may be slow to be added to the FSMap
(especially if mons are in valgrind too). The file system creation that
follows will throw unnecessary warnings about insufficient standbys if
no MDS is available.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This method is unused in the teuthology repo. The helper method better
belongs here where it is more easily modified.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>