and remove src/test/gtest-parallel submodule, because gtest-parallel is
only useful for running tests. and not all end-users are interested in
running test not to mention running them in parallel. so, to avoid
including gtest-parallel scripts in the dist tarball. it'd be better to
make it optional, and an external project.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Unittests are run sequentially and could take a long while to run.
This commit is about using gtest-parallel on some of them which are
known to be very slow due to this sequentiality.
To enable the parallel features, the 'parallel' argument just have to be
added to the add_ceph_unittest() call like in :
-add_ceph_unittest(unittest_throttle)
+add_ceph_unittest(unittest_throttle parallel)
This commit impact the following tests :
Test name Before After (in seconds)
unittest_erasure_code_shec_all: 212 43
unittest_throttle 15 5
unittest_crush 9 6
unittest_rbd_mirror 79 21
Total 315 75
This commit saves 240 seconds (4 minutes) per build.
Note it exist several other long tests but can't be parallelized since
there is explicit dependencies in the order to run the subtests.
Those stay sequential.
Signed-off-by: Erwan Velu <erwan@redhat.com>
Adding a dependency on cls_opt for the radosgw so
that when the vstart target is made, `radosgw-admin
mfa` commands work.
Signed-off-by: Ali Maredia <amaredia@redhat.com>
* refs/pull/22554/head:
qa/standalone/ceph-helpers.sh: Fixing comment for wait_for_health()
tests: Protecting rados bench against endless loop
qa/standalone/ceph-helpers.sh: Defining custom timeout for wait_for_clean()
Reviewed-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
mgr/dashboard: Add help menu entry
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Patrick Nawracay <pnawracay@suse.com>
Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
Enables token authentication for the Grafana proxy as additional option
to username/password authentication. The authentication method has to be
set, too.
$ ceph dashboard set-grafana-api-token <token> # default: ''
$ ceph dashboard set-grafana-api-auth-method <method> # default: ''
Possible values for the authentication method are 'password' and
'token'.
Signed-off-by: Patrick Nawracay <pnawracay@suse.com>
If the cluster dies during the rados bench, the maximum running time is
no more considered and all emitted aios are pending.
rados bench never quits and the global testing timeout (3600 sec : 1
hour) have to be reach to get a failure.
This situation is dramatic for a background test or a CI run as it locks
the whole job for too long for an event that will never occurs.
This ideal solution would be having 'rados bench' considering a failure
once the timeout is reached when aios are pending.
A possible workaround here is to put use the system command 'timeout'
before calling rados bench and fail if rados didn't completed on time.
To avoid side effects, this patch is doubling rados timeout. If rados
didn't completed after twice the expected time, it have to fail to avoid
locking the whole testing job.
Please find below the way it worked on a real test case.
We can see no IO after t>2 but despite timeout=4 the bench continue.
Thanks to this patch, the bench is stopped at t=8 and return 1.
5: /home/erwan/ceph/src/test/smoke.sh:55: TEST_multimon: timeout 8 rados -p foo bench 4 write -b 4096 --no-cleanup
5: hints = 1
5: Maintaining 16 concurrent writes of 4096 bytes to objects of size 4096 for up to 4 seconds or 0 objects
5: Object prefix: benchmark_data_mr-meeseeks_184960
5: sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
5: 0 0 0 0 0 0 - 0
5: 1 16 1144 1128 4.40538 4.40625 0.00412965 0.0141116
5: 2 16 2147 2131 4.16134 3.91797 0.00985654 0.0109079
5: 3 16 2147 2131 2.77424 0 - 0.0109079
5: 4 16 2147 2131 2.0807 0 - 0.0109079
5: 5 16 2147 2131 1.66456 0 - 0.0109079
5: 6 16 2147 2131 1.38714 0 - 0.0109079
5: 7 16 2147 2131 1.18897 0 - 0.0109079
5: /home/erwan/ceph/src/test/smoke.sh:55: TEST_multimon: return 1
5: /home/erwan/ceph/src/test/smoke.sh:18: run: return 1
Signed-off-by: Erwan Velu <erwan@redhat.com>
The wait_for_clean() is using the default timeout aka 300sec = 5mn.
wait_for_clean() is trying to find a clean status within that timeout
_or_ reset its counter if any progress got made in between loops.
In a case where the cluster is sane, the recovery should be made in
shorter than 5mn but it the cluster died, waiting for 5mn for nothing is
unefficient.
This patch is about defining a custom timeout for a wait_for_clean() not
to wait much more that 1m30 (90sec). If no progress is made in that
period, there is very few chance this will read the a valid state
anyhow.
Signed-off-by: Erwan Velu <erwan@redhat.com>
in spdk v18.05, libuuid is linked by libspdk_util.a, in which,
it is used by lib/util/uuid.c. and libspdk_vol.a uses the wrapper
function exposed by libspdk_util.a, so update the CMakefile script to
reflect the change.
Signed-off-by: Kefu Chai <kchai@redhat.com>
the Message classes are shared by OSD and other components of Ceph,
and the throttle in Policy class is different in seastar and
non-seastar world. we will have different implementations for the
seastar applications and non-seastar apps, to consolidate these
two implementations, we need to introduce a common interface for
them.
Signed-off-by: Kefu Chai <kchai@redhat.com>
This commit adds the config options stored by the MON database to the
configuration documentation page.
One can filter for these config options by setting the 'Source' filter
to 'mon' on the configuration documentation page.
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
Policy implementation takes care of evenly balancing images
across rbd mirror instances. This is done when images are
added to the map and/or instances are added or removed with
the exception of image removal -- removing images does not
reshuffle other (mapped) images which can result in some of
the instances under loaded (in worst case, if one removes
images which all map to a particular instance, that instance
would remain idle until more images are added or a shuffle is
triggered).
We could possibly trigger map shuffle when images are removed,
but that would change the interface between Policy and ImageMap
class (in the form of changes to Policy::remove_images()). Also,
policy (and its implementations) would have to do more work when
the above class method is invoked.
Therefore, an interval based rebalancer is added to ImageMap for
periodic rebalancing of images only if the following conditions
are met:
- policy has been idle for a configured time duration
- no scheduled or in-transit operations
Signed-off-by: Venky Shankar <vshankar@redhat.com>
The final state transition when disassociating (removing) images
does not purge the image state map for a given image. This can
also result in uneven balance of images across instances as the
policy implementation relies on this structure to figure out
total number of images tracked.
Signed-off-by: Venky Shankar <vshankar@redhat.com>
and templaterize it. as we need to share Policy between seastar app and
non-seastar apps. and the Throttle interface for seastar is different
from that for non-seastar, so we should templaterize the Policy and
PolicySet.
Signed-off-by: Kefu Chai <kchai@redhat.com>