If we have an OSD with a weight that's not 1.0 and mark it out,
we should restore the same weight when we mark it back in. We
already do this when an OSD is automatically marked out, just
not when it is explicitly marked out.
Signed-off-by: Sage Weil <sage@redhat.com>
The variable 'pgs_per_osd' set value from 'new_pgs' divided by 'expected_osds',
and its type is integer. So it would remove the decimal point and get smaller value.
This would have problem in some situations, for exmaple:
The limitation of pg creating for one OSD is '32'.
There have 3 OSDs and I want to increase pgs for a pool.
It should be the limitation for creating new pgs up to '96(32 * 3)' at once.
Now, I create '98' pgs for a pool.
In original code, '98' would be divided by 'expected_osds' and get the floating value '32....'
Because of the type which is integer, the 'pgs_per_osd' would be set to 32.
Then the value won't bigger than the limitation and get the wrong result.
Signed-off-by: DesmondS <desmond.s@inwinstack.com>
Fixes: http://tracker.ceph.com/issues/17169
For newly created cluster the CEPH_OSDMAP_REQUIRE_KRAKEN will be
automatically set, while for existing clusters it will not.
This change add "require_jewel_osds" to white list, so user
can access it by the "ceph osd set *" command family.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
create temp directory and files in $TMPDIR. the $TMPDIR is hard-wired to
/tmp before this change, we'd better respect the env variable $TMPDIR,
so it would be more consistent, and easier to do the cleanup if any.
Signed-off-by: Kefu Chai <kchai@redhat.com>
this fixes failures like,
/home/jenkins-build/build/workspace/ceph-pull-requests/qa/workunits/cephtool/test.sh:
line 32: ceph osd blacklist ls | grep 192.168.0.1: command not found
where the failure is not the "failure" we are expecting.
in our tests, following command
expect_false "ceph osd blacklist ls | grep 192.168.0.1"
is designed to to verify that "ceph osd blacklist ls | grep 192.168.0.1"
fails with non-zero return code. but expect_false() evaluates the command
line using plain "$@", which will send the arguments direct to the shell,
and $0 is "ceph auth get client.xx | grep caps | grep mon", which does
not exist and is not built-in command. so we need to check the grep
command instead.
for multiple piped command line, use
expect_false sh <<< "echo foo | grep bar | grep baz"
Signed-off-by: Kefu Chai <kchai@redhat.com>
we set the CEPH_CLI_TEST_DUP_COMMAND enn var to verify the successful
commands are idempotent. but some of them are just not. among the other
things:
- ceph tell mds.a exit
- ceph tell mds.a respawn
the respawn command restart the mds daemon, its bind port changes and
all run-time status are reset. so strictly speaking, even the from the
point of view of client, this command is not idempotent. further more,
it fails the test, if the client sends the 2nd command too soon. because
the monitor might not able to update the re-spawned mds address before
the client asking for the new fsmap. so the cephfs client will just
use the old address of the specified mds, and hence will send the
request to port no one is listening anymore.
Signed-off-by: Kefu Chai <kchai@redhat.com>
When this test is failing and reach the limits, reading the log doesn't make
obvious that we reach them.
This simple patch adds the iterations numbers inside the output log.
Signed-off-by: Erwan Velu <erwan@redhat.com>
* test_osd_bench: the injectargs call actually fails due to the spaces
at the beginning. so remove the spaces in args before sending it to
injectargs
* update/add some injectargs tests accordingly
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
in cf24535, we use $CEPH_ROOT to specify the $top_srcdir to unify
cmake and autotools, but this breaks ceph-qa-suite/tasks/workunit.py,
as it only clones the necessary qa/workunits directory, and does not
pass $CEPH_ROOT to the test scripts. so we need to set a default
$CEPH_ROOT if it is not set.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Replaced relative paths in test/cephtool-test-mon.sh,
qa/workunits/cephtool/test.sh, and test/cephtool-test-mon.sh
to work with CEPH_FOO environment variables set in cmake.
Signed-off-by: Ali Maredia <amaredia@redhat.com>
Protect a number of unstable/experimental features behind durable flags
https://github.com/ceph/ceph/pull/8383
Reviewed-by: John Spray <john.spray@redhat.com>
Method preprocess_remove_snaps() is designed to fast check whether
we can safely handle a remove-snaps-request without changing the osdmap.
The original design is to be able to handle snaps from multiple pools,
including those snaps even from a non-existent pool by simply skipping
over them. However, this method will quit on successfully detecting
any vaild snap which is truly needed to be removed and forward this
request to prepare_remove_snaps() for further processing.
From the above analysis, the prepare_remove_snaps() method will
theoretically also encounter some snaps which possibly belong to
non-existent pools.
This pr solves the above problem by adding a sanity check against
pool existense associated with the specified snap to be removed, which
shall be considered as a defensive move and makes prepare_remove_snaps()
stronger.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
The current code was waiting 10s to expect the file being put.
If the file was put in a shorter time than 10s, the test just waits for
nothing reducing the execution speed of that test.
This patch simply check if the file is actually available every second
during 10sec to exit prematurely.
This patch saves exactly 10 sec on a local system, surely a little bit
less on an infra but still saves time.
Signed-off-by: Erwan Velu <erwan@redhat.com>
The actual code double the wait time between two calls leading to a
possible 511s of waiting time which sounds a little bit excessive.
This patch offer to reduce the global wait time to 300s and test more
often the rados status to exit the loop earlier. In a local test, that
saves 6 secs per run.
Signed-off-by: Erwan Velu <erwan@redhat.com>
ceph_watch_wait() is doing a sleep _before_ doing the test which could
stop this loop.
It's better doing the action first as it could exit immediately and
avoid a useless sleep.
That's a minor optimization but everything count when trying to get
something smooth.
Signed-off-by: Erwan Velu <erwan@redhat.com>
OSDs are taking some time to be up but waiting 10 secs seems execessive
here between two loops. In the worst case, we can be in a situation of
waiting 10secs for nothing as we are just a few microsecs after the osd
is up.
This patch simply reduce the sleep from 10 to 1 seconds.
Signed-off-by: Erwan Velu <erwan@redhat.com>
Since the merge of pr #7693, 'ceph command' to get the help is invalid.
As a result, 'test/cephtool-test-mon.sh' test was broken
This patch simply change the 'ceph command' by a 'ceph --help command'
Since this change the test is passing again.
Signed-off-by: Erwan Velu <erwan@redhat.com>