The current code of kill_daemons() was killing daemons one after the
other and wait it to actually die before switching to the next one.
This patch makes the kill_daemons() loop being run in parallel to avoid
this bottleneck.
Signed-off-by: Erwan Velu <erwan@redhat.com>
This script had the following performance issue :
- 4 ceph-dencoders spawn sequentialy
- running twice the same dencoder command
This patch is adding parallelism around the 4 sequential calls but also
prevent from testing the deterministic feature twice.
On a recent laptop, this patch drops the running time from 7mn to 3m46
while keeping the loadavg < 2.
Signed-off-by: Erwan Velu <erwan@redhat.com>
The current code was running sequentially two ceph-dencoder calls.
This process is executed pretty fast but adding sequentiality and by the number
of loops to execute, it have a cost.
This patch is just making this two calls being run in parallel.
As a result, the test/encoding/readable.sh test is running in 4m50 instead of 6.
The associate loadavg isn't impacted as it stays at 6 while being run with
nproc=8.
This patch save 1/6th of building time without impact the loadavg.
Signed-off-by: Erwan Velu <erwan@redhat.com>
When running make -j x check, we face a weird situation where the makefile
targets are spawn in parallel up to "x" but one of those target is very very
long and sequential.
The "readable.sh" test is trying to run ~7.9K tests where 5.3K are actually
executed.
The current code is taking 23mn on a recent laptop (Intel(R) Core(TM)
i7-4810MQ CPU @ 2.80GHz, 32GB of RAM & SSD).
This patch implements parallelism to speed up this process which is not really CPU and
neither IO bound.
By default, readable.sh is now using the number of logical processors to determine
the level of parallelism (by using nproc). If needed, defining the MAX_PARALLEL_JOBS
variable will override this default value.
On the same system, where nproc=8, the resulting execution time is 5m55 seconds :
4x faster than the original code.
The global 'make check' is therefore getting faster too and dropped from 30 to
16 minutes : 2x faster than the original code.
Signed-off-by: Erwan Velu <erwan@redhat.com>
This commit introduce two new functions in ceph-helpers.sh to ease
parallelism in tests.
It's based on two functions : run_in_background() & wait_background()
The first one allow you to spawn processes or functions in background and saves
the associated pid in a variable passed as first argument.
The second one waits for thoses pids to complete and report their exit status.
If one or more failed then wait_background() reports a failure.
A typical usage looks like :
pids1=""
run_in_background pids1 bash -c 'sleep 5; exit 0'
run_in_background pids1 bash -c 'sleep 1; exit 1'
run_in_background pids1 my_bash_function
wait_background pids1
The variable that contains pids is local making possible to do nested calls of
thoses two new functions.
Signed-off-by: Erwan Velu <erwan@redhat.com>
These commands are registered in the set_up_admin_socket() method,
but don't get unregistered correctly in the clean_up_admin_socket(),
which is not very appropriate.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
From the call stack handle_core_message()->handle_mds_map(),
handle_core_message() expects that handle_mds_map() will definitely
take good care of the CEPH_MSG_MDS_MAP message and will drop
the message reference on returning.
However, in the belowing code path, it won't do such a tidy up thing.
Although this one is not necessary because we are going to shutdown
anyway, but it is a good habit to do the cleanup.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
On successfully handing a core message, the handle_core_message()
method shall return true and is responsible for dropping the corresponding
message's reference to get it correctly released.
However, if we receive a CEPH_MSG_OSD_MAP message, we won't drop
its reference unless it is neither from a monitor nor an OSD, which means
in most cases we won't release this kind of message correctly.
This pr solves the above problem by dropping the message reference
appropriately.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
Back in 8290536d7d we moved the
apply_changse (and, indirectly, config var expansion) to happen
after set do the drop privileges, but we need the metavar
expansion for setuser_match_path (which docs suggest setting to
/var/lib/ceph/$type/$cluster-$id).
Fixes: http://tracker.ceph.com/issues/15365
Signed-off-by: Sage Weil <sage@redhat.com>
Since the merge of pr #7693, 'ceph command' to get the help is invalid.
As a result, 'test/cephtool-test-mon.sh' test was broken
This patch simply change the 'ceph command' by a 'ceph --help command'
Since this change the test is passing again.
Signed-off-by: Erwan Velu <erwan@redhat.com>
time_t's width varies between machines. Also it fails to compile on 32
bit linux.
Fixes: http://tracker.ceph.com/issues/15330
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Since we're decoding 32-bit integers, just use uint32_t and then cast them to
what utime_t expects.
Fixes: http://tracker.ceph.com/issues/15330
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
The operation completion could finish and be freed before we
do the info->register_tid assignment. Avoid this by doing the
assignment in _op_submit itself.
Fixes: #14364
Signed-off-by: Sage Weil <sage@redhat.com>
This breaks commands like
COMMAND("fs flag set name=flag_name,type=CephChoices,strings=enable_multiple "
"name=val,type=CephString", \
"Set a global CephFS flag", \
"fs", "rw", "cli,rest")
with only one option:
PUT fs/flag/set?flag_name=enable_multiple&val=true: 400
FAILURE: url http://localhost:5000/api/v0.1/fs/flag/set?flag_name=enable_multiple&val=true
expected 200, got 400
Response content: <html><body><table border=1><th>Possible commands:</th><th>Method</th><th>Description</th><tr><td>fs/flag/set?flag_name=enable_multiple&va
l=val(<string>)
</td><td>PUT</td><td>Set a global CephFS flag</td></tr>
</table></body></html>
...and I can't tell why it's there.
Signed-off-by: Sage Weil <sage@redhat.com>
This should only happen with a buggy client, but we should avoid crashing,
and send a polite error message back.
Signed-off-by: Sage Weil <sage@redhat.com>