The upper limit for OSD/MDS ports changed from 7100 to 7300 in commit
f9ec5a7945. Update the Quick Start
Preflight documentation to reflect this change.
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
The xio_msg pointers to be freed in XioPortal::release_xio_rsp() are no
longer valid after a call to xio_connection_destroy(). We were already
avoiding the call to xio_release_msg() in this case, but were still
dereferencing the xio_msg for its user_context pointer. Moved the check
for is_connected() outside of the loop to avoid any access to msg.
Suggested-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Casey Bodley <casey@cohortfs.com>
accelio is using rdtsc to generate xio_msg.timestamp, which can't be
reliably converted to a timeval. now uses ceph_clock_now() to assign
the Message::recv_stamp and recv_complete_stamp
Signed-off-by: Casey Bodley <casey@cohortfs.com>
A missing nonce in the osd addrs was preventing the monitor from
detecting osd restarts. XioMessenger::bind() now sets the nonce in the
same way that SimpleMessenger and AsyncMessenger do
Signed-off-by: Casey Bodley <casey@cohortfs.com>
Signed-off-by: Vu Pham <vu@mellanox.com>
Better way to assign connections to a specific lane of a portal
Avoiding lane competition/hogging.
This change resolves the slow ramping up and spiky behaviors during
clients starting/running I/Os.
Signed-off-by: Vu Pham <vu@mellanox.com>
Prior to this commit, if a user installed the "ceph-common" Debian
package without installing "ceph", then /usr/bin/ceph would crash
because it was missing the ceph_argparse library.
Ship the ceph_argparse library in "ceph-common" instead of "ceph". (This
was the intention of the original commit that moved argparse to "ceph",
2a23eac54957e596d99985bb9e187a668251a9ec)
http://tracker.ceph.com/issues/11388 Refs: #11388
Reported-by: Jens Rosenboom <j.rosenboom@x-ion.de>
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
When doing seq and rand read benchmarks using rados bench, a quite large
portion of cpu time is consumed by doing object verification. This patch
adds an option to disable this verification when it's not needed, in turn
giving better cluster utilization. rados -p storage bench 600 rand scores
without --no-verification:
Total time run: 600.228901
Total reads made: 144982
Read size: 4194304
Bandwidth (MB/sec): 966
Average IOPS: 241
Stddev IOPS: 38
Max IOPS: 909522486
Min IOPS: 0
Average Latency: 0.0662
Max latency: 1.51
Min latency: 0.004
real 10m1.173s
user 5m41.162s
sys 11m42.961s
Same command, but with --no-verify:
Total time run: 600.161379
Total reads made: 174142
Read size: 4194304
Bandwidth (MB/sec): 1.16e+03
Average IOPS: 290
Stddev IOPS: 20
Max IOPS: 909522486
Min IOPS: 0
Average Latency: 0.0551
Max latency: 1.12
Min latency: 0.00343
real 10m1.172s
user 4m13.792s
sys 13m38.556s
Note the decreased latencies, increased bandwidth and more reads performed.
Signed-off-by: Piotr Dałek <piotr.dalek@ts.fujitsu.com>
- to avoid the scrub wave when the osd_scrub_max_interval reaches in a
high-load OSD, the scrub time is randomized.
- extract scrub_load_below_threshold() out of scrub_should_schedule()
- schedule an automatic scrub job at a time which is uniformly distributed
over [now+osd_scrub_min_interval,
now+osd_scrub_min_interval*(1+osd_scrub_time_limit]. before
this change this sort of scrubs will be performed once the hard interval
is end or system load is below the threshold, but with this change, the
jobs will be performed as long as the load is low or the interval of
the scheduled scrubs is longer than conf.osd_scrub_max_interval. all
automatic jobs should be performed in the configured time period, otherwise
they are postponed.
- the requested scrub job will be scheduled right away, before this change
it is queued with the timestamp of `now` and postponed after
osd_scrub_min_interval.
Fixes: #10973
Signed-off-by: Kefu Chai <kchai@redhat.com>
This tests RENAME_WHITEOUT, which was enabled for xfs in kernel commit
7dcf5c3e4527cfa2807567b00387cf2ed5e07f00. At first execution, it throws a BUG.
Subsequent executions appear to work correctly. This issue manifests for disks
and RBD instances.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
Debian's debug packages ought to depend on their respective binary
packages. This was the case for many of our ceph packages, but it was
not the case for ceph-test-dbg or rest-bench-dbg.
Add the dependencies on the relevant binary packages, pinned to
"= ${binary:Version}" per convention.
http://tracker.ceph.com/issues/11673Fixes: #11673
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
This was broken by 96992466 aka "mds: handle missing mydir dirfrag"
The previous code was mistakenly treating a not-yet-loaded
dirfrag as a non-existent dirfrag, resulting in
inconsistent fragstats even when no objects had
actually been lost.
Fixes: #11641
Signed-off-by: John Spray <john.spray@redhat.com>
Move check-local scripts
src/test/run-cli-tests
encode-decode-non-regression.sh
test/encoding/readable.sh
to check_SCRIPTS. Their output is captured in .log file when running
with a recent automake. This reduces the output of make check by an
order of magnitude.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
Because of the | grep, the status of tox is no longer the status of
run-tox.sh and errors are not reported as they should.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
This program contains a collection of low-level performance measurements
for Ceph, which can be run either individually or altogether. These
tests measure performance in a single stand-alone process, not in a cluster
with multiple servers. Invoke the program like this:
Perf test1 test2 ...
test1 and test2 are the names of individual performance measurements to
run. If no test names are provided then all of the performance tests
are run.
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
Remove the helpers because they are not used any longer. They have been
deprecated by ceph-helpers.sh
Signed-off-by: Loic Dachary <ldachary@redhat.com>
Use ceph-helpers.sh instead of mon/mon-test-helpers.sh.
* modifying the .asok and .log names to match the ceph-helpers.sh
conventions
* use explicit ports 7300 and 7301 instead of +1 so that grep
will show that 7301 is used. This reduces the odds of a
port collision when looking for a port that's not already
used by an existing test.
Signed-off-by: Loic Dachary <ldachary@redhat.com>