Commit Graph

192 Commits

Author SHA1 Message Date
David Zafman fbc8bcfe05 test: test_get_timeout_delays() fix
Caused by: 7b0d1c8b8a

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-07-03 14:01:36 -07:00
Josh Durgin 9106dc56c2
Merge pull request #22761 from fullerdj/wip-djf-24686
osd/filestore: Change default filestore_merge_threshold to -10

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-07-02 17:36:00 -07:00
Douglas Fuller 75f55f2dfc osd/filestore: Change default filestore_merge_threshold to -1
Performance evaluations of medium to large size Ceph clusters have
demonstrated negligible performance impact from unnecessarily deep
directory hierarchies but significant performance impact from filestore
split and merge activity. Disable merges by default.

Fixes: http://tracker.ceph.com/issues/24686
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2018-06-29 11:45:12 -04:00
David Zafman 663d96e934
Merge pull request #22727 from dzafman/wip-21664
qa/standalone/scrub: When possible show side-by-side diff in addition to regular diff

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-06-28 19:59:21 -04:00
David Zafman 3ff56a82a4
Merge pull request #22763 from dzafman/wip-remove-sudo
qa: Don't use sudo when moving logs

Reviewed-by: Neha Ojha <nojha@redhat.com>
2018-06-28 18:37:24 -04:00
David Zafman 23ed63e15f
Merge pull request #22441 from ErwanAliasr1/evelu-makecheck
Improving make check reliability

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
2018-06-28 14:55:12 -04:00
David Zafman 808c628304 qa: Don't use sudo when moving logs
Caused by: f0964beac5

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-28 09:17:06 -07:00
David Zafman ebb05b2542 test: When possible show side-by-side diff in addition to regular diff
Fixes: https://tracker.ceph.com/issues/21664

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-26 18:23:07 -07:00
David Zafman f0964beac5 qa: For teuthology copy logs to teuthology expected location
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-25 18:06:01 -07:00
Erwan Velu 57df91380b qa/standalone/ceph-helpers.sh: Setup ulimit in setup()
If ulimit is set to a 1024 value, ceph-osd will segfault with the
following error :
    filestore(td/smoke/0)  error (24) Too many open files not handled on operation 0x55565d1fd004 (2182.1.0, or op 0, counting from 0)

This patch is about to insure that before setting up ceph daemons in tests, a valid ulimit value is setup.

Signed-off-by: Erwan Velu <erwan@redhat.com>
2018-06-25 22:09:14 +02:00
Erwan Velu 7b0d1c8b8a qa/standalone/ceph-helpers.sh: Thinner resolution in get_timeout_delays()
get_timeout_delays() is a generic function to compute delays for a long
period of time without saturating the CPU is busy loops.

It works pretty fine when the delay is short like having the following
series when requesting a 20seconds timeout : "0.1 0.2 0.4 0.8 1.6 3.2 6.4 7.3 ".
Here the maximum between two loops is 7.3 which is perfectly fine.

When the timeout reaches 300sec, the same code produces the following
series : "0.1 0.2 0.4 0.8 1.6 3.2 6.4 12.8 25.6 51.2 102.4 95.3 "
In such example there is delays which are nearly 2 minutes !

That is not efficient as the expected event, between two loops, could
arrive just after this long sleep occurs making a minute+ sleep for
nothing. On a local system that could be ok while on a CI, if all jobs
run like CI the overall is pretty unefficient by generating useless CPU
waits.

This patch is about adding a maximum acceptable delay time between two
loops while keeping the same rampup behavior.

On the same 300 seconds delay example, with MAX_TIMEOUT set to 10, we
now have the following series: "0.1 0.2 0.4 0.8 1.6 3.2 6.4 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 7.3"
We can see that the long 12/25/51/102/95 values vanished and being
replaced by a series of 10 seconds. It's up to every test defining the
probability of having a soonish event to complete.

The MAX_TIMEOUT is set to 15seconds.
Signed-off-by: Erwan Velu <erwan@redhat.com>
2018-06-25 22:09:14 +02:00
Sage Weil 3cd7d5eb22 Merge PR #22343 into master
* refs/pull/22343/head:
	qa/standalone remove ceph-disk from activate_osd helper
	cmake: remove subman.sh tests
	test remove ceph-disk directory
	debian: remove ceph_detect_init python files from base
	qa/standalone remove virtualenv paths for ceph-disk and ceph-detect-init
	debian: remove ceph-disk ceph-detect-init python files
	rpm: remove ceph-disk ceph-detect-init python files
	alpine: remove ceph-disk ceph-detect-init python files
	alpine: remove ceph-osd and parttypeuuid udev rules
	debian: remove ceph-osd and parttypeuuid udev rules
	rpm: remove ceph-osd and parttypeuuid udev rules
	ceph-helpers.sh: remove ceph-disk, set up osds directly
	CMakeLists.txt: add back CEPH_BUILD_VIRTUALENV
	alpine: remove ceph-disk, add ceph-volume in APKBUILD.in
	upstart: remove ceph-disk activation call
	doc/install add anchor for manual osd deployment in freebsd guide
	doc/dev remove ceph-disk from freebsd guide, link to manual reference
	doc/dev/config-key remove ceph-disk references
	doc/dev remove ceph-disk.rst
	doc/dev: change ceph-disk suite examples for ceph-deploy
	doc/man_index: remove ceph-disk, ceph-detect-init refs
	doc/install: remove ceph-disk from freebsd examples
	doc/rados remove ceph-disk from man references
	doc/man remove ceph-disk ref from ceph-volume-systemd
	doc/man: update reference from ceph-disk to ceph-volume
	doc/man: remove ceph-disk, ceph-detect-init from cmake
	doc/man/ceph-volume remove doc reference to ceph-disk
	doc/man: remove ceph-disk, ceph-detect-init
	qa/suites: remove ceph-disk
	qa/run-standalone.sh: remove requirement for ceph-detect-init virtualenv
	qa/workunits: remove ceph-detect-init from rbdmapfile test
	qa/workunits: remove ceph-detect-init from ceph-helpers-root.sh
	qa/workunits: remove ceph-disk
	build: remove ceph-disk from freebsd script
	cmake: remove ceph-disk, ceph-detect-init tox tests
	init-ceph: remove ceph-disk
	cmake: remove top-level entries for ceph-disk, ceph-detect-init
	debian: remove ceph-detect-init references
	debian: remove ceph-disk references
	src: remove ceph-detect-init tool
	rpm: remove ceph-disk, ceph-detect-init from spec file
	test: remove subman script
	script: remove subman script
	udev: remove parttypeuuid rules for ceph-disk
	tool remove ceph-disk from ps-ceph.pl
	upstart: remove ceph-disk conf file
	systemd: remove ceph-disk from CMakeLists
	systemd: remove ceph-disk service
	udev: remove ceph-disk rules
	src: remove ceph-disk tool
2018-06-19 07:07:55 -05:00
David Zafman fe09fc5e9d test: Fail immediately if some operations fail
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-18 14:09:14 -07:00
David Zafman 33538aca35 test: Fix standalone main usage
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-18 14:09:14 -07:00
David Zafman f886ebba08 test: Fix some function desciptions
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-18 14:09:14 -07:00
David Zafman 39fc43556f test: Put files in private test directory
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-18 14:08:23 -07:00
Erwan Velu 2ce480b8fd qa/standalone/ceph-helpers.sh: Fixing comment for wait_for_health()
wait_for_health doesn't check if the cluster is making progress. So
let's adjust the comment accordingly.

Signed-off-by: Erwan Velu <erwan@redhat.com>
2018-06-14 11:06:52 +02:00
Erwan Velu e6e10246c6 tests: Protecting rados bench against endless loop
If the cluster dies during the rados bench, the maximum running time is
no more considered and all emitted aios are pending.

rados bench never quits and the global testing timeout (3600 sec : 1
hour) have to be reach to get a failure.

This situation is dramatic for a background test or a CI run as it locks
the whole job for too long for an event that will never occurs.

This ideal solution would be having 'rados bench' considering a failure
once the timeout is reached when aios are pending.

A possible workaround here is to put use the system command 'timeout'
before calling rados bench and fail if rados didn't completed on time.

To avoid side effects, this patch is doubling rados timeout. If rados
didn't completed after twice the expected time, it have to fail to avoid
locking the whole testing job.

Please find below the way it worked on a real test case.
We can see no IO after t>2 but despite timeout=4 the bench continue.
Thanks to this patch, the bench is stopped at t=8 and return 1.

5: /home/erwan/ceph/src/test/smoke.sh:55: TEST_multimon:  timeout 8 rados -p foo bench 4 write -b 4096 --no-cleanup
5: hints = 1
5: Maintaining 16 concurrent writes of 4096 bytes to objects of size 4096 for up to 4 seconds or 0 objects
5: Object prefix: benchmark_data_mr-meeseeks_184960
5:   sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
5:     0       0         0         0         0         0           -           0
5:     1      16      1144      1128   4.40538   4.40625  0.00412965   0.0141116
5:     2      16      2147      2131   4.16134   3.91797  0.00985654   0.0109079
5:     3      16      2147      2131   2.77424         0           -   0.0109079
5:     4      16      2147      2131    2.0807         0           -   0.0109079
5:     5      16      2147      2131   1.66456         0           -   0.0109079
5:     6      16      2147      2131   1.38714         0           -   0.0109079
5:     7      16      2147      2131   1.18897         0           -   0.0109079
5: /home/erwan/ceph/src/test/smoke.sh:55: TEST_multimon:  return 1
5: /home/erwan/ceph/src/test/smoke.sh:18: run:  return 1

Signed-off-by: Erwan Velu <erwan@redhat.com>
2018-06-14 11:06:52 +02:00
Erwan Velu 62d2646c30 qa/standalone/ceph-helpers.sh: Defining custom timeout for wait_for_clean()
The wait_for_clean() is using the default timeout aka 300sec = 5mn.

wait_for_clean() is trying to find a clean status within that timeout
_or_ reset its counter if any progress got made in between loops.

In a case where the cluster is sane, the recovery should be made in
shorter than 5mn but it the cluster died, waiting for 5mn for nothing is
unefficient.

This patch is about defining a custom timeout for a wait_for_clean() not
to wait much more that 1m30 (90sec). If no progress is made in that
period, there is very few chance this will read the a valid state
anyhow.

Signed-off-by: Erwan Velu <erwan@redhat.com>
2018-06-14 11:06:52 +02:00
Alfredo Deza 5b3a540045 qa/standalone remove ceph-disk from activate_osd helper
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-06-13 15:16:27 -04:00
Alfredo Deza aa4f5569c3 qa/standalone remove virtualenv paths for ceph-disk and ceph-detect-init
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-06-13 15:16:27 -04:00
Dan Mick 50f2b72f2f ceph-helpers.sh: remove ceph-disk, set up osds directly
Signed-off-by: Dan Mick <dan.mick@redhat.com>
2018-06-13 15:16:26 -04:00
David Zafman c1e96ae7cb test: Use a file that should be on all OSes
Also, create temporary files in test specific dir and remove

Caused by: 154330fd68

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-05 11:27:12 -07:00
Sage Weil 154330fd68 osd/PrimaryLogPG: fix on_local_recover crash on stray clone
If there is a stray clone (one that does not appear in the SnapSet) and
we do any sort of recovery on it the OSD will crash.  Log an error instead
but continue.

This addresses a problem where a cluster has both (1) an unexpected clone
and (2) the clone is not present on all replicas.  Doing repair on that
PG will both not fix the unexpected clone and also cause the remaining
OSDs to crash trying to recover it.

Include a test.

Fixes: https://tracker.ceph.com/issues/24396
Signed-off-by: Sage Weil <sage@redhat.com>
2018-06-05 11:09:01 -05:00
Kefu Chai 333068b208
Merge pull request #22346 from dzafman/wip-scrub-omap
osd: Handle omap and data digests independently

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-06-04 19:53:18 +08:00
Kefu Chai 0829e83fde
Merge pull request #22196 from thinkercui/bugfix
osd: read object attrs failed at EC recovery

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-06-03 01:52:24 +08:00
cuixf 3eb1679b1f osd: retry to read object attrs at EC recovery
In EC recovery read, if the object's attrs read failed or with errors, we erase the attrs we have read and
try to read it again from left shards. This will make the primary osd get the object's attrs correct and
avoid assert.

Signed-off-by: xiaofei cui <cuixiaofei@sangfor.com>
2018-06-01 06:26:56 -04:00
David Zafman 843598b69b Revert "qa/standalone/scrub/osd-scrub-repair.sh: drop omap_digest flag"
This reverts commit 886606bfd7.

Signed-off-by: David Zafman <dzafman@redhat.com>

Conflicts:
	qa/standalone/scrub/osd-scrub-repair.sh (manually made equivalent changes)
2018-05-31 12:01:53 -07:00
Sage Weil c3164df959 qa/standalone/mon/misc: fix features test
Signed-off-by: Sage Weil <sage@redhat.com>
2018-05-25 17:02:49 -05:00
David Zafman 1a7fa9a62a test: Add test cases for multiple copy pool and snapshot errors
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-28 16:42:19 -07:00
David Zafman 2fa596dc0c test: Prepare for second test and minor improvements
Check list-inconsistent-obj output
Check how many _scan_snap groupings
Use more general check for crashed osd(s)

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-28 16:42:19 -07:00
David Zafman bae4940574 test: Fix comment at end of scrub test scripts
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-28 16:42:19 -07:00
Sage Weil 27e91a99f5
Merge pull request #21273 from jdurgin/wip-23195
osd/ECBackend: only check required shards when finishing recovery reads

Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2018-04-24 17:20:25 -05:00
Josh Durgin d4808256d2 osd/ECBackend: preserve requests for other objects when sending extra reads
When multiple objects are in flight for the same ReadOp, swap() on the
map<hobject_t, read_request_t> would remove requests for all objects.

We just want to replace the requests for the single object we're
dealing with in send_all_remaining_reads().

This prevents crashing trying to look up rop.to_read[hoid] when another
object in the same ReadOp gets an EIO and tries to send more requests.

Test this by using osd-recovery-max-single-start to bundle multiple
reads into one ReadOp. Save and restore CEPH_ARGS so custom settings
are reset for each test.

Fixes: http://tracker.ceph.com/issues/23195 (the 2nd crash there)
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2018-04-20 19:42:15 -04:00
Josh Durgin b162a5478d osd/ECBackend: recover from EIO based on the minimum data necessary
Discount shards that already returned EIO, and use minimum_to_decode()
to request just what is necessary to recover or read the originally
requested extents of the object.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2018-04-20 19:42:14 -04:00
Josh Durgin 468ad4b410 osd/ECBackend: only check required shards when finishing recovery reads
1235810c2a allowed recovery to use
multiple passes of reads to handle EIO, but the end condition for
checking whether we finished reading requires the full data to be
decodable (this is what get_want_to_read_shards returns).

This is just a loss of efficiency normally, since when there is only
one object the subsequent read works, and grabs all the data
necessary. The crash comes from having multiple objects in the same
ReadOp - in this case the sequence of events is:

- start recovery of two objects (osd_recovery_max_single_start > 1)
- read object a shard 3
- read object b shard 3
- fail minimum_to_decode because shard 3 can't reconstruct all of object a
- re-read all of object a, marking more reads in progress
- fail minimum_to_decode because shard 3 can't reconstruct all of object b
- skip re-reading object because there are now reads in progress
- finish reading k shards of object a
- still fail minimum_to_decode for object b, so no extra data was read
- send_all_remaining_reads tries to lookup object b in ReadOp object
- crash dereferencing to_read[object b], since this was cleared after handling the original object b read reply

This patch fixes the immediate inefficiency and crash by only checking
for the missing shards that were requested, rather than the entire
object, for recovery reads.

Fixes: http://tracker.ceph.com/issues/23195 (first crash)
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2018-04-20 19:42:14 -04:00
Nathan Cutler f03b9028f5 qa/standalone/ceph-helpers.sh: provide argument to dirname
Fixes: http://tracker.ceph.com/issues/23805
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2018-04-20 10:10:15 +02:00
David Zafman 458babe7ee test: Use jq in a compatible way and for easier diff analysis
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-16 08:11:24 -07:00
David Zafman c6207d21a8
Merge pull request #21362 from dzafman/wip-hex-digest
osd: Change shard digests to hex like object info digests

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-04-12 16:07:36 -07:00
David Zafman 22ddc6da5f osd: Change shard digests to hex like object info digests
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-12 07:59:21 -07:00
Kefu Chai 4cc3dab070
Merge pull request #21318 from badone/wip-qa-mon-misc-add-osdmap-prune-tests
qa/standalone/mon/misc.sh: Add osdmap-prune tests

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-04-11 23:08:33 +08:00
David Zafman 9c5ef19f93 test: Be smarter about when jsonschema can be used
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:52:10 -07:00
David Zafman 60ae2b8eb3 osd rados command: Show snapset in list-inconsistent-snapset
Add SnapSet bufferlist to inconsistent_snapset_t

Partial fix for http://tracker.ceph.com/issues/23428

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:51:48 -07:00
David Zafman 1b1d45bf51 test: Add getjson variable to save output
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman 007cb45fe5 osd rados command: Change error name snapset_mismatch to snapset_error
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman 0c7ac9db3b test: Clean-up test and use local values for number of objects and osds
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman 982509514c osd rados command: list-inconsistent-obj attribute improvements
System attributes shown as "object_info", "snapset" and "hashinfo"
Only output user attributes as "attrs"
	Drop leading undescore "_" for user attribute keys
Improve logic as to when to show user attributes or specific system attributes

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman 01687b052f osd rados command: Change "oi" to "info" in scrub handling errors
data_digest_mismatch_oi -> data_digest_mismatch_info
omap_digest_mismatch_oi -> omap_digest_mismatch_info
size_mismatch_oi -> size_mismatch_info
obj_size_oi_mismatch -> obj_size_info_mismatch

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman 273f6213ea osd rados command: Change "oi_attr" to "info" in scrub handling errors
oi_attr_missing -> info_missing
oi_attr_corrupted -> info_corrupted

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00
David Zafman bec67e3d40 osd rados command: Rename ss_attr_missing/ss_attr_corrupted to snapset_missing/snapset_corrupted
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-04-10 13:26:08 -07:00