Merge pull request #14496 from jdurgin/wip-ec-overwrites-cleanup

osd: make ec overwrites ready to use

Reviewed-by: Sage Weil <sage@redhat.com>
This commit is contained in:
Josh Durgin 2017-04-19 17:56:28 -07:00 committed by GitHub
commit 535b2cff57
42 changed files with 489 additions and 164 deletions

View File

@ -78,7 +78,6 @@
still works. See the documentation page "Mount CephFS in your
file systems table" for details.
12.0.1
------
@ -106,3 +105,14 @@
* The RGW api for getting object torrent has changed its params from 'get_torrent'
to 'torrent' so that it can be compatible with Amazon S3. Now the request for
object torrent is like 'GET /ObjectName?torrent'.
* The configuration option "osd pool erasure code stripe width" has
been replaced by "osd pool erasure code stripe unit", and given the
ability to be overridden by the erasure code profile setting
"stripe_unit". For more details see "Erasure Code Profiles" in the
documentation.
* rbd and cephfs can use erasure coding with bluestore. This may be
enabled by setting 'allow_ec_overwrites' to 'true' for a pool. Since
this relies on bluestore's checksumming to do deep scrubbing,
enabling this on a pool stored on filestore is not allowed.

View File

@ -85,14 +85,18 @@ Ceph configuration file.
make pool creation work in the absence of ruleset 0.
``osd pool erasure code stripe width``
``osd pool erasure code stripe unit``
:Description: Sets the desired size, in bytes, of an object stripe on every
erasure coded pools. Every object if size S will be stored as
N stripes and each stripe will be encoded/decoded individually.
:Description: Sets the default size, in bytes, of a chunk of an object
stripe for erasure coded pools. Every object of size S
will be stored as N stripes, with each data chunk
receiving ``stripe unit`` bytes. Each stripe of ``N *
stripe unit`` bytes will be encoded/decoded
individually. This option can is overridden by the
``stripe_unit`` setting in an erasure code profile.
:Type: Unsigned 32-bit Integer
:Default: ``4096``
:Default: ``4096``
``osd pool default size``

View File

@ -39,6 +39,7 @@ To create a new erasure code profile::
ceph osd erasure-code-profile set {name} \
[{directory=directory}] \
[{plugin=plugin}] \
[{stripe_unit=stripe_unit}] \
[{key=value} ...] \
[--force]
@ -60,23 +61,40 @@ Where:
plugins`_ for more information.
:Type: String
:Required: No.
:Required: No.
:Default: jerasure
``{stripe_unit=stripe_unit}``
:Description: The amount of data in a data chunk, per stripe. For
example, a profile with 2 data chunks and stripe_unit=4K
would put the range 0-4K in chunk 0, 4K-8K in chunk 1,
then 8K-12K in chunk 0 again. This should be a multiple
of 4K for best performance. The default value is taken
from the monitor config option
``osd_pool_erasure_code_stripe_unit`` when a pool is
created. The stripe_width of a pool using this profile
will be the number of data chunks multiplied by this
stripe_unit.
:Type: String
:Required: No.
``{key=value}``
:Description: The semantic of the remaining key/value pairs is defined
by the erasure code plugin.
:Type: String
:Required: No.
:Required: No.
``--force``
:Description: Override an existing profile by the same name.
:Description: Override an existing profile by the same name, and allow
setting a non-4K-aligned stripe_unit.
:Type: String
:Required: No.
:Required: No.
osd erasure-code-profile rm
============================

View File

@ -113,12 +113,41 @@ no two *chunks* are stored in the same rack.
More information can be found in the `erasure code profiles
<../erasure-code-profile>`_ documentation.
Erasure Coding with Overwrites
------------------------------
By default, erasure coded pools only work with uses like RGW that
perform full object writes and appends.
Since Luminous, partial writes for an erasure coded pool may be
enabled with a per-pool setting. This lets RBD and Cephfs store their
data in an erasure coded pool::
ceph osd pool set ec_pool allow_ec_overwrites true
This can only be enabled on a pool residing on bluestore OSDs, since
bluestore's checksumming is used to detect bitrot or other corruption
during deep-scrub. In addition to being unsafe, using filestore with
ec overwrites yields low performance compared to bluestore.
Erasure coded pools do not support omap, so to use them with RBD and
Cephfs you must instruct them to store their data in an ec pool, and
their metadata in a replicated pool. For RBD, this means using the
erasure coded pool as the ``--data-pool`` during image creation::
rbd create --size 1G --data-pool ec_pool replicated_pool/image_name
For Cephfs, using an erasure coded pool means setting that pool in
a `file layout<../../cephfs/file-layouts>`_.
Erasure coded pool and cache tiering
------------------------------------
Erasure coded pools require more resources than replicated pools and
lack some functionalities such as partial writes. To overcome these
limitations, it is recommended to set a `cache tier <../cache-tiering>`_
lack some functionalities such as omap. To overcome these
limitations, one can set up a `cache tier <../cache-tiering>`_
before the erasure coded pool.
For instance, if the pool *hot-storage* is made of fast storage::
@ -131,13 +160,6 @@ will place the *hot-storage* pool as tier of *ecpool* in *writeback*
mode so that every write and read to the *ecpool* are actually using
the *hot-storage* and benefit from its flexibility and speed.
It is not possible to create an RBD image on an erasure coded pool
because it requires partial writes. It is however possible to create
an RBD image on an erasure coded pools when a replicated pool tier set
a cache tier::
$ rbd create --size 10G ecpool/myvolume
More information can be found in the `cache tiering
<../cache-tiering>`_ documentation.

View File

@ -301,6 +301,16 @@ You may set values for the following keys:
:Description: The ruleset to use for mapping object placement in the cluster.
:Type: Integer
.. _allow_ec_overwrites:
``allow_ec_overwrites``
:Description: Whether writes to an erasure coded pool can update part
of an object, so cephfs and rbd can use it. See
`Erasure Coding with Overwrites`_ for more details.
:Type: Boolean
:Version: ``12.2.0`` and above
.. _hashpspool:
``hashpspool``
@ -731,3 +741,4 @@ a size of 3).
.. _Pool, PG and CRUSH Config Reference: ../../configuration/pool-pg-config-ref
.. _Bloom Filter: http://en.wikipedia.org/wiki/Bloom_filter
.. _setting the number of placement groups: ../placement-groups#set-the-number-of-placement-groups
.. _Erasure Coding with Overwrites: ../erasure-code#erasure-coding-with-overwrites

View File

@ -1,9 +1,21 @@
overrides:
thrashosds:
bdev_inject_crash: 2
bdev_inject_crash_probability: .5
ceph:
fs: xfs
conf:
global:
enable experimental unrecoverable data corrupting features: debug_white_box_testing_ec_overwrites
osd: # force bluestore since it's required for ec overwrites
osd objectstore: bluestore
bluestore block size: 96636764160
debug bluestore: 30
debug bdev: 20
debug bluefs: 20
debug rocksdb: 10
enable experimental unrecoverable data corrupting features: "*"
osd debug randomize hobject sort order: false
# this doesn't work with failures bc the log writes are not atomic across the two backends
# bluestore bluefs env mirror: true
tasks:
- workunit:
clients:

View File

@ -0,0 +1 @@
../thrash-erasure-code/objectstore/bluestore.yaml

View File

@ -0,0 +1 @@
../thrash-erasure-code/clusters

View File

@ -0,0 +1 @@
../thrash-erasure-code/fast

View File

@ -0,0 +1 @@
../thrash-erasure-code/leveldb.yaml

View File

@ -0,0 +1 @@
../thrash-erasure-code/msgr-failures

View File

@ -0,0 +1 @@
../thrash-erasure-code/rados.yaml

View File

@ -0,0 +1 @@
../thrash-erasure-code/thrashers

View File

@ -12,7 +12,7 @@ tasks:
objects: 50
pool_snaps: true
ec_pool: true
erasure_code_use_hacky_overwrites: true
erasure_code_use_overwrites: true
op_weights:
read: 100
write: 100

View File

@ -14,7 +14,7 @@ tasks:
objects: 1024
size: 16384
ec_pool: true
erasure_code_use_hacky_overwrites: true
erasure_code_use_overwrites: true
fast_read: true
op_weights:
read: 100

View File

@ -14,7 +14,7 @@ tasks:
objects: 1024
size: 16384
ec_pool: true
erasure_code_use_hacky_overwrites: true
erasure_code_use_overwrites: true
op_weights:
read: 100
write: 100

View File

@ -11,7 +11,7 @@ tasks:
ops: 4000
objects: 50
ec_pool: true
erasure_code_use_hacky_overwrites: true
erasure_code_use_overwrites: true
op_weights:
read: 100
write: 100

View File

@ -3,12 +3,25 @@ tasks:
client.0:
- sudo ceph osd erasure-code-profile set teuthologyprofile ruleset-failure-domain=osd m=1 k=2
- sudo ceph osd pool create datapool 4 4 erasure teuthologyprofile
- sudo ceph osd pool set datapool debug_white_box_testing_ec_overwrites true
- sudo ceph osd pool set datapool allow_ec_overwrites true
overrides:
thrashosds:
bdev_inject_crash: 2
bdev_inject_crash_probability: .5
ceph:
fs: xfs
conf:
global:
enable experimental unrecoverable data corrupting features: debug_white_box_testing_ec_overwrites
client:
rbd default data pool: datapool
osd: # force bluestore since it's required for ec overwrites
osd objectstore: bluestore
bluestore block size: 96636764160
debug bluestore: 30
debug bdev: 20
debug bluefs: 20
debug rocksdb: 10
enable experimental unrecoverable data corrupting features: "*"
osd debug randomize hobject sort order: false
# this doesn't work with failures bc the log writes are not atomic across the two backends
# bluestore bluefs env mirror: true

View File

@ -3,12 +3,25 @@ tasks:
client.0:
- sudo ceph osd erasure-code-profile set teuthologyprofile ruleset-failure-domain=osd m=1 k=2
- sudo ceph osd pool create datapool 4 4 erasure teuthologyprofile
- sudo ceph osd pool set datapool debug_white_box_testing_ec_overwrites true
- sudo ceph osd pool set datapool allow_ec_overwrites true
overrides:
thrashosds:
bdev_inject_crash: 2
bdev_inject_crash_probability: .5
ceph:
fs: xfs
conf:
global:
enable experimental unrecoverable data corrupting features: debug_white_box_testing_ec_overwrites
client:
rbd default data pool: datapool
osd: # force bluestore since it's required for ec overwrites
osd objectstore: bluestore
bluestore block size: 96636764160
debug bluestore: 30
debug bdev: 20
debug bluefs: 20
debug rocksdb: 10
enable experimental unrecoverable data corrupting features: "*"
osd debug randomize hobject sort order: false
# this doesn't work with failures bc the log writes are not atomic across the two backends
# bluestore bluefs env mirror: true

View File

@ -3,12 +3,25 @@ tasks:
client.0:
- sudo ceph osd erasure-code-profile set teuthologyprofile ruleset-failure-domain=osd m=1 k=2
- sudo ceph osd pool create datapool 4 4 erasure teuthologyprofile
- sudo ceph osd pool set datapool debug_white_box_testing_ec_overwrites true
- sudo ceph osd pool set datapool allow_ec_overwrites true
overrides:
thrashosds:
bdev_inject_crash: 2
bdev_inject_crash_probability: .5
ceph:
fs: xfs
conf:
global:
enable experimental unrecoverable data corrupting features: debug_white_box_testing_ec_overwrites
client:
rbd default data pool: datapool
osd: # force bluestore since it's required for ec overwrites
osd objectstore: bluestore
bluestore block size: 96636764160
debug bluestore: 30
debug bdev: 20
debug bluefs: 20
debug rocksdb: 10
enable experimental unrecoverable data corrupting features: "*"
osd debug randomize hobject sort order: false
# this doesn't work with failures bc the log writes are not atomic across the two backends
# bluestore bluefs env mirror: true

View File

@ -1479,7 +1479,7 @@ class CephManager:
def create_pool_with_unique_name(self, pg_num=16,
erasure_code_profile_name=None,
min_size=None,
erasure_code_use_hacky_overwrites=False):
erasure_code_use_overwrites=False):
"""
Create a pool named unique_pool_X where X is unique.
"""
@ -1492,7 +1492,7 @@ class CephManager:
pg_num,
erasure_code_profile_name=erasure_code_profile_name,
min_size=min_size,
erasure_code_use_hacky_overwrites=erasure_code_use_hacky_overwrites)
erasure_code_use_overwrites=erasure_code_use_overwrites)
return name
@contextlib.contextmanager
@ -1504,15 +1504,14 @@ class CephManager:
def create_pool(self, pool_name, pg_num=16,
erasure_code_profile_name=None,
min_size=None,
erasure_code_use_hacky_overwrites=False):
erasure_code_use_overwrites=False):
"""
Create a pool named from the pool_name parameter.
:param pool_name: name of the pool being created.
:param pg_num: initial number of pgs.
:param erasure_code_profile_name: if set and !None create an
erasure coded pool using the profile
:param erasure_code_use_hacky_overwrites: if true, use the hacky
overwrites mode
:param erasure_code_use_overwrites: if true, allow overwrites
"""
with self.lock:
assert isinstance(pool_name, basestring)
@ -1531,10 +1530,10 @@ class CephManager:
'osd', 'pool', 'set', pool_name,
'min_size',
str(min_size))
if erasure_code_use_hacky_overwrites:
if erasure_code_use_overwrites:
self.raw_cluster_cmd(
'osd', 'pool', 'set', pool_name,
'debug_white_box_testing_ec_overwrites',
'allow_ec_overwrites',
'true')
self.pools[pool_name] = pg_num
time.sleep(1)

View File

@ -55,9 +55,7 @@ def task(ctx, config):
rollback: 2
snap_remove: 0
ec_pool: create an ec pool, defaults to False
erasure_code_use_hacky_overwrites: use the whitebox
testing experimental
overwrites mode
erasure_code_use_overwrites: test overwrites, default false
erasure_code_profile:
name: teuthologyprofile
k: 2
@ -139,9 +137,7 @@ def task(ctx, config):
'ceph_test_rados']
if config.get('ec_pool', False):
args.extend(['--no-omap'])
if config.get('erasure_code_use_hacky_overwrites', False):
args.extend(['--no-sparse'])
else:
if not config.get('erasure_code_use_overwrites', False):
args.extend(['--ec-pool'])
if config.get('write_fadvise_dontneed', False):
args.extend(['--write-fadvise-dontneed'])
@ -230,8 +226,8 @@ def task(ctx, config):
else:
pool = manager.create_pool_with_unique_name(
erasure_code_profile_name=profile_name,
erasure_code_use_hacky_overwrites=
config.get('erasure_code_use_hacky_overwrites', False)
erasure_code_use_overwrites=
config.get('erasure_code_use_overwrites', False)
)
created_pools.append(pool)
if config.get('fast_read', False):

View File

@ -520,6 +520,27 @@ function run_osd() {
activate_osd $dir $id "$@"
}
function run_osd_bluestore() {
local dir=$1
shift
local id=$1
shift
local osd_data=$dir/$id
local ceph_disk_args
ceph_disk_args+=" --statedir=$dir"
ceph_disk_args+=" --sysconfdir=$dir"
ceph_disk_args+=" --prepend-to-path="
mkdir -p $osd_data
ceph-disk $ceph_disk_args \
prepare --bluestore $osd_data || return 1
local ceph_osd_args
ceph_osd_args+=" --enable-experimental-unrecoverable-data-corrupting-features=bluestore"
activate_osd $dir $id $ceph_osd_args "$@"
}
function test_run_osd() {
local dir=$1
@ -635,6 +656,7 @@ function activate_osd() {
ceph_disk_args+=" --prepend-to-path="
local ceph_args="$CEPH_ARGS"
ceph_args+=" --enable-experimental-unrecoverable-data-corrupting-features=bluestore"
ceph_args+=" --osd-failsafe-full-ratio=.99"
ceph_args+=" --osd-journal-size=100"
ceph_args+=" --osd-scrub-load-threshold=2000"
@ -993,10 +1015,18 @@ function objectstore_tool() {
shift
local osd_data=$dir/$id
local osd_type=$(cat $osd_data/type)
kill_daemons $dir TERM osd.$id >&2 < /dev/null || return 1
local journal_args
if [ "$objectstore_type" == "filestore" ]; then
journal_args=" --journal-path $osd_data/journal"
fi
ceph-objectstore-tool \
--enable-experimental-unrecoverable-data-corrupting-features=bluestore \
--data-path $osd_data \
--journal-path $osd_data/journal \
$journal_args \
"$@" || return 1
activate_osd $dir $id $ceph_osd_args >&2 || return 1
wait_for_clean >&2

View File

@ -989,7 +989,7 @@ function test_mon_mds()
# We should be permitted to use an EC pool with overwrites enabled
# as the data pool...
ceph osd pool set mds-ec-pool debug_white_box_testing_ec_overwrites true --yes-i-really-mean-it
ceph osd pool set mds-ec-pool allow_ec_overwrites true
ceph fs new $FS_NAME fs_metadata mds-ec-pool --force 2>$TMPFILE
fail_all_mds $FS_NAME
ceph fs rm $FS_NAME --yes-i-really-mean-it
@ -1320,7 +1320,19 @@ function test_mon_osd_pool()
# should fail because the type is not the same
expect_false ceph osd pool create replicated 12 12 erasure
ceph osd lspools | grep replicated
ceph osd pool create ec_test 1 1 erasure
set +e
ceph osd metadata | grep osd_objectstore_type | grep -qc bluestore
if [ $? -eq 0 ]; then
ceph osd pool set ec_test allow_ec_overwrites true >& $TMPFILE
check_response $? 22 "pool must only be stored on bluestore for scrubbing to work"
else
ceph osd pool set ec_test allow_ec_overwrites true || return 1
expect_false ceph osd pool set ec_test allow_ec_overwrites false
fi
set -e
ceph osd pool delete replicated replicated --yes-i-really-really-mean-it
ceph osd pool delete ec_test ec_test --yes-i-really-really-mean-it
}
function test_mon_osd_pool_quota()

View File

@ -87,7 +87,7 @@ run_expect_nosignal "$RADOS_TOOL" --object_locator "asdf" ls
run_expect_nosignal "$RADOS_TOOL" --namespace "asdf" ls
run_expect_succ "$RADOS_TOOL" mkpool "$POOL"
run_expect_succ "$CEPH_TOOL" osd erasure-code-profile set myprofile k=2 m=1 ruleset-failure-domain=osd
run_expect_succ "$CEPH_TOOL" osd erasure-code-profile set myprofile k=2 m=1 stripe_unit=2K ruleset-failure-domain=osd --force
run_expect_succ "$CEPH_TOOL" osd pool create "$POOL_EC" 100 100 erasure myprofile

View File

@ -84,7 +84,7 @@ function count_data_objects() {
ceph osd pool create repdata 24 24
ceph osd erasure-code-profile set teuthologyprofile ruleset-failure-domain=osd m=1 k=2
ceph osd pool create ecdata 24 24 erasure teuthologyprofile
ceph osd pool set ecdata debug_white_box_testing_ec_overwrites true
ceph osd pool set ecdata allow_ec_overwrites true
ceph osd pool create rbdnonzero 24 24
ceph osd pool create clonesonly 24 24

View File

@ -38,8 +38,6 @@ enum {
#define OSD_REP_SPLAY 1
#define OSD_REP_CHAIN 2
#define OSD_POOL_ERASURE_CODE_STRIPE_WIDTH 4096
class CephContext;
extern const char *CEPH_CONF_FILE_DEFAULT;

View File

@ -675,7 +675,7 @@ OPTION(osd_crush_update_on_start, OPT_BOOL, true)
OPTION(osd_crush_initial_weight, OPT_DOUBLE, -1) // if >=0, the initial weight is for newly added osds.
OPTION(osd_pool_default_crush_rule, OPT_INT, -1) // deprecated for osd_pool_default_crush_replicated_ruleset
OPTION(osd_pool_default_crush_replicated_ruleset, OPT_INT, CEPH_DEFAULT_CRUSH_REPLICATED_RULESET)
OPTION(osd_pool_erasure_code_stripe_width, OPT_U32, OSD_POOL_ERASURE_CODE_STRIPE_WIDTH) // in bytes
OPTION(osd_pool_erasure_code_stripe_unit, OPT_U32, 4096) // in bytes
OPTION(osd_pool_default_size, OPT_INT, 3)
OPTION(osd_pool_default_min_size, OPT_INT, 0) // 0 means no specific default; ceph will use size-size/2
OPTION(osd_pool_default_pg_num, OPT_INT, 8) // number of PGs for new pools. Configure in global or mon section of ceph.conf

View File

@ -849,7 +849,7 @@ int FileSystemCommandHandler::_check_pool(
<< " is an erasure-coded pool. Use of erasure-coded pools"
<< " for CephFS metadata is not permitted";
return -EINVAL;
} else if (pool->is_erasure() && !pool->is_hacky_ecoverwrites()) {
} else if (pool->is_erasure() && !pool->allows_ecoverwrites()) {
// non-overwriteable EC pools are only acceptable with a cache tier overlay
if (!pool->has_tiers() || !pool->has_read_tier() || !pool->has_write_tier()) {
*ss << "pool '" << pool_name << "' (id '" << pool_id << "')"

View File

@ -737,7 +737,7 @@ COMMAND("osd pool get " \
"get pool parameter <var>", "osd", "r", "cli,rest")
COMMAND("osd pool set " \
"name=pool,type=CephPoolname " \
"name=var,type=CephChoices,strings=size|min_size|crash_replay_interval|pg_num|pgp_num|crush_rule|crush_ruleset|hashpspool|nodelete|nopgchange|nosizechange|write_fadvise_dontneed|noscrub|nodeep-scrub|hit_set_type|hit_set_period|hit_set_count|hit_set_fpp|use_gmt_hitset|debug_fake_ec_pool|target_max_bytes|target_max_objects|cache_target_dirty_ratio|cache_target_dirty_high_ratio|cache_target_full_ratio|cache_min_flush_age|cache_min_evict_age|auid|min_read_recency_for_promote|min_write_recency_for_promote|fast_read|hit_set_grade_decay_rate|hit_set_search_last_n|scrub_min_interval|scrub_max_interval|deep_scrub_interval|recovery_priority|recovery_op_priority|scrub_priority|compression_mode|compression_algorithm|compression_required_ratio|compression_max_blob_size|compression_min_blob_size|csum_type|csum_min_block|csum_max_block|debug_white_box_testing_ec_overwrites " \
"name=var,type=CephChoices,strings=size|min_size|crash_replay_interval|pg_num|pgp_num|crush_rule|crush_ruleset|hashpspool|nodelete|nopgchange|nosizechange|write_fadvise_dontneed|noscrub|nodeep-scrub|hit_set_type|hit_set_period|hit_set_count|hit_set_fpp|use_gmt_hitset|debug_fake_ec_pool|target_max_bytes|target_max_objects|cache_target_dirty_ratio|cache_target_dirty_high_ratio|cache_target_full_ratio|cache_min_flush_age|cache_min_evict_age|auid|min_read_recency_for_promote|min_write_recency_for_promote|fast_read|hit_set_grade_decay_rate|hit_set_search_last_n|scrub_min_interval|scrub_max_interval|deep_scrub_interval|recovery_priority|recovery_op_priority|scrub_priority|compression_mode|compression_algorithm|compression_required_ratio|compression_max_blob_size|compression_min_blob_size|csum_type|csum_min_block|csum_max_block|allow_ec_overwrites " \
"name=val,type=CephString " \
"name=force,type=CephChoices,strings=--yes-i-really-mean-it,req=false", \
"set pool parameter <var> to <val>", "osd", "rw", "cli,rest")

View File

@ -1232,6 +1232,48 @@ int OSDMonitor::load_metadata(int osd, map<string, string>& m, ostream *err)
return 0;
}
int OSDMonitor::get_osd_objectstore_type(int osd, string *type)
{
map<string, string> metadata;
int r = load_metadata(osd, metadata, nullptr);
if (r < 0)
return r;
auto it = metadata.find("osd_objectstore");
if (it == metadata.end())
return -ENOENT;
*type = it->second;
return 0;
}
bool OSDMonitor::is_pool_currently_all_bluestore(int64_t pool_id,
const pg_pool_t &pool,
ostream *err)
{
// just check a few pgs for efficiency - this can't give a guarantee anyway,
// since filestore osds could always join the pool later
set<int> checked_osds;
for (unsigned ps = 0; ps < MIN(8, pool.get_pg_num()); ++ps) {
vector<int> up, acting;
pg_t pgid(ps, pool_id, -1);
osdmap.pg_to_up_acting_osds(pgid, up, acting);
for (int osd : up) {
if (checked_osds.find(osd) != checked_osds.end())
continue;
string objectstore_type;
int r = get_osd_objectstore_type(osd, &objectstore_type);
// allow with missing metadata, e.g. due to an osd never booting yet
if (r < 0 || objectstore_type == "bluestore") {
checked_osds.insert(osd);
continue;
}
*err << "osd." << osd << " uses " << objectstore_type;
return false;
}
}
return true;
}
int OSDMonitor::dump_osd_metadata(int osd, Formatter *f, ostream *err)
{
map<string,string> m;
@ -4893,8 +4935,9 @@ void OSDMonitor::check_legacy_ec_plugin(const string& plugin, const string& prof
}
}
int OSDMonitor::normalize_profile(const string& profilename,
ErasureCodeProfile &profile,
int OSDMonitor::normalize_profile(const string& profilename,
ErasureCodeProfile &profile,
bool force,
ostream *ss)
{
ErasureCodeInterfaceRef erasure_code;
@ -4904,10 +4947,39 @@ int OSDMonitor::normalize_profile(const string& profilename,
int err = instance.factory(plugin->second,
g_conf->get_val<std::string>("erasure_code_dir"),
profile, &erasure_code, ss);
if (err)
if (err) {
return err;
}
return erasure_code->init(profile, ss);
err = erasure_code->init(profile, ss);
if (err) {
return err;
}
auto it = profile.find("stripe_unit");
if (it != profile.end()) {
string err_str;
uint32_t stripe_unit = strict_si_cast<uint32_t>(it->second.c_str(), &err_str);
if (!err_str.empty()) {
*ss << "could not parse stripe_unit '" << it->second
<< "': " << err_str << std::endl;
return -EINVAL;
}
uint32_t data_chunks = erasure_code->get_data_chunk_count();
uint32_t chunk_size = erasure_code->get_chunk_size(stripe_unit * data_chunks);
if (chunk_size != stripe_unit) {
*ss << "stripe_unit " << stripe_unit << " does not match ec profile "
<< "alignment. Would be padded to " << chunk_size
<< std::endl;
return -EINVAL;
}
if ((stripe_unit % 4096) != 0 && !force) {
*ss << "stripe_unit should be a multiple of 4096 bytes for best performance."
<< "use --force to override this check" << std::endl;
return -EINVAL;
}
}
return 0;
}
int OSDMonitor::crush_ruleset_create_erasure(const string &name,
@ -5130,12 +5202,22 @@ int OSDMonitor::prepare_pool_stripe_width(const unsigned pool_type,
break;
case pg_pool_t::TYPE_ERASURE:
{
ErasureCodeProfile profile =
osdmap.get_erasure_code_profile(erasure_code_profile);
ErasureCodeInterfaceRef erasure_code;
err = get_erasure_code(erasure_code_profile, &erasure_code, ss);
uint32_t desired_stripe_width = g_conf->osd_pool_erasure_code_stripe_width;
if (err == 0)
*stripe_width = erasure_code->get_data_chunk_count() *
erasure_code->get_chunk_size(desired_stripe_width);
if (err)
break;
uint32_t data_chunks = erasure_code->get_data_chunk_count();
uint32_t stripe_unit = g_conf->osd_pool_erasure_code_stripe_unit;
auto it = profile.find("stripe_unit");
if (it != profile.end()) {
string err_str;
stripe_unit = strict_si_cast<uint32_t>(it->second.c_str(), &err_str);
assert(err_str.empty());
}
*stripe_width = data_chunks *
erasure_code->get_chunk_size(stripe_unit * data_chunks);
}
break;
default:
@ -5728,18 +5810,13 @@ int OSDMonitor::prepare_command_pool_set(map<string,cmd_vartype> &cmdmap,
ss << "expecting value 'true' or '1'";
return -EINVAL;
}
} else if (var == "debug_white_box_testing_ec_overwrites") {
} else if (var == "allow_ec_overwrites") {
if (!p.is_erasure()) {
ss << "ec overwrites can only be enabled for an erasure coded pool";
return -EINVAL;
}
if (val == "true" || (interr.empty() && n == 1)) {
if (cct->check_experimental_feature_enabled(
"debug_white_box_testing_ec_overwrites")) {
p.flags |= pg_pool_t::FLAG_EC_OVERWRITES;
} else {
ss << "debug_white_box_testing_ec_overwrites is an experimental feature "
<< "and must be enabled. Note, this feature does not yet actually "
<< "work. This flag merely enables some of the preliminary support "
<< "for testing purposes.";
return -ENOTSUP;
}
} else if (val == "false" || (interr.empty() && n == 0)) {
ss << "ec overwrites cannot be disabled once enabled";
return -EINVAL;
@ -5747,6 +5824,11 @@ int OSDMonitor::prepare_command_pool_set(map<string,cmd_vartype> &cmdmap,
ss << "expecting value 'true', 'false', '0', or '1'";
return -EINVAL;
}
stringstream err;
if (!is_pool_currently_all_bluestore(pool, p, &err)) {
ss << "pool must only be stored on bluestore for scrubbing to work: " << err.str();
return -EINVAL;
}
} else if (var == "target_max_objects") {
if (interr.length()) {
ss << "error parsing int '" << val << "': " << interr;
@ -6831,14 +6913,14 @@ bool OSDMonitor::prepare_command_impl(MonOpRequestRef op,
if (err)
goto reply;
}
err = normalize_profile(name, profile_map, &ss);
err = normalize_profile(name, profile_map, force, &ss);
if (err)
goto reply;
if (osdmap.has_erasure_code_profile(name)) {
ErasureCodeProfile existing_profile_map =
osdmap.get_erasure_code_profile(name);
err = normalize_profile(name, existing_profile_map, &ss);
err = normalize_profile(name, existing_profile_map, force, &ss);
if (err)
goto reply;
@ -6892,7 +6974,7 @@ bool OSDMonitor::prepare_command_impl(MonOpRequestRef op,
&ss);
if (err)
goto reply;
err = normalize_profile(name, profile_map, &ss);
err = normalize_profile(name, profile_map, true, &ss);
if (err)
goto reply;
dout(20) << "erasure code profile set " << profile << "="

View File

@ -297,6 +297,7 @@ private:
const string& profile) const;
int normalize_profile(const string& profilename,
ErasureCodeProfile &profile,
bool force,
ostream *ss);
int crush_ruleset_create_erasure(const string &name,
const string &profile,
@ -410,6 +411,9 @@ private:
OpTracker op_tracker;
int load_metadata(int osd, map<string, string>& m, ostream *err);
int get_osd_objectstore_type(int osd, std::string *type);
bool is_pool_currently_all_bluestore(int64_t pool_id, const pg_pool_t &pool,
ostream *err);
// when we last received PG stats from each osd
map<int,utime_t> last_osd_report;

View File

@ -954,7 +954,7 @@ void ECBackend::handle_sub_read(
++i) {
int r = 0;
ECUtil::HashInfoRef hinfo;
if (!get_parent()->get_pool().is_hacky_ecoverwrites()) {
if (!get_parent()->get_pool().allows_ecoverwrites()) {
hinfo = get_hash_info(i->first);
if (!hinfo) {
r = -EIO;
@ -989,7 +989,7 @@ void ECBackend::handle_sub_read(
);
}
if (!get_parent()->get_pool().is_hacky_ecoverwrites()) {
if (!get_parent()->get_pool().allows_ecoverwrites()) {
// This shows that we still need deep scrub because large enough files
// are read in sections, so the digest check here won't be done here.
// Do NOT check osd_read_eio_on_bad_digest here. We need to report
@ -1732,7 +1732,7 @@ bool ECBackend::try_state_to_reads()
Op *op = &(waiting_state.front());
if (op->requires_rmw() && pipeline_state.cache_invalid()) {
assert(get_parent()->get_pool().is_hacky_ecoverwrites());
assert(get_parent()->get_pool().allows_ecoverwrites());
dout(20) << __func__ << ": blocking " << *op
<< " because it requires an rmw and the cache is invalid "
<< pipeline_state
@ -1786,7 +1786,7 @@ bool ECBackend::try_state_to_reads()
dout(10) << __func__ << ": " << *op << dendl;
if (!op->remote_read.empty()) {
assert(get_parent()->get_pool().is_hacky_ecoverwrites());
assert(get_parent()->get_pool().allows_ecoverwrites());
objects_read_async_no_cache(
op->remote_read,
[this, op](map<hobject_t,pair<int, extent_map> > &&results) {
@ -1859,7 +1859,7 @@ bool ECBackend::try_reads_to_commit()
dout(20) << __func__ << ": written: " << written << dendl;
dout(20) << __func__ << ": op: " << *op << dendl;
if (!get_parent()->get_pool().is_hacky_ecoverwrites()) {
if (!get_parent()->get_pool().allows_ecoverwrites()) {
for (auto &&i: op->log_entries) {
if (i.requires_kraken()) {
derr << __func__ << ": log entry " << i << " requires kraken"
@ -2365,7 +2365,7 @@ void ECBackend::be_deep_scrub(
o.digest_present = false;
return;
} else {
if (!get_parent()->get_pool().is_hacky_ecoverwrites()) {
if (!get_parent()->get_pool().allows_ecoverwrites()) {
assert(hinfo->has_chunk_hash());
if (hinfo->get_total_chunk_size() != pos) {
dout(0) << "_scan_list " << poid << " got incorrect size on read" << dendl;

View File

@ -3894,6 +3894,13 @@ int OSD::handle_pg_peering_evt(
switch (result) {
case RES_NONE: {
const pg_pool_t* pp = osdmap->get_pg_pool(pgid.pool());
if (pp->has_flag(pg_pool_t::FLAG_EC_OVERWRITES) &&
store->get_type() != "bluestore") {
clog->warn() << "pg " << pgid
<< " is at risk of silent data corruption: "
<< "the pool allows ec overwrites but is not stored in "
<< "bluestore, so deep scrubbing will not detect bitrot";
}
PG::_create(*rctx.transaction, pgid, pgid.get_split_bits(pp->get_pg_num()));
PG::_init(*rctx.transaction, pgid, pp);

View File

@ -1448,7 +1448,7 @@ public:
}
uint64_t required_alignment() const { return stripe_width; }
bool is_hacky_ecoverwrites() const {
bool allows_ecoverwrites() const {
return has_flag(FLAG_EC_OVERWRITES);
}

View File

@ -18,7 +18,7 @@
source $(dirname $0)/detect-build-env-vars.sh
CEPH_CLI_TEST_DUP_COMMAND=1 \
MDS=1 MON=1 OSD=3 MGR=0 CEPH_PORT=7200 $CEPH_ROOT/src/test/vstart_wrapper.sh \
MDS=1 MON=1 OSD=3 MGR=0 CEPH_PORT=7200 CEPH_OBJECTSTORE="bluestore" $CEPH_ROOT/src/test/vstart_wrapper.sh \
$CEPH_ROOT/qa/workunits/cephtool/test.sh \
--test-mds \
--asok-does-not-need-root

View File

@ -615,8 +615,8 @@ TEST(ErasureCodeLrc, encode_decode)
profile["layers"] = description_string;
EXPECT_EQ(0, lrc.init(profile, &cerr));
EXPECT_EQ(4U, lrc.get_data_chunk_count());
unsigned int stripe_width = g_conf->osd_pool_erasure_code_stripe_width;
unsigned int chunk_size = stripe_width / lrc.get_data_chunk_count();
unsigned int chunk_size = g_conf->osd_pool_erasure_code_stripe_unit;
unsigned int stripe_width = lrc.get_data_chunk_count() * chunk_size;
EXPECT_EQ(chunk_size, lrc.get_chunk_size(stripe_width));
set<int> want_to_encode;
map<int, bufferlist> encoded;
@ -745,8 +745,8 @@ TEST(ErasureCodeLrc, encode_decode_2)
profile["layers"] = description_string;
EXPECT_EQ(0, lrc.init(profile, &cerr));
EXPECT_EQ(4U, lrc.get_data_chunk_count());
unsigned int stripe_width = g_conf->osd_pool_erasure_code_stripe_width;
unsigned int chunk_size = stripe_width / lrc.get_data_chunk_count();
unsigned int chunk_size = g_conf->osd_pool_erasure_code_stripe_unit;
unsigned int stripe_width = lrc.get_data_chunk_count() * chunk_size;
EXPECT_EQ(chunk_size, lrc.get_chunk_size(stripe_width));
set<int> want_to_encode;
map<int, bufferlist> encoded;

View File

@ -259,8 +259,9 @@ function TEST_alignment_constraints() {
# imposed by the stripe width
# See http://tracker.ceph.com/issues/8622
#
local stripe_width=$(ceph-conf --show-config-value osd_pool_erasure_code_stripe_width)
local block_size=$((stripe_width - 1))
local stripe_unit=$(ceph-conf --show-config-value osd_pool_erasure_code_stripe_unit)
eval local $(ceph osd erasure-code-profile get myprofile | grep k=)
local block_size=$((stripe_unit * k - 1))
dd if=/dev/zero of=$dir/ORIGINAL bs=$block_size count=2
rados --block-size=$block_size \
--pool ecpool put UNALIGNED $dir/ORIGINAL || return 1
@ -268,9 +269,7 @@ function TEST_alignment_constraints() {
}
function chunk_size() {
local stripe_width=$(ceph-conf --show-config-value osd_pool_erasure_code_stripe_width)
eval local $(ceph osd erasure-code-profile get default | grep k=)
echo $(($stripe_width / $k))
echo $(ceph-conf --show-config-value osd_pool_erasure_code_stripe_unit)
}
#

View File

@ -122,30 +122,32 @@ function TEST_erasure_code_profile_default() {
ceph osd erasure-code-profile ls | grep default || return 1
}
function TEST_erasure_crush_stripe_width() {
function TEST_erasure_crush_stripe_unit() {
local dir=$1
# the default stripe width is used to initialize the pool
# the default stripe unit is used to initialize the pool
run_mon $dir a --public-addr $CEPH_MON
stripe_width=$(ceph-conf --show-config-value osd_pool_erasure_code_stripe_width)
stripe_unit=$(ceph-conf --show-config-value osd_pool_erasure_code_stripe_unit)
eval local $(ceph osd erasure-code-profile get myprofile | grep k=)
stripe_width = $((stripe_unit * k))
ceph osd pool create pool_erasure 12 12 erasure
ceph --format json osd dump | tee $dir/osd.json
grep '"stripe_width":'$stripe_width $dir/osd.json > /dev/null || return 1
}
function TEST_erasure_crush_stripe_width_padded() {
function TEST_erasure_crush_stripe_unit_padded() {
local dir=$1
# setting osd_pool_erasure_code_stripe_width modifies the stripe_width
# setting osd_pool_erasure_code_stripe_unit modifies the stripe_width
# and it is padded as required by the default plugin
profile+=" plugin=jerasure"
profile+=" technique=reed_sol_van"
k=4
profile+=" k=$k"
profile+=" m=2"
expected_chunk_size=2048
actual_stripe_width=$(($expected_chunk_size * $k))
desired_stripe_width=$(($actual_stripe_width - 1))
actual_stripe_unit=2048
desired_stripe_unit=$((actual_stripe_unit - 1))
actual_stripe_width=$((actual_stripe_unit * k))
run_mon $dir a \
--osd_pool_erasure_code_stripe_width $desired_stripe_width \
--osd_pool_erasure_code_stripe_unit $desired_stripe_unit \
--osd_pool_default_erasure_code_profile "$profile" || return 1
ceph osd pool create pool_erasure 12 12 erasure
ceph osd dump | tee $dir/osd.json

View File

@ -1,4 +1,4 @@
#!/bin/bash
#!/bin/bash -x
#
# Copyright (C) 2014 Red Hat <contact@redhat.com>
#
@ -154,11 +154,6 @@ function corrupt_and_repair_one() {
function corrupt_and_repair_erasure_coded() {
local dir=$1
local poolname=$2
local profile=$3
ceph osd pool create $poolname 1 1 erasure $profile \
|| return 1
wait_for_clean || return 1
add_something $dir $poolname || return 1
@ -176,8 +171,25 @@ function corrupt_and_repair_erasure_coded() {
}
function TEST_auto_repair_erasure_coded() {
function create_ec_pool() {
local pool_name=$1
local allow_overwrites=$2
ceph osd erasure-code-profile set myprofile ruleset-failure-domain=osd $3 $4 $5 $6 $7 || return 1
ceph osd pool create "$poolname" 1 1 erasure myprofile || return 1
if [ "$allow_overwrites" = "true" ]; then
ceph osd pool set "$poolname" allow_ec_overwrites true || return 1
fi
wait_for_clean || return 1
return 0
}
function auto_repair_erasure_coded() {
local dir=$1
local allow_overwrites=$2
local poolname=ecpool
# Launch a cluster with 5 seconds scrub interval
@ -190,15 +202,16 @@ function TEST_auto_repair_erasure_coded() {
--osd-scrub-min-interval=5 \
--osd-scrub-interval-randomize-ratio=0"
for id in $(seq 0 2) ; do
run_osd $dir $id $ceph_osd_args
if [ "$allow_overwrites" = "true" ]; then
run_osd_bluestore $dir $id $ceph_osd_args || return 1
else
run_osd $dir $id $ceph_osd_args || return 1
fi
done
wait_for_clean || return 1
# Create an EC pool
ceph osd erasure-code-profile set myprofile \
k=2 m=1 ruleset-failure-domain=osd || return 1
ceph osd pool create $poolname 8 8 erasure myprofile || return 1
wait_for_clean || return 1
create_ec_pool $poolname $allow_overwrites k=2 m=1 || return 1
# Put an object
local payload=ABCDEF
@ -222,69 +235,95 @@ function TEST_auto_repair_erasure_coded() {
teardown $dir || return 1
}
function TEST_corrupt_and_repair_jerasure() {
function TEST_auto_repair_erasure_coded_appends() {
auto_repair_erasure_coded $1 false
}
function TEST_auto_repair_erasure_coded_overwrites() {
auto_repair_erasure_coded $1 true
}
function corrupt_and_repair_jerasure() {
local dir=$1
local allow_overwrites=$2
local poolname=ecpool
local profile=myprofile
setup $dir || return 1
run_mon $dir a || return 1
run_mgr $dir x || return 1
for id in $(seq 0 3) ; do
run_osd $dir $id || return 1
if [ "$allow_overwrites" = "true" ]; then
run_osd_bluestore $dir $id || return 1
else
run_osd $dir $id || return 1
fi
done
wait_for_clean || return 1
ceph osd erasure-code-profile set $profile \
k=2 m=2 ruleset-failure-domain=osd || return 1
corrupt_and_repair_erasure_coded $dir $poolname $profile || return 1
create_ec_pool $poolname $allow_overwrites k=2 m=2 || return 1
corrupt_and_repair_erasure_coded $dir $poolname || return 1
teardown $dir || return 1
}
function TEST_corrupt_and_repair_lrc() {
function TEST_corrupt_and_repair_jerasure_appends() {
corrupt_and_repair_jerasure $1
}
function TEST_corrupt_and_repair_jerasure_overwrites() {
corrupt_and_repair_jerasure $1 true
}
function corrupt_and_repair_lrc() {
local dir=$1
local allow_overwrites=$2
local poolname=ecpool
local profile=myprofile
setup $dir || return 1
run_mon $dir a || return 1
run_mgr $dir x || return 1
for id in $(seq 0 9) ; do
run_osd $dir $id || return 1
if [ "$allow_overwrites" = "true" ]; then
run_osd_bluestore $dir $id || return 1
else
run_osd $dir $id || return 1
fi
done
wait_for_clean || return 1
ceph osd erasure-code-profile set $profile \
pluing=lrc \
k=4 m=2 l=3 \
ruleset-failure-domain=osd || return 1
corrupt_and_repair_erasure_coded $dir $poolname $profile || return 1
create_ec_pool $poolname $allow_overwrites k=4 m=2 l=3 plugin=lrc || return 1
corrupt_and_repair_erasure_coded $dir $poolname || return 1
teardown $dir || return 1
}
function TEST_unfound_erasure_coded() {
function TEST_corrupt_and_repair_lrc_appends() {
corrupt_and_repair_jerasure $1
}
function TEST_corrupt_and_repair_lrc_overwrites() {
corrupt_and_repair_jerasure $1 true
}
function unfound_erasure_coded() {
local dir=$1
local allow_overwrites=$2
local poolname=ecpool
local payload=ABCDEF
setup $dir || return 1
run_mon $dir a || return 1
run_mgr $dir x || return 1
run_osd $dir 0 || return 1
run_osd $dir 1 || return 1
run_osd $dir 2 || return 1
run_osd $dir 3 || return 1
for id in $(seq 0 3) ; do
if [ "$allow_overwrites" = "true" ]; then
run_osd_bluestore $dir $id || return 1
else
run_osd $dir $id || return 1
fi
done
wait_for_clean || return 1
ceph osd erasure-code-profile set myprofile \
k=2 m=2 ruleset-failure-domain=osd || return 1
ceph osd pool create $poolname 1 1 erasure myprofile \
|| return 1
wait_for_clean || return 1
create_ec_pool $poolname $allow_overwrites k=2 m=2 || return 1
add_something $dir $poolname || return 1
@ -324,27 +363,35 @@ function TEST_unfound_erasure_coded() {
teardown $dir || return 1
}
function TEST_unfound_erasure_coded_appends() {
unfound_erasure_coded $1
}
function TEST_unfound_erasure_coded_overwrites() {
unfound_erasure_coded $1 true
}
#
# list_missing for EC pool
#
function TEST_list_missing_erasure_coded() {
function list_missing_erasure_coded() {
local dir=$1
local allow_overwrites=$2
local poolname=ecpool
local profile=myprofile
setup $dir || return 1
run_mon $dir a || return 1
run_mgr $dir x || return 1
for id in $(seq 0 2) ; do
run_osd $dir $id || return 1
if [ "$allow_overwrites" = "true" ]; then
run_osd_bluestore $dir $id || return 1
else
run_osd $dir $id || return 1
fi
done
wait_for_clean || return 1
ceph osd erasure-code-profile set $profile \
k=2 m=1 ruleset-failure-domain=osd || return 1
ceph osd pool create $poolname 1 1 erasure $profile \
|| return 1
wait_for_clean || return 1
create_ec_pool $poolname $allow_overwrites k=2 m=1 || return 1
# Put an object and remove the two shards (including primary)
add_something $dir $poolname MOBJ0 || return 1
@ -360,17 +407,17 @@ function TEST_list_missing_erasure_coded() {
done
id=${osds0[0]}
ceph-objectstore-tool --data-path $dir/$id --journal-path $dir/$id/journal \
ceph-objectstore-tool --data-path $dir/$id --enable-experimental-unrecoverable-data-corrupting-features=bluestore \
MOBJ0 remove || return 1
id=${osds0[1]}
ceph-objectstore-tool --data-path $dir/$id --journal-path $dir/$id/journal \
ceph-objectstore-tool --data-path $dir/$id --enable-experimental-unrecoverable-data-corrupting-features=bluestore \
MOBJ0 remove || return 1
id=${osds1[1]}
ceph-objectstore-tool --data-path $dir/$id --journal-path $dir/$id/journal \
ceph-objectstore-tool --data-path $dir/$id --enable-experimental-unrecoverable-data-corrupting-features=bluestore \
MOBJ1 remove || return 1
id=${osds1[2]}
ceph-objectstore-tool --data-path $dir/$id --journal-path $dir/$id/journal \
ceph-objectstore-tool --data-path $dir/$id --enable-experimental-unrecoverable-data-corrupting-features=bluestore \
MOBJ1 remove || return 1
for id in $(seq 0 2) ; do
@ -394,6 +441,14 @@ function TEST_list_missing_erasure_coded() {
teardown $dir || return 1
}
function TEST_list_missing_erasure_coded_appends() {
list_missing_erasure_coded $1 false
}
function TEST_list_missing_erasure_coded_overwrites() {
list_missing_erasure_coded $1 true
}
#
# Corrupt one copy of a replicated pool
#
@ -1513,25 +1568,25 @@ EOF
#
# Test scrub errors for an erasure coded pool
#
function TEST_corrupt_scrub_erasure() {
function corrupt_scrub_erasure() {
local dir=$1
local allow_overwrites=$2
local poolname=ecpool
local profile=myprofile
local total_objs=5
setup $dir || return 1
run_mon $dir a || return 1
run_mgr $dir x || return 1
for id in $(seq 0 2) ; do
run_osd $dir $id || return 1
if [ "$allow_overwrites" = "true" ]; then
run_osd_bluestore $dir $id || return 1
else
run_osd $dir $id || return 1
fi
done
wait_for_clean || return 1
ceph osd erasure-code-profile set $profile \
k=2 m=1 ruleset-failure-domain=osd || return 1
ceph osd pool create $poolname 1 1 erasure $profile \
|| return 1
wait_for_clean || return 1
create_ec_pool $poolname $allow_overwrites k=2 m=1 stripe_unit=2K --force || return 1
for i in $(seq 1 $total_objs) ; do
objname=EOBJ${i}
@ -2151,6 +2206,11 @@ EOF
EOF
jq "$jqfilter" $dir/json | python -c "$sortkeys" | sed -e "$sedfilter" > $dir/csjson
if [ "$allow_overwrites" = "true" ]
then
grep -v data_digest $dir/csjson | grep -v ec_size_error > $dir/csjson
grep -v data_digest $dir/checkcsjson | grep -v ec_size_error > $dir/checkcsjson
fi
diff -y $termwidth $dir/checkcsjson $dir/csjson || test $getjson = "yes" || return 1
if test $getjson = "yes"
then
@ -2166,6 +2226,14 @@ EOF
teardown $dir || return 1
}
function TEST_corrupt_scrub_erasure_appends() {
corrupt_scrub_erasure $1 false
}
function TEST_corrupt_scrub_erasure_overwrites() {
corrupt_scrub_erasure $1 true
}
#
# Test to make sure that a periodic scrub won't cause deep-scrub info to be lost
#

View File

@ -30,8 +30,13 @@ function vstart_setup()
trap "teardown $CEPH_DIR" EXIT
export LC_ALL=C # some tests are vulnerable to i18n
export PATH="$(pwd):${PATH}"
OBJSTORE_ARGS=""
if [ "bluestore" = "${CEPH_OBJECTSTORE}" ]; then
OBJSTORE_ARGS="-b"
fi
$CEPH_ROOT/src/vstart.sh \
--short \
$OBJSTORE_ARGS \
-o 'paxos propose interval = 0.01' \
-n -l || return 1
export CEPH_CONF=$CEPH_DIR/ceph.conf