mirror of
https://github.com/ceph/ceph
synced 2025-02-24 19:47:44 +00:00
Merge pull request #55096 from athanatos/sjust/for-review/wip-crush-msr
crush: add multistep retry rules Reviewed-by: Laura Flores <lflores@redhat.com>
This commit is contained in:
commit
37d5d931b0
@ -419,7 +419,7 @@ centers for three-way replication, and yet another rule for erasure coding acros
|
||||
six storage devices. For a detailed discussion of CRUSH rules, see **Section 3.2**
|
||||
of `CRUSH - Controlled, Scalable, Decentralized Placement of Replicated Data`_.
|
||||
|
||||
A rule takes the following form::
|
||||
A normal CRUSH rule takes the following form::
|
||||
|
||||
rule <rulename> {
|
||||
|
||||
@ -430,6 +430,18 @@ A rule takes the following form::
|
||||
step emit
|
||||
}
|
||||
|
||||
CRUSH MSR rules are a distinct type of CRUSH rule which supports retrying steps
|
||||
and provides better support for configurations that require multiple OSDs within
|
||||
each failure domain. MSR rules take the following form::
|
||||
|
||||
rule <rulename> {
|
||||
|
||||
id [a unique integer ID]
|
||||
type [msr_indep|msr_firsn]
|
||||
step take <bucket-name> [class <device-class>]
|
||||
step choosemsr <N> type <bucket-type>
|
||||
step emit
|
||||
}
|
||||
|
||||
``id``
|
||||
:Description: A unique integer that identifies the rule.
|
||||
@ -441,12 +453,14 @@ A rule takes the following form::
|
||||
|
||||
``type``
|
||||
:Description: Denotes the type of replication strategy to be enforced by the
|
||||
rule.
|
||||
rule. msr_firstn and msr_indep are a distinct descent algorithm
|
||||
which supports retrying steps within the rule and therefore
|
||||
multiple OSDs per failure domain.
|
||||
:Purpose: A component of the rule mask.
|
||||
:Type: String
|
||||
:Required: Yes
|
||||
:Default: ``replicated``
|
||||
:Valid Values: ``replicated`` or ``erasure``
|
||||
:Valid Values: ``replicated``, ``erasure``, ``msr_firstn``, ``msr_indep``
|
||||
|
||||
|
||||
``step take <bucket-name> [class <device-class>]``
|
||||
@ -525,6 +539,16 @@ A rule takes the following form::
|
||||
final CRUSH mapping transformation is therefore 1, 2, 3, 4, 5
|
||||
→ 1, 2, 6, 4, 5.
|
||||
|
||||
``step choosemsr {num} type {bucket-type}``
|
||||
:Description: Selects a num buckets of type bucket-type. msr_firstn and msr_indep
|
||||
must use choosemsr rather than choose or chooseleaf.
|
||||
|
||||
- If ``{num} == 0``, choose ``pool-num-replicas`` buckets (as many buckets as are available).
|
||||
- If ``pool-num-replicas > {num} > 0``, choose that many buckets.
|
||||
:Purpose: Choose step required for msr_firstn and msr_indep rules.
|
||||
:Prerequisite: Follows ``step take`` and precedes ``step emit``
|
||||
:Example: ``step choosemsr 3 type host``
|
||||
|
||||
.. _crush-reclassify:
|
||||
|
||||
Migrating from a legacy SSD rule to device classes
|
||||
|
@ -709,6 +709,13 @@ The relevant erasure-code profile properties are as follows:
|
||||
[default: ``default``].
|
||||
* **crush-failure-domain**: the CRUSH bucket type used in the distribution of
|
||||
erasure-coded shards [default: ``host``].
|
||||
* **crush-osds-per-failure-domain**: Maximum number of OSDs to place in each
|
||||
failure domain -- defaults to 1. Using a value greater than one will
|
||||
cause a CRUSH MSR rule to be created, see below. Must be specified if
|
||||
crush-num-failure-domains is specified.
|
||||
* **crush-num-failure-domains**: Number of failure domains to map. Must be
|
||||
specified if crush-osds-per-failure-domain is specified. Results in
|
||||
a CRUSH MSR rule being created.
|
||||
* **crush-device-class**: the device class on which to place data [default:
|
||||
none, which means that all devices are used].
|
||||
* **k** and **m** (and, for the ``lrc`` plugin, **l**): these determine the
|
||||
@ -726,6 +733,21 @@ The relevant erasure-code profile properties are as follows:
|
||||
argument is omitted, then Ceph will create the CRUSH rule automatically.
|
||||
|
||||
|
||||
CRUSH MSR Rules
|
||||
---------------
|
||||
|
||||
Creating an erasure-code profile with a crush-osds-per-failure-domain
|
||||
value greater than one will cause a CRUSH MSR rule type to be created
|
||||
instead of a normal CRUSH rule. Normal crush rules cannot retry prior
|
||||
steps when an out OSD is encountered and rely on CHOOSELEAF steps to
|
||||
permit moving OSDs to new hosts. However, CHOOSELEAF rules don't
|
||||
support more than a single OSD per failure domain. MSR rules, new in
|
||||
squid, support multiple OSDs per failure domain by retrying all prior
|
||||
steps when an out OSD is encountered. Using MSR rules requires that
|
||||
OSDs and clients be required to support the CRUSH_MSR feature bit
|
||||
(squid or newer).
|
||||
|
||||
|
||||
Deleting rules
|
||||
--------------
|
||||
|
||||
|
@ -11,7 +11,9 @@ tasks:
|
||||
k: 4
|
||||
m: 2
|
||||
technique: reed_sol_van
|
||||
crush-failure-domain: osd
|
||||
crush-failure-domain: host
|
||||
crush-osds-per-failure-domain: 2
|
||||
crush-num-failure-domains: 3
|
||||
op_weights:
|
||||
read: 100
|
||||
write: 0
|
||||
|
@ -79,7 +79,7 @@ class ECPTest(DashboardTestCase):
|
||||
self.assertStatus(201)
|
||||
|
||||
self._get('/api/erasure_code_profile/lrc')
|
||||
self.assertJsonBody({
|
||||
self.assertJsonSubset({
|
||||
'crush-device-class': '',
|
||||
'crush-failure-domain': 'host',
|
||||
'crush-root': 'default',
|
||||
|
@ -321,6 +321,13 @@ int CrushCompiler::decompile(ostream &out)
|
||||
if (crush.get_allowed_bucket_algs() != CRUSH_LEGACY_ALLOWED_BUCKET_ALGS)
|
||||
out << "tunable allowed_bucket_algs " << crush.get_allowed_bucket_algs()
|
||||
<< "\n";
|
||||
if (crush.has_nondefault_tunables_msr()) {
|
||||
out << "tunable msr_descents " << crush.get_msr_descents()
|
||||
<< "\n";
|
||||
out << "tunable msr_collision_tries "
|
||||
<< crush.get_msr_collision_tries()
|
||||
<< "\n";
|
||||
}
|
||||
|
||||
out << "\n# devices\n";
|
||||
for (int i=0; i<crush.get_max_devices(); i++) {
|
||||
@ -363,12 +370,18 @@ int CrushCompiler::decompile(ostream &out)
|
||||
out << "\tid " << i << "\n";
|
||||
|
||||
switch (crush.get_rule_type(i)) {
|
||||
case CEPH_PG_TYPE_REPLICATED:
|
||||
case CRUSH_RULE_TYPE_REPLICATED:
|
||||
out << "\ttype replicated\n";
|
||||
break;
|
||||
case CEPH_PG_TYPE_ERASURE:
|
||||
case CRUSH_RULE_TYPE_ERASURE:
|
||||
out << "\ttype erasure\n";
|
||||
break;
|
||||
case CRUSH_RULE_TYPE_MSR_FIRSTN:
|
||||
out << "\ttype msr_firstn\n";
|
||||
break;
|
||||
case CRUSH_RULE_TYPE_MSR_INDEP:
|
||||
out << "\ttype msr_indep\n";
|
||||
break;
|
||||
default:
|
||||
out << "\ttype " << crush.get_rule_type(i) << "\n";
|
||||
}
|
||||
@ -422,6 +435,15 @@ int CrushCompiler::decompile(ostream &out)
|
||||
out << "\tstep set_chooseleaf_stable " << crush.get_rule_arg1(i, j)
|
||||
<< "\n";
|
||||
break;
|
||||
case CRUSH_RULE_SET_MSR_DESCENTS:
|
||||
out << "\tstep set_msr_descents " << crush.get_rule_arg1(i, j)
|
||||
<< "\n";
|
||||
break;
|
||||
case CRUSH_RULE_SET_MSR_COLLISION_TRIES:
|
||||
out << "\tstep set_msr_collision_tries "
|
||||
<< crush.get_rule_arg1(i, j)
|
||||
<< "\n";
|
||||
break;
|
||||
case CRUSH_RULE_CHOOSE_FIRSTN:
|
||||
out << "\tstep choose firstn "
|
||||
<< crush.get_rule_arg1(i, j)
|
||||
@ -450,6 +472,13 @@ int CrushCompiler::decompile(ostream &out)
|
||||
print_type_name(out, crush.get_rule_arg2(i, j), crush);
|
||||
out << "\n";
|
||||
break;
|
||||
case CRUSH_RULE_CHOOSE_MSR:
|
||||
out << "\tstep choosemsr "
|
||||
<< crush.get_rule_arg1(i, j)
|
||||
<< " type ";
|
||||
print_type_name(out, crush.get_rule_arg2(i, j), crush);
|
||||
out << "\n";
|
||||
break;
|
||||
}
|
||||
}
|
||||
out << "}\n";
|
||||
@ -532,6 +561,10 @@ int CrushCompiler::parse_tunable(iter_t const& i)
|
||||
crush.set_straw_calc_version(val);
|
||||
else if (name == "allowed_bucket_algs")
|
||||
crush.set_allowed_bucket_algs(val);
|
||||
else if (name == "msr_descents")
|
||||
crush.set_msr_descents(val);
|
||||
else if (name == "msr_collision_tries")
|
||||
crush.set_msr_collision_tries(val);
|
||||
else {
|
||||
err << "tunable " << name << " not recognized" << std::endl;
|
||||
return -1;
|
||||
@ -781,9 +814,13 @@ int CrushCompiler::parse_rule(iter_t const& i)
|
||||
string tname = string_node(i->children[start+2]);
|
||||
int type;
|
||||
if (tname == "replicated")
|
||||
type = CEPH_PG_TYPE_REPLICATED;
|
||||
type = CRUSH_RULE_TYPE_REPLICATED;
|
||||
else if (tname == "erasure")
|
||||
type = CEPH_PG_TYPE_ERASURE;
|
||||
type = CRUSH_RULE_TYPE_ERASURE;
|
||||
else if (tname == "msr_firstn")
|
||||
type = CRUSH_RULE_TYPE_MSR_FIRSTN;
|
||||
else if (tname == "msr_indep")
|
||||
type = CRUSH_RULE_TYPE_MSR_INDEP;
|
||||
else
|
||||
ceph_abort();
|
||||
|
||||
@ -905,6 +942,18 @@ int CrushCompiler::parse_rule(iter_t const& i)
|
||||
crush.set_rule_step_set_chooseleaf_stable(ruleno, step++, val);
|
||||
}
|
||||
break;
|
||||
case crush_grammar::_step_set_msr_descents:
|
||||
{
|
||||
int val = int_node(s->children[1]);
|
||||
crush.set_rule_step_set_msr_descents(ruleno, step++, val);
|
||||
}
|
||||
break;
|
||||
case crush_grammar::_step_set_msr_collision_tries:
|
||||
{
|
||||
int val = int_node(s->children[1]);
|
||||
crush.set_rule_step_set_msr_collision_tries(ruleno, step++, val);
|
||||
}
|
||||
break;
|
||||
|
||||
case crush_grammar::_step_choose:
|
||||
case crush_grammar::_step_chooseleaf:
|
||||
@ -932,6 +981,17 @@ int CrushCompiler::parse_rule(iter_t const& i)
|
||||
}
|
||||
break;
|
||||
|
||||
case crush_grammar::_step_choose_msr:
|
||||
{
|
||||
string type = string_node(s->children[3]);
|
||||
if (!type_id.count(type)) {
|
||||
err << "in rule '" << rname << "' type '" << type << "' not defined" << std::endl;
|
||||
return -1;
|
||||
}
|
||||
crush.set_rule_step_choose_msr(ruleno, step++, int_node(s->children[1]), type_id[type]);
|
||||
}
|
||||
break;
|
||||
|
||||
case crush_grammar::_step_emit:
|
||||
crush.set_rule_step_emit(ruleno, step++);
|
||||
break;
|
||||
|
@ -135,6 +135,29 @@ bool CrushWrapper::is_v5_rule(unsigned ruleid) const
|
||||
return false;
|
||||
}
|
||||
|
||||
bool CrushWrapper::has_msr_rules() const
|
||||
{
|
||||
for (unsigned i=0; i<crush->max_rules; i++) {
|
||||
if (is_msr_rule(i)) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
bool CrushWrapper::is_msr_rule(unsigned ruleid) const
|
||||
{
|
||||
if (ruleid >= crush->max_rules)
|
||||
return false;
|
||||
|
||||
crush_rule *r = crush->rules[ruleid];
|
||||
if (!r)
|
||||
return false;
|
||||
|
||||
return r->type == CRUSH_RULE_TYPE_MSR_INDEP ||
|
||||
r->type == CRUSH_RULE_TYPE_MSR_FIRSTN;
|
||||
}
|
||||
|
||||
bool CrushWrapper::has_choose_args() const
|
||||
{
|
||||
return !choose_args.empty();
|
||||
@ -2238,6 +2261,7 @@ void CrushWrapper::reweight_bucket(
|
||||
int CrushWrapper::add_simple_rule_at(
|
||||
string name, string root_name,
|
||||
string failure_domain_name,
|
||||
int num_failure_domains,
|
||||
string device_class,
|
||||
string mode, int rule_type,
|
||||
int rno,
|
||||
@ -2309,17 +2333,19 @@ int CrushWrapper::add_simple_rule_at(
|
||||
}
|
||||
crush_rule_set_step(rule, step++, CRUSH_RULE_TAKE, root, 0);
|
||||
if (type)
|
||||
crush_rule_set_step(rule, step++,
|
||||
mode == "firstn" ? CRUSH_RULE_CHOOSELEAF_FIRSTN :
|
||||
CRUSH_RULE_CHOOSELEAF_INDEP,
|
||||
CRUSH_CHOOSE_N,
|
||||
type);
|
||||
crush_rule_set_step(
|
||||
rule, step++,
|
||||
mode == "firstn" ? CRUSH_RULE_CHOOSELEAF_FIRSTN :
|
||||
CRUSH_RULE_CHOOSELEAF_INDEP,
|
||||
num_failure_domains <= 0 ? CRUSH_CHOOSE_N : num_failure_domains,
|
||||
type);
|
||||
else
|
||||
crush_rule_set_step(rule, step++,
|
||||
mode == "firstn" ? CRUSH_RULE_CHOOSE_FIRSTN :
|
||||
CRUSH_RULE_CHOOSE_INDEP,
|
||||
CRUSH_CHOOSE_N,
|
||||
0);
|
||||
crush_rule_set_step(
|
||||
rule, step++,
|
||||
mode == "firstn" ? CRUSH_RULE_CHOOSE_FIRSTN :
|
||||
CRUSH_RULE_CHOOSE_INDEP,
|
||||
num_failure_domains <= 0 ? CRUSH_CHOOSE_N : num_failure_domains,
|
||||
0);
|
||||
crush_rule_set_step(rule, step++, CRUSH_RULE_EMIT, 0, 0);
|
||||
|
||||
int ret = crush_add_rule(crush, rule, rno);
|
||||
@ -2335,13 +2361,125 @@ int CrushWrapper::add_simple_rule_at(
|
||||
int CrushWrapper::add_simple_rule(
|
||||
string name, string root_name,
|
||||
string failure_domain_name,
|
||||
int num_failure_domains,
|
||||
string device_class,
|
||||
string mode, int rule_type,
|
||||
ostream *err)
|
||||
{
|
||||
return add_simple_rule_at(name, root_name, failure_domain_name, device_class,
|
||||
mode,
|
||||
rule_type, -1, err);
|
||||
return add_simple_rule_at(
|
||||
name, root_name, failure_domain_name, num_failure_domains,
|
||||
device_class,
|
||||
mode,
|
||||
rule_type, -1, err);
|
||||
}
|
||||
|
||||
int CrushWrapper::add_multi_osd_per_failure_domain_rule_at(
|
||||
string name, string root_name, string failure_domain_name,
|
||||
int num_failure_domains,
|
||||
int osds_per_failure_domain,
|
||||
string device_class,
|
||||
crush_rule_type rule_type,
|
||||
int rno,
|
||||
ostream *err)
|
||||
{
|
||||
if (rule_exists(name)) {
|
||||
if (err)
|
||||
*err << "rule " << name << " exists";
|
||||
return -EEXIST;
|
||||
}
|
||||
if (rno >= 0) {
|
||||
if (rule_exists(rno)) {
|
||||
if (err)
|
||||
*err << "rule with ruleno " << rno << " exists";
|
||||
return -EEXIST;
|
||||
}
|
||||
} else {
|
||||
for (rno = 0; rno < get_max_rules(); rno++) {
|
||||
if (!rule_exists(rno))
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (!name_exists(root_name)) {
|
||||
if (err)
|
||||
*err << "root item " << root_name << " does not exist";
|
||||
return -ENOENT;
|
||||
}
|
||||
int root = get_item_id(root_name);
|
||||
int type = 0;
|
||||
if (failure_domain_name.length()) {
|
||||
type = get_type_id(failure_domain_name);
|
||||
if (type < 0) {
|
||||
if (err)
|
||||
*err << "unknown type " << failure_domain_name;
|
||||
return -EINVAL;
|
||||
}
|
||||
}
|
||||
if (device_class.size()) {
|
||||
if (!class_exists(device_class)) {
|
||||
if (err)
|
||||
*err << "device class " << device_class << " does not exist";
|
||||
return -EINVAL;
|
||||
}
|
||||
int c = get_class_id(device_class);
|
||||
if (class_bucket.count(root) == 0 ||
|
||||
class_bucket[root].count(c) == 0) {
|
||||
if (err)
|
||||
*err << "root " << root_name << " has no devices with class "
|
||||
<< device_class;
|
||||
return -EINVAL;
|
||||
}
|
||||
root = class_bucket[root][c];
|
||||
}
|
||||
if (rule_type != CRUSH_RULE_TYPE_MSR_INDEP &&
|
||||
rule_type != CRUSH_RULE_TYPE_MSR_FIRSTN) {
|
||||
if (err)
|
||||
*err << "unknown rule_type " << rule_type;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
int steps = 4;
|
||||
crush_rule *rule = crush_make_rule(steps, rule_type);
|
||||
ceph_assert(rule);
|
||||
int step = 0;
|
||||
crush_rule_set_step(rule, step++, CRUSH_RULE_TAKE, root, 0);
|
||||
crush_rule_set_step(rule, step++,
|
||||
CRUSH_RULE_CHOOSE_MSR,
|
||||
num_failure_domains,
|
||||
type);
|
||||
crush_rule_set_step(rule, step++,
|
||||
CRUSH_RULE_CHOOSE_MSR,
|
||||
osds_per_failure_domain,
|
||||
0);
|
||||
crush_rule_set_step(rule, step++, CRUSH_RULE_EMIT, 0, 0);
|
||||
|
||||
int ret = crush_add_rule(crush, rule, rno);
|
||||
if(ret < 0) {
|
||||
*err << "failed to add rule " << rno << " because " << cpp_strerror(ret);
|
||||
return ret;
|
||||
}
|
||||
set_rule_name(rno, name);
|
||||
have_rmaps = false;
|
||||
return rno;
|
||||
}
|
||||
|
||||
|
||||
int CrushWrapper::add_indep_multi_osd_per_failure_domain_rule(
|
||||
string name, string root_name,
|
||||
string failure_domain_name,
|
||||
int num_failure_domains,
|
||||
int osds_per_failure_domain,
|
||||
string device_class,
|
||||
ostream *err)
|
||||
{
|
||||
return add_multi_osd_per_failure_domain_rule_at(
|
||||
name, root_name,
|
||||
failure_domain_name,
|
||||
num_failure_domains,
|
||||
osds_per_failure_domain,
|
||||
device_class,
|
||||
CRUSH_RULE_TYPE_MSR_INDEP,
|
||||
-1,
|
||||
err);
|
||||
}
|
||||
|
||||
float CrushWrapper::_get_take_weight_osd_map(int root,
|
||||
@ -3080,6 +3218,10 @@ void CrushWrapper::encode(bufferlist& bl, uint64_t features) const
|
||||
}
|
||||
}
|
||||
}
|
||||
if (HAVE_FEATURE(features, CRUSH_MSR)) {
|
||||
encode(crush->msr_descents, bl);
|
||||
encode(crush->msr_collision_tries, bl);
|
||||
}
|
||||
}
|
||||
|
||||
static void decode_32_or_64_string_map(map<int32_t,string>& m, bufferlist::const_iterator& blp)
|
||||
@ -3230,6 +3372,12 @@ void CrushWrapper::decode(bufferlist::const_iterator& blp)
|
||||
choose_args[choose_args_index] = arg_map;
|
||||
}
|
||||
}
|
||||
if (!blp.end()) {
|
||||
decode(crush->msr_descents, blp);
|
||||
decode(crush->msr_collision_tries, blp);
|
||||
} else {
|
||||
set_default_msr_tunables();
|
||||
}
|
||||
update_choose_args(nullptr); // in case we decode a legacy "corrupted" map
|
||||
finalize();
|
||||
}
|
||||
@ -3485,6 +3633,8 @@ void CrushWrapper::dump_tunables(Formatter *f) const
|
||||
f->dump_int("chooseleaf_descend_once", get_chooseleaf_descend_once());
|
||||
f->dump_int("chooseleaf_vary_r", get_chooseleaf_vary_r());
|
||||
f->dump_int("chooseleaf_stable", get_chooseleaf_stable());
|
||||
f->dump_int("msr_descents", get_msr_descents());
|
||||
f->dump_int("msr_collision_tries", get_msr_collision_tries());
|
||||
f->dump_int("straw_calc_version", get_straw_calc_version());
|
||||
f->dump_int("allowed_bucket_algs", get_allowed_bucket_algs());
|
||||
|
||||
@ -3515,6 +3665,7 @@ void CrushWrapper::dump_tunables(Formatter *f) const
|
||||
f->dump_int("has_v4_buckets", (int)has_v4_buckets());
|
||||
f->dump_int("require_feature_tunables5", (int)has_nondefault_tunables5());
|
||||
f->dump_int("has_v5_rules", (int)has_v5_rules());
|
||||
f->dump_int("has_msr_rules", (int)has_msr_rules());
|
||||
}
|
||||
|
||||
void CrushWrapper::dump_choose_args(Formatter *f) const
|
||||
@ -3613,6 +3764,11 @@ void CrushWrapper::dump_rule(int rule_id, Formatter *f) const
|
||||
f->dump_int("num", get_rule_arg1(rule_id, j));
|
||||
f->dump_string("type", get_type_name(get_rule_arg2(rule_id, j)));
|
||||
break;
|
||||
case CRUSH_RULE_CHOOSE_MSR:
|
||||
f->dump_string("op", "choosemsr");
|
||||
f->dump_int("num", get_rule_arg1(rule_id, j));
|
||||
f->dump_string("type", get_type_name(get_rule_arg2(rule_id, j)));
|
||||
break;
|
||||
case CRUSH_RULE_SET_CHOOSE_TRIES:
|
||||
f->dump_string("op", "set_choose_tries");
|
||||
f->dump_int("num", get_rule_arg1(rule_id, j));
|
||||
@ -3621,6 +3777,14 @@ void CrushWrapper::dump_rule(int rule_id, Formatter *f) const
|
||||
f->dump_string("op", "set_chooseleaf_tries");
|
||||
f->dump_int("num", get_rule_arg1(rule_id, j));
|
||||
break;
|
||||
case CRUSH_RULE_SET_MSR_DESCENTS:
|
||||
f->dump_string("op", "set_msr_descents");
|
||||
f->dump_int("num", get_rule_arg1(rule_id, j));
|
||||
break;
|
||||
case CRUSH_RULE_SET_MSR_COLLISION_TRIES:
|
||||
f->dump_string("op", "set_msr_collision_tries");
|
||||
f->dump_int("num", get_rule_arg1(rule_id, j));
|
||||
break;
|
||||
default:
|
||||
f->dump_int("opcode", get_rule_op(rule_id, j));
|
||||
f->dump_int("arg1", get_rule_arg1(rule_id, j));
|
||||
|
@ -125,6 +125,7 @@ public:
|
||||
crush->chooseleaf_vary_r = 0;
|
||||
crush->chooseleaf_stable = 0;
|
||||
crush->allowed_bucket_algs = CRUSH_LEGACY_ALLOWED_BUCKET_ALGS;
|
||||
set_default_msr_tunables();
|
||||
}
|
||||
void set_tunables_bobtail() {
|
||||
crush->choose_local_tries = 0;
|
||||
@ -134,6 +135,7 @@ public:
|
||||
crush->chooseleaf_vary_r = 0;
|
||||
crush->chooseleaf_stable = 0;
|
||||
crush->allowed_bucket_algs = CRUSH_LEGACY_ALLOWED_BUCKET_ALGS;
|
||||
set_default_msr_tunables();
|
||||
}
|
||||
void set_tunables_firefly() {
|
||||
crush->choose_local_tries = 0;
|
||||
@ -143,6 +145,7 @@ public:
|
||||
crush->chooseleaf_vary_r = 1;
|
||||
crush->chooseleaf_stable = 0;
|
||||
crush->allowed_bucket_algs = CRUSH_LEGACY_ALLOWED_BUCKET_ALGS;
|
||||
set_default_msr_tunables();
|
||||
}
|
||||
void set_tunables_hammer() {
|
||||
crush->choose_local_tries = 0;
|
||||
@ -156,6 +159,7 @@ public:
|
||||
(1 << CRUSH_BUCKET_LIST) |
|
||||
(1 << CRUSH_BUCKET_STRAW) |
|
||||
(1 << CRUSH_BUCKET_STRAW2);
|
||||
set_default_msr_tunables();
|
||||
}
|
||||
void set_tunables_jewel() {
|
||||
crush->choose_local_tries = 0;
|
||||
@ -169,6 +173,7 @@ public:
|
||||
(1 << CRUSH_BUCKET_LIST) |
|
||||
(1 << CRUSH_BUCKET_STRAW) |
|
||||
(1 << CRUSH_BUCKET_STRAW2);
|
||||
set_default_msr_tunables();
|
||||
}
|
||||
|
||||
void set_tunables_legacy() {
|
||||
@ -233,6 +238,24 @@ public:
|
||||
crush->straw_calc_version = n;
|
||||
}
|
||||
|
||||
int get_msr_descents() const {
|
||||
return crush->msr_descents;
|
||||
}
|
||||
void set_msr_descents(int n) {
|
||||
crush->msr_descents = n;
|
||||
}
|
||||
|
||||
int get_msr_collision_tries() const {
|
||||
return crush->msr_collision_tries;
|
||||
}
|
||||
void set_msr_collision_tries(int n) {
|
||||
crush->msr_collision_tries = n;
|
||||
}
|
||||
void set_default_msr_tunables() {
|
||||
set_msr_descents(100);
|
||||
set_msr_collision_tries(100);
|
||||
}
|
||||
|
||||
unsigned get_allowed_bucket_algs() const {
|
||||
return crush->allowed_bucket_algs;
|
||||
}
|
||||
@ -248,7 +271,8 @@ public:
|
||||
crush->chooseleaf_descend_once == 0 &&
|
||||
crush->chooseleaf_vary_r == 0 &&
|
||||
crush->chooseleaf_stable == 0 &&
|
||||
crush->allowed_bucket_algs == CRUSH_LEGACY_ALLOWED_BUCKET_ALGS;
|
||||
crush->allowed_bucket_algs == CRUSH_LEGACY_ALLOWED_BUCKET_ALGS &&
|
||||
!has_nondefault_tunables_msr();
|
||||
}
|
||||
bool has_bobtail_tunables() const {
|
||||
return
|
||||
@ -258,7 +282,8 @@ public:
|
||||
crush->chooseleaf_descend_once == 1 &&
|
||||
crush->chooseleaf_vary_r == 0 &&
|
||||
crush->chooseleaf_stable == 0 &&
|
||||
crush->allowed_bucket_algs == CRUSH_LEGACY_ALLOWED_BUCKET_ALGS;
|
||||
crush->allowed_bucket_algs == CRUSH_LEGACY_ALLOWED_BUCKET_ALGS &&
|
||||
!has_nondefault_tunables_msr();
|
||||
}
|
||||
bool has_firefly_tunables() const {
|
||||
return
|
||||
@ -268,7 +293,8 @@ public:
|
||||
crush->chooseleaf_descend_once == 1 &&
|
||||
crush->chooseleaf_vary_r == 1 &&
|
||||
crush->chooseleaf_stable == 0 &&
|
||||
crush->allowed_bucket_algs == CRUSH_LEGACY_ALLOWED_BUCKET_ALGS;
|
||||
crush->allowed_bucket_algs == CRUSH_LEGACY_ALLOWED_BUCKET_ALGS &&
|
||||
!has_nondefault_tunables_msr();
|
||||
}
|
||||
bool has_hammer_tunables() const {
|
||||
return
|
||||
@ -281,7 +307,8 @@ public:
|
||||
crush->allowed_bucket_algs == ((1 << CRUSH_BUCKET_UNIFORM) |
|
||||
(1 << CRUSH_BUCKET_LIST) |
|
||||
(1 << CRUSH_BUCKET_STRAW) |
|
||||
(1 << CRUSH_BUCKET_STRAW2));
|
||||
(1 << CRUSH_BUCKET_STRAW2)) &&
|
||||
!has_nondefault_tunables_msr();
|
||||
}
|
||||
bool has_jewel_tunables() const {
|
||||
return
|
||||
@ -294,7 +321,8 @@ public:
|
||||
crush->allowed_bucket_algs == ((1 << CRUSH_BUCKET_UNIFORM) |
|
||||
(1 << CRUSH_BUCKET_LIST) |
|
||||
(1 << CRUSH_BUCKET_STRAW) |
|
||||
(1 << CRUSH_BUCKET_STRAW2));
|
||||
(1 << CRUSH_BUCKET_STRAW2)) &&
|
||||
!has_nondefault_tunables_msr();
|
||||
}
|
||||
|
||||
bool has_optimal_tunables() const {
|
||||
@ -322,6 +350,11 @@ public:
|
||||
return
|
||||
crush->chooseleaf_stable != 0;
|
||||
}
|
||||
bool has_nondefault_tunables_msr() const {
|
||||
return
|
||||
crush->msr_descents != 100 ||
|
||||
crush->msr_collision_tries != 100;
|
||||
}
|
||||
|
||||
bool has_v2_rules() const;
|
||||
bool has_v3_rules() const;
|
||||
@ -329,13 +362,17 @@ public:
|
||||
bool has_v5_rules() const;
|
||||
bool has_choose_args() const; // any choose_args
|
||||
bool has_incompat_choose_args() const; // choose_args that can't be made compat
|
||||
bool has_msr_rules() const;
|
||||
|
||||
bool is_v2_rule(unsigned ruleid) const;
|
||||
bool is_v3_rule(unsigned ruleid) const;
|
||||
bool is_v5_rule(unsigned ruleid) const;
|
||||
bool is_msr_rule(unsigned ruleid) const;
|
||||
|
||||
std::string get_min_required_version() const {
|
||||
if (has_v5_rules() || has_nondefault_tunables5())
|
||||
if (has_msr_rules() || has_nondefault_tunables_msr())
|
||||
return "squid";
|
||||
else if (has_v5_rules() || has_nondefault_tunables5())
|
||||
return "jewel";
|
||||
else if (has_v4_buckets())
|
||||
return "hammer";
|
||||
@ -565,6 +602,21 @@ public:
|
||||
if (have_rmaps)
|
||||
rule_name_rmap[name] = i;
|
||||
}
|
||||
bool rule_valid_for_pool_type(int rule_id, int ptype) const {
|
||||
auto rule_type = get_rule_type(rule_id);
|
||||
switch (ptype) {
|
||||
case CEPH_PG_TYPE_REPLICATED:
|
||||
return rule_type == CRUSH_RULE_TYPE_REPLICATED ||
|
||||
rule_type == CRUSH_RULE_TYPE_MSR_FIRSTN;
|
||||
case CEPH_PG_TYPE_ERASURE:
|
||||
return rule_type == CRUSH_RULE_TYPE_ERASURE ||
|
||||
rule_type == CRUSH_RULE_TYPE_MSR_INDEP;
|
||||
default:
|
||||
ceph_assert(0 == "impossible");
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
bool is_shadow_item(int id) const {
|
||||
const char *name = get_item_name(id);
|
||||
return name && !is_valid_crush_name(name);
|
||||
@ -1151,6 +1203,14 @@ public:
|
||||
int set_rule_step_set_chooseleaf_stable(unsigned ruleno, unsigned step, int val) {
|
||||
return set_rule_step(ruleno, step, CRUSH_RULE_SET_CHOOSELEAF_STABLE, val, 0);
|
||||
}
|
||||
|
||||
int set_rule_step_set_msr_descents(unsigned ruleno, unsigned step, int val) {
|
||||
return set_rule_step(ruleno, step, CRUSH_RULE_SET_MSR_DESCENTS, val, 0);
|
||||
}
|
||||
int set_rule_step_set_msr_collision_tries(unsigned ruleno, unsigned step, int val) {
|
||||
return set_rule_step(ruleno, step, CRUSH_RULE_SET_MSR_COLLISION_TRIES, val, 0);
|
||||
}
|
||||
|
||||
int set_rule_step_choose_firstn(unsigned ruleno, unsigned step, int val, int type) {
|
||||
return set_rule_step(ruleno, step, CRUSH_RULE_CHOOSE_FIRSTN, val, type);
|
||||
}
|
||||
@ -1163,22 +1223,61 @@ public:
|
||||
int set_rule_step_choose_leaf_indep(unsigned ruleno, unsigned step, int val, int type) {
|
||||
return set_rule_step(ruleno, step, CRUSH_RULE_CHOOSELEAF_INDEP, val, type);
|
||||
}
|
||||
int set_rule_step_choose_msr(unsigned ruleno, unsigned step, int val, int type) {
|
||||
return set_rule_step(ruleno, step, CRUSH_RULE_CHOOSE_MSR, val, type);
|
||||
}
|
||||
int set_rule_step_emit(unsigned ruleno, unsigned step) {
|
||||
return set_rule_step(ruleno, step, CRUSH_RULE_EMIT, 0, 0);
|
||||
}
|
||||
|
||||
int add_simple_rule(
|
||||
std::string name, std::string root_name, std::string failure_domain_type,
|
||||
int num_failure_domains,
|
||||
std::string device_class, std::string mode, int rule_type,
|
||||
std::ostream *err = 0);
|
||||
int add_simple_rule(
|
||||
std::string name, std::string root_name, std::string failure_domain_type,
|
||||
std::string device_class, std::string mode, int rule_type,
|
||||
std::ostream *err = 0) {
|
||||
return add_simple_rule(
|
||||
name, root_name, failure_domain_type, -1,
|
||||
device_class, mode, rule_type, err);
|
||||
}
|
||||
|
||||
int add_indep_multi_osd_per_failure_domain_rule(
|
||||
std::string name, std::string root_name, std::string failure_domain_type,
|
||||
int osds_per_failure_domain,
|
||||
int num_failure_domains,
|
||||
std::string device_class,
|
||||
std::ostream *err = 0);
|
||||
|
||||
/**
|
||||
* @param rno rule[set] id to use, -1 to pick the lowest available
|
||||
*/
|
||||
int add_simple_rule_at(
|
||||
std::string name, std::string root_name,
|
||||
std::string failure_domain_type, std::string device_class, std::string mode,
|
||||
std::string failure_domain_type,
|
||||
int num_failure_domains,
|
||||
std::string device_class, std::string mode,
|
||||
int rule_type, int rno, std::ostream *err = 0);
|
||||
int add_simple_rule_at(
|
||||
std::string name, std::string root_name,
|
||||
std::string failure_domain_type,
|
||||
std::string device_class, std::string mode,
|
||||
int rule_type, int rno, std::ostream *err = 0) {
|
||||
return add_simple_rule_at(
|
||||
name, root_name, failure_domain_type, -1,
|
||||
device_class, mode, rule_type, rno, err);
|
||||
}
|
||||
|
||||
int add_multi_osd_per_failure_domain_rule_at(
|
||||
std::string name, std::string root_name, std::string failure_domain_type,
|
||||
int osds_per_failure_domain,
|
||||
int num_failure_domains,
|
||||
std::string device_class,
|
||||
crush_rule_type rule_type,
|
||||
int rno,
|
||||
std::ostream *err = 0);
|
||||
|
||||
int remove_rule(int ruleno);
|
||||
|
||||
|
@ -65,7 +65,15 @@ enum crush_opcodes {
|
||||
CRUSH_RULE_SET_CHOOSE_LOCAL_TRIES = 10,
|
||||
CRUSH_RULE_SET_CHOOSE_LOCAL_FALLBACK_TRIES = 11,
|
||||
CRUSH_RULE_SET_CHOOSELEAF_VARY_R = 12,
|
||||
CRUSH_RULE_SET_CHOOSELEAF_STABLE = 13
|
||||
CRUSH_RULE_SET_CHOOSELEAF_STABLE = 13,
|
||||
|
||||
/* set choose_msr_total_tries */
|
||||
CRUSH_RULE_SET_MSR_DESCENTS = 14,
|
||||
/* set choose_msr_local_collision_tries */
|
||||
CRUSH_RULE_SET_MSR_COLLISION_TRIES = 15,
|
||||
|
||||
/* choose variant without FIRSTN|INDEP */
|
||||
CRUSH_RULE_CHOOSE_MSR = 16
|
||||
};
|
||||
|
||||
/*
|
||||
@ -87,7 +95,12 @@ struct crush_rule {
|
||||
#define crush_rule_size(len) (sizeof(struct crush_rule) + \
|
||||
(len)*sizeof(struct crush_rule_step))
|
||||
|
||||
|
||||
enum crush_rule_type {
|
||||
CRUSH_RULE_TYPE_REPLICATED = 1,
|
||||
CRUSH_RULE_TYPE_ERASURE = 3,
|
||||
CRUSH_RULE_TYPE_MSR_FIRSTN = 4,
|
||||
CRUSH_RULE_TYPE_MSR_INDEP = 5
|
||||
};
|
||||
|
||||
/*
|
||||
* A bucket is a named container of other items (either devices or
|
||||
@ -410,6 +423,12 @@ struct crush_map {
|
||||
*/
|
||||
__u8 chooseleaf_stable;
|
||||
|
||||
/*! Sets total descents for MSR rules */
|
||||
__u8 msr_descents;
|
||||
|
||||
/*! Sets local collision retries for MSR rules */
|
||||
__u8 msr_collision_tries;
|
||||
|
||||
/*! @cond INTERNAL */
|
||||
/* This value is calculated after decode or construction by
|
||||
the builder. It is exposed here (rather than having a
|
||||
|
@ -50,8 +50,11 @@ struct crush_grammar : public boost::spirit::grammar<crush_grammar>
|
||||
_step_set_choose_tries,
|
||||
_step_set_choose_local_tries,
|
||||
_step_set_choose_local_fallback_tries,
|
||||
_step_set_msr_descents,
|
||||
_step_set_msr_collision_tries,
|
||||
_step_choose,
|
||||
_step_chooseleaf,
|
||||
_step_choose_msr,
|
||||
_step_emit,
|
||||
_step,
|
||||
_crushrule,
|
||||
@ -91,8 +94,11 @@ struct crush_grammar : public boost::spirit::grammar<crush_grammar>
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_step_set_chooseleaf_tries> > step_set_chooseleaf_tries;
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_step_set_chooseleaf_vary_r> > step_set_chooseleaf_vary_r;
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_step_set_chooseleaf_stable> > step_set_chooseleaf_stable;
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_step_set_msr_descents> > step_set_msr_descents;
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_step_set_msr_collision_tries> > step_set_msr_collision_tries;
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_step_choose> > step_choose;
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_step_chooseleaf> > step_chooseleaf;
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_step_choose_msr> > step_choose_msr;
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_step_emit> > step_emit;
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_step> > step;
|
||||
boost::spirit::rule<ScannerT, boost::spirit::parser_context<>, boost::spirit::parser_tag<_crushrule> > crushrule;
|
||||
@ -149,6 +155,8 @@ struct crush_grammar : public boost::spirit::grammar<crush_grammar>
|
||||
step_set_chooseleaf_tries = str_p("set_chooseleaf_tries") >> posint;
|
||||
step_set_chooseleaf_vary_r = str_p("set_chooseleaf_vary_r") >> posint;
|
||||
step_set_chooseleaf_stable = str_p("set_chooseleaf_stable") >> posint;
|
||||
step_set_msr_descents = str_p("set_msr_descents") >> posint;
|
||||
step_set_msr_collision_tries = str_p("set_msr_collision_tries") >> posint;
|
||||
step_choose = str_p("choose")
|
||||
>> ( str_p("indep") | str_p("firstn") )
|
||||
>> integer
|
||||
@ -157,6 +165,9 @@ struct crush_grammar : public boost::spirit::grammar<crush_grammar>
|
||||
>> ( str_p("indep") | str_p("firstn") )
|
||||
>> integer
|
||||
>> str_p("type") >> name;
|
||||
step_choose_msr = str_p("choosemsr")
|
||||
>> integer
|
||||
>> str_p("type") >> name;
|
||||
step_emit = str_p("emit");
|
||||
step = str_p("step") >> ( step_take |
|
||||
step_set_choose_tries |
|
||||
@ -165,12 +176,15 @@ struct crush_grammar : public boost::spirit::grammar<crush_grammar>
|
||||
step_set_chooseleaf_tries |
|
||||
step_set_chooseleaf_vary_r |
|
||||
step_set_chooseleaf_stable |
|
||||
step_set_msr_descents |
|
||||
step_set_msr_collision_tries |
|
||||
step_choose |
|
||||
step_chooseleaf |
|
||||
step_choose_msr |
|
||||
step_emit );
|
||||
crushrule = str_p("rule") >> !name >> '{'
|
||||
>> (str_p("id") | str_p("ruleset")) >> posint
|
||||
>> str_p("type") >> ( str_p("replicated") | str_p("erasure") )
|
||||
>> str_p("type") >> ( str_p("replicated") | str_p("erasure") | str_p("msr_firstn") | str_p("msr_indep") )
|
||||
>> !(str_p("min_size") >> posint)
|
||||
>> !(str_p("max_size") >> posint)
|
||||
>> +step
|
||||
|
1070
src/crush/mapper.c
1070
src/crush/mapper.c
File diff suppressed because it is too large
Load Diff
@ -77,15 +77,11 @@ extern int crush_do_rule(const struct crush_map *map,
|
||||
const __u32 *weights, int weight_max,
|
||||
void *cwin, const struct crush_choose_arg *choose_args);
|
||||
|
||||
/* Returns the exact amount of workspace that will need to be used
|
||||
for a given combination of crush_map and result_max. The caller can
|
||||
then allocate this much on its own, either on the stack, in a
|
||||
per-thread long-lived buffer, or however it likes. */
|
||||
|
||||
static inline size_t crush_work_size(const struct crush_map *map,
|
||||
int result_max) {
|
||||
return map->working_size + result_max * 3 * sizeof(__u32);
|
||||
}
|
||||
/* Returns enough workspace for any crush rule within map to generate
|
||||
result_max outputs. The caller can then allocate this much on its own,
|
||||
either on the stack, in a per-thread long-lived buffer, or however it likes.*/
|
||||
extern size_t crush_work_size(const struct crush_map *map,
|
||||
int result_max);
|
||||
|
||||
extern void crush_init_workspace(const struct crush_map *m, void *v);
|
||||
|
||||
|
@ -52,6 +52,12 @@ int ErasureCode::init(
|
||||
err |= to_string("crush-failure-domain", profile,
|
||||
&rule_failure_domain,
|
||||
DEFAULT_RULE_FAILURE_DOMAIN, ss);
|
||||
err |= to_int("crush-osds-per-failure-domain", profile,
|
||||
&rule_osds_per_failure_domain,
|
||||
"0", ss);
|
||||
err |= to_int("crush-num-failure-domains", profile,
|
||||
&rule_num_failure_domains,
|
||||
"0", ss);
|
||||
err |= to_string("crush-device-class", profile,
|
||||
&rule_device_class,
|
||||
"", ss);
|
||||
@ -66,19 +72,33 @@ int ErasureCode::create_rule(
|
||||
CrushWrapper &crush,
|
||||
std::ostream *ss) const
|
||||
{
|
||||
int ruleid = crush.add_simple_rule(
|
||||
name,
|
||||
rule_root,
|
||||
rule_failure_domain,
|
||||
rule_device_class,
|
||||
"indep",
|
||||
pg_pool_t::TYPE_ERASURE,
|
||||
ss);
|
||||
|
||||
if (ruleid < 0)
|
||||
return ruleid;
|
||||
|
||||
return ruleid;
|
||||
if (rule_osds_per_failure_domain <= 1) {
|
||||
return crush.add_simple_rule(
|
||||
name,
|
||||
rule_root,
|
||||
rule_failure_domain,
|
||||
rule_num_failure_domains,
|
||||
rule_device_class,
|
||||
"indep",
|
||||
pg_pool_t::TYPE_ERASURE,
|
||||
ss);
|
||||
} else {
|
||||
if (rule_num_failure_domains < 1) {
|
||||
if (ss) {
|
||||
*ss << "crush-num-failure-domains " << rule_num_failure_domains
|
||||
<< " must be >= 1 if crush-osds-per-failure-domain specified";
|
||||
return -EINVAL;
|
||||
}
|
||||
}
|
||||
return crush.add_indep_multi_osd_per_failure_domain_rule(
|
||||
name,
|
||||
rule_root,
|
||||
rule_failure_domain,
|
||||
rule_num_failure_domains,
|
||||
rule_osds_per_failure_domain,
|
||||
rule_device_class,
|
||||
ss);
|
||||
}
|
||||
}
|
||||
|
||||
int ErasureCode::sanity_check_k_m(int k, int m, ostream *ss)
|
||||
|
@ -37,6 +37,8 @@ namespace ceph {
|
||||
std::string rule_root;
|
||||
std::string rule_failure_domain;
|
||||
std::string rule_device_class;
|
||||
int rule_osds_per_failure_domain = -1;
|
||||
int rule_num_failure_domains = -1;
|
||||
|
||||
~ErasureCode() override {}
|
||||
|
||||
|
@ -137,7 +137,7 @@ DEFINE_CEPH_FEATURE(34, 3, RANGE_BLOCKLIST)
|
||||
DEFINE_CEPH_FEATURE(35, 1, OSD_CACHEPOOL) // 3.14
|
||||
DEFINE_CEPH_FEATURE(36, 1, CRUSH_V2) // 3.14
|
||||
DEFINE_CEPH_FEATURE(37, 1, EXPORT_PEER) // 3.14
|
||||
DEFINE_CEPH_FEATURE_RETIRED(38, 1, OSD_ERASURE_CODES, MIMIC, OCTOPUS)
|
||||
DEFINE_CEPH_FEATURE(38, 2, CRUSH_MSR) // X.XX TODOSAM kernel version?
|
||||
// available
|
||||
DEFINE_CEPH_FEATURE(39, 1, OSDMAP_ENC) // 3.15
|
||||
DEFINE_CEPH_FEATURE(40, 1, MDS_INLINE_DATA) // 3.19
|
||||
@ -218,6 +218,7 @@ DEFINE_CEPH_FEATURE_RETIRED(63, 1, RESERVED_BROKEN, LUMINOUS, QUINCY) // client-
|
||||
CEPH_FEATURE_OSD_CACHEPOOL | \
|
||||
CEPH_FEATURE_CRUSH_V2 | \
|
||||
CEPH_FEATURE_EXPORT_PEER | \
|
||||
CEPH_FEATURE_CRUSH_MSR | \
|
||||
CEPH_FEATURE_OSDMAP_ENC | \
|
||||
CEPH_FEATURE_MDS_INLINE_DATA | \
|
||||
CEPH_FEATURE_CRUSH_TUNABLES3 | \
|
||||
@ -265,9 +266,10 @@ DEFINE_CEPH_FEATURE_RETIRED(63, 1, RESERVED_BROKEN, LUMINOUS, QUINCY) // client-
|
||||
CEPH_FEATURE_CRUSH_TUNABLES2 | \
|
||||
CEPH_FEATURE_CRUSH_TUNABLES3 | \
|
||||
CEPH_FEATURE_CRUSH_TUNABLES5 | \
|
||||
CEPH_FEATURE_CRUSH_MSR | \
|
||||
CEPH_FEATURE_CRUSH_V2 | \
|
||||
CEPH_FEATURE_CRUSH_V4 | \
|
||||
CEPH_FEATUREMASK_CRUSH_CHOOSE_ARGS)
|
||||
CEPH_FEATUREMASK_CRUSH_MSR)
|
||||
|
||||
/*
|
||||
* make sure we don't try to use the reserved features
|
||||
|
@ -7562,6 +7562,12 @@ bool OSDMonitor::validate_crush_against_features(const CrushWrapper *newcrush,
|
||||
<< newmap.require_min_compat_client;
|
||||
return false;
|
||||
}
|
||||
if (mv > newmap.require_osd_release) {
|
||||
ss << "new crush map requires client version " << mv
|
||||
<< " but require_osd_release is "
|
||||
<< newmap.require_osd_release;
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
// osd compat
|
||||
@ -8072,7 +8078,7 @@ int OSDMonitor::prepare_new_pool(string& name,
|
||||
return r;
|
||||
}
|
||||
|
||||
if (osdmap.crush->get_rule_type(crush_rule) != (int)pool_type) {
|
||||
if (!osdmap.crush->rule_valid_for_pool_type(crush_rule, pool_type)) {
|
||||
*ss << "crush rule " << crush_rule << " type does not match pool";
|
||||
return -EINVAL;
|
||||
}
|
||||
@ -8344,7 +8350,7 @@ int OSDMonitor::prepare_command_pool_set(const cmdmap_t& cmdmap,
|
||||
return -EPERM;
|
||||
}
|
||||
}
|
||||
if (osdmap.crush->get_rule_type(p.get_crush_rule()) != (int)p.type) {
|
||||
if (!osdmap.crush->rule_valid_for_pool_type(p.get_crush_rule(), p.type)) {
|
||||
ss << "crush rule " << p.get_crush_rule() << " type does not match pool";
|
||||
return -EINVAL;
|
||||
}
|
||||
@ -8577,7 +8583,7 @@ int OSDMonitor::prepare_command_pool_set(const cmdmap_t& cmdmap,
|
||||
ss << cpp_strerror(id);
|
||||
return -ENOENT;
|
||||
}
|
||||
if (osdmap.crush->get_rule_type(id) != (int)p.get_type()) {
|
||||
if (!osdmap.crush->rule_valid_for_pool_type(id, p.get_type())) {
|
||||
ss << "crush rule " << id << " type does not match pool";
|
||||
return -EINVAL;
|
||||
}
|
||||
|
@ -1764,9 +1764,10 @@ uint64_t OSDMap::get_features(int entity_type, uint64_t *pmask) const
|
||||
features |= CEPH_FEATURE_CRUSH_V4;
|
||||
if (crush->has_nondefault_tunables5())
|
||||
features |= CEPH_FEATURE_CRUSH_TUNABLES5;
|
||||
if (crush->has_incompat_choose_args()) {
|
||||
if (crush->has_incompat_choose_args())
|
||||
features |= CEPH_FEATUREMASK_CRUSH_CHOOSE_ARGS;
|
||||
}
|
||||
if (crush->has_nondefault_tunables_msr())
|
||||
features |= CEPH_FEATURE_CRUSH_MSR;
|
||||
mask |= CEPH_FEATURES_CRUSH;
|
||||
|
||||
if (!pg_upmap.empty() || !pg_upmap_items.empty() || !pg_upmap_primaries.empty())
|
||||
@ -1789,6 +1790,8 @@ uint64_t OSDMap::get_features(int entity_type, uint64_t *pmask) const
|
||||
features |= CEPH_FEATURE_CRUSH_TUNABLES3;
|
||||
if (crush->is_v5_rule(ruleid))
|
||||
features |= CEPH_FEATURE_CRUSH_TUNABLES5;
|
||||
if (crush->is_msr_rule(ruleid))
|
||||
features |= CEPH_FEATURE_CRUSH_MSR;
|
||||
}
|
||||
}
|
||||
mask |= CEPH_FEATURE_OSDHASHPSPOOL | CEPH_FEATURE_OSD_CACHEPOOL;
|
||||
@ -1843,6 +1846,9 @@ ceph_release_t OSDMap::get_min_compat_client() const
|
||||
{
|
||||
uint64_t f = get_features(CEPH_ENTITY_TYPE_CLIENT, nullptr);
|
||||
|
||||
if (HAVE_FEATURE(f, CRUSH_MSR)) { // TODOSAM -- add version right before merge
|
||||
return ceph_release_t::squid; // v19.2.0
|
||||
}
|
||||
if (HAVE_FEATURE(f, OSDMAP_PG_UPMAP) || // v12.0.0-1733-g27d6f43
|
||||
HAVE_FEATURE(f, CRUSH_CHOOSE_ARGS)) { // v12.0.1-2172-gef1ef28
|
||||
return ceph_release_t::luminous; // v12.2.0
|
||||
@ -4524,7 +4530,7 @@ int OSDMap::validate_crush_rules(CrushWrapper *newcrush,
|
||||
<< " but it is not present";
|
||||
return -EINVAL;
|
||||
}
|
||||
if (newcrush->get_rule_type(ruleno) != (int)pool.get_type()) {
|
||||
if (!newcrush->rule_valid_for_pool_type(ruleno, pool.get_type())) {
|
||||
*ss << "pool " << i.first << " type does not match rule " << ruleno;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
@ -159,6 +159,8 @@
|
||||
"chooseleaf_descend_once": 0,
|
||||
"chooseleaf_vary_r": 0,
|
||||
"chooseleaf_stable": 0,
|
||||
"msr_descents": 100,
|
||||
"msr_collision_tries": 100,
|
||||
"straw_calc_version": 0,
|
||||
"allowed_bucket_algs": 22,
|
||||
"profile": "argonaut",
|
||||
@ -172,7 +174,8 @@
|
||||
"has_v3_rules": 0,
|
||||
"has_v4_buckets": 1,
|
||||
"require_feature_tunables5": 0,
|
||||
"has_v5_rules": 0
|
||||
"has_v5_rules": 0,
|
||||
"has_msr_rules": 0
|
||||
},
|
||||
"choose_args": {
|
||||
"1": [],
|
||||
|
@ -6,7 +6,7 @@
|
||||
osdmaptool: exported crush map to oc
|
||||
$ osdmaptool --import-crush oc myosdmap
|
||||
osdmaptool: osdmap file 'myosdmap'
|
||||
osdmaptool: imported 497 byte crush map from oc
|
||||
osdmaptool: imported 499 byte crush map from oc
|
||||
osdmaptool: writing epoch 3 to myosdmap
|
||||
$ osdmaptool --adjust-crush-weight 0:5 myosdmap
|
||||
osdmaptool: osdmap file 'myosdmap'
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -176,6 +176,9 @@ zoned_enabled=0
|
||||
io_uring_enabled=0
|
||||
with_jaeger=0
|
||||
force_addr=0
|
||||
osds_per_host=0
|
||||
require_osd_and_client_version=""
|
||||
use_crush_tunables=""
|
||||
|
||||
with_mgr_dashboard=true
|
||||
if [[ "$(get_cmake_variable WITH_MGR_DASHBOARD_FRONTEND)" != "ON" ]] ||
|
||||
@ -599,6 +602,21 @@ case $1 in
|
||||
with_jaeger=1
|
||||
echo "with_jaeger $with_jaeger"
|
||||
;;
|
||||
--osds-per-host)
|
||||
osds_per_host="$2"
|
||||
shift
|
||||
echo "osds_per_host $osds_per_host"
|
||||
;;
|
||||
--require-osd-and-client-version)
|
||||
require_osd_and_client_version="$2"
|
||||
shift
|
||||
echo "require_osd_and_client_version $require_osd_and_client_version"
|
||||
;;
|
||||
--use-crush-tunables)
|
||||
use_crush_tunables="$2"
|
||||
shift
|
||||
echo "use_crush_tunables $use_crush_tunables"
|
||||
;;
|
||||
*)
|
||||
usage_exit
|
||||
esac
|
||||
@ -1095,6 +1113,15 @@ EOF
|
||||
if [ "$crimson" -eq 1 ]; then
|
||||
$CEPH_BIN/ceph osd set-allow-crimson --yes-i-really-mean-it
|
||||
fi
|
||||
|
||||
if [ -n "$require_osd_and_client_version" ]; then
|
||||
$CEPH_BIN/ceph osd set-require-min-compat-client $require_osd_and_client_version
|
||||
$CEPH_BIN/ceph osd require-osd-release $require_osd_and_client_version --yes-i-really-mean-it
|
||||
fi
|
||||
|
||||
if [ -n "$use_crush_tunables" ]; then
|
||||
$CEPH_BIN/ceph osd crush tunables $use_crush_tunables
|
||||
fi
|
||||
}
|
||||
|
||||
start_osd() {
|
||||
@ -1128,6 +1155,13 @@ start_osd() {
|
||||
[osd.$osd]
|
||||
host = $HOSTNAME
|
||||
EOF
|
||||
|
||||
if [ "$osds_per_host" -gt 0 ]; then
|
||||
wconf <<EOF
|
||||
crush location = root=default host=$HOSTNAME-$(echo "$osd / $osds_per_host" | bc)
|
||||
EOF
|
||||
fi
|
||||
|
||||
if [ "$spdk_enabled" -eq 1 ]; then
|
||||
wconf <<EOF
|
||||
bluestore_block_path = spdk:${bluestore_spdk_dev[$osd]}
|
||||
|
Loading…
Reference in New Issue
Block a user