And it is consistent with the above. Also use a slightly different
string to allow the caller to differentiate between the two cases.
Signed-off-by: Loic Dachary <loic@dachary.org>
And implement dump_rules() using dump_rule(). The indentiation and
variable names are intentionaly left as is to not confuse code being
moved around and the code changes.
Signed-off-by: Loic Dachary <loic@dachary.org>
Use the admin socket to create the conditions by which a pool creation
is made to wait for the next paxos proposal because the required crush
ruleset is pending.
It replaces a fragile time sensitive workaround that could fail because
of race conditions. It also has the benefit of increase the speed of the
test because there is no need to wait for a long time just to accomodate
the slowest machines.
Signed-off-by: Loic Dachary <loic@dachary.org>
It provides a developer path allowing functional tests to modify the
pending OSDMap without triggering a PaxosProposal.
It can be used as follows:
echo '{"prefix":"osdmonitor_prepare_command","prepare":"osd crush tunables","profile":"bobtail"}' | nc -U out/mon.a.asok
It will transform the command into:
{"prefix":"osd crush tunables","profile":"bobtail"}
and feed it to OSDMonitor::prepare_command_impl(). The pending OSDMap won't
be proposed because it short circuit PaxosService::dispatch. It will,
however, be proposed next time PaxosService::dispatch() gets a chance.
It cannot be used via the ceph command line.
Signed-off-by: Loic Dachary <loic@dachary.org>
When this flag is true, the mon is expected to provide functionalities
that are for developer oriented debug purposes only. It is meant to
be used by the developer and not the system administrator, because it
would allow a non-developer to break things in ways that would be very
difficult to diagnose. It should probably not be documented.
Signed-off-by: Loic Dachary <loic@dachary.org>
So that it is possible to call prepare_command without a session
established and a cmdmap that has already been parsed.
Signed-off-by: Loic Dachary <loic@dachary.org>
Warn on legacy tunables, not on non-optimal tunables. Optimal is a moving
target, but it is really the legacy defaults that we want to push people
off of.
Fixes: #7399
Signed-off-by: Sage Weil <sage@inktank.com>
The binaries file name have changed and need to be updated in the
packaging files for deb and rpm. Fix a few leftovers as well.
Fixing 1a588f18ba
Signed-off-by: Loic Dachary <loic@dachary.org>
Update inode format version to 10, treat any previously created
inode as no backtrace. When inode with no backtrace is modified,
force update its backtrace.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
The number of objects is not a significant indicated of when data
should be written out for rbd. Use the highest possible value for
number of objects and just rely on the dirty data limits to trigger
flushing. When the number of objects is low, and many start being
flushed before they accumulate many requests, it hurts average request
size and performance for many concurrent sequential writes.
Fixes: #7385
Backport: emperor, dumpling
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
All the options are uint64_t, but the ObjectCacher was converting them
to int64_t. There's never any reason for these to be negative, so
change the type.
Adjust a few conditionals so that they only convert known-positive
signed values to uint64_t before comparing with the target and max
values. Leave the actual stats accounting as loff_t for now, since
bugs in accounting will have bad effects if negative values wrap
around.
Backport: emperor, dumpling
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
These illustrate the variation in mapping results as the vary_r tunable
is adjusted. Note:
1- For the vary_r=0 case, we have several inputs that map to only a single
output:
rule 3 (delltestrule) num_rep 4 result size == 1:\t27/1024 (esc)
rule 3 (delltestrule) num_rep 4 result size == 2:\t997/1024 (esc)
This is the behavior we are fixing. For all of the other values of
vary_r, we get 2 outputs for all inputs.
2- If we use vary_r 1, which is likely the most efficient computation,
we get lots of inputs that change. By setting larger values of vary_r,
we can trade a bit of extra computation to get a mapping that is more
similar to the legacy behavior. This is useful for legacy clusters:
$ for f in `seq 1 4` ; do diff -u test-map-vary-r-0.t test-map-vary-r-$f.t | grep -c -- + ; done
3030
1629
645
228
The crushmap here comes from a user who was seeing a bad mapping for certain
pgs after some OSDs were reweighted by utilization.
Signed-off-by: Sage Weil <sage@inktank.com>