These illustrate the variation in mapping results as the vary_r tunable
is adjusted. Note:
1- For the vary_r=0 case, we have several inputs that map to only a single
output:
rule 3 (delltestrule) num_rep 4 result size == 1:\t27/1024 (esc)
rule 3 (delltestrule) num_rep 4 result size == 2:\t997/1024 (esc)
This is the behavior we are fixing. For all of the other values of
vary_r, we get 2 outputs for all inputs.
2- If we use vary_r 1, which is likely the most efficient computation,
we get lots of inputs that change. By setting larger values of vary_r,
we can trade a bit of extra computation to get a mapping that is more
similar to the legacy behavior. This is useful for legacy clusters:
$ for f in `seq 1 4` ; do diff -u test-map-vary-r-0.t test-map-vary-r-$f.t | grep -c -- + ; done
3030
1629
645
228
The crushmap here comes from a user who was seeing a bad mapping for certain
pgs after some OSDs were reweighted by utilization.
Signed-off-by: Sage Weil <sage@inktank.com>
The current crush_choose_firstn code will re-use the same 'r' value for
the recursive call. That means that if we are hitting a collision or
rejection for some reason (say, an OSD that is marked out) and need to
retry, we will keep making the same (bad) choice in that recursive
selection.
Introduce a tunable that fixes that behavior by incorporating the parent
'r' value into the recursive starting point, so that a different path
will be taken in subsequent placement attempts.
Note that this was done from the get-go for the new crush_choose_indep
algorithm.
This was exposed by a user who was seeing PGs stuck in active+remapped
after reweight-by-utilization because the up set mapped to a single OSD.
Signed-off-by: Sage Weil <sage@inktank.com>
Back in 27f4d1f6bc we refactored the CRUSH
code to allow adjustment of the retry counts on a per-pool basis. That
commit had an off-by-one bug: the previous "tries" counter was a *retry*
count, not a *try* count, but the new code was passing in 1 meaning
there should be no retries.
Fix the ftotal vs tries comparison to use < instead of <= to fix the
problem. Note that the original code used <= here, which means the
global "choose_total_tries" tunable is actually counting retries.
Compensate for that by adding 1 in crush_do_rule when we pull the tunable
into the local variable.
This was noticed looking at output from a user provided osdmap.
Unfortunately the map doesn't illustrate the change in mapping behavior
and I haven't managed to construct one yet that does. Inspection of the
crush debug output now aligns with prior versions, though.
Signed-off-by: Sage Weil <sage@inktank.com>
This fixes a valgrind error from OSD::handle_osd_map where primary is not
initialized and is compared after the call to pg_to_acting_osds().
We are still not distinguishing from "no mapping" to "pool doesn't exist,
no mapping". That is a somewhat larger change, though.
Signed-off-by: Sage Weil <sage@inktank.com>
Need to initialize the truncated variable, as we sometimes ignore error
response (e.g., with ENOENT), and in such cases we can't expect it to be
set.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit 9ecf3467a3)
Fixes: #7336
The rgw_read_user_buckets() treated the max param as the max number of
entries to request in a single op, but always fetched the entire list
of buckets. This is wrong, as it should have treated it as the total
number of entries requested. All the callers assume the latter.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
client/Client.cc: In member function 'int Client::_read(Fh*, int64_t, uint64_t, ceph::bufferlist*)':
warning: client/Client.cc:5893:27: comparison between signed and unsigned integer expressions [-Wsign-compare]
client/Client.cc: In member function 'int Client::_write(Fh*, int64_t, uint64_t, const char*)':
warning: client/Client.cc:6235:30: comparison between signed and unsigned integer expressions [-Wsign-compare]
Signed-off-by: Sage Weil <sage@inktank.com>
We had already added this as a flag (set/unset) when I generalized the
'mds set_max_mds' to be 'ceph mds set <var> <val>'. Switch the snaps
flag to be a key/value to with true/false (similar to the hashpspool
pool flag) since there are fewer users and the var/val approach is more
general.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
The file size can jump to a value that is very much larger than our current
position (for example, it could be a disk image file that gets a sparse
write at a large offset). Use a 64-bit value so that 'some' doesn't
overflow.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: John Spray <john.spray@inktank.com>
Passed fs suite, sage-2014-02-01_22:18:10-fs-wip-inline-testing-basic-plana,
modulo a snap test error in the suite.
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
- fix a couple of typo for repo configuration and service restart
- the ceph package must be installed on RPM distro since the init
script relies on ceph-conf
- Note on radosgw service name for RPM distro
Signed-off-by: Alexandre Marangone <alexandre.marangone@inktank.com>
Move enum scrub_error_type to osd_types.h
Move PG::_compare_scrub_objects to ReplicatedBackend::be_compare_scrub_objects
Move PG::_select_auth_object to ReplicatedBackend::be_select_auth_object
Move PG::_compare_scrubmaps to ReplicatedBackend::be_compare_scrubmaps
Signed-off-by: David Zafman <david.zafman@inktank.com>
This is a backport of 368852f6c0.
Make a deep copy of the OSDMap to avoid clobbering the in-memory copy with
the call to apply_incremental.
Fixes: #7060
Signed-off-by: Sage Weil <sage@inktank.com>
Make a deep(ish) copy of another OSDMap. Unfortunatley we can't make the
compiler-generated copy operator/constructors private until c++11. :(
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit bd54b9841b)
./os/KeyValueStore.h: In member function ‘std::string KeyValueStore::strip_object_key(uint64_t)’:
warning: ./os/KeyValueStore.h:173:31: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
Signed-off-by: Sage Weil <sage@inktank.com>