This is useful for demos and testing. Enables
creation of lots of OSDs on a single block device
by simply running ceph-disk prepare more than once,
with a --data-size argument set.
Signed-off-by: John Spray <john.spray@inktank.com>
Fixes: #8428
Backport: firefly
Cannot use verify_object_permission() to test acls, as the operation
here might either be on object or on bucket.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
To Calculate PG ID, if I didn't get it wrong, CRUSH calculates the hash modulo
the number of PGs instead of OSDs, according to osd/osd_types.cc:963
ceph_stable_mod(pg.ps(), pg_num, pg_num_mask).
Signed-off-by: Kai Zhang <zakir.exe@gmail.com>
Speed up several cross-MDS operations by reducing the number of two-phase commit disk accesses we have to go through.
Reviewed-by: Greg Farnum <greg@inktank.com>
Older versions of the JNI interface expected non-const parameters
to their memory move functions. It's unpleasant, but won't actually
change the memory in question, to do a cast_const in order to satisfy
those older headers. (And even if it *did* modify the memory, that
would be okay given our single user.)
Signed-off-by: Greg Farnum <greg@inktank.com>
Currently we don't set MMonGetVersionReply tid even if the original
MMonGetVersion message had a non-zero tid. This is bad for the kernel
client, which has the infrastructure in place that relies on tids to
lookup message buffers and contexts. To kick off transitioning away
from the workaround, set MMonGetVersionReply tid to the tid of the
original MMonGetVersion message.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Now KeyValueStore doesn't support set_alloc_hit op, the implementation of
_do_transaction need to consider decoding the arguments. Otherwise, the
arguments will be regarded as the next op.
Fix the same problem for MemStore.
Fix#8381
Reported-by: Xinxin Shu <xinxin.shu5040@gmail.com>
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
If we fail to set the CRUSH position for one OSD, continue on to try
starting others, just as we do when we fail to start the daemon.
Fixes: #8342
Signed-off-by: Sage Weil <sage@inktank.com>
- When creating the OSD data, specify osd-uuid so that it matches when the osd is first created.
- Modify caps when adding osd auth to match what ceph-deploy does.
to fix FTBFS due to undeclared atomic functions.
As reported
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=748571
by John David Anglin <dave.anglin@bell.net>
~~~~
./include/atomic.h: In member function 'size_t ceph::atomic_t::inc()':
./include/atomic.h:42:36: error: 'AO_fetch_and_add1' was not declared in this scope
return AO_fetch_and_add1(&val) + 1;
^
./include/atomic.h: In member function 'size_t ceph::atomic_t::dec()':
./include/atomic.h:45:42: error: 'AO_fetch_and_sub1_write' was not declared in this scope
return AO_fetch_and_sub1_write(&val) - 1;
^
./include/atomic.h: In member function 'void ceph::atomic_t::add(size_t)':
./include/atomic.h:48:36: error: 'AO_fetch_and_add' was not declared in this scope
AO_fetch_and_add(&val, add_me);
^
./include/atomic.h: In member function 'void ceph::atomic_t::sub(int)':
./include/atomic.h:52:48: error: 'AO_fetch_and_add_write' was not declared in this scope
AO_fetch_and_add_write(&val, (AO_t)negsub);
^
./include/atomic.h: In member function 'size_t ceph::atomic_t::dec()':
./include/atomic.h:46:5: warning: control reaches end of non-void function [-Wreturn-type]
}
^
make[5]: *** [cls/user/cls_user_client.o] Error 1
~~~~
Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
This code was using CrushWrapper::rule_exists, which
checks for a *rule* existing, whereas the value being
set is a *ruleset*.
Signed-off-by: John Spray <john.spray@inktank.com>
Specifically, in the case where the configured
default ruleset is CEPH_DEFAULT_CRUSH_REPLICATED_RULESET,
instead of assuming ruleset 0 exists, choose the lowest
numbered ruleset.
In the case where an explicit ruleset is passed to
OSDMonitor::prepare_pool_crush_ruleset, verify
that it really exists.
The idea is to eliminate cases where a pool could
exist with its crush ruleset set to something
other than a value ruleset ID.
Fixes: #8373
Signed-off-by: John Spray <john.spray@inktank.com>
Use the server timestamp for the snapshot timestamp. This could arguably
be the client timestamp, but I think snapshot creation times are a bit
more important to have accurate timestamps on, and this should not be
something that existing client apps will strongly depend on.
Signed-off-by: Sage Weil <sage@inktank.com>
These ops to complicated work prior to starting the real operation, like
fetching missing directories, or opening remote dirfrags, creating
snaprealms. Reset the mds timestamp after this slow work has completed.
Signed-off-by: Sage Weil <sage@inktank.com>
Use the op_stamp from the MDRequest, populated by the MClientRequest when
possible, for setting timestamps on user-visible metadata (like ctime,
mtime).
Signed-off-by: Sage Weil <sage@inktank.com>
Use the op (client) timestamp for the recursive stats, for santity's sake.
Note that since this is monotonically increasing, the danger here is
that we lose track of nested changes due to skewed client clocks.
Signed-off-by: Sage Weil <sage@inktank.com>
This is a catch-all that we are carrying over from before. It may not
be strictly necessary, but I'm not inclined to check the code for
Mutation users who didn't call acquire_locks().
Signed-off-by: Sage Weil <sage@inktank.com>
This was off by one (too few) in the case of a
trimmedpos->write_pos range that had length
layout_period+2 and starting position one byte
before a period boundary.
Signed-off-by: John Spray <john.spray@inktank.com>