In practice this tends to get bubbled up the stack as an error on
the caller, and they usually do not handle it properly. For example,
with librbd, this turns into EIO and break the VM.
Instead, this will manifest as a hung op on the client. That is
also not ideal, but given that the root cause here is generally a
bug, it's not clear what else would be better.
We already log an error in the cluster log, so teuthology runs will
continue to fail.
Signed-off-by: Sage Weil <sage@redhat.com>
Expose public methods that include a new output argument to indicate
whether there are more keys to fetch or not.
Mark the old interfaces deprecated.
Signed-off-by: Sage Weil <sage@redhat.com>
This change does prioritize backfill of PGs which don't
have min_size active copies. Such PGs would cause IO stalls
for clients and would increase throttlers usage.
This change also fixes few subtlle out-of-bounds bugs.
Signed-off-by: Bartłomiej Święcki <bartlomiej.swiecki@corp.ovh.com>
Tell users they need to set this to true before Monitors will allow
pools to be removed.
Also update the Pending Release Notes so that users can find this change
there.
This was changed with commit 5d7f4ea
Signed-off-by: Wido den Hollander <wido@42on.com>
osd: set server-side limits on omap get operations
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
If we have an OSD with a weight that's not 1.0 and mark it out,
we should restore the same weight when we mark it back in. We
already do this when an OSD is automatically marked out, just
not when it is explicitly marked out.
Signed-off-by: Sage Weil <sage@redhat.com>
This assumes that if the mon does not explicitly specify
the kv type that it is leveldb. No prior version of
Ceph has had non-experimental rocksdb, so this is
relatively safe. It's also necessary because the
default is now 'rocksdb' and we shouldn't assume those
old mons are rocksdb.
This will break for users to explicitly specified
rocksdb for the mon despite it being experimental.
Signed-off-by: Sage Weil <sage@redhat.com>
Exclusive lock, object map, fast-diff, and deep-flatten have been
enabled by default for all new images.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The rbd cli will warn about the deprecation when attempting to create
image format 1 images. librbd will log an error message when opening
a format 1 RBD image.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
the symbols of buffer::list::iterator_impl<> were wrongly exposed
in previous infernalis release, and the clients linked against
librados are very likely using them. so we need to document this
change.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Allow librados users to opt to receive ENOSPC or EDQUOT when they submit
an operation against a full cluster. This should only be used if the
librados app can handle those errors gracefully (librbd, for example,
cannot).
Also note that this allows savvy librados users to send delete operations;
they will get either a success or EDQUOT, depending on whether the
operation results in a net drop in space utilization.
Signed-off-by: Sage Weil <sage@redhat.com>
'ceph mon_metadata' was added still during this dev cycle, so there is
no need to deprecate it first.
Fixes: #11545
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
Use a clean name for keyvaluestore (no -dev suffix), but mark as
experimental to ensure users know what they are signing up for.
Signed-off-by: Sage Weil <sage@redhat.com>
Recent versions of Python contain a change to thread shutdown that
causes ceph to hang on exit; see http://bugs.python.org/issue21963.
As it turns out, this is relatively easy to avoid by not spawning
threads on exit, as Rados.__del__() will certainly do by calling
shutdown(); I suspect, but haven't proven, that the problem is
that shutdown() tries to start() a threading.Thread() that never
makes it all the way back to signal start().
Also add a PendingReleaseNote and extra doc comments to clarify.
Fixes: #8797
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Add release note
New librados interface
New pg_nls_response_t over the wire protocol
Ignore internal namespace (.ceph_internal)
Enhance ObjListCtx to keep independent IoCtxImpl so nspace won't change out from under listing code
Add ListObject with private implementation ListObjectImpl to return from iterator
Add EINVAL error for old librados interface when LIBRADOS_ALL_NSPACES set
Add throw to old librados c++ interface when all_nspaces set
Fixes: #9031
Signed-off-by: David Zafman <dzafman@redhat.com>
OSDs will now rely on 'leveldb_*' config options. We do keep however
leveldb's log enabled for OSDs by passing 'leveldb_log=""' as a default
argument to global_init() on ceph_osd.cc -- however, users will be able
to override this at their own discretion.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
'leveldb_*' options are currently used both by the monitor and the osd.
However, the monitor has quite different requirements from those of the
osds.
We need to specify some default values that must squash the defaults we
have for 'leveldb_*' options, while allowing users to overriding them too.
We take this not-exactly-ideal-but-still-good-enough approach of
defining the monitor-specific defaults in the 'default arguments' to
global_init(), thus allowing the user's options to take precedence over
whatever we define.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
From this point onward, users should use leveldb's options and add them
to the appropriate config sections of their configuration file.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
A 'status' or 'health' request will return a HEALTH_WARN whenever the
monitor handling the request has the option set to zero.
Fixes: 7784
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
The FileStore's leveldb currently uses libleveldb's defaults for cache and
write buffer size, which are both 4 MB. Increase the cache size to 128MB and
the write buffer to 8MB.
Tested-by: Dmitry Smirnov <onlyjob@member.fsf.org>
Signed-off-by: Sage Weil <sage@inktank.com>
Reading past the end of a pointer returned by string.data() in c++98
is undefined. While we're fixing this, also allow comparison of xattrs
containing null bytes.
Fixes: #7250
Backport: dumpling
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Just before sending an op, prepare_mutate_op() is called, creating a
new Op. prepare_read_op() already copied over all the out-params
correctly, but for write operations the individual op return value
pointers were not copied, so they would not be filled in. With this
fixed, librados users can get the per-op return codes again.
Partially fixes: #6483
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>