If we don't specify the version up front, learn the version after the first
chunk and enforce it thereafter to ensure we do not get torn content.
Signed-off-by: Sage Weil <sage@inktank.com>
Block any request on an object (read or write) during the COPY_FROM
operation.
This could potentially be broken down into read vs write operations without
much difficulty, but blocking any op indescriminately is sufficient for
now, so let's keep it simple.
Signed-off-by: Sage Weil <sage@inktank.com>
Add an is_blocked() method for the obc, and add infrastructure to block
any operations if it returns true. Clean up on_change(), and add a helper
to kick an obc when whatever condition leading to it being blocked is no
longer true.
For now, is_blocked() is always false...
Signed-off-by: Sage Weil <sage@inktank.com>
As we get each chunk of data during the COPY_FROM operation, write it out
to a temporary object on the replicas. When we get all the pieces, move
it into place.
Signed-off-by: Sage Weil <sage@inktank.com>
On btrfs, kb_used + kb_avail can be much smaller than total kb, and
what really matters to avoid filling up the disk is how much space is
available, not how much we've used. Thus, compute the ratio we use to
determine full or nearfull from kb_avail rather than from kb_used.
Signed-off-by: Alexandre Oliva <oliva@gnu.org>
Signed-off-by: Sage Weil <sage@inktank.com>
When using replica log, if the log pool doesn't exist all operations are
going to fail. Try to create it if doesn't exist.
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
The in-tree Hadoop shim was a combination of libcephfs wrapper, and the
bits to support Hadoop. This has been replaced by src/java that
implements generic libcephfs wrappers, and externally, the hadoop shim
(see docs).
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Fixes: #6175
Backport: dumpling
We get a buffer off the remote gateway which might
not be NULL terminated. The JSON parser needs the
buffer to be NULL terminated even though we provide
a buffer length as it calls strlen().
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Backportable change to insure that even if no new ops started or
are running that indeed recovery is complete. Prevents some
error condition or unforseen code path from crashing an osd.
Backport: dumpling, cuttlefish
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Caused by 944f3b7353Fixes: #6291
Backport: dumpling
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Fixes: #6286
Use an external counter instead of calling list::size()
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Fixes: #6268
When doing aio write of objects (either regular or multipart parts) we
need to drain pending aio requests. Otherwise if gateway goes down then
object might end up corrupted.
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
If the repop has no version set, skip the updates to last_update and
last_update_{applied,ondisk} and last_complete_ondisk.
Signed-off-by: Sage Weil <sage@inktank.com>
Allow us to mark when we start and stop using a temporary object in a
sub_op. If we start to use it, make sure the collection exists on the
replica.
Signed-off-by: Sage Weil <sage@inktank.com>
This is similar to a collection_add + collection_move sequence in that we
apply the same replay guards. The difference is that we roll it up into
a single operation, change the filename, and make the omap content carry
over by calling DBObjectMap->clone (as there is no rename function or
collection awareness in the DBObjectMap).
Signed-off-by: Sage Weil <sage@inktank.com>
Fixes: #6214
When getting a failed read from client when putting an object
we returned the wrong value (always 0), which in the chunked-
upload case ended up in assuming that the write was done
successfully.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
We don't want it binding to whatever willy-nilly, and as an OSD even
its "client" traffic should go on the cluster address.
Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>