list_plain_entries() was using encode_obj_versioned_data_key() to set
its end_key, which gives a prefix of BI_BUCKET_OBJ_INSTANCE_INDEX[=2]
that range between start_key and end_key would not only span the
BI_BUCKET_OBJS_INDEX[=0] prefixes, but BI_BUCKET_LOG_INDEX[=1] prefixes
as well. this can result in list_plain_entries() trying and failing to
decode a rgw_bi_log_entry as a rgw_bucket_dir_entry
Fixes: http://tracker.ceph.com/issues/19876
Signed-off-by: Casey Bodley <cbodley@redhat.com>
This works out of the box with a vstart environment and
RGW=1 ../src/vstart.sh -n -l
PATH=bin:$PATH ../qa/workunits/rgw/run-s3tests.sh
Signed-off-by: Sage Weil <sage@redhat.com>
If we are a syncrhonous read, we don't need this: we don't aio_wait for
sync reads. If we are an aio_read, we are in the aio_running count anyway,
and there is also no purpose for this counter.
I'm a bit unsure about the NVME use of this counter; I switched it to use
num_running (pretty sure we aren't mixing reads and writes on a single
IOContext) *but* it might make more sense to switch to a private counter.
Signed-off-by: Sage Weil <sage@redhat.com>
Thread 1 (_do_read) Thread 2 (_aio_thread)
queues aio
ioc->aio_wait()
locks ioc->lock
num_waiting++
finishes aios
ioc->aio_wake
if (num_waiting)
lock ioc->lock (blocks)
num_running == 0, continue
ioc->unlock
ioc->lock taken
do_read destroys ioc
use after free, may block forever
The last bit of the race may vary; the key thing is that thread 2 is
taking a lock that thread 1 can destroy; thread 2 doesn't have it pinned
in memory.
Fix this by simplifying the aio_wake, aio_wait. Since it is mutually
exclusive with a callback completion, we can avoid calling it at all when
a callback in present, and focus on keeping it simple.
Avoid use-after-free by making sure the last decrement happens under
the lock in the aio_wake/wait case.
Signed-off-by: Sage Weil <sage@redhat.com>
We are unconditionally linking lttng-ust, which make selinux complain. We
should either
- fix selinux rules and unconditionally link
- dlopen at runtime based on an option (like we do for the current
tracepoints)
- ???
Signed-off-by: Sage Weil <sage@redhat.com>
- encode the same regardless of whether it is compiled in (!)
- encode in endian- and struct alignment-safe way.
Signed-off-by: Sage Weil <sage@redhat.com>
These are protocol features and cannot vary based on our compilation.
Encode and decode unconditionally. The callers have already guarded these
field additions with a message version bump and are conditionally calling
decode_trace.
Signed-off-by: Sage Weil <sage@redhat.com>
when set, Message::decode_trace() will always create a trace for
incoming messages, even if they didn't pass trace information
Signed-off-by: Casey Bodley <cbodley@redhat.com>
zipkin_trace.h is a wrapper around ztracer.hpp, which provides a stub
implementation when WITH_BLKIN is not defined
Signed-off-by: Casey Bodley <cbodley@redhat.com>
This lets us run multiple cleanup steps right before ceph
teardown.
Note that we drop the facet from multimon/ because it
doesn't factor out cluster creation before this step
properly. That's fine because the require_luminous
cleanup shouldn't be related to the multimon tests.
Signed-off-by: Sage Weil <sage@redhat.com>