When proxying the write/cache op, if it is decided to not promote the
object, need to purge it from the object_contexts cache. Otherwise, it
causes problems for the later ops on this object.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
object
In a scenario like below:
- A rollback op comes in, and is enqueued.
- Several other ops on the same object come in, and are enqueued.
- The rollback op dispatches, and finds the object which it rollbacks to is
degraded, then this op is pushbacked into a list to wait for the degraded
object to recover.
- The later ops are handled and responded back to client.
- The degraded object recovers. The rollback op is enqueued again and finally
responded to client.
This breaks the op order. Need to set the blocked_by relationship to enqueue
the later ops until the degraded object recovers.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
When doing object overwrites on the ec base pool, the write op can't be
proxied. Always force promotion in this case.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
This is needed as in the following scenario:
- Client sends 3 writes and a read on the same object to base tier
- Set up cache tiering
- Client retries ops and sends the 3 writes and 1 read to the cache tier
- The 3 writes finished on the base tier, say with versions v1, v2 and v3
- Cache tier proxies the 1st write, and start to promote the object for the 2nd
write, the 2nd and 3rd writes and the read are blocked
- The proxied 1st write finishes on the base tier with version v4, and returns
to cache tier. But somehow the cache tier fails to send the reply due to socket
failure injecting
- Client retries the writes and the read again, the writes are identified as
dup ops
- The promotion finishes, it copies the pg_log entries from the base tier and
put it in the cache tier's pg_log. This includes the 3 writes on the base tier
and the proxied write
- The writes dispatches after the promotion, they are identified as completed
dup ops. Cache tier replies these write ops with the version from the base tier
(v1, v2 and v3)
- In the last, the read dispatches, it reads the version of the proxied write
(v4) and replies to client
- Client complains that 'racing read got wrong version'
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
The cache tier needs to set the reqid to the original reqid from client
when proxying the write op.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
The cache tier needs to set the reqid explicitly to the original reqid
from the client when proxying the write op to the base tier.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
For older versions which doesn't support proxy write, doing promote.
Otherwise, we can proxy the write.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
When proxy write comes back from base tier, the write op may or may not
sit at the front of the list.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
This function is used to check if we need to initiate a promotion in
maybe_handle_cache.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
Conflicts:
src/osd/ReplicatedPG.cc
If min_write_recency_for_promote is
- 0: Promote when there is a write.
- 1: Check if the object is in current hit set. Promote if yes.
- else: Check if the object is in current and other in memory hit sets.
Promote if yes.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
Conflicts:
src/osd/ReplicatedPG.cc
Set the RWORDERED flag when doing promote copy-from op. This is in case
there are proxy writes in flight.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
Conflicts:
src/osd/ReplicatedPG.cc
When moving to proxy write, this assertion doesn't hold any more.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
Conflicts:
src/osd/ReplicatedPG.cc
9db80da128 added an unconditional call to
restorecon after mounting the filesystem. It fails when restorecon is
not available and must be made conditional.
http://tracker.ceph.com/issues/12718Fixes: #12718
Signed-off-by: Loic Dachary <ldachary@redhat.com>
On some test machines, /usr/lib/ltp/testcases/bin/fsstress is
dangling symlink. 'cp -f' is impotent in this case.
Fixes: #12710
Signed-off-by: Yan, Zheng <zyan@redhat.com>
rbd:improve the error handle of rbd,check the return value.
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Why do indirect dependencies seem to work randomly:
undefined symbol: _ZN4ceph3log3Log12create_entryEii
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Total time run: 12.279167
Total writes made: 92
Write size: 4194304
Bandwidth (MB/sec): 30
Stddev Bandwidth: 23.4
Max bandwidth (MB/sec): 64
Min bandwidth (MB/sec): 2
Average IOPS: 7
Stddev IOPS: 6
Max IOPS: 32767
Min IOPS: -1537890352
Average Latency: 2.12
Stddev Latency: 1.35
Max latency: 6.05
Min latency: 0.501
Signed-off-by: xinxin shu <xinxin.shu@intel.com>