Commit Graph

43553 Commits

Author SHA1 Message Date
Zhiqiang Wang
da68bb371f osd: purge the object from the cache when proxying and not promoting the op
When proxying the write/cache op, if it is decided to not promote the
object, need to purge it from the object_contexts cache. Otherwise, it
causes problems for the later ops on this object.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:24 -07:00
Zhiqiang Wang
aff8aa1a7b osd: set the blocked_by relationship when rolling back to a degraded
object

In a scenario like below:
- A rollback op comes in, and is enqueued.
- Several other ops on the same object come in, and are enqueued.
- The rollback op dispatches, and finds the object which it rollbacks to is
degraded, then this op is pushbacked into a list to wait for the degraded
object to recover.
- The later ops are handled and responded back to client.
- The degraded object recovers. The rollback op is enqueued again and finally
responded to client.
This breaks the op order. Need to set the blocked_by relationship to enqueue
the later ops until the degraded object recovers.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:23 -07:00
Zhiqiang Wang
fa5b6e6d5e osd: skip promotion when proxying a delete op
When the object is deleted, there is no need to promote it.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:23 -07:00
Zhiqiang Wang
98fff96b8f osd: rename SKIP_PROMOTE to SKIP_HANDLE_CACHE
To comply with what it really means.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:23 -07:00
Zhiqiang Wang
626c569200 osd: force promote for object overwrites on a ec base pool
When doing object overwrites on the ec base pool, the write op can't be
proxied. Always force promotion in this case.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:23 -07:00
Zhiqiang Wang
bf5d51e288 osd: explicitly set the reqid when proxying the write op
This is needed as in the following scenario:
- Client sends 3 writes and a read on the same object to base tier
- Set up cache tiering
- Client retries ops and sends the 3 writes and 1 read to the cache tier
- The 3 writes finished on the base tier, say with versions v1, v2 and v3
- Cache tier proxies the 1st write, and start to promote the object for the 2nd
write, the 2nd and 3rd writes and the read are blocked
- The proxied 1st write finishes on the base tier with version v4, and returns
to cache tier. But somehow the cache tier fails to send the reply due to socket
failure injecting
- Client retries the writes and the read again, the writes are identified as
dup ops
- The promotion finishes, it copies the pg_log entries from the base tier and
put it in the cache tier's pg_log. This includes the 3 writes on the base tier
and the proxied write
- The writes dispatches after the promotion, they are identified as completed
dup ops. Cache tier replies these write ops with the version from the base tier
(v1, v2 and v3)
- In the last, the read dispatches, it reads the version of the proxied write
(v4) and replies to client
- Client complains that 'racing read got wrong version'

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:23 -07:00
Zhiqiang Wang
ff658bd8ea Objecter: optionally setting the reqid in the mutate interface
The cache tier needs to set the reqid to the original reqid from client
when proxying the write op.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:23 -07:00
Zhiqiang Wang
58dd21c094 osd: add reqid in MOSDOp
The cache tier needs to set the reqid explicitly to the original reqid
from the client when proxying the write op to the base tier.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:23 -07:00
Zhiqiang Wang
0d7759f3e9 osd: turn on proxy write feature bit by default
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:22 -07:00
Zhiqiang Wang
b9ec7e64b7 osd: add proxy write perf counter
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:22 -07:00
Zhiqiang Wang
cb9390dc6f osd/ReplicatedPG: add the proxy write feature bit support
For older versions which doesn't support proxy write, doing promote.
Otherwise, we can proxy the write.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:22 -07:00
Zhiqiang Wang
ab39e03d59 osd/ReplicatedPG: don't check order in finish_proxy_write
When proxy write comes back from base tier, the write op may or may not
sit at the front of the list.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:22 -07:00
Zhiqiang Wang
7e27e61526 osd/ReplicatedPG: add helper function check_for_promote
This function is used to check if we need to initiate a promotion in
maybe_handle_cache.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>

Conflicts:
	src/osd/ReplicatedPG.cc
2015-08-18 11:25:22 -07:00
Zhiqiang Wang
d836a645a8 osd/ReplicatedPG: minor updates on proxy write
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:22 -07:00
Zhiqiang Wang
257c851ec3 mon: add osd pool set/get for min_write_recency_for_promote
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>

Conflicts:
	src/mon/MonCommands.h
	src/mon/OSDMonitor.cc
2015-08-18 11:25:21 -07:00
Zhiqiang Wang
3d5300bde1 osd/ReplicatedPG: promote on 2nd write
If min_write_recency_for_promote is
- 0: Promote when there is a write.
- 1: Check if the object is in current hit set. Promote if yes.
- else: Check if the object is in current and other in memory hit sets.
Promote if yes.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>

Conflicts:
	src/osd/ReplicatedPG.cc
2015-08-18 11:25:21 -07:00
Zhiqiang Wang
1099ec23f0 osd/osd_types: add min_write_recency_for_promote in pg_pool_t
This field stands for the minimum number of hit sets to check before
promote on write.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2015-08-18 11:25:21 -07:00
Zhiqiang Wang
c01d20b60d osd/ReplicatedPG: set the RWORDERED flag for the promote copy-from op
Set the RWORDERED flag when doing promote copy-from op. This is in case
there are proxy writes in flight.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>

Conflicts:
	src/osd/ReplicatedPG.cc
2015-08-18 11:25:21 -07:00
Zhiqiang Wang
90e5f410ee osd: tiering: use proxy write in writeback mode
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>

Conflicts:
	src/osd/ReplicatedPG.cc
2015-08-18 11:25:17 -07:00
Zhiqiang Wang
772617bf92 osd/ReplicatedPG: remove the peer_type assertion in eval_repop
When moving to proxy write, this assertion doesn't hold any more.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>

Conflicts:
	src/osd/ReplicatedPG.cc
2015-08-18 11:25:17 -07:00
Zhiqiang Wang
f8b3a4098a osd: tiering: add proxy write support
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>

Conflicts:
	src/osd/ReplicatedPG.cc
2015-08-18 11:25:17 -07:00
Sage Weil
f2cbbdd7a9 Merge remote-tracking branch 'gh/next' 2015-08-18 11:52:53 -04:00
branto1
7e02fb1f52 Merge pull request #5597 from dachary/wip-12718-restorecon
ceph-disk: only call restorecon when available

Reviewed-by: Boris Ranto <branto@redhat.com>
2015-08-18 15:05:02 +02:00
Loic Dachary
3aab146bb7 ceph-disk: only call restorecon when available
9db80da128 added an unconditional call to
restorecon after mounting the filesystem. It fails when restorecon is
not available and must be made conditional.

http://tracker.ceph.com/issues/12718 Fixes: #12718

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-08-18 14:43:15 +02:00
Gregory Farnum
4751936548 Merge pull request #5595 from ceph/wip-12710
qa/fsstress.sh: fix 'cp not writing through dangling symlink'

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2015-08-18 13:07:09 +01:00
Gregory Farnum
3cfb7e4ccc Merge pull request #5594 from ceph/wip-12711
mds: properly set client incarnation

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2015-08-18 13:01:17 +01:00
Yan, Zheng
479f2a760b qa/fsstress.sh: fix 'cp not writing through dangling symlink'
On some test machines, /usr/lib/ltp/testcases/bin/fsstress is
dangling symlink. 'cp -f' is impotent in this case.

Fixes: #12710
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-08-18 15:26:34 +08:00
Yan, Zheng
6387ec9f6f mds: properly set client incarnation
Fixes: #12711
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-08-18 14:57:04 +08:00
Samuel Just
c3de0affcb Merge pull request #5593 from ceph/revert-4927-snapset-obc
Revert "osd/ReplicatedPG: snapset is not persisted"

Reviewed-by: Samuel Just <sjust@redhat.com>
2015-08-17 12:59:18 -07:00
Samuel Just
0ba2e145d0 Revert "osd/ReplicatedPG: snapset is not persisted" 2015-08-17 12:58:58 -07:00
Yehuda Sadeh
989820a908 Merge pull request #5577 from oritwas/wip-next-12363
rgw: we should not overide Swift sent content type

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2015-08-17 11:19:23 -07:00
Jason Dillaman
d43d6844ec Merge pull request #5558 from s09816/rbd-fix
rbd:improve the error handle of rbd,check the return value.

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-08-17 11:42:51 -04:00
Noah Watkins
a5e0d8a388 Merge pull request #5586 from ceph/wip-jni-loader
wip-jni-loader

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2015-08-17 07:00:39 -06:00
Kefu Chai
7cf4b3a89c Merge pull request #5588 from zhouyuan/isal_yasm_fix
configure: Fix checking for yasm compatibility

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-08-17 13:20:34 +08:00
Yuan Zhou
0bb57f105f configure: Fix checking for yasm compability
Fix typo when checking yasm

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
2015-08-17 10:48:14 +08:00
Noah Watkins
2743cc405a java: add libcommon to deps
Why do indirect dependencies seem to work randomly:

  undefined symbol: _ZN4ceph3log3Log12create_entryEii

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2015-08-16 16:15:34 -06:00
Noah Watkins
5afa21d6c7 java: search for JNI bits in common dirs
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2015-08-16 16:15:34 -06:00
Kefu Chai
128f5a2504 Merge pull request #5572 from xinxinsh/wip-rados-bench-error
fix print error of rados bench

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-08-15 11:56:22 +08:00
s09816
af0ebeeb0e rbd:improve the error handle of rbd,check the return value.
Signed-off-by: s09816 <shi.lu@h3c.com>
2015-08-14 22:32:18 -04:00
Josh Durgin
ae54a9fa22 Merge pull request #5443 from ceph/wip-wrlock
cleanup: remove all traces of rados 'lock' operations

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-08-14 15:50:53 -07:00
Josh Durgin
87fdba9c48 Merge remote-tracking branch 'origin/next' 2015-08-14 14:18:12 -07:00
Josh Durgin
ee909dcc7c Merge pull request #5537 from s09816/master
rbd:modify the log of purging snaps so that it is more appropriate

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-08-14 14:16:46 -07:00
Josh Durgin
1f2f27f172 Merge pull request #5560 from solesoul1127/master
rbd:Check the dest image name, if it is empty string, return error and give a message

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-08-14 14:03:15 -07:00
xinxin shu
4eaa9ea1c9 fix print error of rados bench
Total time run:         12.279167
Total writes made:      92
Write size:             4194304
Bandwidth (MB/sec):     30
Stddev Bandwidth:       23.4
Max bandwidth (MB/sec): 64
Min bandwidth (MB/sec): 2
Average IOPS:           7
Stddev IOPS:            6
Max IOPS:               32767
Min IOPS:               -1537890352
Average Latency:        2.12
Stddev Latency:         1.35
Max latency:            6.05
Min latency:            0.501

Signed-off-by: xinxin shu <xinxin.shu@intel.com>
2015-08-14 22:22:48 +08:00
Yan, Zheng
c577ea2eee Merge pull request #5569 from ceph/wip-unused-var
client: fix unused var warning
2015-08-14 15:45:47 +08:00
s09816
7f32a3de78 rbd:modify the log of purging snaps so that it is more appropriate.
Signed-off-by: s09816 <shi.lu@h3c.com>
2015-08-13 23:24:55 -04:00
Josh Durgin
7083ed5edf Merge pull request #4744 from ceph/wip-11625
librbd: diff_iterate should issue concurrent ops

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-08-13 13:04:19 -07:00
Loic Dachary
84007cf346 Merge pull request #5555 from dachary/wip-mon-test-timeouts
tests: be more generous with mon tests timeouts

Reviewed-by: Kefu Chai <kchai@redhat.com>
2015-08-13 21:11:03 +02:00
Sage Weil
c229f7f626 Merge pull request #5576 from liewegas/wip-dencoder
simplify handling for objects w/ nondeterministic encoding

Reviewed-by: John Spray <john.spray@redhat.com>
2015-08-13 15:03:34 -04:00
Orit Wasserman
423cf136f1 rgw: we should not overide Swift sent content type
Fixes: #12363
backport: hammer

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
2015-08-13 20:50:53 +02:00