Commit Graph

29635 Commits

Author SHA1 Message Date
Sage Weil
028bb0d1e2 osd/osd_types: include num_objects_dirty, num_whiteouts in object_stat_sum_t
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:40:01 -08:00
Sage Weil
1403798327 osd/ReplicatedPG: EBUSY on cache-evict when watchers are present
Linger operations will follow the object to the cache pool when the pool
overlay process is set.  If we evict the object, the object_info_t will
go away along with the watch state and confusing things will happen.
Prevent that from happening by returning EBUSY when you try to evict a
watched object.

Note that you *can* flush a watched object, and the dirty flag will be
cleared.  But you still can't evict it.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:40:01 -08:00
Sage Weil
9ed6679a94 ceph_test_rados: test cache_flush, cache_try_flush, cache_evict
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:40:00 -08:00
Sage Weil
1dcbb663d8 ceph_test_rados_api_tier: fix HitSet* test names
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:40:00 -08:00
Sage Weil
99cee55c8a osd/osd_types: debug: include size in object_info_t operator<<
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:40:00 -08:00
Sage Weil
1af6723cdc osd/ReplicatedPG: debug: clean up oi printout
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:40:00 -08:00
Sage Weil
0d2d6a5faa osd/ReplicatedPG: debug: add an assert for copy-get
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:40:00 -08:00
Sage Weil
2fda4c0161 osd/ReplicatedPG: fix locking for promote
After we get the copy-from data and unblock the obc, we still need to take
the RWWRITE lock on the object for the duration of the repop while we
actually apply the change locally.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:40:00 -08:00
Sage Weil
927b0e60c2 osd/ReplicatedPG: fix user_version preservation for copy_from
In the process of fixing this for flush, we break promote, so we need to
adjust them both here.  Basic strategy: do not set user_modify, but handle
the user_version explicitly in the callbacks.

For copy_from, we don't have a clean way to pass the result through to
finish_copyfrom in do_osd_ops; do so by putting it in user_at_version. (If
we were to call finish_copyfrom directly from the callback this might
be simpler, but let's not go there right now.)

For promote, it is a trivial fix.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:59 -08:00
Sage Weil
bc05104149 osd/ReplicatedPG: handle ECANCELED in C_CopyFrom, C_Flush
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:59 -08:00
Sage Weil
2d5a7e2cad osd/ReplicatedPG: uninline CopyFromCallback, PromoteCallback
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:59 -08:00
Sage Weil
54f0c60c8b osd/osd_types: make object_info_t::dump() dump user_version
Backport: emperor
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:59 -08:00
Sage Weil
ba2f9e2996 osd/osd_types: include user_version in operator<< object_info_t
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:59 -08:00
Sage Weil
ffdaa5f415 vstart.sh: --cache <pool> to set up pool cache(s) on startup
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:59 -08:00
Sage Weil
57e91455be qa/workunits/rados/test_cache_pool.sh: fixes
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:58 -08:00
Sage Weil
1bde88f87c qa/workunits/rados: rename cache pool tests
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:58 -08:00
Sage Weil
ea519b48c0 qa/workunits/rados: test cache-{flush,evict,flush-evict-all}
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:58 -08:00
Sage Weil
71cd4a2278 rados: add cache-flush, cache-evict, cache-flush-evict-all commands
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:58 -08:00
Sage Weil
ad3b46665f osd/ReplicatedPG: implement cache-flush, cache-try-flush
Implement a rados operation that will flush a dirty object in the cache
tier by writing it back to the base tier.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-19 16:39:58 -08:00
Sage Weil
e6ad4d4a44 osd: make obc copyfrom blocking generic
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-18 11:30:18 -08:00
Sage Weil
8dec2b2735 librados, osd: add flags to COPY_FROM
If we initiate a COPY_FROM as part of a FLUSH operation, we will need to
set a flag so that the read-side of the copy and join the existing
in-progress operation without taknig additional locks.

Similarly, we need to pass flags from the client indicating whether we
should ignore overlay or cache logic while performing the copy.  These are
used by the promote and flush logic.

Note that none of these flags are exposed through librados (at least not
at this time).

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-18 11:26:51 -08:00
Sage Weil
0dc59af993 osd/ReplicatedPG: fix promote: set oi.size
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:57 -08:00
Sage Weil
697151eaec osd/osd_types: fix operator<< on copy-get operation
This was missed in 15c8267e34.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:57 -08:00
Sage Weil
f50389d741 ceph_test_rados_api_tier: test undirty on non-existent object
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:57 -08:00
Sage Weil
f86d6e7794 osd/ReplicatedPG: debug: improve maybe_handle_cache() handling
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:57 -08:00
Sage Weil
81279e3bb6 osd/ReplicatedPG: rename invalidate_forward
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:57 -08:00
Sage Weil
87547bde9a ceph_test_rados: debug: include exists|dne in update_object_version
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:57 -08:00
Sage Weil
d1e63b3cfe ceph_test_rados: test is_dirty, undirty
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:57 -08:00
Sage Weil
14f76cc264 ceph_test_rados: fix CopyFromOp locking
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:56 -08:00
Sage Weil
41be4feb35 librados: seek during object iteration
Add ability to reset iterator to a specific hash position.  For now, we
just truncate this to the current PG.  In the future, this may be more
precise.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
2013-12-13 16:35:56 -08:00
Sage Weil
330a13059f osdc/Objecter: remove honor_cache_redirects global flag
We can do this on a per-op basic with CEPH_OSD_FLAG_IGNORE_OVERLAY.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:56 -08:00
Sage Weil
42d6af1c30 osd/ReplicatedPG: use IGNORE_OVERLAY flag for copy-from
No need to use the Objecter-wide setting now.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:56 -08:00
Sage Weil
067536c2c6 osdc/Objecter: add CEPH_OSD_FLAG_IGNORE_OVERLAY flag
If the flag is set, send the op to the pool specified and ignore the
overlay.  Note that this obsoletes the global Objecter flag.

It also makes these EINVAL correctly:

  rados -p base cache-flush
  rados -p base cache-evict

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:56 -08:00
Sage Weil
3d9c49974e osd: rename IGNORE_OVERLAY -> IGNORE_CACHE
This is about skipping cache logic, not the tier pool overlay property.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:56 -08:00
Sage Weil
ea088fae6a osd/osd_types: operator<< for ObjectContext::RWState
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:56 -08:00
Sage Weil
c0e4ed3489 osd/ReplicatedPG: more verbose heading for process_copy_chunk
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:56 -08:00
Sage Weil
90eb1ec1e0 osd/ReplicatedPG: set ctx->obc in simple_repop_create
Strangely nobody hss needed this yet, but we will shortly.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:55 -08:00
Sage Weil
ca86656e74 osd/ReplicatedPG: use finish_ctx for finish_promote
Use the common code here to avoid duplicating this logic.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:55 -08:00
Sage Weil
66263bb6ff osd/ReplicatedPG: use get_next_version() in finish_promote
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:55 -08:00
Sage Weil
56ad14ec1f osd/ReplicatedPG: split off finish_ctx from execute_ctx
The second part of execute_ctx() is doing some somewhat generic work to
make the prepared updates in the ctx apply, updating the obc's cached
values.  Factor it out.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:55 -08:00
Sage Weil
3ef731068c osd/ReplicatedPG: add SKIPRWLOCKS flag
Flush puts us in an conundrum:

 - the flush eventually writes, behaving like a write
 - writes take the write lock at the start
 - to flush, we send copy-from to the base pool, which does a copy-get on
   our object
 - the copy-get is a read, that blocks on the write.

This flag will allow an op to skip the initial locking step.  It will need
to take it later, of course.

Signed-off-by: Sage Weil <sage@inktank.com>

Conflicts:

	src/osd/ReplicatedPG.cc
2013-12-13 16:35:55 -08:00
Sage Weil
5e547f8772 osd/ReplicatedPG: be consistent about ctx->obs vs ctx->obc->obs
Just for consistency (ctx->obs =- &ctx->obc->obs).

Signed-off-by: Sage Weil <sage@inktank.com>

Conflicts:

	src/osd/ReplicatedPG.cc
2013-12-13 16:35:55 -08:00
Sage Weil
36bbcf8e55 osd/ReplicatedPG: drop unnecessary temp vars in execute_ctx()
Both of these are pulled out of ctx->obs, which is not updated until the
very end; use that instead!

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:55 -08:00
Sage Weil
10c9be3401 osd/ReplicatedPG: allow osds to issue writes to osds
We asserted that the client was not an OSD years ago when we separated out
the client and cluster networks.  Now, we are about to allow an OSD to
trigger a copy_from on another pool (for cache flush) and the assert can
go away.  We've long since verified that the messages are going out on
the correct interfaces.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:55 -08:00
Sage Weil
20d149e198 osd/ReplcatedPG: maybe_handle_cache style
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:54 -08:00
Sage Weil
0b81ff68c0 osd/ReplicatedPG: skip promote for DELETE
If an op starts with DELETE there is no need to promote the old content
from the base tier.  Note that this only works if the FAILOK flag is
set.  Otherwise, we need to know whether the object existed or not to
return either 0 or -ENOENT.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:54 -08:00
Sage Weil
4c014eddbe osd/ReplicatedPG: implement cache_evict
Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:54 -08:00
Sage Weil
8b9b7136ba librados: add an aio_operate that takes a write and flags
Until now you could only pass flags to read operations.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:54 -08:00
Greg Farnum
85282319ee osd/osd_types: introduce helper for osd op flags -> string conversion
Signed-off-by: Sage Weil <sage@inktank.com>

Conflicts:

	src/osd/osd_types.h
2013-12-13 16:35:54 -08:00
Sage Weil
181cb8e83c librados, osd: add IGNORE_OVERLAY flag
Add a flag that will make the OSD bypass the cache overlay logic.  This is
needed in order to handle operations like CACHE_EVICT and CACHE_FLUSH.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-12-13 16:35:54 -08:00