Commit Graph

32574 Commits

Author SHA1 Message Date
Yan, Zheng
1bd575e223 mds: fix CInode::get_approx_dirfrag
return NULL if there is no opened dirfrag

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-29 02:08:13 +08:00
Yan, Zheng
a1f5c645bb mds: don't trim ambiguous import dirfrags
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-29 02:08:13 +08:00
Yan, Zheng
598c5f18b2 mds: trim empty non-auth dirfrags
Fragmenting a non-auth dirfrag results several smaller dirfrags. Some
of the resulting dirfrags can be empty, which are not used to connected
to auth subtree.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-29 02:08:13 +08:00
Yan, Zheng
3c6c712414 mds: trim non-auth inode with remote parents
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-29 02:08:13 +08:00
Yan, Zheng
e811b07e19 mds: properly journal fragment rollback
If dirfrags are subtree roots, mark the dirfragtreelock as scattered
dirty, otherwise journal the dirfragtree change.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-29 02:08:13 +08:00
Yan, Zheng
6a548a97f8 mds: fix CDir::WAIT_ANY_MASK
make sure CDir::WAIT_ANY_MASK include MDSCacheObject::WAIT_UNFREEZE

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-29 02:08:13 +08:00
Yan, Zheng
e535f7f2b9 mds: avoid journaling non-auth opened inode
Exporting inode has AUTH bit set while EExport event is being
journaled.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-29 02:08:13 +08:00
Yan, Zheng
ffcbcdd61f mds: handle race between cache rejoin and fragmenting
MDCache::handle_cache_expire() ignores mismatched dirfrags. this is
OK during normal operation because MDS doesn't trim replica inode
whose dirfrags are likely being fragmented (see commit 22535340).

During recovery, the recovering MDS can reveive survivor MDS' cache
expire message before it sends cache rejoin acks. In this case,
there still can be mismatched dirfrags, but nothing prevents the
survivor MDS to trim inode of these mismatched dirfrags. So there
can be unconnected dirfrags when the recovering MDS sends cache
rejoin acks.

The fix is, when mismatched dirfrag is encountered during recovery,
check if inode of the dirfrag is still replicated to the sender MDS.
If the inode is not replicated, remove the sender MDS from replica
maps of all child dirfrags.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-29 02:08:13 +08:00
Yan, Zheng
6963a8f9cb mds: handle interaction between slave rollback and fragmenting
For slave rename and rmdir events, the MDS needs to preserve non-auth
dirfrag where the renamed inode originally lives in until slave commit
event is encountered. Current method to handle this is use MDCache::
uncommitted_slave_rename_olddir to track any non-auth dirfrag that
need to be preserved. This method does not works well if any preserved
dirfrag gets fragmented by log event (such as ESubtreeMap) between the
slave prepare event and the slave commit event.

The fix is tracking inode of dirfrag instead of tracking dirfrag that
need to preserved directly.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-29 02:08:13 +08:00
Sage Weil
0dcb54f71e Merge pull request #1549 from dachary/wip-doc
doc: fix typos in tiering dev doc
2014-03-28 08:23:46 -07:00
Loic Dachary
72eaa5e885 doc: fix typos in tiering dev doc
Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-28 14:02:25 +01:00
Samuel Just
7f4be9e9d0 Merge pull request #1547 from ceph/wip-cache-scrub
osd: improve scrub checks on clones; tolerate missing clones on cache pools

Fixes: #7885
Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-03-27 17:14:34 -07:00
Sage Weil
7a1990b66e Merge branch 'wip-7875'
Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-03-27 16:39:36 -07:00
Sage Weil
c64d03d0a8 mon/OSDMonitor: require OSD_CACHEPOOL feature before using tiering features
The OSDs need to support this feature before we allow users to turn it
on.  This is similar to what the erasure pool support does.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 16:39:01 -07:00
Sage Weil
69321bf57f mon/OSDMonitor: prevent setting hit_set unless all OSDs support it
We are using OSD_CACHEPOOL as a proxy for the support for the tiering
OSDMap infrastructure.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 16:38:46 -07:00
Sage Weil
eb71924ea2 osd/ReplicatedPG: tolerate missing clones in cache pools
A few cases:

- As we are working through the list, if we see a clone that is lower than
  the next one we were expecting, we should be able to skip them.
- If we see a head, we can skip all of the rest of the clones.
- If we get to the end and next_clone was set, we can ignore it.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 15:12:25 -07:00
Sage Weil
6508d5efe3 osd/ReplicatedPG: improve clone vs head checking
- notice when we are missing a clone (that isn't at the end of the list)
- notice when we are missing a clone on the last object in the scrub map
- do not assert when we are missing a clone

There is still more we could do to improve this (like noticing one missing
clone but still checking the others), but we'll leave that aside for just
a moment...

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 15:00:52 -07:00
Sage Weil
9e2cd5feaf osd/ReplicatedPG: do not assert on clone_size mismatch
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 13:48:33 -07:00
Sage Weil
7f026ba608 ceph_test_rados_api_tier: scrub while cache tier is missing clones
Trigger a scrub to verify that we can handle a cache tier that is missing
some clones.  We rely on the test harness to notice the error, and we do
not confirm that the scrub happened.  In practice this is plenty of time,
however.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 13:28:10 -07:00
Dan Mick
c5682e78e9 Merge pull request #1546 from ceph/wip-fix-pools
fix pool ops test
2014-03-27 13:01:05 -07:00
Sage Weil
7cb1d3a43d qa/workunits/mon/pool_ops.sh: fix test
The pool create command doesn't take k/v pairs any more.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 12:57:40 -07:00
Sage Weil
233801c622 qa/workunits/mon/pool_ops.sh: use expect_false
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 12:56:44 -07:00
Josh Durgin
ce59760aea Merge pull request #1545 from ceph/wip-7849-b
ceph-conf: do not log

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-03-27 12:35:50 -07:00
Sage Weil
72715b235a ceph-conf: no admin_socket
We don't need to worry about pidfile because that is done by the fork
functions, which ceph-conf doesn't call.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 12:30:39 -07:00
Josh Durgin
e91f5c8cc4 Merge pull request #1522 from themgt/patch-1
document adding dev key for custom Apache/FCGI install

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-03-27 12:03:25 -07:00
Sage Weil
fb208237a1 jerasure: fix up .gitignore
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 11:41:57 -07:00
Sage Weil
acc31e75a3 ceph-conf: do not log
If you are querying the conf for an osd and it has a log configured, we
should not generate any log activity.

This isn't super pretty, but it is much less intrusive that wiring a 'do
not log' flag down into CephContext and a zillion other places.

Fixes: #7849
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-27 11:36:42 -07:00
Josh Durgin
3f1417a850 Merge pull request #1542 from onlyjob/debian
logrotate: do not rotate empty logs (2nd logrotate file)

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-03-27 11:33:58 -07:00
Sage Weil
e21561e7f4 Merge pull request #1544 from ceph/wip-7876
rgw: use s->content_length instead of s->length

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-27 11:15:27 -07:00
Sage Weil
9f313109bc Merge pull request #1534 from dachary/wip-sse-fix
erasure code sse optimized jerasure plugin

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-27 11:14:30 -07:00
Yehuda Sadeh
ffd69ab3c0 rgw: use s->content_length instead of s->length
Fixes: #7876
Need to use the actual content length, not the pointer to the string.
This was probably working because there's correlation to when
content_length > 0 to whether s->length is not null.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2014-03-27 10:53:25 -07:00
Sage Weil
0935bb61b7 Merge pull request #1540 from ceph/wip-7860
test: Wait for tier removal before next test starts

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-27 10:21:10 -07:00
Dmitry Smirnov
501e31d94d logrotate: do not rotate empty logs (2nd logrotate file)
Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
2014-03-28 03:42:45 +11:00
Sage Weil
2d55316116 Merge pull request #1541 from onlyjob/debian
logrotate improvement: do not rotate empty logs

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-27 07:02:46 -07:00
Loic Dachary
91176f142c erasure-code: test encode/decode of SSE optimized jerasure plugins
If the machine running make check has the required CPU features
available, load the SSE optimized plugin and check that it can encode /
decode a simple payload. If the CPU features are not available, only
test the generic plugin and display an informative message about the
tests that were skipped.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-27 14:27:24 +01:00
Loic Dachary
b76ad972d1 erasure-code: test jerasure SSE optimized plugins selection
Test the selection of the plugin depending on the CPU features. The
prefix of the plugin is "jerasure" by default (jerasure_generic,
jerasure_sse3, jerasure_sse4) and can be modified with the
"jerasure-name" parameter. A test plugin is created for each
variant (test_jerasure_generic, test_jerasure_sse3, test_jerasure_sse4).
The flags set by ceph_probe are modified by the test to check if the
expected plugin suffix is appended.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-27 14:27:23 +01:00
Loic Dachary
30e714057c osd: increase osd verbosity during functional tests
Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-27 14:27:23 +01:00
Loic Dachary
10fd6b3153 erasure-code: SSE optimized jerasure plugins
The jerasure plugin is compiled with three sets of flags:

* jerasure_generic with no SSE optimization
* jerasure_sse3 with SSE2, SSE3 and SSSE3 optimizations
* jerasure_sse4 with SSE2, SSE3, SSSE3, SSE41, SSE42 and PCLMUL optimizations

The jerasure plugin loads the appropriate plugin depending on the CPU
features detected at runtime.

http://tracker.ceph.com/issues/7826 fixes #7826

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-27 14:27:23 +01:00
Loic Dachary
e9878db230 arch: add SSE3, SSSE3, SSSE41 and PCLMUL intel features
And add a note about valgrind forcing a fake cpuid.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-27 14:27:23 +01:00
Loic Dachary
c07aedb6db autotools: intel cpu features detection
Rename SIMD to INTEL for clarity.

Instead of agregating all flags in INTEL_FLAGS, create individual flags
for each feature (INTEL_SSE2_FLAGS etc.) for finer control in the
makefiles.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-27 14:27:23 +01:00
Loic Dachary
cc0cc15212 erasure-code: gf-complete / jerasure modules updates
To avoid confusion, the jerasure v1 branch that contains commits pending
review upstream is named v2-ceph and the gf-complete v2 branch is named
v2-ceph.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-27 14:27:23 +01:00
Loic Dachary
12d4f382d6 erasure-code: allow loading a plugin from factory()
The Mutex scope is restricted to only protect the load() method and not
the factory() method. This allows a plugin to load another plugin from
within the factory() method.

Reviewed-By: Christophe Courtaut <christophe.courtaut@gmail.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-27 14:26:54 +01:00
Sage Weil
d9a2dea755 Merge remote-tracking branch 'gh/firefly' 2014-03-26 21:44:45 -07:00
Dmitry Smirnov
506d2bbaeb logrotate improvement: do not rotate empty logs
Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
2014-03-27 12:12:19 +11:00
Sage Weil
dc3ce58add osd: do not make pg_pool_t incompat when hit_sets are enabled
If we enable HitSet tracking, the OSD needs to know this, but clients do
not care.  Setting the compat version is too heavyweight as it locks out
older kernels (*any* currents, currently) who are unaffected by the new
fields.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-26 17:47:06 -07:00
Sage Weil
b5702640cb Merge pull request #1537 from ceph/wip-7871
RadosModel: allow --no-omap to be specified seperately from --ec-pool

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-26 17:16:08 -07:00
Sage Weil
ec40196f4f Merge pull request #1536 from ceph/wip-7870
ReplicatedPG::do_osd_ops: only return ENOTSUP on OMAP write ops

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-26 17:14:07 -07:00
David Zafman
56974b91a2 test: Wait for tier removal before next test starts
Fixes: #7860

Signed-off-by: David Zafman <david.zafman@inktank.com>
2014-03-26 16:04:40 -07:00
Yehuda Sadeh
98654092fc rgw: configurable chunk size
Fixes: #7589

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2014-03-26 15:49:35 -07:00
Samuel Just
f171c93f18 Merge pull request #1535 from ceph/wip-7823
osd: trim copy-get backend read to object size

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-03-26 11:48:07 -07:00