Commit Graph

31912 Commits

Author SHA1 Message Date
Samuel Just
c99b7e1985 PG,PGLog: replace _merge_old_entry with _merge_object_divergent_entries
The _merge_old_entry structure had trouble distinguishing between the
following cases:

missing: foo, 1,1
merge_old_entry modify 1,1 0,0
merge_old_entry modify 1,2 1,1

and
merge_old_entry modify 1,2 1,1

In the first case, we should end up with foo removed from missing
at the end.  In the second, we need foo added to missing at 1,1.
It's far simpler to present all of the divergent entries for a single
object at once.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:12 -08:00
Samuel Just
86b21e0b78 TestPGLog::merge_old_entry: ne.version cannot be oe.version
Otherwise, it would not be divergent!

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:11 -08:00
Samuel Just
3dc4f10a9a TestPGLog::merge_old_entry: we no longer use merge_old_entry this way
This needs to be replaced with an equivalent test of
_merge_object_divergent_entries.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:11 -08:00
Samuel Just
ff329ac52b TestPGLog:rewind_divergent_log: set prior_version for delete
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:11 -08:00
Samuel Just
9e43dd6ee3 TestPGLog: ignore merge_old_entry return value
No callers use the merge_old_entry return value.  _merge_divergent_entries
won't have one.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:11 -08:00
Samuel Just
3cc9e2262c TestPGLog: not worth maintaining tests of assert behavior
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:11 -08:00
David Zafman
dda72dee70 Merge pull request #1356 from ceph/wip-7458
osd: stray pg ref on shutdown

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-03-03 14:47:38 -08:00
Sage Weil
fd9c29b9b0 Merge pull request #1341 from ceph/wip-osd-status
osd: 'status' admin socket command

Reviewed-by: Loic Dachary <loic@dachary.org>
2014-03-03 11:21:11 -08:00
Ilya Dryomov
bd9913ce64 Merge branch 'wip-hint' into firefly
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-03 20:37:24 +02:00
Ilya Dryomov
371a80cb0f librbd: prefix rbd writes with CEPH_OSD_OP_SETALLOCHINT osd op
In an effort to reduce fragmentation, prefix every rbd write with
a CEPH_OSD_OP_SETALLOCHINT osd op with an expected_write_size value set
to the object size (1 << order).  Backwards compatibility is taken care
of on the osd side.

"The CEPH_OSD_OP_SETALLOCHINT hint is durable, in that it's enough to
do it once.  The reason every rbd write is prefixed is that rbd doesn't
explicitly create objects and relies on writes creating them
implicitly, so there is no place to stick a single hint op into.  To
get around that we decided to prefix every rbd write with a hint (just
like write and setattr ops, hint op will create an object implicitly if
it doesn't exist)."

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-03-03 20:33:44 +02:00
Ilya Dryomov
8e49bc32c8 FileStore: add option to cap alloc hint size
Add a new config option, filestore_max_alloc_hint_size, to cap
SETALLOCHINT hint size.  The unit is a byte, the default value is
1 megabyte.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-03-03 20:33:44 +02:00
Ilya Dryomov
1f5b796f58 FileStore: introduce XfsFileStoreBackend class
Introduce XfsFileStoreBackend class, currently the only filestore
backend implementing SETALLOCHINT op.  This commit adds a build-time
dependency on libxfs as xfs-specific ioctl (XFS_IOC_FSSETXATTR /
XFS_XFLAG_EXTSIZE) is used to implement the new set_alloc_hint()
method.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-03-03 20:33:44 +02:00
Ilya Dryomov
391257c00e FileStore: refactor FS detection checks a bit
Refactor FS detection checks in FileStore::_detect_fs() so that they
look the same as the ones in FileStore::mkfs().  This is in preparation
for adding XfsFileStoreBackend class.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-03-03 20:33:44 +02:00
Ilya Dryomov
6456802394 osd: add SETALLOCHINT operation
This is primarily for librbd/krbd's benefit and is supposed to combat
fragmentation:

"... knowing that rbd images have a 4m size, librbd can pass a hint
that will let the osd do the xfs allocation size ioctl on new files so
that they are allocated in 1m or 4m chunks.  We've seen cases where
users with rbd workloads have very high levels of fragmentation in xfs
and this would mitigate that and probably have a pretty nice
performance benefit."

SETALLOCHINT is considered advisory, so our backwards compatibility
mechanism here is to set FAILOK flag for all SETALLOCHINT ops.

xfs is hooked up in the subsequent commits.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-03-03 20:33:44 +02:00
Josh Durgin
28c29c1da7 Revert "ObjectCacher: remove unused target/max setters"
This reverts commit e1a49e5386.
2014-03-03 09:07:30 -08:00
Josh Durgin
d00a92724c Revert "librbd: remove limit on number of objects in the cache"
Disabling this limit causes too much memory usage in some
workloads.

This reverts commit 0559d31db2.
2014-03-03 09:04:00 -08:00
Ray Lv
195d53a7fc rgw: off-by-one in rgw_trim_whitespace()
Fixes: #7543
Backport: dumpling

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Signed-off-by: Ray Lv <raylv@yahoo-inc.com>
2014-03-03 08:51:30 -08:00
Sage Weil
09099c9e4c osd: 'status' admin socket command
Basic stuff, like what state is the OSD in, and what osdmap epoch are
we on.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-03 07:03:01 -08:00
Sage Weil
ffdfb846a2 Merge pull request #1327 from dachary/wip-7423
osd: do not attempt to read past the object size

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-03 06:59:22 -08:00
Loic Dachary
ef25135e1f erasure-code: test rados put and get
Check that rados put immediately followed by rados get retrieves exactly
the same content.

http://tracker.ceph.com/issues/7423 refs #7423

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-03 09:28:56 +01:00
Loic Dachary
0b612d1017 mon: prepend current directory to PATH for tests
So that binaries found in the source directory are always prefered to
installed binaries or scripts.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-03 09:28:55 +01:00
Loic Dachary
eb21bc805d osd: helper to create an OSD for functional tests
Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-03 09:28:55 +01:00
Loic Dachary
cababd926c mon: add mon-test-helpers.sh to EXTRA_DIST
Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-03 09:28:55 +01:00
Loic Dachary
927153f5c7 osd: do not attempt to read past the object size
When reading from a replicated pool, trying to read more than the object
size results in a short read that does not go beyond the object size. In
erasure coded pools, objects are padded and the read will return more
bytes than the object actually contains.

http://tracker.ceph.com/issues/7423 fixes #7423

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-03 09:28:40 +01:00
Sage Weil
10f87fc604 Merge pull request #1344 from ceph/wip-7539
Wip 7539

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-02 21:06:01 -08:00
Sage Weil
4e4f4cc160 Merge pull request #1322 from ceph/wip-librados-end-iterator
librados: fix ObjectIterator::operator= for the end iterator

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-02 12:51:30 -08:00
Sage Weil
32a4e90349 Merge pull request #1337 from ceph/wip-fix-coverity-20140228
Fix different issues found by Coverity

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-01 19:56:45 -08:00
Sage Weil
4bf32c66c8 Merge pull request #1336 from ceph/wip-nfs-export
Wip nfs export

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-01 19:54:23 -08:00
Samuel Just
62fd382fbf osd_types,PG: trim mod_desc for log entries to min size
In the event that mod_desc.bl contains pointers into a large
message buffer, we'd otherwise end up keeping around the entire
MOSDECSubOpWrite which created each log entry.

Fixes: #7539
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:54:06 -08:00
Samuel Just
d4118e15a3 MOSDECSubOpWrite: drop transaction, log_entries in clear_buffers
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:54:06 -08:00
Samuel Just
718cda6e95 TrackedOp: clear_payload as well in unregister_inflight_op
We want to minimize the cost of maintaining the historic ops.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:54:06 -08:00
Samuel Just
59ff572fd6 OpTracker: clarify that unregister_inflight_op is only called if enabled
The !tracking_enabled branch actually had a leak which was unreachable
since the caller does the check for tracking_enabled.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:54:06 -08:00
Samuel Just
fc9b8ef06b MOSDOp: drop ops vector in clear_data()
Otherwise, clear_data on MOSDOp will leave essentially
all of the buffers intact.  This is a problem since the
OpTracker mechanism relies on being able to keep the mesage
around without keeping around the data.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:53:52 -08:00
Greg Farnum
1ea59f6c42 ReplicatedPG: delete mark_all_unfound_lost transactions after completion
This was a minor memory leak.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-03-01 14:53:14 -08:00
Loic Dachary
84ba4cf21e Merge pull request #1339 from ceph/wip-7572
mon: fix 'pg dump' JSON output

Reviewed-by: Loic Dachary <loic@dachary.org>
2014-03-01 18:46:50 +01:00
John Spray
e19dffb88d mon: fix 'pg dump' JSON output
This was broken by 40bdcb88.  The 'acting' array had
the up_primary and acting_primary appended.

Fixes: #7572

Signed-off-by: John Spray <john.spray@inktank.com>
2014-03-01 17:05:11 +00:00
Danny Al-Gaaf
1a4657a374 req_state: fix uninitialized bool var
CID 717359 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
 uninit_member: Non-static class member "bucket_exists" is not
 initialized in this constructor nor in any functions that it calls.

Set bucket_exists to false in req_state::req_state().

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 14:26:18 +01:00
Danny Al-Gaaf
605e645026 Objecter::recalc_op_target: fix uninitialized scalar variable
CID 1160848 (#1 of 1): Uninitialized scalar variable (UNINIT)
 uninit_use: Using uninitialized value "best".

Init 'best' with -1 (from the code logic it will be set at least to 0)
to silence coverity.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 13:45:53 +01:00
Danny Al-Gaaf
754a36897b PGMonitor: fix uninitialized scalar variable
Fix type handling in dump_stuck_pg_stats. If type is type doesn't
match to known PGMap::STUCK_* type print out a message and return
directly from function.

CID 1030132 (#2 of 2): Uninitialized scalar variable (UNINIT)
 uninit_use_in_call: Using uninitialized value "stuck_type" when calling
 "PGMap::dump_stuck(ceph::Formatter *, PGMap::StuckPG, utime_t) const"

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 13:33:18 +01:00
Danny Al-Gaaf
1747c589e7 MDCache: fix potential null pointer deref
CID 716921 (#1 of 1): Dereference after null check (FORWARD_NULL)
 var_deref_model: Passing null pointer "dir" to function
 "operator <<(std::ostream &, CDir &)", which dereferences it.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 13:11:48 +01:00
Danny Al-Gaaf
93c09836fe MDCache::handle_discover: fix null pointer deref
CID 716990 (#1 of 1): Dereference null return value (NULL_RETURNS)
 dereference: Dereferencing a pointer that might be null "cur" when calling
 "MDCache::replicate_inode(CInode *, int, ceph::bufferlist &)"

Add assert to check for return value from get_inode() as done in other places.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 12:10:56 +01:00
Danny Al-Gaaf
249e210792 FileStore: fix resource leak in queue_transactions() blackhole case
CID 1135931 (#1 of 1): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "ondisk" going out of scope leaks the storage it
 points to.

CID 1135932 (#1 of 1): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "onreadable" going out of scope leaks the storage
 it points to.

CID 1135933 (#1 of 1): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "onreadable_sync" going out of scope leaks the
 storage it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 11:53:09 +01:00
Danny Al-Gaaf
3cd751b0a2 c_read_operations.cc: fix resource leak
CID 1188154 (#2 of 2): Resource leak (RESOURCE_LEAK)
 overwrite_var: Overwriting "op" in "op = rados_create_read_op()" leaks
 the storage that "op" points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 11:44:39 +01:00
Danny Al-Gaaf
ad9b6d2f7a c_write_operations.cc: fix some ioctx resource leaks
CID 1160833 (#3 of 3): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "ioctx" going out of scope leaks the storage
 it points to

CID 1160835 (#3 of 3): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "ioctx" going out of scope leaks the storage
 it points to.

CID 1188156 (#5 of 5): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "ioctx" going out of scope leaks the storage
 it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 11:36:18 +01:00
Danny Al-Gaaf
e8533ee4c9 ReplicatedBackend: check result of dynamic_cast to fix null pointer deref
CID 1188135 (#1 of 1): Unchecked dynamic_cast (FORWARD_NULL)
 var_deref_model: Passing null pointer "t" to function
 "RPGTransaction::get_transaction()", which dereferences it

CID 1188134 (#1 of 1): Unchecked dynamic_cast (FORWARD_NULL)
 var_deref_op: Dereferencing null pointer "to_append".

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 11:16:27 +01:00
Yan, Zheng
8d6b25a1eb mds: use "lookup-by-ino" helper to handle LOOKUPPARENT request
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-01 18:02:18 +08:00
Samuel Just
63e34639d7 Merge pull request #1326 from ceph/wip-7542
Wip 7542

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-02-28 21:08:30 -08:00
Danny Al-Gaaf
0bf5f8668f store_test.cc: fix unchecked return value
CID 1188126 (#1 of 1): Unchecked return value (CHECKED_RETURN)
 2. check_return: Calling function "ObjectStore::stat(coll_t,
    ghobject_t const &, stat *, bool)" without checking return value
    (as is done elsewhere 8 out of 9 times).
 3. unchecked_value: No check of the return value of "this->store->stat(
    coll_t(this->cid), hoid, &buf, false)".

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 01:24:37 +01:00
Danny Al-Gaaf
7eefe85cf5 histogram.h: fix potential div by zero
CID 1188131 (#1 of 1): Division or modulo by zero (DIVIDE_BY_ZERO)
 divide_by_zero: In expression "lower_sum * 1000000UL / total", division
 by expression "total" which may be zero has undefined behavior

Added check for non zero total.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 00:19:58 +01:00
Danny Al-Gaaf
500206d809 ReplicatedPG.cc: fix ressource leak, delete cb
CID 1188145 (#1 of 1): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "cb" going out of scope leaks the storage it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 00:04:05 +01:00