Commit Graph

31888 Commits

Author SHA1 Message Date
Joao Eduardo Luis
25a9bd3251 osd: OSD: limit the value of 'size' and 'count' on 'osd bench'
Otherwise, a high enough 'count' value will trigger all sorts of timeouts
on the OSD; a low enough 'size' value will have the same effect for a
high enough value of 'count' (even the default value may have ill effects
on the osd's behaviour).  Limiting these values do not fix how 'osd bench'
should behave, but avoid someone from inadvertently bork an OSD.

Four options have been added and the user may adjust them if he so
desires to play with the OSD's fate:

 - 'osd_bench_small_size_max_iops' [default: 100] defines the amount of
   expected IOPS for a small block size (i.e., <1MB).
 - 'osd_bench_large_size_max_throughput' [default: 100<<20] defines
   the expected throughput in B/s.  We assume 100MB/s.
 - 'osd_bench_max_block_size' [default: 64 << 20] caps the block size
   allowed.  We have defined 64 MB.
 - 'osd_bench_duration' [default: 30] caps the expected duration.  This
   values is used when calculating the maximum allowed 'count', and is
   not enforced as the maximum duration of the operation.  If other IO
   is undergoing, or 'osd bench' is somehow slowed down, 'osd bench' may
   go over this duration.  Adjusting this option does however allow the
   user to specify higher 'count' values for (e.g.) a small block size,
   as the operation is assumed to perform the operation over a longer
   time span.

These options attempt to avoid combinations of dangerous parameters.  For
instance, we limit the block size to 64 MB (by default) so that there is
no temptation to specify a large enough block size, along with a very small
'count', such that the end result is similar to specifying a big count with
a sane block size.

Fixes: 7248

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2014-03-03 14:41:13 +00:00
Sage Weil
10f87fc604 Merge pull request #1344 from ceph/wip-7539
Wip 7539

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-02 21:06:01 -08:00
Sage Weil
4e4f4cc160 Merge pull request #1322 from ceph/wip-librados-end-iterator
librados: fix ObjectIterator::operator= for the end iterator

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-02 12:51:30 -08:00
Sage Weil
32a4e90349 Merge pull request #1337 from ceph/wip-fix-coverity-20140228
Fix different issues found by Coverity

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-01 19:56:45 -08:00
Sage Weil
4bf32c66c8 Merge pull request #1336 from ceph/wip-nfs-export
Wip nfs export

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-01 19:54:23 -08:00
Samuel Just
62fd382fbf osd_types,PG: trim mod_desc for log entries to min size
In the event that mod_desc.bl contains pointers into a large
message buffer, we'd otherwise end up keeping around the entire
MOSDECSubOpWrite which created each log entry.

Fixes: #7539
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:54:06 -08:00
Samuel Just
d4118e15a3 MOSDECSubOpWrite: drop transaction, log_entries in clear_buffers
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:54:06 -08:00
Samuel Just
718cda6e95 TrackedOp: clear_payload as well in unregister_inflight_op
We want to minimize the cost of maintaining the historic ops.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:54:06 -08:00
Samuel Just
59ff572fd6 OpTracker: clarify that unregister_inflight_op is only called if enabled
The !tracking_enabled branch actually had a leak which was unreachable
since the caller does the check for tracking_enabled.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:54:06 -08:00
Samuel Just
fc9b8ef06b MOSDOp: drop ops vector in clear_data()
Otherwise, clear_data on MOSDOp will leave essentially
all of the buffers intact.  This is a problem since the
OpTracker mechanism relies on being able to keep the mesage
around without keeping around the data.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:53:52 -08:00
Greg Farnum
1ea59f6c42 ReplicatedPG: delete mark_all_unfound_lost transactions after completion
This was a minor memory leak.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-03-01 14:53:14 -08:00
Loic Dachary
84ba4cf21e Merge pull request #1339 from ceph/wip-7572
mon: fix 'pg dump' JSON output

Reviewed-by: Loic Dachary <loic@dachary.org>
2014-03-01 18:46:50 +01:00
John Spray
e19dffb88d mon: fix 'pg dump' JSON output
This was broken by 40bdcb88.  The 'acting' array had
the up_primary and acting_primary appended.

Fixes: #7572

Signed-off-by: John Spray <john.spray@inktank.com>
2014-03-01 17:05:11 +00:00
Danny Al-Gaaf
1a4657a374 req_state: fix uninitialized bool var
CID 717359 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
 uninit_member: Non-static class member "bucket_exists" is not
 initialized in this constructor nor in any functions that it calls.

Set bucket_exists to false in req_state::req_state().

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 14:26:18 +01:00
Danny Al-Gaaf
605e645026 Objecter::recalc_op_target: fix uninitialized scalar variable
CID 1160848 (#1 of 1): Uninitialized scalar variable (UNINIT)
 uninit_use: Using uninitialized value "best".

Init 'best' with -1 (from the code logic it will be set at least to 0)
to silence coverity.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 13:45:53 +01:00
Danny Al-Gaaf
754a36897b PGMonitor: fix uninitialized scalar variable
Fix type handling in dump_stuck_pg_stats. If type is type doesn't
match to known PGMap::STUCK_* type print out a message and return
directly from function.

CID 1030132 (#2 of 2): Uninitialized scalar variable (UNINIT)
 uninit_use_in_call: Using uninitialized value "stuck_type" when calling
 "PGMap::dump_stuck(ceph::Formatter *, PGMap::StuckPG, utime_t) const"

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 13:33:18 +01:00
Danny Al-Gaaf
1747c589e7 MDCache: fix potential null pointer deref
CID 716921 (#1 of 1): Dereference after null check (FORWARD_NULL)
 var_deref_model: Passing null pointer "dir" to function
 "operator <<(std::ostream &, CDir &)", which dereferences it.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 13:11:48 +01:00
Danny Al-Gaaf
93c09836fe MDCache::handle_discover: fix null pointer deref
CID 716990 (#1 of 1): Dereference null return value (NULL_RETURNS)
 dereference: Dereferencing a pointer that might be null "cur" when calling
 "MDCache::replicate_inode(CInode *, int, ceph::bufferlist &)"

Add assert to check for return value from get_inode() as done in other places.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 12:10:56 +01:00
Danny Al-Gaaf
249e210792 FileStore: fix resource leak in queue_transactions() blackhole case
CID 1135931 (#1 of 1): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "ondisk" going out of scope leaks the storage it
 points to.

CID 1135932 (#1 of 1): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "onreadable" going out of scope leaks the storage
 it points to.

CID 1135933 (#1 of 1): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "onreadable_sync" going out of scope leaks the
 storage it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 11:53:09 +01:00
Danny Al-Gaaf
3cd751b0a2 c_read_operations.cc: fix resource leak
CID 1188154 (#2 of 2): Resource leak (RESOURCE_LEAK)
 overwrite_var: Overwriting "op" in "op = rados_create_read_op()" leaks
 the storage that "op" points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 11:44:39 +01:00
Danny Al-Gaaf
ad9b6d2f7a c_write_operations.cc: fix some ioctx resource leaks
CID 1160833 (#3 of 3): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "ioctx" going out of scope leaks the storage
 it points to

CID 1160835 (#3 of 3): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "ioctx" going out of scope leaks the storage
 it points to.

CID 1188156 (#5 of 5): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "ioctx" going out of scope leaks the storage
 it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 11:36:18 +01:00
Danny Al-Gaaf
e8533ee4c9 ReplicatedBackend: check result of dynamic_cast to fix null pointer deref
CID 1188135 (#1 of 1): Unchecked dynamic_cast (FORWARD_NULL)
 var_deref_model: Passing null pointer "t" to function
 "RPGTransaction::get_transaction()", which dereferences it

CID 1188134 (#1 of 1): Unchecked dynamic_cast (FORWARD_NULL)
 var_deref_op: Dereferencing null pointer "to_append".

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 11:16:27 +01:00
Yan, Zheng
8d6b25a1eb mds: use "lookup-by-ino" helper to handle LOOKUPPARENT request
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-01 18:02:18 +08:00
Samuel Just
63e34639d7 Merge pull request #1326 from ceph/wip-7542
Wip 7542

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-02-28 21:08:30 -08:00
Danny Al-Gaaf
0bf5f8668f store_test.cc: fix unchecked return value
CID 1188126 (#1 of 1): Unchecked return value (CHECKED_RETURN)
 2. check_return: Calling function "ObjectStore::stat(coll_t,
    ghobject_t const &, stat *, bool)" without checking return value
    (as is done elsewhere 8 out of 9 times).
 3. unchecked_value: No check of the return value of "this->store->stat(
    coll_t(this->cid), hoid, &buf, false)".

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 01:24:37 +01:00
Danny Al-Gaaf
7eefe85cf5 histogram.h: fix potential div by zero
CID 1188131 (#1 of 1): Division or modulo by zero (DIVIDE_BY_ZERO)
 divide_by_zero: In expression "lower_sum * 1000000UL / total", division
 by expression "total" which may be zero has undefined behavior

Added check for non zero total.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 00:19:58 +01:00
Danny Al-Gaaf
500206d809 ReplicatedPG.cc: fix ressource leak, delete cb
CID 1188145 (#1 of 1): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable "cb" going out of scope leaks the storage it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-01 00:04:05 +01:00
Gregory Farnum
84decc119f Merge pull request #1331 from ceph/wip-cache-pool
mon/OSDMonitor: make default false-positive-probability 5%

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-02-28 14:09:49 -08:00
Samuel Just
fbb1ec88b2 ECBackend: don't leak transactions
Fixes: #7539
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-02-28 11:27:10 -08:00
Samuel Just
b0d426440b OSD::handle_misdirected_op: handle ops to the wrong shard
OSD recomputes op target based on current OSDMap. With an EC pg, we can get
this result:
1) client at map 512 sends an op to osd 3, pg_t 3.9 based on mapping
   [CRUSH_ITEM_NONE, 2, 3]/3
2) OSD 3 at map 513 remaps op to osd 3, spg_t 3.9s0 based on mapping [3, 2, 3]/3
3) PG 3.9s0 dequeues the op at epoch 512 and notices that it isn't
   primary -- misdirected op
4) client resends and this time PG 3.9s0 having caught up to 513 gets it and
   fulfils it

We can't compute the op target based on the sending map epoch due to
splitting.  The simplest thing is to detect such cases in
OSD::handle_misdirected_op and drop them without an error (the client
will resend anyway).

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-02-28 11:26:32 -08:00
Loic Dachary
07ddfcfa93 Merge pull request #1332 from ceph/wip-pg-msg
mon/OSDMonitor: missing space in string

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
2014-02-28 18:35:11 +01:00
John Spray
448fc0e91a mon/OSDMonitor: missing space in string
Minor glitch.  Was printing ..."exceeds per-OSD max of32)"

Signed-off-by: John Spray <john.spray@inktank.com>
2014-02-28 17:16:09 +00:00
Dan Mick
799cde0a7b Fix python-requests package dependencies.
python-ceph does not require requests, but ceph-common does (for ceph-brag).

Signed-off-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit 9a0ef6a181)
2014-02-28 08:34:43 -08:00
Josh Durgin
bfad17bfa9 librados: fix ObjectIterator::operator= for the end iterator
We can't set a shared_ptr to NULL, we need to reset it instead. Add
another test for various permutations of this.

Fixes: #7538
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2014-02-28 08:16:28 -08:00
Sage Weil
f0241c8ac8 mon/OSDMonitor: make default false-positive-probability 5%
This is a more conservative default (as in, less memory consumed) for
newly created cache pools.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-28 08:06:03 -08:00
Yan, Zheng
7ba3200f1e mds: fix nested_anchors update during journal replay
check if the inode is anchored/unanchored before updating the inode

Fixes: #7530
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-02-26 07:12:49 -08:00
Gregory Farnum
82aba4b02b Merge pull request #1319 from ceph/wip-primary-temp-fix
osd/OSDMap: respect temp primary without temp acting

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-02-26 07:12:04 -08:00
Samuel Just
1040d1b08a osd/OSDMap: respect temp primary without temp acting
be2748c6d5 ensured that
if the temp acting mapping contains only CRUSH_ITEM_NONE,
that the acting_primary is left at -1.  However, even if
acting.empty(), we need to respect a temp_primary mapping.
Thus, use _acting_primary unless acting.empty() &&
acting_primary == -1.

Bug introduced in be2748c6d5.
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-02-25 23:09:57 -08:00
Samuel Just
5a6cb3da20 Merge pull request #1317 from ceph/wip-7537
Wip 7537

Reviewed-by: David Zafman <david.zafman@inktank.com>
2014-02-25 20:42:18 -08:00
Samuel Just
be2748c6d5 OSDMap::_pg_to_up_acting_osds: use _acting_primary unless acting is empty
If the temp set for whatever reason has only CRUSH_ITEM_NONE,
we need primary to be -1.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-02-25 16:47:24 -08:00
Samuel Just
f93bf33b99 Merge pull request #1311 from ceph/wip-dz-scrub-fixes
Wip dz scrub fixes

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-02-25 15:28:08 -08:00
Samuel Just
dc079eb3c5 OSDMonitor: when thrashing, only generate valid temp pg mappings
Since backfill peers are no longer placed into the acting set,
temp mappings will never exceed the pool size.  Also, for ec
pools, temp mappings will never be less than the pool size.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-02-25 15:27:02 -08:00
David Zafman
9f7f4edad3 Revert "osd/PG: fix assert when deep repair finds no errors"
This reverts commit e3e3328ec8.
2014-02-24 19:56:48 -08:00
David Zafman
728e391112 osd: Don't include primary's shard in repair result message
Signed-off-by: David Zafman <david.zafman@inktank.com>
2014-02-24 19:56:02 -08:00
Gregory Farnum
60c9aafaf0 Merge pull request #1308 from ceph/wip-osdmap-inc
mon/OSDMonitor: fix osdmap encode feature logic

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-02-24 09:17:42 -08:00
Gregory Farnum
1717601537 Merge pull request #1302 from ceph/wip-create-null
client: fix possible null dereference in create

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-02-24 09:08:36 -08:00
Sage Weil
27968a74d2 ceph_test_objectstore: fix i386 build (again)
test/objectstore/store_test.cc: In member function ‘void SyntheticWorkloadState::read()’:
error: test/objectstore/store_test.cc:462:23: no matching function for call to ‘swap(uint64_t&, size_t&)’

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-23 19:54:34 -08:00
Sage Weil
5f53cf132b Merge pull request #1307 from ceph/wip-7517
Wip 7517
2014-02-23 19:49:18 -08:00
Sage Weil
14ea8157eb mon/OSDMonitor: fix osdmap encode feature logic
If we are encoding a full map based on an old Incremental that does not
encode the features, fall back to the quorum features or (barring that)
all features.  Do *not* do no features or else we will end up with
encode_client_old which does not even include the extended info and will
cause the mon to crash when decoding.

This was observed when upgading a 0.76 cluster to 0.77 (all mons stopped,
upgraded, and then started)

Reported-by: Aaron Ten Clay <aarontc@aarontc.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-23 18:23:55 -08:00
Samuel Just
7357b6ed4b PG: skip pg_whoami.osd, not pg_whoami.shard in scrub feature check
Caused by typo in 68184d4574.

Fixes: #7517
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-02-23 16:20:00 -08:00