Commit Graph

32012 Commits

Author SHA1 Message Date
Sage Weil
772968e60b mon/OSDMonitor: disallow crush buckets of type 0
Prevent creation of buckets of type 0 ('osd', 'device', etc.), as they
will confusing the mapping algorithm.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-05 13:15:58 -08:00
Samuel Just
8b3934fc0f PGBackend::rollback_stash: remove the correct shard
Fixes: #7616
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-05 12:53:08 -08:00
Samuel Just
1ddec86e64 FileStore::_collection_move_rename: propogate EEXIST
Previously, an EEXIST would get masked by the subsequent clone
operation.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-05 12:52:49 -08:00
Sage Weil
ca12e0d92e qa/workunits/mon/crush_ops: use expect_false
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-05 12:52:08 -08:00
Sage Weil
561869d9c1 Merge pull request #1376 from ceph/wip-7608
test: Fix tiering test cases to use ---force-nonempty

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-05 12:35:56 -08:00
David Zafman
e016e83bce test: Fix tiering test cases to use ---force-nonempty
Fixes: #7608

Signed-off-by: David Zafman <david.zafman@inktank.com>
2014-03-05 12:31:29 -08:00
Sage Weil
0592368070 mon: warn when pool nears target max objects/bytes
The cache pools will throttle when they reach the target max size, so it
is important to make the administrator aware when they approach that point.
Unfortunately it is not particularly easy to efficiently keep track of
which PGs have hit their limit and use that for reporting.  However, it
is easy to raise a flag when we start to approach the target for the
entire pool, and that sort of early warning is arguably more useful
anyway.

Trigger the warning based on the target full ratio.  Not when we hit the
target, but when we are 2/3 between it and completely full.

Implements: #7442
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-05 11:59:07 -08:00
Sage Weil
8106adee4b Merge pull request #1375 from ceph/wip-pgmap-stat
mon/PGMap: return empty stats if pool is not in sum

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-03-05 11:07:03 -08:00
Sage Weil
f6edceefe2 mon/PGMap: return empty stats if pool is not in sum
Greg was right!

When a pool is created, the PGs are not added to the PGMap until the *next*
proposal.  Weaken the assert here and return empty stats for non-existent
(new) pools so that a pool create + tier add sequence does not crash.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-05 10:44:41 -08:00
Sage Weil
4901347e89 Merge pull request #1373 from ceph/wip-crush-json
crush: revise JSON format for 'item' type

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-05 08:52:45 -08:00
John Spray
1685c6f75c crush: revise JSON format for 'item' type
Commit a7e9a7b648 changed the JSON format of CRUSH rules
such that the 'item' attribute on a step was sometimes
an integer and sometimes a string.

This commit separates the integer and string representations
so that tools which rely on a 'item' consistently being an
integer ID will work.

Signed-off-by: John Spray <john.spray@inktank.com>
2014-03-05 16:28:00 +00:00
Samuel Just
4cb1cbfbf3 ReplicatedPG::fill_in_copy_get: fix omap loop conditions
cursor.omap_offet indicates the most recently recovered key, we continue
filling in at the smallest key k | k > cursor.omap_offset.  If the loop
as written terminates due to !(left > 0), iter points at the next key to
copy, rather than the last key copied, resulting in the next copy
operation skipping that key.

Now, iter, if valid, must point to the last key copied once the loop has
completed since we check left <= 0 prior to advancing iter.  We can
therefore use it to fill in cursor.omap_offset.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-04 19:29:21 -08:00
Samuel Just
11393ab7e5 ReplicatedPG::fill_in_copy_get: remove extraneous if statement
This should leave the behavior unchanged.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-04 19:29:20 -08:00
Samuel Just
8fdfece9fd ReplicatedPG::fill_in_copy_get: fix early return bug
This is not a leak: we are in an else block where cb must
be NULL.  The fix as introduced did not include braces on
the if causing the method to return unconditionally.

Fixes: #7604
Introduced in: 500206d809
Reviewed-by: David Zafman <david.zafman@inktank.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-04 19:23:00 -08:00
Samuel Just
4bf28df229 Merge remote-tracking branch 'upstream/wip-7447' into firefly
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-03-04 19:22:08 -08:00
Samuel Just
d0b1094ff7 ECBackend,ReplicatedPG: delete temp if we didn't get the transaction
We always send the transaction for operations on temp objects,
but if we didn't get the final transacition on the actual object,
we might end up failing to remove the temp object.  Thus, if
we get a sub op and don't have the transaction, just remove the
named temp objects.

Fixes: #7447
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-04 15:29:20 -08:00
Samuel Just
f2a4eec1d6 PGBackend/ECBackend: handle temp objects correctly
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-04 15:29:20 -08:00
Samuel Just
308ea1bd9e ECMsgTypes: fix constructor temp_added/temp_removed ordering to match users
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-04 15:29:20 -08:00
Samuel Just
3e219961a0 ReplicatedPG::finish_ctx: use correct snapdir prior version in events
Fixes: #7595
Reviewed-by: Greg Farnum <greg@inktank.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-04 15:00:20 -08:00
Loic Dachary
4938212b69 Merge pull request #1360 from enovance/wip-brag
Fixes for ceph-brag

Reviewed-by: Loic Dachary <loic@dachary.org>
2014-03-04 12:41:54 +01:00
Babu Shanmugam
46b9f65506 Merge remote-tracking branch 'brag/master' into firefly
Signed-off-by: Babu Shanmugam <anbu@enovance.com>
2014-03-04 14:16:49 +05:30
Sage Weil
d223d3a7d2 Merge pull request #1352 from dachary/wip-7578
common: -- support for env_to_vec

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-03 21:43:39 -08:00
Sage Weil
bcea57d61f Merge pull request #1342 from ceph/wip-cache-add
mon: add 'osd tier add-cache ...' command (DNM until after wip-tier-add)

Reviewed-by: Loic Dachary <loic@dachary.org>
2014-03-03 21:37:56 -08:00
Sage Weil
397e844397 Merge pull request #1335 from ceph/wip-tier-add
mon: prevent non-empty pools from being added as tiers

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-03-03 21:36:22 -08:00
Gregory Farnum
48e55d9881 Merge pull request #1358 from ceph/wip-2288
mds: check projected xattr when handling setxattr

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-03-03 21:19:40 -08:00
Sage Weil
49e54aba33 mon/OSDMonitor: fix race in 'osd tier remove ...'
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-03 21:16:24 -08:00
Sage Weil
241b9e81f1 mon/OSDMonitor: fix some whitespace
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-03 21:16:24 -08:00
Sage Weil
c029c2fbf1 mon/OSDMonitor: add 'osd tier add-cache <pool> <size>' command
This is a friendlier interface for setting up a cache tier with some
reasonable defaults (defined via config options).  This will simplify
the user experience and documentation.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-03 21:16:24 -08:00
Sage Weil
62e0eb7f2e mon/OSDMonitor: handle 'osd tier add ...' race/corner case
If you have two racing requests to add two different pools as a tier, the
committed checks will pass but they proposals will conflict.  Recheck the
pending pools for the same conditions and wait for a commit if they
occur.

Reported-by: Loic Dachary <loic@dachary.org>
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-03 21:16:24 -08:00
Sage Weil
0e5fd0e322 osd: make default bloom hit set fpp configurable
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-03 21:16:24 -08:00
Sage Weil
eddf7b68ff osd/ReplicatedPG: fix agent division by zero
If the pool is empty we cannot divide by the object count.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-03 21:16:24 -08:00
Sage Weil
08efb45889 OSDMonitor: do not add non-empty tier pool unless forced
In general, users should not use non-empty pools as new tiers or else
things can behave strangely:

 - the data sets are unrelated behavior will be... strange.
 - if the cache pool is not "new" and does not do the OMAP flag, the OSD
   will not know not to flush omap objects to an EC base tier
 - probably other random stuff I'm forgetting

Allow a user to shoot themselves in the foot with --force-nonempty.

Implements: #7457
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-03 21:11:17 -08:00
Yan, Zheng
12909bb607 mds: check projected xattr when handling setxattr
Fixes: #2288
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-03-04 10:55:23 +08:00
Samuel Just
198b0aa268 Merge pull request #1354 from ceph/wip-7563
Wip 7563

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-03-03 17:05:53 -08:00
Samuel Just
192a27cac6 Merge pull request #1355 from ceph/wip-osd-verbosity
osd: be a bit more verbose on startup

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:38:54 -08:00
Samuel Just
20fe162ece TestPGLog: tests for proc_replica_log/merge_log equivalence
We need the merge_log and proc_replica_log paths to result in the
same missing set.  This patch adds some machinery for specifying
a log merge scenario and comparing both paths to the same correct
result.  This machinery also makes it a bit easier to read and add
new tests.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:17 -08:00
Samuel Just
9a64947ca1 TestPGLog::proc_replica_log: adjust wonky test
This test didn't quite make sense since the divergent entry
cannot be from a newer epoch.  It also didn't quite match the
diagram.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:17 -08:00
Samuel Just
6b6065ab9d TestPGLog::proc_replica_log: adjust to corrected proc_replica_log behavior
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:17 -08:00
Samuel Just
97f35960a0 TestPGLog::proc_replica_log: add prior_version to some entries
Otherwise, the test logs are invalid.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:17 -08:00
Samuel Just
200e2964ea PGLog::proc_replica_log: _merge_divergent_entries based on truncated olog
We can't merge using the primary's log since we haven't decided whether
to send them a complete log yet.  Thus, merge based on the truncated olog
rather than the primary's log.  This is a consequence of the division
between trimming divergent entries in peering/unfound search and sending
a complete log to actual members of the actingbackfill set in activate().
_merge_divergent_entries on the truncated log and add_next_event() on the
newer entries result in the same missing/log regardless of the order in
which they are performed.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:16 -08:00
Samuel Just
b0357abcae PG.h:PGLogEntryHandler: remove silly cant_rollback logic
Also, we now call rollback in a reverse order, so there is no
need to reverse the entries again.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:16 -08:00
Samuel Just
c99b7e1985 PG,PGLog: replace _merge_old_entry with _merge_object_divergent_entries
The _merge_old_entry structure had trouble distinguishing between the
following cases:

missing: foo, 1,1
merge_old_entry modify 1,1 0,0
merge_old_entry modify 1,2 1,1

and
merge_old_entry modify 1,2 1,1

In the first case, we should end up with foo removed from missing
at the end.  In the second, we need foo added to missing at 1,1.
It's far simpler to present all of the divergent entries for a single
object at once.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:12 -08:00
Samuel Just
86b21e0b78 TestPGLog::merge_old_entry: ne.version cannot be oe.version
Otherwise, it would not be divergent!

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:11 -08:00
Samuel Just
3dc4f10a9a TestPGLog::merge_old_entry: we no longer use merge_old_entry this way
This needs to be replaced with an equivalent test of
_merge_object_divergent_entries.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:11 -08:00
Samuel Just
ff329ac52b TestPGLog:rewind_divergent_log: set prior_version for delete
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:11 -08:00
Samuel Just
9e43dd6ee3 TestPGLog: ignore merge_old_entry return value
No callers use the merge_old_entry return value.  _merge_divergent_entries
won't have one.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:11 -08:00
Samuel Just
3cc9e2262c TestPGLog: not worth maintaining tests of assert behavior
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 16:05:11 -08:00
David Zafman
dda72dee70 Merge pull request #1356 from ceph/wip-7458
osd: stray pg ref on shutdown

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-03-03 14:47:38 -08:00
Samuel Just
a234053d42 OSD,config_opts: log osd state changes at level 0 instead
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-03 13:53:54 -08:00
Sage Weil
fd9c29b9b0 Merge pull request #1341 from ceph/wip-osd-status
osd: 'status' admin socket command

Reviewed-by: Loic Dachary <loic@dachary.org>
2014-03-03 11:21:11 -08:00