Commit Graph

31601 Commits

Author SHA1 Message Date
Sage Weil
9ac03ef579 osd/ReplicatedPG: fix finish_flush
Make sure we reallocate a pgbackend transaction at the time when we are
initiating new work.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:09:38 -08:00
Sage Weil
34fcf42c69 osd/HitSet: add HitSetRef
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:09:38 -08:00
Sage Weil
6950212315 osd/ReplicatedPG: factor clone check out of evict op code
Move the check for clones into a helper so that we will be able to use in
other places where we need to evict.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:09:38 -08:00
Sage Weil
fc28a99f55 osd/ReplicatedPG: add on_finish to OpContext
Add a callback hook for whenever an OpContext completes or cancels.  We
are pretty sloppy here about the return values because our initial user
will not care, and it is unclear if future users will.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:09:37 -08:00
Sage Weil
a57052cb7b mon: include dirty stats in 'ceph df detail'
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:09:37 -08:00
Sage Weil
bc945248ec osd: rename test/test_osd_types.cc -> test/osd/types.cc
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:09:37 -08:00
Sage Weil
e65c280b0e osd: add pg_pool_t::get_pg_num_divisor
A PG is not always an equally sized fraction of the total pool size due to
the use of ceph_stable_mod.  Add a helper to return the fraction
(denominator) of a given pg based on the current pg_num value.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:08:12 -08:00
Sage Weil
95f25ce092 mon/OSDMonitor: allow new pool policy fields to be set
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:08:12 -08:00
Sage Weil
0988c8438b osd/osd_types: add cache policy fields to pg_pool_t
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:08:12 -08:00
Sage Weil
297d54eb95 histogram: add decay
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:07:04 -08:00
Sage Weil
fb4152aeab histogram: move to common, add unit tests
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:07:04 -08:00
Sage Weil
85a82722cc histogram: rename set -> set_bin
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:07:04 -08:00
Sage Weil
8b68ad037f histogram: calculate bin position of a value in the histrogram
Generate a lower and upper bound.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 22:07:04 -08:00
Sage Weil
af848d4a4a Merge pull request #1176 from ceph/wip-primary-affinity
osd: primary affinity

Added primary-affinity thrashing to thrashosd.py.

Reviewed-by: Loic Dachary <loic@dachary.org>
2014-02-15 16:59:35 -08:00
Sage Weil
f0f1cf4f96 Merge pull request #1249 from dachary/wip-qa-erasure-test
qa: do not create erasure pools yet
2014-02-15 16:48:36 -08:00
Loic Dachary
d921d9b383 qa: do not create erasure pools yet
comment out erasure pool related tests when an OSD is involved because
it does not work yet. See http://tracker.ceph.com/issues/7360.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-02-16 00:53:13 +01:00
Sage Weil
c673f4084d osd/OSDMap: include primary affinity in OSDMap::print
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 10:50:09 -08:00
Sage Weil
87be7c1574 osd/OSDMap: remove bad assert
You can have an erasure poool with all CRUSH_ITEM_NONE and primary == -1.
acting is not empty.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 10:50:09 -08:00
Sage Weil
ba3eef86d8 mon/OSDMonitor: add 'mon osd allow primary affinity' bool option
By default, disallow adjustment of primary affinity unless the user has
opted in by adjusting their monitor config.  This will avoid some user
pain because inadvertantly setting the affinity will prevent older clients
from connecting to and using the cluster.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 10:50:09 -08:00
Sage Weil
c360c604aa ceph_psim: some futzing to test primary_affinity
- map to acting
- count first position, primary

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 10:50:09 -08:00
Sage Weil
f825624ff0 osd/OSDMap: add primary_affinity feature bit
Indicate that we support it.  Indicate when an OSDMap requires it.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 10:50:09 -08:00
Sage Weil
8ecec02fc1 osd/OSDMap: apply primary_affinity to mapping
The behavior is a bit different for replicated and indep/erasure mode.
In the first case, we are rearranging the result.  In the second case,
we can just set the primary argument to the right value.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 10:50:08 -08:00
Sage Weil
a91d0cbc1b Merge pull request #1245 from ceph/wip-brag
ceph-brag

Sebastien Han confirms that this is under the default (LGPL2) license, thus:

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 10:37:26 -08:00
Sage Weil
871a5f04f0 ceph.spec: add ceph-brag
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 10:15:16 -08:00
Sage Weil
4ea0a25aa6 debian: add ceph-brag
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 10:15:15 -08:00
Sage Weil
57d7018371 ceph-brag: add Makefile
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 10:15:15 -08:00
Sage Weil
7e9f03b18e Merge pull request #1181 from dachary/wip-7277
DNM: mon: s/ENOSYS/ENOTSUP/

Reviewed-by: Christophe Courtaut <christophe.courtaut@gmail.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-02-15 10:08:45 -08:00
Sage Weil
e485c95f74 Merge branch 'master' of https://github.com/enovance/ceph-brag into wip-brag 2014-02-15 09:17:22 -08:00
Sage Weil
cf4f7027e7 mon/Elector: bootstrap on timeout
Currently if an election times out we call a new
election.  If we have never joined a quorum, bootstrap
instead. This is heavier weight, but captures the case
where, during bootstrap:

 - a and b have learned each others' addresses
 - everybody calls an election
 - a and b form a quorum
 - c loops trying to call an election, but is ignored
   because a and b don't see its address in the monmap

See logs:
  ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-02-14_13:50:04-ceph-deploy-wip-7212-sage-b-testing-basic-plana/83194

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-15 08:59:51 -08:00
Sage Weil
4595c44ba1 mon: tell MonmapMonitor first about winning an election
It is important in the bootstrap case that the very first paxos round
also codify the contents of the monmap itself in order to avoid any manner
of confusing scenarios where subsequent elections are called and people
try to recover and modify paxos without agreeing on who the quorum
participants are.

Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-14 20:27:45 -08:00
Sage Weil
7bd2104acf mon: only learn peer addresses when monmap == 0
It is only safe to dynamically update the address for a peer mon in our
monmap if we are in the midst of the initial quorum formation (i.e.,
monmap.epoch == 0).  If it is a later epoch, we have formed our initial
quorum and any and all monmap changes need to be agreed upon by the quorum
and committed via paxos.

Fixes: #7212
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-14 20:27:44 -08:00
Greg Farnum
3c76b81f2f OSD: use the osdmap_subscribe helper
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-02-14 16:54:43 -08:00
Greg Farnum
6db3ae851d OSD: create a helper for handling OSDMap subscriptions, and clean them up
We've had some trouble with not clearing out subscription requests and
overloading the monitors (though only because of other bugs). Write a
helper for handling subscription requests that we can use to centralize
safety logic. Clear out the subscription whenever we get a map that covers
it; if there are more maps available than we received, we will issue another
subscription request based on "m->newest_map" at the end of handle_osd_map().

Notice that the helper will no longer request old maps which we already have,
and that unless forced it will not dispatch multiple subscribe requests
to a single monitor.
Skipping old maps is safe:
1) we only trim old maps when the monitor tells us to,
2) we do not send messages to our peers until we have updated our maps
from the monitor.
That means only old and broken OSDs will send us messages based on maps
in our past, and we can (and should) ignore any directives from them anyway.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-02-14 16:54:43 -08:00
Greg Farnum
5b9c187caf monc: new fsub_want_increment( function to make handling subscriptions easier
Provide a subscription-modifying function which will not decrement
the start version.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-02-14 16:53:51 -08:00
Sage Weil
7d398c2ae2 doc/release-notes: v0.67.6
Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-14 14:20:51 -08:00
Sage Weil
f47062d8a6 Merge pull request #1237 from dachary/wip-hashpspool
mon: ceph hashpspool false clears the flag

Reviewed-by: Christophe Courtaut <christophe.courtaut@gmail.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-02-14 09:27:33 -08:00
Loic Dachary
d8964b2f33 Merge pull request #1235 from ceph/wip-osdmaptool-pool-fix
wip-osdmaptool-pool-fix

Reviewed-by: Loic Dachary <loic@dachary.org>
2014-02-14 13:33:26 +01:00
Ilya Dryomov
0ed6a81b4b osdmaptool: add tests for --pool option
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-02-14 12:24:33 +02:00
Ilya Dryomov
f98435a45f osdmaptool: add --pool option for --test-map-pgs mode to usage()
--test-map-pgs mode allows to map all pgs from either all pools or just
one pool.  Mention it in usage output.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-02-14 12:24:32 +02:00
Ilya Dryomov
eedbf501e6 osdmaptool: fix --pool option for --test-map-object mode
Commit 7f1b12f2ef ("osdmaptool: add --test-map-pgs mode") broke
--pool for --test-map-object mode.  Fix it, and improve --pool option
handling for both modes while at it (report strict_strtol() errors,
check if specified pool exists).

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-02-14 12:24:25 +02:00
Greg Farnum
e44122f094 test: fix signed/unsigned warnings in TestCrushWrapper.cc
Irritatingly, using 0 binds to int and generates warnings
if the thing we're checking is unsigned, so we have to be
explicit.

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
2014-02-13 22:01:59 -08:00
Loic Dachary
589e2fa485 mon: ceph hashpspool false clears the flag
instead of toggling it.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-02-13 19:07:36 +01:00
Loic Dachary
7834535f7b mon: remove format argument from osd crush dump
The --format argument of the ceph cli is used to send the desired format
argument. The format argument is always part of the command sent to the
server. Adding it to the command description in MonCommand is not
necessary.

partially revert cec1893310
revert fce4d68404

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-02-13 16:27:42 +01:00
Sage Weil
0ae5e53e8b Merge pull request #1231 from dachary/wip-mon-create-simple
mon: do not goto reply if a ruleset exists in pending

Reviewed-by: Sage Weil <sage@inktank.com>
2014-02-13 07:00:18 -08:00
Sage Weil
39e313fcfa Merge pull request #1216 from ceph/wip-null-xattr
mds: remove xattr when null value is given to setxattr()

Reviewed-by: Sage Weil <sage@inktank.com>
2014-02-13 06:59:15 -08:00
Sage Weil
9cbbc883e2 Merge branch 'wip-libcephfs-firefly-rb' of https://github.com/linuxbox2/linuxbox-ceph
Reviewed-by: Sage Weil <sage@inktank.com>

This went through the fs suite and passed:

	http://pulpito.ceph.com/sage-2014-02-12_13:38:53-fs-wip-libcephfs-testing-basic-plana
2014-02-13 06:51:37 -08:00
Loic Dachary
020e543e34 mon: do not goto reply if a ruleset exists in pending
If the crush ruleset is found in pending, do not goto reply because it
does not exist yet. Wait for the pending proposal (and the ruleset) to
be accepted and then only return that it exists.

revert 4b687ba673

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-02-13 15:50:36 +01:00
Loic Dachary
192ed6151c Merge pull request #1202 from dachary/wip-mon-create-simple
mon: create simple should goto reply when it exists in pending

Reviewed-By: Christophe Courtaut <christophe.courtaut@gmail.com>
2014-02-13 15:06:27 +01:00
Loic Dachary
0c9c1577f6 mon: osd crush rule create-simple functional tests
Basic tests and a test that create the conditions where an OSDMap
is pending with a ruleset that is not yet in the OSDMap. An attempt to
create a rule by the same name will return success and not create it again.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-02-13 14:54:59 +01:00
Loic Dachary
c248e7cf6c mon: osd crush rule functional tests
* A set of test for the simplest operations
* A test covering all cases of osd crush rule

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-02-13 14:54:52 +01:00