RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-03-11 02:39:05 +00:00

Author	SHA1	Message	Date
Greg Farnum	f1ccdb418b	Elector: share local command set when deferring We're about to use this at a basic level, to identify when we have "classic" monitors in-quorum, but could also do something more sophisticated like a set intersection on the commands. Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-09 11:26:04 -08:00
Greg Farnum	ba673be3e6	Monitor: import MonCommands.h from original Dumpling and expose it If the Elector doesn't receive a set of commands from the elected leader, it assumes the monitor is "classic" and uses the Dumpling command set as the leader set. Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-09 11:26:04 -08:00
Greg Farnum	3cb58f7406	Monitor: validate incoming commands against the leader's set too Then check against our own, and forward if we don't recognize it or for some reason don't match. Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-09 11:26:04 -08:00
Greg Farnum	cb51b1ed1a	Monitor: disseminate leader's command set instead of our own Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-09 11:26:04 -08:00
Greg Farnum	d33df28c2b	Elector: transmit local api on election win, accept leader's on loss If we're the leader, just point to our local set. Disseminating these will let peons advertise the full command set supported by the leader. INCOMPLETE: does not yet handle winning Electors who do not send a command set. Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-09 11:26:04 -08:00
Greg Farnum	8025fb33ad	messages: make room for passing supported monitor commands in MMonElection We're going to use this space to let leader tell everybody what commands it supports. Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-09 11:26:03 -08:00
Greg Farnum	f932903646	Monitor: pull command mapping out of _allowed_command() We want to be able to validate commands against both the leader and local command sets, so make that functionality generic. Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-09 11:26:03 -08:00
Sage Weil	7d000e3411	Merge pull request #918 from ceph/port/misc Misc portability patches Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-09 11:16:49 -08:00
Sage Weil	4c5f7ba8ba	Merge pull request #922 from dachary/wip-crush-choose-tries crush: fix map->choose_tries boundary test Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-09 08:28:43 -08:00
Loic Dachary	41152a6317	crush: --show-utilization* implies --show-statistics --show-utilization* outputs only if --show-statistics is set, which is confusing. Instead of failing, set --show-statistics to avoid the confusion. Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-09 10:57:17 +01:00
Greg Farnum	dcb0a4f3bb	Monitor: add a separate leader_supported_commands This isn't used yet, but will be shortly. Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-08 22:21:41 -08:00
Greg Farnum	4cd5c3bf3f	Monitor: expose local monitor commands to other compilation units Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-08 22:21:41 -08:00
Greg Farnum	dca5383f2e	MonCommand: add operator== and operator!= Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-08 22:21:41 -08:00
Greg Farnum	ac69a0122b	MonCommand: support encode/decode Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-08 22:21:41 -08:00
Greg Farnum	3dcbf460d1	encoding: fix [encode\|decode]_array_nohead We want to actually encode each element and keep it, rather than writing each one at the position after the array end! Signed-off-by: Greg Farnum <greg@inktank.com>	2013-12-08 22:21:41 -08:00
Loic Dachary	7482d62f24	crush: add CrushTester accessors Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-08 22:17:26 +01:00
Loic Dachary	c928f077f7	crush: output --show-bad-mappings on err Instead of using stdout so that it displays well when used in conjunction with --show-statistics Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-08 22:17:26 +01:00
Loic Dachary	5e0722fab5	crush: fix map->choose_tries boundary test CrushWrapper::start_choose_profile allocates map->choose_tries with choose_total_tries elements. When crush_choose_firstn sets a value, it tests against map->choose_local_tries which could lead to memory corruption if map->choose_total_tries is smaller than map->choose_local_tries. Another indesirable but non fatal side effect is that the output crushtool --show-choose-tries will be truncated to choose_local_tries which is set to a lower value than choose_total_tries by the default tuneables. Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-08 17:00:54 +01:00
Sage Weil	94da2153d1	Merge pull request #869 from ceph/wip-crush crush changes for erasure coding Reviewed-by: Loic Dachary <loic@dachary.org> Reviewed-by: Samuel Just <sam.just@inktank.com>	2013-12-07 20:59:22 -08:00
Noah Watkins	ef4061f0ad	librbd: remove unused private variable Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2013-12-07 18:07:03 -08:00
Noah Watkins	ad3825c608	TrackedOp: remove unused private variable Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2013-12-07 18:07:03 -08:00
Noah Watkins	3b39a8a9f1	librbd: rename howmany to avoid conflict A howmany macro exists on some platforms in standard headers, but there really isn't any sort of standard that I've found. We just avoid the conflict entirely this way. Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2013-12-07 18:07:03 -08:00
Sage Weil	096f9b3268	Merge pull request #917 from ceph/port/compat compat: define replacement TEMP_FAILURE_RETRY Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-07 14:01:14 -08:00
Sage Weil	96068bfad6	Merge pull request #919 from ceph/port/fdatasync wbthrottle: use feature check for fdatasync Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-07 14:00:40 -08:00
Noah Watkins	539fe26109	wbthrottle: use feature check for fdatasync Checking for fdatasync uses the same approach as the qemu configure script. The relevant commit is d1722a27f552a22561104210e0afad4577878e53. Here is a copy of the commit message which explains the check: Under Darwin, a symbol exists for the fdatasync() function, so that our link test succeeds. However _POSIX_SYNCHRONIZED_IO is set to '-1'. According to POSIX:2008, a value of -1 means the feature is not supported. A value of 0 means supported at compilation time, and a value greater 0 means supported at both compilation and run time. Enable fdatasync() only if _POSIX_SYNCHRONIZED_IO is '>0'. Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2013-12-07 10:37:00 -08:00
Noah Watkins	663da61c02	rados_sync: fix mismatched tag warning Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2013-12-07 10:24:46 -08:00
Noah Watkins	60a25093a4	rados_sync: remove unused private variable Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2013-12-07 10:24:46 -08:00
Noah Watkins	43c1676778	mon: check for sys/vfs.h existence Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2013-12-07 10:24:20 -08:00
Noah Watkins	c99cf265fd	make: increase maximum template recursion depth With clang on OSX spirit blows up without this. Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2013-12-07 10:22:54 -08:00
Noah Watkins	e2be099118	compat: define replacement TEMP_FAILURE_RETRY Not all platforms have it. Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2013-12-07 10:18:51 -08:00
Sage Weil	a52ef1df49	Merge remote-tracking branch 'gh/wip-fix-3x' Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-12-06 16:56:10 -08:00
Sage Weil	0386095ea0	Merge remote-tracking branch 'gh/wip-fix-tunables' Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-12-06 16:55:54 -08:00
Sage Weil	3b3cbf52fb	crush/CrushCompiler: make current set of tunables 'safe' We can reenable this error the next time we add new tunables. Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-06 16:24:16 -08:00
Sage Weil	8535ceda03	crushtool: remove scary tunables messages Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-06 16:24:15 -08:00
Sage Weil	4eb8891d8d	crush/CrushCompiler: start with legacy tunables when compiling Ensure that a crush file always compiled deterministically, even though the default values for new maps has changed. Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-06 16:24:15 -08:00
Sage Weil	e8fdef217f	crush: add indep data set to cli tests This will help us catch things if we break the mapping. Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-06 16:22:59 -08:00
Sage Weil	564de6ea05	osdmaptool: fix cli tests for 3x Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-06 16:22:26 -08:00
Sage Weil	6704be68d4	osd: default to 3x replication 3x is the recommendation; it should be the default too. Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-06 16:21:37 -08:00
Sage Weil	308e4f9def	Merge pull request #913 from dachary/wip-crush-unittest CrushWrapper::move_bucket unittest and minor fixes Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-06 16:10:00 -08:00
Josh Durgin	8d0180b1b7	objecter: don't take extra throttle budget for resent ops These ops have already taken their budget in the original op_submit(). It will be returned via put_op_budget() when they complete. If there were many localized reads of missing objects from replicas, or cache pool redirects, this would cause the objecter to use up all of its op throttle budget and hang. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>	2013-12-06 16:03:20 -08:00
Sage Weil	38647f7627	Revert "osd: default to 3x replication" This reverts commit `cb26fbde52`. Fix unit tests and do integration tests first; this may have unexpected consequences.	2013-12-06 15:48:39 -08:00
Loic Dachary	cbeb1f4510	crush: detach_bucket must test item >= 0 not > 0 Since detach_bucket is a private helper solely used by move_bucket which contains another ( correct ) safeguard, the code cannot be reached and the problem can never happen. If another function uses detach_bucket, it may happen. Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-07 00:31:54 +01:00
Loic Dachary	2cd73f9d3e	crush: remove obsolete comments from link_bucket Probably copy/pasted from move_bucket. Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-07 00:27:09 +01:00
Loic Dachary	e00324b2bc	crush: remove redundant code from move_bucket The following was introduced in 2012 by `a2d0cff1b0` // un-set the device name so we can use add_item later build_rmap(name_map, name_rmap); name_map.erase(id); name_rmap.erase(id_name); when insert_item refused to move a bucket for which a name already exists. It was changed in 2013 by `4e2557a038` and now supports it. The TestCrushWrapper unittest for move_bucket pass. Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-07 00:21:16 +01:00
Loic Dachary	8ef80a4c67	crush: unittest CrushWrapper::move_bucket Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-07 00:20:31 +01:00
Sage Weil	865880b5b1	Merge pull request #888 from ceph/wip-crush-tunables default to bobtail-era crush tunables. Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>	2013-12-06 14:45:57 -08:00
Sage Weil	650f896c4d	Merge pull request #903 from ceph/wip-memstore memstore: reference ObjectStore backend Reviewed-by: Samuel Just <sam.just@inktank.com>	2013-12-06 14:38:15 -08:00
Josh Durgin	3caf3effcb	rbd: check write return code during bench-write This is allows rbd-bench to detect http://tracker.ceph.com/issues/6938 when combined with rapidly changing the mon osd full ratio. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>	2013-12-06 14:33:41 -08:00
Josh Durgin	e32874fc5a	objecter: resend all writes after osdmap loses the full flag Now that the osd does not respond if it gets a map with the full flag set first, clients need to resend all writes. Clients talking to old osds are still subject to the race condition, so both sides must be upgraded to avoid it. Refs: #6938 Backport: dumpling, emperor Signed-off-by: Josh Durgin <josh.durgin@inktank.com>	2013-12-06 14:33:35 -08:00
Josh Durgin	4111729dda	osd: drop writes when full instead of returning an error There's a race between the client and osd with a newly marked full osdmap. If the client gets the new map first, it blocks writes and everything works as expected, with no errors from the osd. If the osd gets the map first, however, it will respond to any writes with -ENOSPC. Clients will pass this up the stack, and not retry these writes later. -ENOSPC isn't handled well by all clients. RBD, for example, may pass it on to qemu or kernel rbd which will both interpret it as EIO. Filesystems on top of rbd will not behave well when they receive EIOs like this, especially if the cluster oscillates between full and not full, so some writes succeed. To fix this, never return ENOSPC from the osd because of a map marked full, and rely on the client to retry all writes when the map is no longer marked full. Old clients talking to osds with this fix will hang instead of propagating an error, but only if they run into this race condition. ceph-fuse and rbd with caching enabled are not affected, since the ObjectCacher will retry writes that return errors. Refs: #6938 Backport: dumpling, emperor Signed-off-by: Josh Durgin <josh.durgin@inktank.com>	2013-12-06 14:33:26 -08:00

1 2 3 4 5 ...

30070 Commits