RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-26 05:25:09 +00:00

Author	SHA1	Message	Date
Samuel Just	a576eb3204	PG: do not serve requests until replicas have activated There are two problems: 1) We choose the min last_update amoung peers with the max local-les value as an upper bound on requests which could have been reported to the client as committed. We then, for ec pools, roll back to that point to ensure that we don't inadvertently commit to an update which fewer than K replicas actually saw. If the primary sets local-les, accepts an update from a client, and there is a new interval before any of the replicas have been activated, we will end up being forced to use that update which no other replica has seen as the new last_update. This will cause the object to become unfound. We don't have this problem as long as all active replicas agree on last_update before we accept IO. 2) Even for replicated pools, we would then immediately respond to the request which created the primary-only update with a commit since it is in the log and we have no outstanding repops. If we then lose that primary before any of the replicas in the new interval record the new log, we will not only lose the object, but also the log entry recording it, which will result in a lost write. For these reasons, it seems like we need to wait for the replicas to activate before we can process new requests essentially because whatever update we select as last_update is essentially regarded as committed as soon as we accept IO. Fixes: #7649 Signed-off-by: Samuel Just <sam.just@inktank.com>	2014-03-12 10:38:17 -07:00
Samuel Just	83731a75d7	ReplicatedPG::finish_ctx: clear object_info if !obs.exists Otherwise, we see a different object_info_t depending on whether the transaction deleting the object clears before another op recreating it appears. In particular, we use oi.version to set the prior_version on the log entries in finish_ctx. If the oi is allowed to stick around the recreation log event will have a prior version of the deletion event when it should have a prior version of eversion_t(). Fixes: #7655 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-09 12:22:12 -07:00
Sage Weil	40dc3f8b2c	Merge pull request #1405 from ceph/wip-7575 osd: Add hit_set_flushing to track current flushes and prevent races Reviewed-by: Sage Weil <sage@inktank.com> Reviewed-by: Samuel Just <sam.just@inktank.com>	2014-03-09 12:21:35 -07:00
Danny Al-Gaaf	a7afa1453b	config.cc: add debug_ prefix to subsys logging levels Add debug_ prefix also for 'ceph --admin-daemon *.asok config show' as already done e.g. by 'ceph-osd --show-config'. Fixes: #7602 Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de> Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-09 10:32:39 -07:00
Sage Weil	2474e5322d	Merge pull request #1408 from ceph/wip-da-fix-doc Fixes and updates for doc Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-09 09:56:18 -07:00
Danny Al-Gaaf	54ffdcc45d	get-involved.rst: update information Added #ceph-devel IRC channel, more mailing lists, wiki and planet.ceph.com. Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-09 02:18:28 +01:00
Danny Al-Gaaf	d1a888e0f2	swift/containerops.rst: fix some typos Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-09 01:02:43 +01:00
Danny Al-Gaaf	93b95a2874	radosgw/troubleshooting.rst: s/ceph-osd/OSD/ Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-09 00:58:57 +01:00
Danny Al-Gaaf	2223a372d6	radosgw/config-ref.rst: fix typo Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-09 00:30:49 +01:00
Danny Al-Gaaf	87618d4508	session_authentication.rst: fix some typos Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-09 00:19:08 +01:00
Danny Al-Gaaf	682c695898	release-process.rst: fix some typos Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-09 00:07:39 +01:00
Danny Al-Gaaf	72ee3389af	doc: s/osd/OSD/ if not part of a command First attempt to unify usage of OSD over rst files. Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-09 00:01:40 +01:00
Danny Al-Gaaf	e666019434	doc/dev/logs.rst; fix some typos Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-08 23:31:11 +01:00
Danny Al-Gaaf	bbd1c4bab5	filestore-filesystem-compat.rst: fix typo Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-08 23:25:53 +01:00
Danny Al-Gaaf	ae123a6dd5	corpus.rst: fix typo Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-08 23:22:38 +01:00
Danny Al-Gaaf	cf9f017d4e	config.rst: fix typo Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-08 23:16:24 +01:00
Danny Al-Gaaf	5aaecc7210	cephx_protocol.rst: fix typo Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-08 23:11:25 +01:00
Danny Al-Gaaf	2cbb0a402b	architecture.rst: fix typos Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-08 11:27:15 +01:00
Danny Al-Gaaf	a4cbb192ab	rados/operations/control.rst: fix typo Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2014-03-08 11:13:52 +01:00
Sage Weil	db0c37829c	Merge remote-tracking branch 'gh/wip-7210' into firefly Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-07 15:23:31 -08:00
Sage Weil	1c8c61897d	qa/workunits/cephtool/test.sh: fix 'osd thrash' test - fix the wait check for osds to come back up - make sure they get marked back in, too Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2014-03-07 15:21:38 -08:00
Sage Weil	20754779ab	Merge pull request #1403 from ceph/wip-7642 mon: fix check for primary-affinity feature bit, and fix a race in similar checks Reviewed-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>	2014-03-07 15:05:30 -08:00
Sage Weil	b62f9f076a	mon/OSDMonitor: feature feature check bit arithmetic Make sure all features are present (instead of just any of them). Signed-off-by: Sage Weil <sage@inktank.com>	2014-03-07 14:44:42 -08:00
Sage Weil	21c225959d	Merge pull request #1404 from ceph/wip-7652 mon: fix infininte pg create msgs for erasure pools Reviewed-by: Samuel Just <sam.just@inktank.com>	2014-03-07 14:19:58 -08:00
Sage Weil	8d52fb70e1	mon/PGMap: send pg create messages to primary, not acting[0] For erasure pools, these may not match. In the case of #7652, this caused pg_create messages to be send indefinitely. register_pg() added it to the list for acting_primary, and when we got the (non-creating) pg stat update we removed it from the list for acting[0]. Fixes: #7652 Signed-off-by: Sage Weil <sage@inktank.com>	2014-03-07 14:02:26 -08:00
Sage Weil	c8b34f19b3	mon/PGMonitor: improve debugging on PGMap updates slightly Chasing #7652 Signed-off-by: Sage Weil <sage@inktank.com>	2014-03-07 13:56:31 -08:00
Sage Weil	819cce2d41	mon/OSDMonitor: make osdmap feature checks non-racy The check for OSD features may race with the boot of an OSD that does not have the necessary features. Check the pending info too, and if there is a missing feature, return -EAGAIN. In the callers, wait on -EAGAIN. Signed-off-by: Sage Weil <sage@inktank.com>	2014-03-07 13:29:15 -08:00
Sage Weil	b9bcc1590c	mon/OSDMonitor: prevent set primary-affinity unless all OSDs support it Make sure all running OSDs support the feature before we start using it (even if the config option is on!). Fixes: #7642 Signed-off-by: Sage Weil <sage@inktank.com>	2014-03-07 13:29:15 -08:00
Joao Eduardo Luis	38fd666ac6	qa: workunits/mon/rbd_snaps_ops.sh: ENOTSUP on snap rm from copied pool 'rados cppool' copies the contents but that doesn't make the destination pool an unmanaged snaps pool. Therefore, we must get an ENOTSUP when we try to remove an unmanaged snap from a not-unmanaged pool. Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>	2014-03-07 19:49:56 +00:00
Joao Eduardo Luis	c13e1b7929	mon: OSDMonitor: don't remove unamanaged snaps from not-unmanaged pools Although we should allow creating unmanaged snaps on not-unamanaged pools, as long as those pools don't have any managed snapshots in them, we cannot allow removal -- because the pool will not have any unmanaged snapshots. Fixes: 7210 Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>	2014-03-07 19:49:50 +00:00
David Zafman	135c27ec74	osd: Add hit_set_flushing to track current flushes and prevent races When flushing a HitSet track in hit_set_flushing map so that agent_load_hit_sets() doesn't try to read it too soon. Fixes: #7575 Signed-off-by: David Zafman <david.zafman@inktank.com>	2014-03-07 11:48:21 -08:00
Sage Weil	8221a8ecba	Merge pull request #1394 from ceph/wip-7610 obj_bencher: allocate contentsChars to object_size, not op_size Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-06 21:11:25 -08:00
Sage Weil	23db6782bb	Merge pull request #1397 from ceph/wip-7638 ReplicatedPG::trim_object: use old_snaps for rollback Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-06 20:06:59 -08:00
Sage Weil	4a0c3a6673	Merge pull request #1398 from ceph/wip-7634 ReplicatedPG: use hobject_t for snapset_contexts map Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-06 20:05:19 -08:00
Samuel Just	0037ee4550	Merge pull request #1395 from ceph/wip-7637 osd: fix agent thread shutdown Reviewed-by: Samuel Just <sam.just@inktank.com>	2014-03-06 19:19:12 -08:00
Sage Weil	09668a4958	osd: fix agent thread shutdown We had an old invariant that agent_queue would have at least 1 entry in it to simplify some other code paths, but it turns out that it is simpler not to do that. In particular, this was triggering a failed assertion on shutdown when we assert that the queue is empty. Dump offending items on shutdown if they are there, tho, to catch any future bugs. Fixes: #7637 Signed-off-by: Sage Weil <sage@inktank.com>	2014-03-06 16:12:30 -08:00
Samuel Just	06b96ffdc8	Merge pull request #1389 from ceph/wip-firefly-misc fix rest tests; fix COLL_MOVE_RENAME dump Reviewed-by: Samuel Just <sam.just@inktank.com>	2014-03-06 15:51:40 -08:00
Sage Weil	d4b4468c88	Merge pull request #1393 from dachary/wip-7072 logrotate: copy/paste daemon list from ceph-*-all-starter.conf Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-06 15:51:15 -08:00
Loic Dachary	7411c3c6a4	logrotate: copy/paste daemon list from -all-starter.conf Each upstart/-all-starter.conf use the same script to find the list of daemons and their ids. Copy it over to the corresponding logrotate.conf script instead of using a less reliable script based on initctl list output. If logrotate fails to run initctl reload on a daemon, it will keep writing to the rotated log file, even after it is deleted and until it fills the disk. By using the exact same shell snippet as the upstart scripts used to start the daemon, all of them will be sent the HUP signal and reopen the log file that was just rotated. http://tracker.ceph.com/issues/7072 fixes #7072 Signed-off-by: Loic Dachary <loic@dachary.org>	2014-03-07 00:47:58 +01:00
Sage Weil	6f7c8c79f5	Merge pull request #1392 from ceph/wip-7632 ReplicatedPG: consistently use ctx->at_version.version for stashed objec... Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-06 15:34:59 -08:00
Sage Weil	57c7e19819	Merge pull request #1391 from ceph/wip-7393 ReplicatedPG: clean up num_dirty adjustments Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-06 15:30:59 -08:00
Samuel Just	b6872b255c	ReplicatedPG::trim_object: use old_snaps for rollback We need to rollback the old value of snaps, not the new one. Fixes: #7638 Signed-off-by: Samuel Just <sam.just@inktank.com>	2014-03-06 15:01:04 -08:00
Samuel Just	b5b67d19aa	ReplicatedPG: use hobject_t for snapset_contexts map Otherwise, two objects with different namespaces but the same object_t will end up clobbering each other's contexts. Fixes: #7634 Signed-off-by: Samuel Just <sam.just@inktank.com>	2014-03-06 14:40:12 -08:00
Sage Weil	b436930779	qa/workunits/rest/test.py: do not test 'osd thrash' This wreaks havoc on our QA because it marks osds up and down and then immediately after that we try to scrub and some osds are still down. Adjust the CLI test to wait for all OSDs to come back up after thrashing. Signed-off-by: Sage Weil <sage@inktank.com>	2014-03-06 13:46:10 -08:00
Sage Weil	237f0fb455	os/ObjectStore: dump COLL_MOVE_RENAME This got missed way back in `ef7cffc34f` (pre-0.71). Signed-off-by: Sage Weil <sage@inktank.com>	2014-03-06 13:44:39 -08:00
Samuel Just	f888ab41bd	ReplicatedPG: consistently use ctx->at_version.version for stashed object Otherwise, two ops might end up using the same version number. Fixes: #7632 Signed-off-by: Samuel Just <sam.just@inktank.com>	2014-03-06 12:11:31 -08:00
Samuel Just	eca7e633c8	ReplicatedPG: clean up num_dirty adjustments Previously, a _delete_head() followed by a recreation on an object in the same transaction would result in num_dirty being decremented in _delete_head() without the flag being cleared. make_writeable() would then see exists and was_dirty and therefore not increment num_dirty resulting in a mismatch. Rather than trying to maintain the num_dirty number in _delete_head(), rollback_to(), and make_writeable(), it seems simpler to do the adjustment once in make_writeable based on undirty, ctx->obc->obs.oi, and ctx->new_obs->oi. Fixes: 7393 Signed-off-by: Samuel Just <sam.just@inktank.com>	2014-03-06 12:05:10 -08:00
Samuel Just	d171418058	obj_bencher: allocate contentsChars to object_size, not op_size Otherwise, our attempt to sanitize object_size bytes of data.object_contents will be doomed to memory corruption. Fixes: #7610 Signed-off-by: Samuel Just <sam.just@inktank.com>	2014-03-06 11:12:25 -08:00
Sage Weil	7403b23544	Merge pull request #1386 from ceph/wip-7624 ReplicatedPG: ensure clones are readable after find_object_context Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-06 11:01:30 -08:00
Sage Weil	cf2f3adfa6	Merge pull request #1387 from ceph/wip-7618 ReplicatedPG::wait_for_degraded_object: only recover if found Reviewed-by: Sage Weil <sage@inktank.com>	2014-03-06 10:59:47 -08:00

1 2 3 4 5 ...

32027 Commits