RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-19 09:57:05 +00:00

Author	SHA1	Message	Date
Sage Weil	c24c9e3a55	Merge remote branch 'gh/wip-filestore-misc' Conflicts: src/test/filestore/run_seed_to.sh	2012-04-28 16:25:31 -07:00
Sage Weil	6bb3e84190	Merge remote branch 'gh/wip-2353' Reviewed-by: Samuel Just <samuel.just@dreamhost.com>	2012-04-28 15:53:35 -07:00
Sage Weil	254644a4f0	osd: always share past_intervals Share past intervals when starting up new replicas. This can happen via an MOSDPGInfo or an MOSDPGLog message. Fix up get_or_create_pg() so the past_intervals arg is required (and a ref, like the other args). Fix doxygen comment. Now the only time generate_past_intervals() should do any work is when upgrading old clusters, during pg creation, and (possibly) during pg split (when that is fully implemented). Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-28 15:49:40 -07:00
Sage Weil	5047cac0cd	Merge branch 'wip-osdmap' Conflicts: src/mon/PGMonitor.cc src/osd/OSDMap.h	2012-04-28 15:25:20 -07:00
Sage Weil	352247e1b9	fix file_layout.sh layouts test preferred_osd is not gone. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-28 14:52:56 -07:00
Sage Weil	c97c20de0e	Merge branch 'wip-mon' Reviewed-by: Gregory Farnum <gregory.farnum@dreamhost.com>	2012-04-28 14:48:51 -07:00
Sage Weil	e205e11c5a	mon: 'osd [un]set noin' Missed this one. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-28 14:48:26 -07:00
Sage Weil	c661e66cc2	Merge branch 'next'	2012-04-28 14:47:53 -07:00
Sage Weil	c971545a15	osd: set dirty_info in generate_past_intervals This ensures that we save our work. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-28 07:46:42 -07:00
Sage Weil	944a431177	osd: fill in past intervals during advance_map If ceph-osd is way behind, we will advance through past maps before we mark ourselves up. This avoids the slow recalculation once we are up, and the ensuing badness. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-28 07:46:42 -07:00
Sage Weil	0c65ac6f4e	osd: drop useless PG::fulfill_info() There is a nice symmetry there with fulfill_log(), but it is a short function with a single caller that mostly just forces us to copy a bunch of data structures around unnecessarily. Drop it. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-28 07:46:42 -07:00
Sage Weil	7e8ab0f29b	osd: share past intervals with notifies Send past_intervals along with pg_info_t on every notify. The reasoning here is as follows: - we already have the state in memory - if we don't send it, and the primary doesn't have it, it will recalculate it by reading/decoding many previous maps from disk - for a highly-tortured cluster, i see past_intervals on the order of ~6 KB, times 600 pgs means ~2.5 MB sent for every activate_map(). for comparison, the same cluster would need to read and decode ~1 GB of maps to recalculate the same info. - for healthy clusters, the data is small, and costs little. - for unhealthy clusters, the data is large, but most useful. In theory we could set a threshold so that we don't send it if it is large, but allow the primary to query it explicitly. I doubt it's worth the complexity. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-28 07:46:42 -07:00
Sage Weil	0c6914039c	osd: only generate missing intervals in generate_past_intervals We can (currently) get into a situation where we don't have the full history back to last_epoch_clean because non-primaries record past intervals but don't initially have the full history, resulting in a partial recent history. If this happens, only fill in what's missing; no need to rebuild the recent parts too. Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-28 07:46:42 -07:00
Sage Weil	db8e20b211	osd: include past_intervals in pg debug printout Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-28 07:46:42 -07:00
Sage Weil	12d1675ca0	osd: fix check for whether to recalculate past_intervals We may not recalculate all the way back to last_interval_clean due to the oldest_map floor. Figure out what we want and could calculate before deciding whether what we have is insufficient. Also, print something if we discard and recalculate so it is clear what is happening and why. Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-28 07:46:42 -07:00
Sage Weil	90dae62b9c	osd: PG::Interval -> pg_interval_t Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-28 07:46:41 -07:00
Sage Weil	924a12516c	Merge branch 'next' into t	2012-04-28 07:46:23 -07:00
Dan Mick	f922dc4355	Stop rebuild of libcommon.la on "make dist" Fixes: 2356 Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>	2012-04-28 07:45:34 -07:00
Sage Weil	e44b126c40	mon: limit size of MOSDMap message sent as reply We may send an MOSDMap as a reply to various requests, including - a failure report - a boot message - a pg_temp message - an up_thru message In these cases, send a single MOSDMap message, but limit how big it gets. All recipients here are osds, which are smart enough to request more maps based on the MOSDMap::newest_map field. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-28 07:45:29 -07:00
Sage Weil	d1df320b2d	ceph-object-corpus: revert rewind From `92becb696b` Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-28 07:45:24 -07:00
Sage Weil	4274fd05d4	osdmap: fix addr dedup check Compare every address for a match, or else note that it is (or might be) different. Previously, we falsely took diff==0 to mean that all addrs were definitely equal, which was not necessarily the case. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-27 21:52:15 -07:00
Sage Weil	06d1bc22d6	osd: fix bad map debug messages Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-27 21:48:31 -07:00
Dan Mick	a477d6be7e	Stop rebuild of libcommon.la on "make dist" Fixes: 2356 Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>	2012-04-27 18:32:09 -07:00
Yehuda Sadeh	510eed0fcd	filestore: fix error message error message was misleading, fixing it. Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>	2012-04-27 16:05:36 -07:00
Yehuda Sadeh	f03dc34f7e	filestore: first lock osd mount point, next detect fs type Fixes #2353. Problem was that there were (at least) two osd processes that were racing for the fs detection, which triggered some errors in the btrfs create/remove snapshot. Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>	2012-04-27 15:46:49 -07:00
Samuel Just	10c616a50a	OSD: use map bl cache pinning during handle_osd_map Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2012-04-27 14:29:39 -07:00
Samuel Just	d0d6912527	simple_cache.hpp: add pinning Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2012-04-27 14:28:08 -07:00
Samuel Just	8ce5155137	Merge branch 'next'	2012-04-27 14:00:09 -07:00
Samuel Just	92becb696b	FileJournal: simply flush by waiting for completions to empty Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2012-04-27 13:58:58 -07:00
Samuel Just	155700d67e	PG: in GetInfo Notify handler, fix peer_info_requested filter Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2012-04-27 11:46:34 -07:00
Sage Weil	05b4fb33a1	Merge branch 'wip-lpg' Conflicts: src/osd/OSDMap.h	2012-04-26 21:57:23 -07:00
Sage Weil	cee218f0da	Merge branch 'next'	2012-04-26 21:53:36 -07:00
Sage Weil	dbd99129ce	librados: test get/set of debug levels Also do some sanity checks on the subsystem log level settings. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-26 21:52:29 -07:00
Sage Weil	4e2e87941b	config: allow {get,set}_val on subsystem debug levels This mimics the allows you to get and set subsystem debug levels via the normal config access methods. Among other things, this allows librados users to set debug levels. Fixes: #2350 Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-26 21:52:23 -07:00
Samuel Just	7f3790a9ed	OSD.cc: track osdmap refs using an LRU Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2012-04-26 19:41:25 -07:00
Samuel Just	ec1ea6a8fd	common/: added templated simple lru implementations Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2012-04-26 19:41:25 -07:00
Sage Weil	873e9beedf	osdmap: dedup pg_temp We only deal with the case where the entire map is identical, since the individual items are too small to make the pointer overhead worthwhile. Too bad. A in-memory btree-like structure would work better for this. Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 18:49:19 -07:00
Sage Weil	ed1024fb15	osdmap: use shared_ptr<> for pg_temp This will let us dedup later. Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 18:49:19 -07:00
Sage Weil	207eec65d5	osd: make map dedup optional On by default. This trades CPU for memory. Some might have unlimited RAM and not care. Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 18:49:19 -07:00
Sage Weil	0188d9bb32	osd: dedup osdmaps when added to the in-memory cache When we add an OSDMap to our in-memory cache, dedup against an existing map at a nearby epoch. Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 18:49:19 -07:00
Sage Weil	4cfbd81c13	osdmap: drop obsolete PG_ROLE_* constants There are cruft from the old primary/chain/splay replication code. All current code says <0 is stray, 0 is primary, and >0 is replica. That is, the role is the acting vector position, or -1 if not in the vector. Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 18:49:19 -07:00
Sage Weil	2a46564158	buffer: make contents_equal() more efficient Iterate both lists in parallel in terms of buffers, and use memcmp() to do the comparison. Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 18:49:19 -07:00
Sage Weil	36d43825c7	osdmap: dedup crush map If the encoded crush map is identical between two versions, share the reference. Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 18:49:19 -07:00
Sage Weil	98b1d8f36c	osdmap: use shared_ptr for CrushWrapper Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 18:49:19 -07:00
Sage Weil	e0436cb900	osdmaptool: kludge to load a range of maps into memory Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 18:49:19 -07:00
Sage Weil	d6359d4465	osdmap: dedup addrs and addr vectors between maps Compare two maps. If an addrs matches, share the reference. If all addrs match, share the entire vector. This leads to roughly 70% drop in memory utilization for the set of thrashed maps I'm working with. Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 18:49:19 -07:00
Josh Durgin	8a3fd4495c	Merge branch 'next'	2012-04-26 17:54:56 -07:00
Sage Weil	ee541c0f8d	osdmap: filter out nonexistent osds from map It is possible that the crush map contains device ids that do not exist as osds. Filter them out of the CRUSH result. Drop the max devices assert, as that is trivially violated by adding a new item to the crush map beyond max_osd (via 'ceph osd crush add ...'). Signed-off-by: Sage Weil <sage@newdream.net>	2012-04-26 17:42:29 -07:00
Josh Durgin	8f4dba62f8	librbd: the length argument of aio_discard should be uint64_t size_t was accidentally copy-pasted. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>	2012-04-26 17:41:27 -07:00
Sage Weil	fe76c5ba77	filestore: interprect any fiemap error as EOPNOTSUPP On 2.6.32-5-amd64 (debian) and XFS I'm getting EINVAL. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-04-26 17:40:52 -07:00

... 3 4 5 6 7 ...

19412 Commits