RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-11 13:41:02 +00:00

Author	SHA1	Message	Date
Samuel Just	adc9b91f37	os/HashIndex: use set<pair<string, hobject_t>> rather than multimap Multimap does not make any guarantees about ordering of different values with the same key. list_by_hash, however, assumes that the iterator order matches hobject_t order. Thus, we use set<pair<string, hobject_t> > to get the proper ordering. Backport: stable Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-20 12:29:03 -07:00
Sage Weil	0b84384fd4	mon: shut up about sessionless MPGStats messages If the mon gets a reset on the client connection, it clears the session on the connection. This is perfectly normal to see. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-19 22:14:11 -07:00
Sage Weil	6580450fbc	osd: clean up boot method names Prefix subsequent steps with _. Better names. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>	2012-07-19 21:27:40 -07:00
Sage Weil	369fbf6110	osd: defer boot if heartbeatmap indicates we are unhealthy If the OSD is bogged down or unresponsive, we should not try to join the cluster. This was observed on congress (slow/clogged op_tp combined with osdmap thrashing). Fixes: #2502 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>	2012-07-19 21:27:37 -07:00
Sage Weil	d76df212c8	Merge branch 'next' Conflicts: src/include/ceph_features.h	2012-07-19 20:22:35 -07:00
Sage Weil	dec936923f	osd/mon: subscribe (onetime) to pg creations on connect Ask the monitor for pending pg creations each time we connect. Normally, this is a freebie check. If there are pending creations, though, it ensures that the OSD finds out about them even if the original lame broadcast didn't reach it. Specifically: - osd is hunting for a monitor, but isn't yet connected - new pgs are created - send_pg_creates() sends out create messages, but osd does get it - osd finally connects to a mon Fixes: #2151 (tho the bug description is bad) Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Samuel Just <sam.just@inktank.com>	2012-07-19 17:13:09 -07:00
Sage Weil	7f58b9beee	mon: track pg creations by osd Track the pending pg creations by osd, and use a helper to send out that messages. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-19 17:13:09 -07:00
Sage Weil	4c6c927b27	Revert "rbd: fix usage for snap commands" This reverts commit `42de6873f9`. Actually, these are fine! Dan made them all kinds of fancy.	2012-07-19 16:45:07 -07:00
Sage Weil	42de6873f9	rbd: fix usage for snap commands Snap commands take '--snap <snapname> <imagename>'. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-19 16:48:18 -07:00
Mike Ryan	58cd27fd29	doc: add missing dependencies to README Signed-off-by: Mike Ryan <mike.ryan@inktank.com>	2012-07-19 11:29:40 -07:00
Sage Weil	6f381affdc	add CRUSH_TUNABLES feature bit Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-18 19:49:58 -07:00
Samuel Just	e3349a2a3d	OSD::handle_osd_map: don't lock pgs while advancing maps We no longer do anything with the pgs here. PG map advancing is now handled in OSD::advance_pg asyncronously. Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-18 15:37:28 -07:00
Sage Weil	c8ee30160d	osd: add osd_debug_drop_pg_create_{probability,duration} options This will let us exercise more of the pg creation code. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-18 14:26:16 -07:00
Samuel Just	8f5562ffe6	OSD: write_if_dirty during get_or_create_pg after handle_create In the case that the pg is newly created, we will activate during that call, so the info and log will be dirty. Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-18 14:26:16 -07:00
Samuel Just	ca9f713004	OSD: actually send queries during handle_pg_create During the osd threading refactor, we lost the do_queries call in favor of dispatch_context. However, this did not include the queries triggered prior to pg instantiation. Instead, use the rctx to send the queries. Part of #2771. Without the queries being sent, can_create_pg will never become true. Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-18 14:26:16 -07:00
Josh Durgin	0d0b468914	Merge branch 'next'	2012-07-18 12:58:47 -07:00
Sage Weil	5dd68b95b1	objecter: always resend linger registrations If a linger op (watch) is sent to the OSD and updates the object, and then the client loses the reply, it will resend the request. The OSD will see that it is a dup, however, and not set up the in-memory session state for the watch. This in turn will break the watch (i.e., notifies won't get delivered). Instead, always resend linger registration ops, so that we always have a unique reqid and do the correct session registeration for each session. * track the tid of the registation op for each LingerOp * mark registrations ops as should_resend=false; cancel as needed * when we send a new registration op, cancel the old one to ensure we ignore the reply. This is needed becuase we resend linger ops on any pg change, not just a primary change. * drop the first_send arg to send_linger(), as we can now infer that from register_tid == 0. The bug was easily reproduced with ms inject socket failures = 500 and the test_stress_watch utility. Fixes: #2796 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-07-18 12:55:35 -07:00
Samuel Just	76efd9772c	OSD: publish_map in init to initialize OSDService map Other areas rely on OSDService::get_map() to function, possibly before activate_map is first called. In particular, with handle_osd_ping, not initializing the map member results in: ceph version 0.48argonaut-413-g90ddc5a (commit:90ddc5ae51627e7656459085d7e15105c8b8316d) 1: /tmp/cephtest/binary/usr/local/bin/ceph-osd() [0x71ba9a] 2: (()+0xfcb0) [0x7fcd8243dcb0] 3: (OSD::handle_osd_ping(MOSDPing)+0x74d) [0x5dbdfd] 4: (OSD::heartbeat_dispatch(Message)+0x22b) [0x5dc70b] 5: (SimpleMessenger::DispatchQueue::entry()+0x92b) [0x7b5b3b] 6: (SimpleMessenger::dispatch_entry()+0x24) [0x7b6914] 7: (SimpleMessenger::DispatchThread::entry()+0xd) [0x7762fd] 8: (()+0x7e9a) [0x7fcd82435e9a] 9: (clone()+0x6d) [0x7fcd809ea4bd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-18 10:44:36 -07:00
Sage Weil	7586cde9de	qa/workunits/suites/pjd.sh: bash -x This will let us see what test is failing, exactly, and what its inputs were. Hoping to help find #2187. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-18 10:52:44 -07:00
Josh Durgin	675d630203	ObjectCacher: fix cache_bytes_hit accounting Misses are not hits! Signed-off-by: Josh Durgin <josh.durgin@inktank.com>	2012-07-18 10:25:13 -07:00
John Wilkins	4e1d973e46	doc: Fixed heading text. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2012-07-18 07:35:35 -07:00
John Wilkins	ebc577361c	doc: favicon.ico should be new Ceph icon. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2012-07-18 07:35:00 -07:00
John Wilkins	3a377c44e1	doc: Overhauled Swift API documentation. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2012-07-17 21:28:59 -07:00
Sage Weil	aecf0031c8	Merge branch 'next'	2012-07-17 19:20:06 -07:00
Sage Weil	d78235be1b	client: fix readdir locking Several of the readdir-related methods were not taking client_lock. Fixes: #1737 Backport: argonaut Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-17 19:19:39 -07:00
Sage Weil	82a575c9a5	client: fix leak of client_lock when not initialized Backport: argonaut Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-17 19:18:55 -07:00
Samuel Just	90ddc5ae51	OSD: use service.get_osdmap() in heartbeat(), don't grab map_lock service.get_osdmap() gives us sufficiently consist access to the map state. Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-17 16:58:21 -07:00
Samuel Just	58e81c82e0	OSD: handle_osd_ping: use service->get_osdmap() This way, we avoid grabbing the map_lock. Furthermore, get curmap at the beginning of the method to ensure that we send the message using the same map used to check is_up. This should also fix #2798, which was caused by an osd being marked up between service.get_osdmap() and OSD::osdmap. Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-17 16:58:21 -07:00
Samuel Just	32892c1edd	doc/dev/osd_internals: add newlines before numbered lists Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-17 16:51:57 -07:00
Sage Weil	fe4c658bd3	librados: simplify locking slightly No reason to hold mylock_all here. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-17 16:02:18 -07:00
Sage Weil	199397dc96	osd: default 'osd_preserve_trimmed_log = false' This option makes the osd skip zeroing old trimmed regions of the log. The data is never read, since the xattrs indicate which part of the log is valid. We've never actually used this to debug a problem, and it consumes space, so let's disable it. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-17 12:40:33 -07:00
Samuel Just	24df8b1d82	doc/dev: add osd_internals to toc Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-17 09:54:47 -07:00
Samuel Just	5a27f07160	doc/internals/osd_internals: fix indentation errors Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-17 09:31:22 -07:00
Sage Weil	6490c84ff9	doc: discuss choice of pg_num Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-17 08:36:54 -07:00
Sage Weil	36d0a3555f	log: simplify log logic a bit Whether an entry is eligible to log/dump is independent of the channel it is sent to. Some channels impose additional restrictions. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-17 08:36:54 -07:00
Josh Durgin	abe05a3fbb	Merge branch 'next'	2012-07-16 17:36:06 -07:00
Pascal de Bruijn \| Unilogic Networks B.V	96587f39e3	Robustify ceph-rbdnamer and adapt udev rules Below is a patch which makes the ceph-rbdnamer script more robust and fixes a problem with the rbd udev rules. On our setup we encountered a symlink which was linked to the wrong rbd: /dev/rbd/mypool/myrbd -> /dev/rbd1 While that link should have gone to /dev/rbd3 (on which a partition /dev/rbd3p1 was present). Now the old udev rule passes %n to the ceph-rbdnamer script, the problem with %n is that %n results in a value of 3 (for rbd3), but in a value of 1 (for rbd3p1), so it seems it can't be depended upon for rbdnaming. In the patch below the ceph-rbdnamer script is made more robust and it now it can be called in various ways: /usr/bin/ceph-rbdnamer /dev/rbd3 /usr/bin/ceph-rbdnamer /dev/rbd3p1 /usr/bin/ceph-rbdnamer rbd3 /usr/bin/ceph-rbdnamer rbd3p1 /usr/bin/ceph-rbdnamer 3 Even with all these different styles of calling the modified script, it should now return the same rbdname. This change "has" to be combined with calling it from udev with %k though. With that fixed, we hit the second problem. We ended up with: /dev/rbd/mypool/myrbd -> /dev/rbd3p1 So the rbdname was symlinked to the partition on the rbd instead of the rbd itself. So what probably went wrong is udev discovering the disk and running ceph-rbdnamer which resolved it to myrbd so the following symlink was created: /dev/rbd/mypool/myrbd -> /dev/rbd3 However partitions would be discovered next and ceph-rbdnamer would be run with rbd3p1 (%k) as parameter, resulting in the name myrbd too, with the previous correct symlink being overwritten with a faulty one: /dev/rbd/mypool/myrbd -> /dev/rbd3p1 The solution to the problem is in differentiating between disks and partitions in udev and handling them slightly differently. So with the patch below partitions now get their own symlinks in the following style (which is fairly consistent with other udev rules): /dev/rbd/mypool/myrbd-part1 -> /dev/rbd3p1 Please let me know any feedback you have on this patch or the approach used. Regards, Pascal de Bruijn Unilogic B.V. Signed-off-by: Pascal de Bruijn <pascal@unilogicnetworks.net> Signed-off-by: Josh Durgin <josh.durgin@inktank.com>	2012-07-16 17:34:22 -07:00
caleb miles	b0465496d2	doc/radosgw/config.rst: mended small typo Signed-off-by: caleb miles <caleb.miles@inktank.com>	2012-07-16 16:30:36 -07:00
Sage Weil	f9c1a6fb0a	Merge branch 'next'	2012-07-16 16:13:55 -07:00
Sage Weil	2a8c4db72f	Merge branch 'wip-mon-mkfs' Reviewed-by: Tommi Virtanen <tv@inktank.com>	2012-07-16 16:15:33 -07:00
Sage Weil	4eec4fc57d	mkcephfs: nicer empty directory check From TV. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-16 16:14:39 -07:00
Sage Weil	4e66a3b98d	mkcephfs: error out if mon data directory is not empty The ceph-mon --mkfs function no longer wipes out the directory; it is in fact mostly a no-op that just verifies the dir exists. So, ensure that the directory is empty at mkfs time. This could alternatively do an 'rm -r' in that directory (that is in fact what ceph-mon used to do), but this is safer. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-16 16:14:39 -07:00
Sage Weil	6b1835a92c	vstart.sh: blow away mon directory on creation/start Now that ceph-mon doesn't blow away the mon data content, we need to. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-16 16:14:39 -07:00
Sage Weil	54be9d0917	mon: stop doing rm -rf on mon mkfs Simply verify that the directory exists, or if it doesn't, create it. Do nothing about its content. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-16 16:14:39 -07:00
Sage Weil	52f96b9fd1	log: apply log_level to stderr/syslog logic In non-crash situations, we want to make sure the message is both below the syslog/stderr threshold and also below the normal log threshold. Otherwise we get anything we gather on those channels, even when the log level is low. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-16 16:02:14 -07:00
Sage Weil	de524abdb1	log: dump logging levels in crash dump So you know what you are/are not seeing. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-16 15:53:59 -07:00
Sage Weil	d3c76f754f	Merge branch 'next'	2012-07-16 15:53:54 -07:00
Samuel Just	3821f6c4bf	PG: grab reference to pg in C_OSD_AppliedRecoveredObject Otherwise, accessing the pg via _applied_recovered_object isn't safe. Using intrusive_ptr clarifies the reference ownership. Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-16 15:43:52 -07:00
Sage Weil	64f745008b	log: fix event gather condition We should gather an event if it is below the log or gather threshold. Previously we were only gathering if we were going to print it, which makes the dump no more useful than what was already logged. Signed-off-by: Sage Weil <sage@inktank.com>	2012-07-16 15:36:44 -07:00
Samuel Just	d4410e4ad5	PG::RecoveryState::Stray::react(LogEvt&): set dirty_info/log We adjust the info and the log, so we must set dirty_info and dirty_log to force writes. Signed-off-by: Samuel Just <sam.just@inktank.com>	2012-07-16 14:18:22 -07:00

1 2 3 4 5 ...

20541 Commits