RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-20 18:21:57 +00:00

Author	SHA1	Message	Date
Sage Weil	e42a0e9f59	crush: move (de)compile into CrushCompiler class Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-19 14:44:06 -08:00
Sage Weil	4dd8c3542a	crush: uninline encode/decode Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-19 12:08:11 -08:00
Sage Weil	6b5be27634	crush: cleanup: use temp var for curstep Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-19 11:59:11 -08:00
Sage Weil	ff5178c86a	mds: use want_state to indicate shutdown State gets DNE when we receive the first map. And want_ makes more sense anyway. Fixes MDS startup. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-19 07:41:47 -08:00
Sage Weil	344c202203	osd: fix up argument to PG::init() Commit `cefa55b288` moved PG initialization into init(), but passed acting for both up and acting args. This lead to confusion between primary and replica. Also fix debug print so that the output is useful. Fixes: #2075, #2070 Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-18 22:17:35 -08:00
Sage Weil	2500a9b691	SimpleMessenger: drop unused sigint() Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-18 22:12:26 -08:00
Sage Weil	1f5e446d8a	msgr: promote SimpleMessenger::Policy to Messenger::Policy This is part of the generic interface, not specific to the implementation. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-18 22:12:26 -08:00
Sage Weil	10016923c9	mds: ignore all msgr callbacks on shutdown, not just dispatch Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-18 22:12:26 -08:00
Sage Weil	1f240ca4ff	mon: discard messages while shutting down Add SHUTDOWN state. Ignore any msgr callbacks if set. Fixes crash like 2012-02-18T21:57:58.912 INFO:teuthology.task.ceph:Shutting down mon daemons... 2012-02-18T21:57:58.912 DEBUG:teuthology.task.ceph.mon.a:waiting for process to exit 2012-02-18T21:57:58.913 INFO:teuthology.task.ceph.mon.a.err:2012-02-18 21:57:58.927759 7fe98dfa1700 mon.a@1(peon) e1 * Got Signal Terminated * 2012-02-18T21:57:59.014 INFO:teuthology.task.ceph.mon.a.err:* Caught signal (Segmentation fault) 2012-02-18T21:57:59.014 INFO:teuthology.task.ceph.mon.a.err: in thread 7fe98d7a0700 2012-02-18T21:57:59.014 INFO:teuthology.task.ceph.mon.a.err: ceph version 0.41-382-gc1db900 (commit:c1db9009c2cde9dc7ab8857b0d28a1b6d931e98a) 2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x5b0871] 2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 2: (()+0xfb40) [0x7fe991a1eb40] 2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 3: (PerfCounters::set(int, unsigned long)+0x1a) [0x52008a] 2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 4: (PGMonitor::update_logger()+0x96) [0x4d4bf6] 2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 5: (PGMonitor::update_from_paxos()+0xa70) [0x4e0980] 2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 6: (Monitor::_ms_dispatch(Message)+0x143b) [0x47bd6b] 2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 7: (Monitor::ms_dispatch(Message)+0x90) [0x489210] 2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 8: (SimpleMessenger::dispatch_entry()+0x89a) [0x53959a] 2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 9: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x46358c] 2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 10: (()+0x7971) [0x7fe991a16971] 2012-02-18T21:57:59.017 INFO:teuthology.task.ceph.mon.a.err: 11: (clone()+0x6d) [0x7fe9902a592d] which is analogous to #2014. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-18 21:41:13 -08:00
Sage Weil	787dd17097	msgr: fix shutdown vs accept race This is a kludge. The real fix is to rewrite SimpleMessenger as a state machine. Fixes: #2073 Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-18 14:28:44 -08:00
Sage Weil	c3a509a0f6	mds: drop all messages during suicide Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-18 14:28:42 -08:00
Sage Weil	fe0859aad5	Merge remote branch 'gh/wip-pg-states'	2012-02-18 14:00:50 -08:00
Sage Weil	6e89d9ca06	osd: update_stats() in GetInfo state start This is the first stage of peering. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-17 16:24:10 -08:00
Sage Weil	fb31f63170	osd: don't update_stats() on prec_replica_info Nothing changes here... Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-17 16:24:10 -08:00
Sage Weil	9e309c493e	filestore: hold journal_lock during Hold journal_lock during replay so that we don't stomp on variables like op_seq and open_ops that the the commit thread cares about. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-17 16:24:10 -08:00
Sage Weil	06a2202b96	osd: only complete/deregister repop once It's now possible to send the ack and deregister the repop before the op_applied() happens. And when that happens, we'll call eval_repop() once more. Don't do anything in that case. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-17 16:24:10 -08:00
Josh Durgin	c1db9009c2	Merge branch 'next'	2012-02-17 14:31:44 -08:00
Josh Durgin	4925e9c6d9	man: regenerate man pages Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>	2012-02-17 14:27:13 -08:00
Josh Durgin	304389ca0e	man: move man page fixes to rst `83cf1b62fd` and `e5f49104ab` updated the nroff output but not the rst source. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>	2012-02-17 14:27:06 -08:00
Florian Haas	a446f32394	doc: fix snapshot creation/deletion syntax in rbd man page (trivial) Creating a snapshot requires using "rbd snap create", as opposed to just "rbd create". Also for purposes of clarification, add note that removing a snapshot similarly requires "rbd snap rm". Thanks to Josh Durgin for the explanation on IRC. Signed-off-by: Florian Haas <florian@hastexo.com> Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>	2012-02-17 14:27:00 -08:00
Sage Weil	7837c19b56	osd: make op_commit imply op_applied for purposes of repop completion For repop completion, we want waitfor_ack and _commit to be empty. For replicas, a commit reply implies ack, so ack is always a subset of commit. But for the local write, we wait for applied separately, so we can have repops open where we sent the reply to the client but still have it open and consuming memory. And generating 'old request' warnings in the logs (when the filestore is taking a long time to apply to the fs). Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-17 13:48:02 -08:00
Sage Weil	d6c767456c	osd: add REMAPPED state Set this bit whenever up != acting. This tells you that the OSDMap is explicitly remapping the PG to different nodes (than what CRUSH specified). Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-17 13:46:11 -08:00
Sage Weil	8e6f9ca8ac	osd: refactor recovery completion - rename is_all_update() -> needs_recovery(), reverse logic. - drop up != acting check; that has nothing to do with recovery itself - drop trigger in Active::react(const ActMap&)... it's nonsensical - CompleteRecovery always leads to finish_recovery (or acting set change) Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-17 13:19:57 -08:00
Sage Weil	8c0e184c50	osd: introduce RECOVERING pg state Since clean now means not degraded, we need some other indication that recovery has completed and we are "done" (given the current up/down state of the OSDs). Adding a 'recovering' state also makes it clearer to users that work is being done, as opposed to the current situation, where they look for the absense of 'clean'. Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-17 10:56:58 -08:00
Sage Weil	db41bdda7e	paxos: fix is_consistent() check If our last_committed == 1, we don't need a separate stash. This is the logic that slurp() follows, so fix is_consistent() to match. Fixes: #2077 Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-17 10:23:12 -08:00
Tom Callaway	d913e5e670	osd: change nested iterator name Don't shadow the iterator variable. Signed-off-by: Tom Callaway <spot@redhat.com> Signed-off-by: David Nalley <david@gnsa.us>	2012-02-17 09:17:27 -08:00
Tom Callaway	2325da8635	add missing #includes to build on gcc 4.7 Signed-off-by: Tom Callaway <spot@redhat.com> Signed-off-by: David Nalley <david@gnsa.us>	2012-02-17 09:17:22 -08:00
Tom Callaway	d938246c50	mds: comment out unused code in mds dump_pop_map Signed-off-by: Tom Callaway <spot@redhat.com> Signed-off-by: David Nalley <david@gnsa.us>	2012-02-17 09:17:05 -08:00
Sage Weil	07504607e3	Merge branch 'next'	2012-02-16 21:00:49 -08:00
Sage Weil	95633b9b88	osd: fix _activate_committed replica->primary message Normally we take a fresh map reference in PG::lock(). However, _activate_committed needs to make sure the map hasn't changed significantly before acting. In the case of #2068, the OSD map has moved forward and the mapping has changed, but the PG hasn't processed that yet, and thus mis-tags the MOSDPGInfo message. Tag the message with the e epoch, and also pass down the primary's address to send the message to the right location. Fixes: #2068 Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-16 21:00:35 -08:00
Sage Weil	41425f6be9	osd: skip threadpool pause on shutdown when blackholed We can't pause the threadpools if they're blocked on a blackholed filestore. Instead, just call _exit(). Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-16 15:18:58 -08:00
Sage Weil	4b3bb5ab37	osd: fix _activate_committed replica->primary message Normally we take a fresh map reference in PG::lock(). However, _activate_committed needs to make sure the map hasn't changed significantly before acting. In the case of #2068, the OSD map has moved forward and the mapping has changed, but the PG hasn't processed that yet, and thus mis-tags the MOSDPGInfo message. Tag the message with the e epoch, and also pass down the primary's address to send the message to the right location. Fixes: #2068 Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-16 09:12:45 -08:00
Sage Weil	82eceb9a3b	osd: fix do not always clear DEGRADED/set CLEAN on recovery finish Clean means we have exactly the right number of replicas and recovery is complete. Degraded means we do not have enough replicas, either because recovery is in progress, or because acting is too small. A consequence is that if we have a PG with len(up) == 1 but a pg_temp mapping so that len(acting) == 2, it will be active and not clean. Fixes: #2060 Signed-off-by: Sage Weil <sage@newdream.net> Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>	2012-02-15 15:20:35 -08:00
Wido den Hollander	45701f5b68	init: Only check if auto start is disabled when the issued command is "start" This still makes sure daemons don't start on boot. When auto start was disabled it would also prevent logrotate from doing it's job. Signed-off-by: Wido den Hollander <wido@widodh.nl> Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-15 09:29:15 -08:00
Holger Macht	543e8b98d0	ceph.spec.in: Move libcls_.so from -devel to base package OSDs (src/osd/ClassHandler.cc) specifically look for libcls_.so in /usr/$libdir/rados-classes, so libcls_rbd.so and libcls_rgw.so need to be shipped along with the base package. Signed-off-by: Holger Macht <hmacht@suse.de> Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-15 09:28:41 -08:00
Sage Weil	1a994bed63	objclass: add debug_objclass knob, default to off Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-15 09:04:22 -08:00
Sage Weil	ba0ef62f86	osd: reduce watch/notify debug noise Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-15 09:03:28 -08:00
Sage Weil	ebbfdefa12	msgr: mark_all_down on shutdown This ensures we destroy all the Pipes and discard their messages. Among other things, this can avoid 2012-02-15 03:16:46.385242 7fe712b9a700 mon.f@5(peon) e1 * Got Signal Terminated * 2012-02-15 03:16:46.470227 7fe712b9a700 mon.f@5(peon) e1 shutdown msg/SimpleMessenger.h: In function 'virtual SimpleMessenger::Pipe::~Pipe()' thread 7fe716a37780 time 2012-02-15 03:16:46.471005 msg/SimpleMessenger.h: 234: FAILED assert(!i->second->is_on_list()) ceph version 0.41-362-g40802ae (commit:40802ae883a94d205a8716065b80ad5d7ff57d12) 1: (SimpleMessenger::Pipe::~Pipe()+0x199) [0x4669d9] 2: (SimpleMessenger::~SimpleMessenger()+0x31) [0x552231] 3: (main()+0x3026) [0x4614a6] 4: (__libc_start_main()+0xfe) [0x7fe714dd6d8e] 5: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e219] ceph version 0.41-362-g40802ae (commit:40802ae883a94d205a8716065b80ad5d7ff57d12) 1: (SimpleMessenger::Pipe::~Pipe()+0x199) [0x4669d9] 2: (SimpleMessenger::~SimpleMessenger()+0x31) [0x552231] 3: (main()+0x3026) [0x4614a6] 4: (__libc_start_main()+0xfe) [0x7fe714dd6d8e] 5: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e219] Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-15 08:21:10 -08:00
Sage Weil	c1b6b218d2	osd: do not sync_and_flush if blackholed If we have blackholed this will block forever. In that case dont' bother. Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-15 08:21:02 -08:00
Sage Weil	e6ffe31bdf	workqueue: make pause/unpause count We can pause() multiple times, and we need as many unpause()s to actually resume work. This resolves problems where we have two actors interested in pausing a queue, both want to stop work, and they aren't interacting/coordinating. Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-15 08:20:32 -08:00
Sage Weil	40802ae883	osd: exit code 0 on SIGINT/SIGTERM This makes daemon-handler happy... Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-14 22:05:36 -08:00
Sage Weil	2aafdeada8	signals: check write(2) return values Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-14 21:04:05 -08:00
Sage Weil	9cd090038f	osd: semi-clean shutdown on signal Make some effort to stop work in progress, remove pid file, and exit with informative error code. Note that this is much simpler than the shutdown() exit path; I'm not sure whether a complete teardown is useful. It's also difficult to maintain and get right with everything else going on, and it's not clear that it's worth the effort right now. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-14 21:03:54 -08:00
Sage Weil	ec066829a7	mds: remove some cruft Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-14 21:03:54 -08:00
Sage Weil	395dc659b9	mds: remove pidfile Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-14 21:03:53 -08:00
Sage Weil	bbe5cd755f	mon: do a clean shutdown on SIGINT/SIGTERM Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-14 21:03:53 -08:00
Sage Weil	eafe832791	mon: install async signal handlers for SIG{HUP,INT,TERM} Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-14 21:03:53 -08:00
Sage Weil	e905564bb2	osd: install async signal handlers for SIG{HUP,INT,TERM} Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-14 21:03:53 -08:00
Sage Weil	be704fe1d9	mds: install async signal handlers for SIG{HUP,INT,TERM} Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-14 21:03:53 -08:00
Sage Weil	afa1f9e392	signal: remove unused/obsolete handle_shutdown_signal Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-02-14 21:03:53 -08:00

1 2 3 4 5 ...

18241 Commits