RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-19 01:21:49 +00:00

Author	SHA1	Message	Date
Sage Weil	2cd2c56dd0	v0.24.3	2011-02-10 09:49:28 -08:00
Colin Patrick McCabe	b60444b5c1	make:add messages/MOSDRepScrub.h to NOINST_HEADERS Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>	2011-02-10 09:49:28 -08:00
Sage Weil	ec9d14c1be	Merge remote branch 'origin/rep_scrub_wq' into stable	2011-02-08 16:22:01 -08:00
Sage Weil	cc525b3a3e	osd: discard scrub reply if pg changed build_scrub_map will bail out if the pg changed. Discard the result in that case since the primary will ignore it anyway. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2011-02-08 08:41:52 -08:00
Sage Weil	a948aa1180	osd: avoid map_lock for scrub_map reply Using osd->osdmap->epoch without map_lock is dangerous. We can avoid it entirely by replying on the same connection as the request. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2011-02-08 08:41:14 -08:00
Sage Weil	36097c3ac5	osd: never rewrite log after {advance,activate}_map pg->dirty_log is never true, so this is dead code. And nothing in either of those two methods updates the pg log. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2011-02-08 08:22:54 -08:00
Sage Weil	3055d09441	osd: always write backlog after creation dirty_log is never set to true, so we would set the log.backlog flag but not write it to disk. If we restarted the OSD, we would think we had the backlog in the log but in reality we would not. clean_up_local() could then erase almost every object in the PG. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2011-02-08 08:22:47 -08:00
Sage Weil	19afe11cc5	osd: fix no missing inferance Add missing continue in last_update==last_complete (no missing) case. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2011-02-08 08:22:45 -08:00
Samuel Just	416292027d	PG: remove sub_op_scrub Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-07 20:56:30 -08:00
Samuel Just	212f977f11	PG: switch _request_scrub_map to send MOSDRepScrub Also switches sub_op_scrub_reply to sub_op_scrub_map to handle the OSD_OP_SCRUB_MAP response. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-07 20:56:30 -08:00
Samuel Just	03c7b062d1	OSD: Adds handler for MOSDRepScrub handle_rep_scrub enqueues the message in rep_scrub_wq. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-07 20:56:30 -08:00
Samuel Just	aed279e68f	PG: added replica_scrub Adds handler in PG for MOSDRepScrub messages. replica_scrub will replace sub_op_scrub. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-07 20:56:30 -08:00
Samuel Just	cb4fcfe316	OSD: Add rep_scrub_wq Previously, replica scrubs would be handled in sub_op_scrub in the op queue. Replica scrubs will now be processed by rep_scrub_wq using the disk tp. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-07 20:56:30 -08:00
Samuel Just	4cab2031dc	rados: Adds CEPH_OSD_OP_SCRUB_MAP sub op Previously, maps were requested with a sub_op and sent with a sub_op_reply. As maps will now be requested using a different message, replicas will transmit scrub maps requested via MOSDRepScrub messages by sending a sub_op of type CEPH_OSD_OP_SCRUB_MAP. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-07 20:56:01 -08:00
Samuel Just	7245b6a16e	MOSDRepScrub: Adds a message for initiating a replica scrub Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-07 20:56:01 -08:00
Sage Weil	d7af21020e	mon: ignore mds boot messages with zeroed port On 0.24.2 I saw a zeroed port in the cmds log and in the mdsmap. Ignore anything from a cmds with a zeroed port to prevent the insanity from spreading. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2011-02-06 20:49:59 -08:00
Sage Weil	5a50d339ed	client: more carefully gaurd local cache truncate This fixes an assert when len=0 in file_to_extents when we get some weird metadata from the MDS. Fixes: #778 Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2011-02-06 13:56:25 -08:00
Sage Weil	e49dced7d4	signal: fix redefine warnings Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2011-02-03 11:54:39 -08:00
Samuel Just	400813cc41	ReplicatedPG: snap_trimmer fix leaked lock Previous patch `7a02070b74` leaks the pg lock. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-03 11:58:56 -08:00
Samuel Just	7a02070b74	ReplicatedPG:snap_trimmer should return if !clean or !active or !primary The PG may become !clean or !active while in the osd snap_trim_wq. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-03 10:31:47 -08:00
Samuel Just	4587f1fe85	mount.ceph: option parsing fix Passing -o secretfile would cause a segfault since searching for = would result in a null pointer. New version checks for that case. Also, *end cannot be a ,. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-02 14:33:54 -08:00
Samuel Just	ece4f61a8d	FileStore: fix double close curr_fd is already closed if cp == cur_seq. This second close occasionally ended up closing another thread's fd. The next open would tend to grab that fd in op_fd or current_fd which would then get closed by the other thread leaving op_fd or current_fd pointing to some random file (or a closed descriptor). Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-02-01 14:39:27 -08:00
Samuel Just	0f3198e8c6	OSD: update_osd_stat take heartbeat_lock Previously update_osd_stat had a race with code modifying heartbeat_from causing the iterator increment to occasionally segfault. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-01-28 18:24:12 -08:00
Greg Farnum	14c669c3f6	Locker: Drop loner correctly! Our previous check for if we want to drop the loner was incorrect. Now, it's fixed. Resolves a serious bug with inode write access. Reported-by: Jim Schutt <jaschut@sandia.gov> Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>	2011-01-28 16:48:02 -08:00
Sage Weil	9e4325b298	mds: defer sending resolves until mdsmap.failed.empty() There is no point sending resolves while there are still failed nodes, since we can't complete. We also trigger an assert if we try to send to a failed node. Instead just wait until failed.empty() and then start. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2011-01-28 12:57:32 -08:00
Sage Weil	35442744f4	osd: fix mutual exclusion for _dispatch We want only one thread dispatching messages (either new or requeued), so that we can preserve ordering. Previously we weren't doing so for all callers of do_waiters (tick() and the first in ms_dispatch()). This fixes osd_sub_op(_reply) ordering problems that trigger the now-famous repop queue assert. Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-28 01:24:49 -08:00
Sage Weil	fbcf66906e	osd: preserve ordering when ops are requeued Requeue ops under osd_lock to preserve ordering wrt incoming messages. Also drain the waiter queue when ms_dispatch takes the lock before calling _dispatch(m). Fixes: #743 Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-26 10:08:30 -08:00
Sage Weil	7d65f6eabe	osd: restart if the osdmap client, heartbeat, OR cluster addrs don't match If we somehow get ourselves into a situation where the OSDMap addresses do not match our actual addresses, restart and try again. This is still possible if multiple MOSDBoot messages end up in flight in the monitor, say due to a monitor disconnect/reconnect, and we race with something that marks us down in the map. Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-26 10:08:30 -08:00
Sage Weil	47dc27a694	osd: avoid extraneous send_boot() calls Only send_boot() on osdmap update if we are restarting. Otherwise we can end up with too many MOSDBoot messages in flight and the monitor may apply an old one instead of a new one. For example: - cosd starts - send_boot with address set A - get an osdmap update - send_boot again with address set A - get an osdmap update. now we're up. - get osdmap update, now we're marked down, - bind to address set B - send_boot with address set B and the monitor may apply the second MOSDBoot (with adddress set A). This results in an online OSD using a cluster address that differs from that in the OSDMap. Which causes problems with peering, among other things. Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-26 10:08:29 -08:00
Samuel Just	ba998f05b7	ReplicatedPG: _rollback_to fix the just cloned condition _rollback_to in the case that head was just cloned and that clone includes snapid does not need to do anything. Previously, snapid would have to match the snap on the clone, but the condition should be that snapid is contained within the clone's snaps set. This bug was introduced in `e189222f06` Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-01-25 14:36:54 -08:00
Sage Weil	f7572de5cb	v0.24.2	2011-01-24 12:53:22 -08:00
Sage Weil	4a49a87db7	msgr: make connection pipe reset atomic Close a small and unlikely race. Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-24 10:59:42 -08:00
Sage Weil	3a30eb75c4	msgr: include con in debug output Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-24 10:59:42 -08:00
Sage Weil	943fd14f79	filestore: don't wait min sync interval on explicit sync() Also, if we do wait longer, wait on the same cond. Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-24 10:59:42 -08:00
Samuel Just	785bf0fcbf	ReplicatedPG: fix snap_trimmer log version bug Previously, ctx->at_version would be the same as ctx->obs->oi.version leading to the log entry having prior_version == version. This bug was introduced in `d1b85e06fb`. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-01-21 14:30:43 -08:00
Greg Farnum	3e4a82e559	FileJournal: don't overflow the journal size. Previously we were casting it to a uint64_t, but the left shift occurs before the cast, so we were overflowing in some circumstances. Split these up to prevent it. Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>	2011-01-21 14:20:16 -08:00
Colin Patrick McCabe	444e930ab3	mds: respawn must unblock signals before exec Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>	2011-01-21 06:53:24 -08:00
Colin Patrick McCabe	59e8e1652a	common: move signal blocking into signal.cc Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>	2011-01-21 06:53:04 -08:00
Colin Patrick McCabe	ba000d9c27	common: add signal_mask_to_str Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>	2011-01-21 06:47:05 -08:00
Sage Weil	aaed6eb3d0	msgr: always start reaper If we didn't explicitly bind (i.e. are a client), then we don't start the accepter. That's fine. But the reaper thread start was also conditional, when it shouldn't be; otherwise the client can't clean up old Pipes (and their sockets). Fixes: #732 Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-21 10:08:26 -08:00
Sage Weil	027335afe3	monclient: fix locking Hold lock in handle_* methods; assert lock held in all _* methods. Fixes: #731 Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-21 09:35:31 -08:00
Colin Patrick McCabe	ad8951aeeb	signals: signal.cc: trim includes Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>	2011-01-20 03:44:57 -08:00
Colin Patrick McCabe	189cf33f50	common: re-install sighandlers after daemon() Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>	2011-01-20 03:44:57 -08:00
Colin Patrick McCabe	6041302efe	common: move signal handler stuff into signal.cc Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>	2011-01-20 03:44:42 -08:00
Samuel Just	48ebab6d1c	ReplicatedPG.cc: fix snap_trimmer object context error Previously, snap_trimmer would get the clone object information from the object store rather than using find_object_context. This would cause the cached version to not be updated with the new version in the case that the object information got updated. As a result, the need field of the missing object could get a stale version inconsistent with the most recent logged version. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-01-19 17:44:55 -08:00
Samuel Just	d1b85e06fb	ReplicatedPG.cc: update coi version and prior_version to match log Caused error where oi on clone would not get updated version when snaps was updated. oi.version would lag behind the missing item's need field during recovery. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-01-19 17:44:55 -08:00
Samuel Just	e6b9731d00	ReplicatedPG.cc: fix use of potentially invalid pointer rollback_to may not be initialized if ret != 0. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-01-19 17:44:55 -08:00
Samuel Just	4e3a4e2853	ReplicatedPG,PG,OSD: snap_trimmer should run only when the PG is clean Signed-off-by: Samuel Just <samuel.just@dreamhost.com>	2011-01-19 17:44:34 -08:00
Colin Patrick McCabe	35ef7bc98e	signals: handle_fatal_signal: use SA_NODEFER SA_RESETHAND \| SA_NODEFER allows the "re-trigger default signal handler" trick to work for signals other than SIGSEGV. Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2011-01-19 05:14:20 -08:00
Colin Patrick McCabe	3326b753e5	signals: backtrace some more exotic fatal signals We're not likely to see these, but if we do, we want it in the logs! Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2011-01-19 05:14:14 -08:00

1 2 3 4 5 ...

12272 Commits