RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-28 05:53:37 +00:00

Author	SHA1	Message	Date
Greg Farnum	a93b970ab1	C_Gather: Set debug #ifdefs to remove set. This way when we're confident it works right, we can remove the set<Context*> and just rely on ref counting. Further optimizations would include using a spinlock rather than a mutex, or possibly even just switching sub_[created\|existing]_count to be atomics. Signed-off-by: Greg Farnum <gregf@hq.newdream.net>	2011-01-14 16:12:32 -08:00
Greg Farnum	55cf6bad2f	C_Gather: Rewrite for thread safety. Previously, C_Gather wasn't thread safe at all, and there was an issue with creating subs while some subs were being finished. These issues are now fixed. Signed-off-by: Greg Farnum <gregf@hq.newdream.net>	2011-01-14 16:11:01 -08:00
Greg Farnum	29825c75e7	mds: call MonClient::shutdown when doing a journal dump. Previously we got a failed assert since nothing was calling this. Signed-off-by: Greg Farnum <gregf@hq.newdream.net>	2011-01-14 15:08:06 -08:00
Colin Patrick McCabe	1bae352ed2	os: don't crash on no-journal case JournalingObjectStore::commit_start should handle the case where journal is null. This will occur if the user doesn't configure a journal. Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2011-01-14 10:08:10 -08:00
Sage Weil	6d0dc4bf64	mds: tolerate (with warning) replayed op with bad prealloc_inos This comes up when an ESesssion close is followed by an EMetaBlob that uses a prealloc_ino. That isn't supposed to happen (it's probably a corner case with session timeout vs a request waiting on locks that didn't get killed/canceled?). But tolerate it during replay just the same. Works around #708. Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-13 22:08:40 -08:00
Sage Weil	86337127c0	mds: improve debug output on ESession journal replay Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-13 21:51:05 -08:00
Samuel Just	b60ef3a7ad	OSD,ReplicatedPG: Do not run snap_trimmer while the pg is degraded snap_trimmer causes replica crashes if the replica is missing objects. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-13 16:20:46 -08:00
Sage Weil	e060d7a115	filejournal: rewrite completion handling, fix ordering on full->notfull Rewriting the completion handling to be simpler, clearer, so that it is easier to maintain a strict completion ordering invariant. This also fixes an ordering bug: When restarting journal, we defer initially until we get a committed_thru from the previous commit and then do all those completions. That same logic needs to also apply to new items submitted during that commit interval. This was broken before, but the simpler structure fixes it. Fixes #666. Tested-by: Jim Schutt <jaschut@sandia.gov> Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-13 13:14:40 -08:00
Samuel Just	f2755a5337	PG: activate should not enqueue snap_trimmer on a replica Previously, activate would queue_snap_trim() for replicas if snap_trimq ended up non-empty, guaranteeing a crash for any replica starting up while purged_snaps lagged behind pool->cached_removed_snaps. This should fix #702. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-13 13:16:40 -08:00
Samuel Just	1cdb01b47b	ReplicatedPG: Fix oi.size bug in _rollback_to _rollback_to calls _delete_head before cloning the clone into place. _delete_head sets the object info size to 0. _rollback_to now resets the size to match the rolled back object. Previously, this bug manifested as a failed assert in scrub when checking the object sizes. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-12 15:13:16 -08:00
Samuel Just	9c80239b6a	ReplicatedPG: register_object_context and register_snapset_context cleanup Previously, get_object_context and get_snapset_context did not register the resulting objects. In some cases, these objects would not get registered and multiple copies would end up created. This caused a bug in find_object_context where get_snapset_context could return an object distinct from the one referenced by the object returned from get_object_context. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-12 13:51:55 -08:00
Samuel Just	8f327d11ca	ReplicatedPG: snap_trimmer work around Currently, an OSD bug is causing snap_trimq to contain some snaps already in purged_snaps. This work around should let kvmtest come back up. A real fix is still needed. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-12 12:07:44 -08:00
Colin Patrick McCabe	61bd155f4a	osd: OSD::queue_pg_for_deletion: avoid double del Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2011-01-11 10:29:10 -08:00
Sage Weil	6e6c22ea23	mds: avoid double-pinning stray inodes We make multiple iterations through populate_mydir(). Only pin each stray once. Fixes #689 and crashes like mds/CInode.h: In function 'virtual void CInode::bad_get(int)': mds/CInode.h:1088: FAILED assert(ref_set.count(by) == 0) ceph version 0.24 (`180a417603`) 1: (CInode::bad_put(int)+0) [0x827b090] 2: (MDSCacheObject::get(int)+0x153) [0x813e463] 3: (MDCache::populate_mydir()+0x8a) [0x81a7e5a] 4: (MDCache::_create_system_file_finish(Mutation, CDentry, Context)+0x181) [0x819f501] 5: (C_MDC_CreateSystemFile::finish(int)+0x29) [0x81d6c29] 6: (finish_contexts(std::list<Context, std::allocator<Context> >&, int)+0x6b) [0x81d663b] 7: (Journaler::_finish_flush(int, long long, utime_t, bool)+0x983) [0x82f2f53] 8: (Journaler::C_Flush::finish(int)+0x3f) [0x82fb24f] 9: (Objecter::handle_osd_op_reply(MOSDOpReply)+0x801) [0x82d8e31] 10: (MDS::_dispatch(Message)+0x2ae5) [0x80eaa15] 11: (MDS::ms_dispatch(Message)+0x62) [0x80eb142] 12: (SimpleMessenger::dispatch_entry()+0x899) [0x80b8649] 13: (SimpleMessenger::DispatchThread::entry()+0x22) [0x80b30f2] Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-11 09:50:20 -08:00
Samuel Just	e189222f06	ReplicatedPG: Fix bug in rollback Previously, _rollback_to assumed that the rollback was a noop if ctx->clone_obc was set and it's prior version matches head's version. However, this broke in sequences like: Write "snap1 contents" to oid "blah" create snapshot "snap1" Write "snap2 contents" to oid "blah" create snapshot "snap2" rollback oid "blah" to snapshot "snap1" In this case, make_writeable would have just cloned head to the snap2 clone, but the relevant clone is actually "snap1". _rollback_to now verifies that the most recent clone is the correct one before assuming that head is already correct. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-10 15:41:09 -08:00
Sage Weil	630565f3ac	v0.24.1	2011-01-07 16:50:15 -08:00
Samuel Just	a64ddbb686	ReplicatedPG: get_object_context ssc refcount leak If obc->obs.ssc is non-null, the second get_snapset_context ends up leaking a snapset reference. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-07 14:25:03 -08:00
Samuel Just	8665370030	ReplicatedPG: clone_overlap should contain one entry per clone Previously, writefull and _delete_head would remove the last entry from snapset.clone_overlap. Now, the last entry becomes an empty interval_set. clone_overlap should contain one entry per clone. The missing entries previously caused a bug in _rollback_to where iter would be clone_overlap.end(). Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-06 15:59:15 -08:00
Samuel Just	fab61391b7	PG: Fixes bug in _scrub with checking clones I introduced this bug in `4a4a1e53c7`. curclone++ not curclone--. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-04 14:38:53 -08:00
Samuel Just	4a4a1e53c7	PG: Fix bug in scrub when checking clone sizes Previosly, _scrub checked: assert(p->second.size == snapset.clone_size[curclone]) curclone was, however, an index into snapset.clones rather than a snapid_t. For clarity, curclone is now an iterator. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-04 10:27:59 -08:00
Sage Weil	6c73da0a99	mds: assert no submit_entry during replay state We should never submit items to the journal during replay. Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-03 21:24:49 -08:00
Sage Weil	88c445b15f	mds: start new log segment resolve start, not replay finish Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-03 21:24:49 -08:00
Sage Weil	462cb8410d	osd: clean up backlog generation checks a bit Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-03 21:24:49 -08:00
Sage Weil	ff035ab31c	osd: generate backlog if needed to get last_complete >= log.tail \|\| backlog If primary or a replica has a mistrimmed pg log, we need to generate the backlog during peering. This sucks, because the PG won't go active for a long time, but it's what happens when there's a bug in the code that mis-trims the PG log! Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-03 21:24:49 -08:00
Sage Weil	78f35a6450	osd: send sufficient log to compensate for replicas with last_complate < log.tail If a replica has last_complete < log.tail and no backlog, send enough log for them to get back into a consistent state. Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-03 21:24:49 -08:00
Sage Weil	b40e7dc0f7	mds: load root inode on replay if auth If we are auth for the root inode, load it's initial value off of disk. We may not see it in the log if it has not been modified. If it has, this is useless but fast/harmless. This only occurs for brand-new filesystems where the mds is immediately restarted. Fixes #671. Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-03 14:33:03 -08:00
Greg Farnum	20593b0d38	msgr: Unlock dispatch_queue.lock when short-circuiting queue_received. Previously we left the mutex locked, which is obviously bad bad bad! I believe this was the cause of #673. Signed-off-by: Greg Farnum <gregf@hq.newdream.net>	2011-01-03 14:15:24 -08:00
Sage Weil	4efa300601	filestore: assert on out of order journal pipeline submissions Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-03 13:14:49 -08:00
Sage Weil	259c509a89	filestore: fix wake condition when journal submission blocks We only want to wake up if we are at the front of the line, in order to preserve journal submission pipeline ordering. This fixes, among other things, messages in the log like 2010-12-21 10:38:42.515974 7f0861486700 journal op_submit_finish 5364 expected 5370, OUT OF ORDER and bug #666. Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-03 13:14:13 -08:00
Sage Weil	15dcc65199	mds: fix purge_stray for directories, zeroed layouts - We don't want to purge file content on directories - Don't fall over if a file has a zero period Reported-by: Paul Komkoff <i@stingr.net> Signed-off-by: Sage Weil <sage@newdream.net>	2011-01-03 11:50:53 -08:00
Colin Patrick McCabe	6cdfa30455	osd: PG::Info::History: init last_epoch_clean It seems that we have not been zeroing PG::Info::History:last_epoch_clean when the History structure is created. This led to some very interesting log output (and bugs!) Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2011-01-03 10:30:56 -08:00
Samuel Just	9ad05cf7ff	SimpleMessenger.cc: Fixes a dispatch_throttler leak in queue_received when the pipe has been halted. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2011-01-03 10:14:52 -08:00
Sage Weil	180a417603	v0.24	2010-12-20 15:58:09 -08:00
Sage Weil	69940e2717	osd: compensate for replicas with tail > last_complete Normally we shouldn't ever have a last_complete < log.tail (&& !backlog). But maybe we do (old bugs, whatever; see #590). In that case, the primary can compensate by sending more log info to the replica. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-20 13:22:49 -08:00
Sage Weil	b04b6f4823	mds: make nested scatterlock state change check more robust The predirty_journal_parents() calls wrlock_start() with nowait=true because it has a journal entry open and we don't want to trigger a nested scatterlock change that needs to journal something again (either via scatter_writebehind or scatter_start). (MDLog can only handle a single log entry open at once because building multiple at once would require very very very careful ordering of predirty() calls and versions.) We were already check for the simple_lock() case (which may call writebehind); fix up the check to also cover the scatter_mix() (which may call scatter_start) case. Fixes this crash: mds/MDLog.h: In function 'void MDLog::start_entry(LogEvent)': mds/MDLog.h:191: FAILED assert(cur_event == __null) ceph version 0.24~rc (commit:fe10300317383ec29948d7dbe3cb31b3aa277e3c) 1: (CInode::finish_scatter_update(ScatterLock, CDir, unsigned long, unsigned long)+0x804) [0x606e14] 2: (CInode::start_scatter(ScatterLock)+0xaa) [0x60dc1a] 3: (Locker::scatter_mix(ScatterLock, bool)+0x1ca) [0x589a9a] 4: (Locker::wrlock_start(SimpleLock, MDRequest, bool)+0x165) [0x597d65] 5: (MDCache::predirty_journal_parents(Mutation, EMetaBlob, CInode, CDir, int, int, snapid_t)+0x153e) [0x55a70e] 6: (Locker::scatter_writebehind(ScatterLock)+0x42d) [0x58553d] 7: (Locker::simple_lock(SimpleLock, bool)+0x7ab) [0x58beeb] 8: (Locker::scatter_nudge(ScatterLock, Context, bool)+0x3ad) [0x58c49d] 9: (Locker::scatter_tick()+0x28a) [0x58c98a] 10: (MDS::tick()+0x4e4) [0x4b26a4] 11: (SafeTimer::timer_thread()+0x22c) [0x6d164c] 12: (SafeTimerThread::entry()+0xd) [0x6d34bd] 13: (Thread::_entry_func(void)+0xa) [0x4943da] 14: /lib/libpthread.so.0 [0x7fc87810b73a] 15: (clone()+0x6d) [0x7fc876dad69d] Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-17 21:02:58 -08:00
Sage Weil	3a235b0f21	filestore: make OpSequencer::flush() work for writeahead journaling items It was only waiting for items in the op_queue to complete. The goal is to wait for anything we've called queue_transactions(&osr,...) on. If we do writeahead journaling, though, there might be new ops that are still journaling but not yet submitted to the fs that are missed. This adds a journal queue to the OpSequencer, and uses it in the writeahead case only. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-17 15:30:39 -08:00
Colin Patrick McCabe	285f351b72	mon: build_initial_monmap: fix mismatched alloc Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-17 15:31:41 -08:00
Colin Patrick McCabe	caa4609387	common: cleanups common_init: avoid (mismatched) heap allocation ConfFile::_parse: avoid memory leak on error path ConfFile: NULL filename if not set, rather than leaving it undefined Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-17 15:26:37 -08:00
Colin Patrick McCabe	28bcf0bc98	osd: PG::choose_acting: fix major iterator mistake Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-17 15:14:53 -08:00
Colin Patrick McCabe	f7dc1a9239	rgw: fix fd leak on error path Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-17 15:14:53 -08:00
Colin Patrick McCabe	795811d66a	hadoop: fix a bunch of mismatched allocations Using array new means you need array delete. Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-17 15:14:53 -08:00
Colin Patrick McCabe	2f916086a6	auth: avoid mismatched allocation Can't pair strdup and free. Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-17 15:14:53 -08:00
Sage Weil	3c7d30f1ac	osd: flush pg writes to disk before starting scrub scan This avoids two races: - we just completed recovery by pushing objects to the replica, and the replica starts scanning before those writes reach the fs. - we just trimmed to something after last_update_applied. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-17 14:15:35 -08:00
Sage Weil	5184db4424	filestore: add per-sequencer flush operation Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-17 14:15:35 -08:00
Sage Weil	2fb60daf68	osd: debug scan_list and scrub a bit better Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-17 12:51:03 -08:00
Sage Weil	1cfad2ea77	osd: clear INCONSISTENT if scrub detects no errors Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-17 10:59:45 -08:00
Sage Weil	b190875548	osd: add assert that we're replica ar Fred saw a crash where we got into merge_log as a stray, which really shouldn't ever happen! See #590. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-17 10:36:34 -08:00
Laszlo Boszormenyi	1e291fc9ef	debian: don't strip rados classes Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu> Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-17 08:31:00 -08:00
Laszlo Boszormenyi	9c173bb400	debian: rename ceph.lintian -> ceph.lintian-overrides Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu> Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-17 08:30:43 -08:00
Samuel Just	73669d87e6	PG.cc: sub_op_scrub must set finalizing_scrub on the replica before waiting for last_update_applied to catch up to info.last_update. Signed-off-by: Samuel Just <samuelj@hq.newdream.net>	2010-12-16 13:06:43 -08:00

1 2 3 4 5 ...

12207 Commits