Commit Graph

11497 Commits

Author SHA1 Message Date
Sage Weil
f0c89bab41 mds: always mark parent scatterlock when marking dirty rstat
Note that this will let the parent nestlock 'dirty' state get out of
sync with the lock state, as the whole point of the dirty rstat lists is
that it can happen any time.  It does, however, queue us up.
2010-09-24 11:44:22 -07:00
Sage Weil
416470da27 mds: mark dirty rstat inodes during recovery 2010-09-24 11:44:22 -07:00
Sage Weil
c503d362b8 mds: error to log when inode/dirfrag rbytes get out of sync
Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-24 11:44:22 -07:00
Sage Weil
143438aa5c mds: stubs for printing projected fragstat/rstat
Disabled for now, since it is so freaking verbose.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-24 11:44:22 -07:00
Sage Weil
28a4c34095 mds: assimilate dirty rstat inodes during scatter_writeback
We put some of the predirty_journal_parents() code that calls the
project_rstat_inode_to_frag() into a common helper and use that.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-24 11:44:22 -07:00
Sage Weil
c0d7e8f3c8 mds: maintain dirty_rstat list
Add on fetch or import of dirty_rstat; clear on export of dirty_rstat.
2010-09-24 11:44:22 -07:00
Sage Weil
29b1e84843 mds: add dirty_rstat CInode elist, state, pins
We need to track inodes with unpropagated rstat data on a per-dirfrag
basis so that we can propagate it when the nestlock becomes writeable.
2010-09-24 11:44:22 -07:00
Yehuda Sadeh
810ff49974 osd: remove assertion 2010-09-24 10:46:38 -07:00
Yehuda Sadeh
628e28e2b2 qa: improved rgw tests 2010-09-24 10:16:14 -07:00
Sage Weil
f7f32b24b1 makefile: drop quotes on tcmalloc CXXFLAGS 2010-09-23 21:20:31 -07:00
Sage Weil
043c9c8bed mds: scatter pin frozen tree on importer too
The importer also needs to scatter pin.  This avoids scatterlock gather
races like so:

A: start exporting to B
A: freeze, scatter pin tree
C: initiate gather
A: delay replay to gather
B: reply to gather, do not include (non-auth) dirfrag
A,B: finish migration
A: reply to gather, do not include (now non-auth) dirfrag
C: gets no info about the dirfrag!

By pinning on the importer, we ensure that at least one MDS will respond
to the gather with auth dirfrag info.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-23 16:44:47 -07:00
Sage Weil
c82bc1ccf6 mds: drop dead Renamer code 2010-09-23 16:44:47 -07:00
Sage Weil
2fbd843f3e mds: clarify inode dirstat/rstat locking
The accounted_rstat must always remain consistent with the parent dirfrag,
which in turn means it is governed by the parent's nestlock.

The rstat is protected by _this_ inode's nestlock, and is updated by
scatter_writebehind() or predirty_journal_parents().

Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-23 16:44:47 -07:00
Sage Weil
b108b6a713 mds: fix bounding frag rstat/fragstat update during import
Be careful about when we update bounding dirfrag info during an import.  If
the lock is in a MIX state, we do NOT want to update, since the inode
auth doesn't know jack (unless they are also dirfrag auth, in which case
we'll find out when we unscatter anyway).

Fixes fix 9d81f9d6.
2010-09-23 16:44:47 -07:00
Sage Weil
1c09263467 mds: do not scatter_writebehind on nudge if replicated
This can cause the inode rstat etc to become out of sync with dirfrag
accounted_rstat when the scatterlock is not in a gathered state: the
local values will get updated but those on other nodes will not, and the
inode will drift out of sync with the dirfrags.

Other callers to scatter_writebehind() are all in contexts where we have
_just_ gathered dirfrag state, or there is no remote dirfrag state to
gather.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-23 16:44:47 -07:00
Sage Weil
d715338110 mds: use scatter pins for migration instead of rd/wrlocks
This is simpler (for the migrator), and wrlocks allow scatter_writebehind,
which is a no-no for a frozen tree.  By pinning the frozen dir's parent
inode, we prevent any scatter or unscatter operations from implicitly
updating metadata within the frozen root dirfrag.
2010-09-23 16:44:47 -07:00
Sage Weil
961e186d47 mds: add scatterpins 2010-09-23 16:44:47 -07:00
Greg Farnum
690607cb1f backtrace: include ceph version 2010-09-23 10:57:22 -07:00
Sage Weil
113a9bcd95 mds: always pass pick_inode_snap the head
This fixes a possible infinite loop in handle_client_caps().  We need to
_always_ pass the head inode in.
2010-09-23 07:46:37 -07:00
Yehuda Sadeh
1eaec17943 qa: add simple rgw test 2010-09-22 22:40:28 -07:00
Greg Farnum
56ae11645e mds: remove unused CompatSet mds_features.
All the MDS features are stored in the MDSMap::mdsmap_compat
2010-09-22 16:36:05 -07:00
Greg Farnum
0277823597 mds: add policylock to the inodes.
This will be used to cover per-directory default file distribution
policies, and maybe other things that come up.
2010-09-22 14:48:39 -07:00
Sage Weil
2e5fa67c6e mds: fix eval_gather() for non-auth inodes
For non-auth nodes, we want a can_* policy that's < AUTH, not <= AUTH.
Adjust macro accordingly.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-22 14:02:08 -07:00
Sage Weil
36fe2ab2d3 Merge branch 'testing' into unstable 2010-09-22 13:45:13 -07:00
Sage Weil
79b6f2f9e9 mon: return errors (not 0) from MonitorStore::get_bl_ss()
Checked callers, should be fine.
2010-09-22 13:32:11 -07:00
Sage Weil
a783f409e5 mon: move election start reset to starting_election() helper
An election can start either because we call it, or because someone else
calls it.  Either way, we need to reset our state, so move that code into
the election_starting() callback, which is called by the elector's
start()/call_election() anyway.

This hopefully fixes a case where we see a timeout expire on the monitor
and fail the assertion

mon/Paxos.cc: In function 'void Paxos::lease_timeout()':
mon/Paxos.cc:684: FAILED assert(mon->is_peon())
 1: (SafeTimer::EventWrapper::finish(int)+0x259) [0x52da29]
 2: (Timer::timer_entry()+0x8e3) [0x52f523]
 3: (Timer::TimerThread::entry()+0xd) [0x46d45d]
 4: (Thread::_entry_func(void*)+0xa) [0x458aca]
 5: (()+0x6a3a) [0x7fe0bd6a4a3a]
 6: (clone()+0x6d) [0x7fe0bc8c277d]

The Paxos::election_starting() hook resets the timer, and will at least
close this possible cause.

Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-22 12:09:09 -07:00
Greg Farnum
79166a2821 mds: distribute flocklock properly!
Previously we weren't handling it in a lot of our distributed system
areas, which would have broken stuff if it were being used.
2010-09-22 11:43:16 -07:00
Greg Farnum
6efd1e8acd mds: distribute flocklock properly!
Previously we weren't handling it in a lot of our distributed system
areas, which would have broken stuff if it were being used.
2010-09-22 11:40:23 -07:00
Greg Farnum
96c08e4fc3 mds: Make SimpleLock wait shift bits unique like they should be.
This wasn't actually breaking stuff before, but it did mean
we woke up stuff we didn't need to.
2010-09-22 11:16:58 -07:00
Greg Farnum
84a09bae46 mds: Make SimpleLock wait shift bits unique like they should be.
This wasn't actually breaking stuff before, but it did mean
we woke up stuff we didn't need to.
2010-09-22 11:14:08 -07:00
Greg Farnum
2c5a3d99aa mon: Fix infinite looping, if failed_notes is empty.
Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
2010-09-22 10:25:28 -07:00
Sage Weil
2e71037250 mon: add debug output 2010-09-22 09:26:55 -07:00
Sage Weil
01b58f38cc msgr: do no open connection when policy indicates we are lossy server
We should not initiate a connection if we are a lossy server; just drop
the message.
2010-09-22 09:26:35 -07:00
Yehuda Sadeh
4b4bdb494b rgw: url_decode url prefix 2010-09-21 15:10:48 -07:00
Yehuda Sadeh
8fc9adfa8f rgw: url_decode delimiter 2010-09-21 15:10:44 -07:00
Greg Farnum
c33685764a Makefile: move tcmalloc checks outside of FUSE checks. Whoops. 2010-09-21 15:04:38 -07:00
Greg Farnum
04de6b8ee7 Merge branch 'profiling' into unstable
Conflicts:
	src/Makefile.am
2010-09-21 15:03:35 -07:00
Greg Farnum
ca2f2d55ea mds: enable tcmalloc profiling on MDSes. Add commands to start/dump/stop. 2010-09-21 14:40:57 -07:00
Greg Farnum
a850708a73 osd: enable tcmalloc profiling on OSDs. Add commands to start/dump/stop. 2010-09-21 14:40:48 -07:00
Greg Farnum
0ef684dc6d config: build infrastructure for handling tcmalloc's profiling. 2010-09-21 14:40:10 -07:00
Sage Weil
381447d9f5 qa: add snaptest-git-ceph.sh 2010-09-21 13:55:12 -07:00
Sage Weil
6cb6aa1446 mds: correctly set straydn->first for rename target
Make sure the straydn->first matches the rename target (destdnl->inode).
Unfortunately the cow happens _after_ the destdn->first is set, so instead
of trivially copying it, we dup the MAX calculation.  Add some temp
variables to clean up similar code in this method.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-21 13:55:06 -07:00
Sage Weil
136aa978e2 Merge branch 'testing' into unstable
Conflicts:
	src/mds/MDCache.cc
2010-09-21 13:54:57 -07:00
Sage Weil
b7c4185793 mds: do full pre_dirty()/mark_dirty() on cowed dentries
The dir commit/fetch and LogSegment::try_to_expire() rely on any new or
items in the directory getting new versions that correspond to a bump in
the dirfrag version.  This must include dentries/inodes that are created
by the cow process, or else we have problems during dir commit/fetch or
segment expire.

Change the dirty list in the Mutation to include the pv so that we can
properly mark them dirty later.

Leave the inode one alone.  We could theoretically do the same for the
dirty inodes, but this way we avoid projecting them and copying stuff
around.  Any dirty cowed inode will also have a dirty dentry, so it will
still get saved regardless.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-21 13:54:13 -07:00
Sage Weil
3aa948f9aa mds: only return pdnvec for full path_traverse
We should only return the pdnvec for a full traverse.  i.e., either a
success, or a failure in which we instantiate a null dn for the trailing
entry.  This makes pdnvec well defined, and allows callers like
rdlock_path_pin_ref() to reply with a null lease when appropriate.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-09-21 13:54:00 -07:00
Sage Weil
fa277aef7b mds: don't instantiate null dentries for snapped namespace
The dentry needs a [first,last] range and we don't know what first is when
we miss a lookup.  And part of the point of instantiating null dentires is
to issue leases against them, which we don't do.  The client will cache
the null result.
2010-09-21 13:53:29 -07:00
Sage Weil
f080bb9675 rgw: url_decode delimiter 2010-09-21 13:52:42 -07:00
Greg Farnum
23b1b52b80 makefile: build cfuse with tcmalloc 2010-09-21 11:20:57 -07:00
Sage Weil
ba1af748dc Merge remote branch 'origin/objecter_ratelimit' into unstable 2010-09-21 11:00:23 -07:00
Greg Farnum
f4be4b936f librados: throttle messages via the objecter 2010-09-20 11:02:51 -07:00