Commit Graph

14144 Commits

Author SHA1 Message Date
Sage Weil
e942a2a000 mds: make trim_non_auth paths complete filepaths (not dnames)
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-28 13:34:34 -07:00
Sage Weil
21cc059eab mds: fix steal_dentry dir_auth_pins adjustment
Pass down the correct value for dir_auth_pins (dh->auth_pins plus the
inode's auth_pins, but nothing nested beneath the inode).  The CDentry
doesn't track dir auth pins independently, and doesn't really need to.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-28 13:22:30 -07:00
Sage Weil
81041de1e1 mon: use tcmalloc
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-28 13:08:34 -07:00
Sage Weil
8e4eed58bd mds: fix export_prep trace format
The prep message includes a spanning tree in the interior of the subtree
that includes all parent inodes of bounding dirfrags.  That used to look
like
	df dentry inode (dir dentry inode)*

The code to generate those traces was stopping if the df->ino had already
been included.  The problem was that we may have done the that inode on a
different dirfrag.

Change this to be

	df ('-' | ('f' dir | 'd') dentry inode (dir dentry inode)*)

so that we can start with a dentry (already had the dirfrag, same check
as before) or a dirfrag (already had the inode, the new case), or a '-'
(nothing at all).  A single byte is used to indicate which it is and how
to start decoding.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-28 13:00:44 -07:00
Sage Weil
5d6718e676 libceph: no _t types
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-28 12:34:11 -07:00
Yehuda Sadeh
4b9c093c70 lfn: short fn length is constant and accurate
also, disabling real hashing for now
2011-04-28 11:24:50 -07:00
Yehuda Sadeh
c8859f0471 osd: bump up max object name size 2011-04-28 11:16:17 -07:00
Yehuda Sadeh
7dd592aaf1 crypto: add support for SHA256 2011-04-28 11:15:50 -07:00
Sage Weil
1fd2784d74 libceph: typedef struct mystruct *mystruct_t
Needed to drop the ceph_ prefix on the internal ceph_dir_result_t type
to prevent the ceph_dir_result_t typedef from colliding.

ceph_mount_info to avoid colliding with int ceph_mount().

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-28 11:11:14 -07:00
Yehuda Sadeh
aad7006ff3 Merge commit 'origin/master' into lfn 2011-04-28 11:04:40 -07:00
Sage Weil
deb27efb9f libceph: include 'struct' in declarations for C compilation
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-28 10:37:18 -07:00
Sage Weil
8aab0eca3f mds: fix auth_pin check
The inode only gets an auth_pin if the dirfrag is not a subtree root.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-28 09:30:29 -07:00
Sage Weil
a61b519512 Merge branch 'master' of ceph.newdream.net:git/ceph 2011-04-27 17:08:49 -07:00
Sage Weil
a490d1e45d mds: handle freeze completion delayed by frozen inode
We can't complete a freeze_tree if we are not a subtree and the parent
inode is frozen.  If that's the case, we were just doing nothing on the
auth_unpin, but that means the freeze_tree would never complete.

Instead, retake an auth_pin (on behalf of the parent) and release it when
the parent inode unfreezes.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-27 17:07:54 -07:00
Yehuda Sadeh
f998bf64ec lfn: replace hash function
for some reason crashes when using libnss
2011-04-27 16:35:35 -07:00
Sage Weil
0a80865f5c mds: add 'mds debug auth pins' option
This counts dirfrag auth_pins and ensure the inode's nested_auth_pins
count is correct.  Helped catch the bug fixed in the previous commit.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-27 16:34:16 -07:00
Sage Weil
d994e8b4b8 mds: fix nested_auth_pin accounting on refragment
The diri gets an auth_pin on the first frag pin when it is not a subtree
root.  When we are moving dentries between frags during refragment, make
sure we use the adjust_nested_auth_pins method to have one such pin per
fragment.

Carry an auth_pin on the old fragment for the duration to ensure that the
pinning/unpinning as no side-effects.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-27 16:33:21 -07:00
Sage Weil
27135c9e58 mds: maintain dn pinning invariants during freezing for refragmenting
fragment_mark_and_complete aims to complete the in-cache directory,
mark+pin every dentry, then drop a final auth_pin so that the whole thing
freezes.  The problem is we may not be holding the final auth_pin, and
other dentries may get added (or removed?) between the mark and freeze
stages.

Use the DNPINNEDFRAG dir state bit to maintain the invariant that that
bit is set IFF all dentries are similarly pinned and marked.  Update the
add_*_dentry and remove_dentry methods to do that.

Fix the success path to assert this was true and to clean up(!).  Also
fix the unwind/failure path to assert.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-27 15:52:26 -07:00
Sage Weil
d6917cd731 mds: freeze fragments during split/merge
Freeze the target fragment(s) before unfreezing the old fragment(s) to
avoid any weird events going off when the unfreeze unauth_pins the dir
inode (in certain cases).  This makes the whole process cleaner and more
symmetrical.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-27 15:09:35 -07:00
Yehuda Sadeh
af6ed09848 lfn: some more fixes 2011-04-27 13:03:28 -07:00
Tommi Virtanen
c9825f0874 automake: Make debug targets known but not built by default in non-debug builds.
With this, "./configure --without-debug && make -C src testceph" will work.
Before this, it would use make builtin rules, and fail to compile in a
confusing manner.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
2011-04-27 12:16:52 -07:00
Greg Farnum
63b0cfa369 mds: remove erroneous fixme.
This is for the client map journaling, but that's handled
elsewhere within this function...no idea why it ever had
a fixme there!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-27 11:24:43 -07:00
Sage Weil
d1116818f0 mds: handle discovers that race with refragmenting
Consider:

 - send discover on frag X
 - X refragments
   - we take the waiter and rediscover on frag Y
 - we get the reply for the X discover

The auth mds will correctly delay sending the reply until the refragment
completes and it unfreezes, but the reply was getting the original frag_t,
not the new one.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-27 10:59:40 -07:00
Greg Farnum
a76d583192 mds: Replay new client sessions on slave-rename importing.
We've been logging the sessions for ages but never
actually opened them.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-27 10:52:09 -07:00
Sage Weil
33d1ea0d4f mds: pay attention to *stat staleness during split
Leave only the first frag stale, since we are already doing that with the
accounted_ differential.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-27 10:14:10 -07:00
Sage Weil
7aef5444b8 mds: merge accounted_* stats
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-27 10:00:30 -07:00
Colin Patrick McCabe
512ab30759 obsync: use lxml to parse XML ACL
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-04-26 23:57:00 -07:00
Sage Weil
698b1eadbe libceph: move header file to include/ceph/libceph.h
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-26 20:45:36 -07:00
Yehuda Sadeh
6f074241bf lfn: some fixes 2011-04-26 17:24:53 -07:00
Yehuda Sadeh
12542c8ab3 lfn: amend long file name hashing 2011-04-26 16:49:19 -07:00
Sage Weil
a68340e434 mds: ignore resolve messages received prior to resolve stage
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 16:46:57 -07:00
Sage Weil
68c2b57864 mds: handle aborted export during pre-export sync
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 16:39:18 -07:00
Yehuda Sadeh
2f16916b70 lfn: push cid/oid translation down
compiles now, not tested, probably doesn't work
2011-04-26 16:33:30 -07:00
Sage Weil
f6d1ccb694 mds: drop messages to down mdss
...instead of asserting in MDSMap::get_inst.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 16:28:03 -07:00
Sage Weil
dd183ca081 mds: do not send heartbeat when degraded
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 16:18:21 -07:00
Sage Weil
e37878e04f mds: fix discover tid assignment
Hmm!

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 16:09:43 -07:00
Sage Weil
35efa2ba38 vstart.sh: remove cruft
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 15:51:11 -07:00
Sage Weil
1f5b620240 mon: fix standby-replay assignment (again)
Only assign a random node to standby-replay if they are marked as
STANDBY_ANY.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 15:44:56 -07:00
Jim Schutt
319c20f20e auth: Avoid const mismatch in nss_aes_operation
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Signed-off-by: Jim Schutt <jaschut@sandia.gov>
2011-04-26 15:39:19 -07:00
Jim Schutt
9854e27f7e configure.ac: check for supported compiler flags
Ancient versions of gcc, such as the gcc 4.1.2 in RHEL 5.5, don't
support some -W flags that newer versions do.  Fix up configure.ac
and Makefile.am to use them if you have them.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-04-26 15:37:44 -07:00
Sage Weil
00a25206f1 vstart.sh: set up pairs for each rank when -s is on
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 15:31:53 -07:00
Sage Weil
28edbe5f31 mon: rework assignment of standby-replay, expansion nodes
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 15:26:57 -07:00
Sage Weil
53a8e7d6de mon: fix standby-replay assignment logic
Assign a standby-replay at any time based on rank, name, or no preference.
Previously this could only happen when the MDS first started, and we would
fail if the target MDS wasn't followable at that point in time.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 14:43:24 -07:00
Sage Weil
01f3526b62 Merge branch 'osd_trans' 2011-04-26 13:40:34 -07:00
Sage Weil
7c771108ac Merge remote branch 'origin/stable' 2011-04-26 13:40:21 -07:00
Sage Weil
6025dee1b4 osd: move watch/notify effects out of do_osd_ops
Apply watch/notify side effects in do_osd_op_effects() only if the
transaction will succeed.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 13:39:58 -07:00
Colin Patrick McCabe
0aeab99c83 obsync: implement RadosStore
Implement RadosStore, a storage backend which accesses librados
directly, without going through RGW (Rados GateWay).

This version is still very preliminary because ACLs aren't supported.
We need ACLs even to do things like properly create buckets.
Instead, this version has ACL_HACK, which is just for testing purposes.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-04-26 12:36:15 -07:00
Sage Weil
ccf11fbe33 osd: mention invalid snapc in log
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-26 12:34:21 -07:00
Sage Weil
896de0ac94 osd: include (some) osd op flags in MOSDOp print method
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-26 12:10:33 -07:00
Sage Weil
b08ee2c634 osd: add RWORDERED osd op flag
Order this op wrt reads the same way a read-modify-write would be.
(Otherwise we may get a fast/stale read result on a not-yet-complete
write.)

This fixes a problem where the Filer was marking a probe stat as a write
to get this same effect, but the OSD would EINVAL if it was a snapped
object (which happens in certain cases where the MDS is recovering the
file size of a snapped file).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-26 12:10:12 -07:00