Commit Graph

13918 Commits

Author SHA1 Message Date
Sage Weil
36f0068563 cauthtool: -C not -c in man page
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-19 15:33:16 -07:00
Sage Weil
f9056d0d9f osd: better debug output on replay completion
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-19 14:32:56 -07:00
Sage Weil
634dfc90c8 mkcephfs: allow a prebuild osdmap to be specified
Otherwise we'll create one with osdmaptool --createsimple with the default
generic settins.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-19 14:13:02 -07:00
Sage Weil
4428d1ec3b Revert "Revert "autoconf: Complain if tcmalloc is not found.""
This reverts commit 05c281bfa9.

This should be okay now.
2011-04-19 12:05:36 -07:00
Tommi Virtanen
f6179fc375 debian: Handle missing tcmalloc on Debian lenny.
lenny doesn't have a suitable libgoogle-perftools-dev, and
release.sh edits it out of build-deps. Detect that and tell
configure that not having tcmalloc is ok.

This should make 05c281bfa9
unnecessary.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
2011-04-19 12:05:32 -07:00
Tommi Virtanen
0d98a62ce2 debian: Build without tcmalloc on non-i386/amd64.
This is not strictly needed as of 05c281bfa9,
but that reverting is hopefully only temporary.

Without this, with 05c281 undone, non-mainstream architectures
would fail to build.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
2011-04-19 12:05:29 -07:00
Sage Weil
d55399ffec mds: remove MDSlaveUpdate from list on deletion
These are added to the LogSegment list on the slaves, but also need to be
removed from that list when we replay a COMMIT|ROLLBACK or when the op's
fate is determined during the resolve stage.

This fixes a crash like

./include/elist.h: In function 'elist<T>::item::~item() [with T =
MDSlaveUpdate*]', in thread '0x7fb2004d5700'
./include/elist.h: 39: FAILED assert(!is_on_list())
 ceph version 0.26 (commit:9981ff90968398da43c63106694d661f5e3d07d5)
 1: (MDSlaveUpdate::~MDSlaveUpdate()+0x59) [0x4d9fe9]
 2: (ESlaveUpdate::replay(MDS*)+0x422) [0x4d2772]
 3: (MDLog::_replay_thread()+0xb90) [0x67f850]
 4: (MDLog::ReplayThread::entry()+0xd) [0x4b89ed]
 5: (()+0x7971) [0x7fb20564a971]
 6: (clone()+0x6d) [0x7fb2042e692d]
 ceph version 0.26 (commit:9981ff90968398da43c63106694d661f5e3d07d5)
 1: (MDSlaveUpdate::~MDSlaveUpdate()+0x59) [0x4d9fe9]
 2: (ESlaveUpdate::replay(MDS*)+0x422) [0x4d2772]
 3: (MDLog::_replay_thread()+0xb90) [0x67f850]
 4: (MDLog::ReplayThread::entry()+0xd) [0x4b89ed]
 5: (()+0x7971) [0x7fb20564a971]

Fixes: #1019
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-19 09:25:30 -07:00
Sage Weil
e4e2b742fe Merge commit '8038c491ba90a8cbcd569e84d4cafc8bbdff81d5' into next 2011-04-18 16:26:06 -07:00
Sage Weil
c93c6619ff Merge remote branch 'origin/stable' into next 2011-04-18 16:23:03 -07:00
Sage Weil
68863bb453 osd: make ZERO on non-existent object a no-op
Fixes bug where oi.size gets out of sync with the object size because we
actually write zeros.  (This explains #933.)

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-18 13:55:16 -07:00
Colin Patrick McCabe
8038c491ba clitests: fix radosgw_admin test
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-04-18 11:44:37 -07:00
Colin Patrick McCabe
3f275bcf3c clitests: eliminate use of old-style section name
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-04-18 11:44:37 -07:00
Greg Farnum
6058a36c4e MDS: move slave rename xlock handling before finish_export_inode.
finish_export_inode changes states! That's not good for our checks,
so just handle unpinning and stuff before we finish_export_inode.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-18 10:41:29 -07:00
Greg Farnum
14dd299489 improve debug printing 2011-04-18 10:41:08 -07:00
Greg Farnum
d857983301 mds: Unify migration-handling code in _commit_slave_rename.
We need to handle locks and pins on exported inodes but we
were using a separate if block with its own (non-matching!) check
for no good reason.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-18 10:41:02 -07:00
Greg Farnum
6bd20815e2 mds: _commit_slave_rename needs to drop auth_pins for exported xlocks.
Otherwise these pins are never dropped from the inode since we
don't go through our normal xlock teardown code. Now we do!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-18 10:40:43 -07:00
Greg Farnum
1a6f43763f MDS: Make _rename_apply inode import auth_pinning more intelligent.
We don't want auth_pins on the locallocks (they're never auth_pinned)
and we only want new auth_pins that are for locks on the inode that we
imported -- not for each xlock that the mdr has everywhere (like,
say, on the srcdn)!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-18 10:40:21 -07:00
Greg Farnum
478c617311 mds: If we're a slave, clean up xlocks when we export an inode.
Because we can do an inode import during a rename that skips the usual
channels, we were getting into an odd state with the xlocks (which we
did as a slave for an inode that we exported away). Clean up the
record of these xlocks for inodes before we get into the request
cleanup (at which point we are labeled as no-longer-auth, and the
standard cleanup routines will break).

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-18 10:40:10 -07:00
Greg Farnum
5299aabe1c mds: properly drop imported xlocks.
Because we can do an inode import during a rename that skips the usual
channels, we were getting into an odd state with the xlocks (which
were formerly remote and are now local). Clean up the record of
those remote xlocks.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-18 10:40:04 -07:00
Greg Farnum
97e357c430 MDS: Server takes auth_pins for xlocks on imported inodes.
Should fix #934.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-18 10:36:22 -07:00
Sage Weil
216fd77610 objecter: resub ops on full->nonfull transition
This was broken a while ago during the last refactor.  Whoops!  Clean it
up to be smarter (and work at all).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-18 10:15:07 -07:00
Sage Weil
c966410fab osd: show "full" or "nearfull" in osdmap summary line
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-18 09:57:55 -07:00
Sage Weil
879adb6190 Merge remote branch 'origin/stable'
Conflicts:
	src/osdc/Journaler.cc
2011-04-18 09:58:15 -07:00
Yehuda Sadeh
fa7061d2da Merge branch 'rgw_uid' 2011-04-18 09:56:08 -07:00
Yehuda Sadeh
796528c3db rgw: remove get_user_info() and clean up
rename all the get_uid_by_* to get_user_info_by_*, remove get_user_info()
and call the appropriate function instead (either the by_uid or by_access_key).
2011-04-18 08:56:52 -07:00
Yehuda Sadeh
d8fe208d06 rgw: store user info on all indexes in the same format
this breaks backward compatibility, we'll have to deal with that
later.
2011-04-18 08:32:09 -07:00
Yehuda Sadeh
11f1e2ef52 rgw_admin: can lookup user by access key 2011-04-18 08:15:11 -07:00
Sage Weil
d778921888 mount.ceph: behave when CONFIG_KEYS is not compiled in
In that case we get ENOSYS.  This also implies an old version of the client
and that we should fall back.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-17 21:58:27 -07:00
Wido den Hollander
d21bdd6e29 radosgw_admin: Update manpage to new syntax
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Colin McCabe <cmccabe@alumni.cmu.edu>
2011-04-17 17:42:04 -07:00
Greg Farnum
1eccc019ed MDS: Fix Locker::handle_reqrdlock for xlocked locks.
We previously dropped the request but that was inappropriate for that
one case because the replica has no way to trigger a resend.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-16 21:04:52 -07:00
Sage Weil
79cac5ee3a mds: Always _open_parents when opening a new snaprealm
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-16 21:03:48 -07:00
Greg Farnum
a028c8954c mds: don't run all of try_subtree_merge on a rename across MDSes.
Previously we'd try and do the whole thing, which meant that
the replica got a lock twiddle before it had finished the export.
That broke things spectacularly, since we weren't respecting our
invariants about who gets remote locking messages.
Now we pass through a flag and respect our invariants.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-16 21:03:37 -07:00
Greg Farnum
6250e82c00 mds: adjust LocalLock can_xlock_local().
I don't remember why we needed can_xlock_local() to begin with, but
I can tell that adding this get_xlock_by() check won't stop anything
working that was ever working to begin with (really it's still not
strong enough a check).

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-16 21:02:56 -07:00
Greg Farnum
5a65a04a9c mds: Extend use of find_ino_peers.
Missed a few places that need it.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-16 21:02:48 -07:00
Greg Farnum
bea966af2f mds: Make use of find_ino_peers
Previously we just had to give up on ESTALE. Now
we can attempt to recover!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-16 21:02:45 -07:00
Greg Farnum
22e8519d03 random commenting 2011-04-16 21:01:46 -07:00
Greg Farnum
ace54db0c0 MDS: Remove inappropriate assert from _logged_slave_rename.
The slave also can hold some auth pins from locks which the
master has asked it to grab. It's possible we can intelligently
determine how many, but for now just drop the assert.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-16 21:01:38 -07:00
Greg Farnum
ac045dc3ec MDS: Server::handle_slave_rename_prep now accounts for dir snaplock.
Previously it ignored the auth pin required to hold snap xlock, which
is currently always held for a rename on a dir. This would lead to
a permanent hang on the request. Now we account for it!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-16 21:01:15 -07:00
Greg Farnum
597e30edeb MDS: Don't move inode to snaprealms if not primary inode.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-16 21:00:05 -07:00
Greg Farnum
08bd2ef111 MDCache: update assert to account for being a slave.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-04-16 20:59:57 -07:00
Greg Farnum
569cce39f4 Server: push_projected_linkage in _link_remote
_link_remote_finish will pop the linkage if inc==true, so we'd
better push it to match!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-16 20:59:49 -07:00
Greg Farnum
5b825c3a43 Server: ensure slave mdses have full dest tree
We were already taking rdlocks on the source tree, to make
sure that each slave MDS could traverse to the source dentry. Now,
if there are slave MDSes, we take rdlocks on each destination
ancestor to make sure the slaves can also traverse there.
This fixes an fsstress bug.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-04-16 20:59:45 -07:00
Yehuda Sadeh
544ce94ab9 rgw: basic support for separate uid and access key 2011-04-15 17:20:44 -07:00
Sage Weil
24f35e79db mds: fix null deref in debug
The *dir isn't always non-null (namely, during DISCOVERING state).

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-15 16:32:55 -07:00
Sage Weil
f85515141a mds: keep import/export subtree_map state in sync with journal
We were being sloppy before with the ESubtreeMap vs import/export events.
Fix that by doing a few things:

 - add an ambig flag to the subtree map items, and set it for in-progress
   imports.  That means an ESubtreeMap followed by EImportFinish will do
   the right thing now.
 - adjust the dir_auth on EExport journaling (handle_export_dir_ack) so
   that our journaled subtree_map state is always in sync with what we
   see during replay.

Also document clearly what the dir_auth variations actually mean.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-15 16:32:55 -07:00
Sage Weil
d94c69e580 mds: fix export cancel during IMPORT_PREPPING
If we are in PREPPING, we need to drop the stickydirs() on the inodes, and
not the pins on the dirfrags.  Do this in the helper so we can keep the
call chains simple.

Also deal with the case where we get a cancel in PREPPED state.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-15 16:32:55 -07:00
Sage Weil
07098fa5a9 mds: clean up trim_non_auth_subtree output
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-15 16:32:55 -07:00
Sage Weil
e15d9ca1d1 mds: cancel exports in PREPPING state on any failure
The prepping nodes may need to discover bounds from the failed node and
may hang indefinitely.  Meanwhile, we won't send out mds_resolve messages
until in-progress migrations complete.  Deadlock.

In certain cases the importing node can manufacture the replica.  If it
doesn't realize that right off, though, it will get hung up trying to
discover from the wrong node, get referred to the failed node, and block
waiting for recovery.  The replica forging is a bit suspect anyway, so
let's avoid the whole thing if we can!

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-15 16:32:55 -07:00
Sage Weil
c7385c1d0e mds: use helpers for import_reverse
Use helpers for common code shared between handle_export_cancel and
handle_mds_failure_or_stop.

Also include handling for IMPORT_PREPPING state, even though we don't use
it yet.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-15 16:32:55 -07:00
Sage Weil
777bcba0a1 mds: don't skip inodes in journal that may be trimmed during replay
During replay we trim non-auth inodes on EExport or EImportFinish abort.
Subtree trimming may be delayed, too.

Skip parents if the diri is in the same blob, or if it is journaled in the
current segment *and* it is in a subtree that is unambiguously auth.  We can't
easily be more precise than that because the actual event we care about on
replay is EExport, but the migrator doesn't twiddle auth bits to false until
later.

Also, reset last_journaled on import.

This fixes replay bugs like

2011-04-13 18:15:18.064029 7f65588ef710 mds1.journal EImportStart.replay 10000000015 bounds []
2011-04-13 18:15:18.064034 7f65588ef710 mds1.journal EMetaBlob.replay 2 dirlumps by unknown0
2011-04-13 18:15:18.064040 7f65588ef710 mds1.journal EMetaBlob.replay dir 10000000010
2011-04-13 18:15:18.064046 7f65588ef710 mds1.journal EMetaBlob.replay missing dir ino  10000000010
mds/journal.cc: In function 'void EMetaBlob::replay(MDS*, LogSegment*)', in thread '0x7f65588ef710'
mds/journal.cc: 407: FAILED assert(0)
 ceph version 0.25-683-g653580a (commit:653580ae84c471c34872f14a0308c78af71f7243)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x53) [0xa53d26]
 2: (EMetaBlob::replay(MDS*, LogSegment*)+0x7eb) [0x7a737d]

Fixes: #994
Signed-off-by: Sage Weil <sage@newdream.net>
2011-04-15 16:32:55 -07:00