Commit Graph

11814 Commits

Author SHA1 Message Date
Sage Weil
5eb8ef7f11 filejournal: fix throttle vs FULL behavior
We don't want to add to the throttler if we aren't going to queue the
write, or else we'll never take it off again.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-30 08:55:29 -08:00
Greg Farnum
d8652de616 mdcache: in trim_non_auth, only print out path if it has a parent dentry.
This should only occur with the root inode, but caused a segfault for
anybody running more than one MDS who restarted.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
2010-11-23 14:40:54 -08:00
Herb Shiu
8768b52dc4 mds: Reply checking_lock while reading filelock
Use checking_lock to repalce lock_state in extra buffer list to let client can get correct file lock reply.
2010-11-23 14:04:03 -08:00
Sage Weil
868665d5f2 v0.23.1 2010-11-22 23:02:09 -08:00
Greg Farnum
f7170f95f0 client: only encode_cap_releases once per request.
Accomplish this by making a list of cap releases in the (permanent)
MetaRequest, and then copying that into the (potentially-temporary)
MClientRequest.
2010-11-22 09:09:01 -08:00
Greg Farnum
c43455cee4 client: Remove the I_COMPLETE flag from the parent directory in relink_inode.
This papers over issues arising from the client's lack of proper support
for hard links, and lets it pass the snaptest-upchildrealms test.
2010-11-17 09:58:38 -08:00
Sage Weil
f18609e88a Merge remote branch 'origin/msgr' into testing 2010-11-12 20:43:30 -08:00
Sage Weil
2be4215a6b debug: don't print thread id twice
Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-12 16:00:12 -08:00
Sage Weil
b61af6a742 msgr: cleanup: make queue_received non-inline; some helpful debug
Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-12 15:59:50 -08:00
Sage Weil
f99c84e6b2 msgr: do not clear halt_delivery
We need to keep the halt_delivery plug set on failure/shutdown in order to
prevent a racing reader from queuing new messages.  The only time we clear
it is when we discard messages due to a session reset.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-12 15:56:54 -08:00
Sage Weil
d4746ab5ac msgr: close enqueue/discard race
We need to re-check halt_delivery after dropping and retaking pipe_lock.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-12 14:55:12 -08:00
Sage Weil
1071a9abde msgr: protect pipe queue_item map with pipe_lock AND dispatch_queue lock
Close a few different races here.

Also, assert that queue_items are not queued in ~Pipe().

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-12 14:55:12 -08:00
Sage Weil
70fe062ffa msgr: add 'ms inject socket failures = foo'
Where we fail roughly every foo'th socket operation.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-12 14:55:11 -08:00
Sage Weil
cbf154e134 msgr: only close socket on reconnect or shutdown
We can't modify 'sd' or (more importnatly) close sd while any other thread
might be using it, or else we might race with an open and they might end
up using someone else's fd.

Take care to _only_ close(sd) in connect(), when the reader thread is
stopped, or when reaping the connection.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-12 14:55:11 -08:00
Sage Weil
20937e885f msgr: protect pipe queuing with _both_ pipe and dispatch_queue locks
We want to make sure the pipe's queue item doesn't go away.

Also, make queue_received() require pipe_lock to be held.  This avoids some
useless unlocking/locking, since (in the case where the pipe is already
queued) we then don't need to drop the pipe_lock at all.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-12 14:55:11 -08:00
Sage Weil
c5b2d28bc7 uclient: insert lssnap results under snapdir, not live dir
Put the readdir results (list of snapshots) in the right place in the
hierarchy; we were putting them in the parent dir (as if they were real
directories).

This bug manifested itself as a snaptest-2.sh failure.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-12 07:55:41 -08:00
Wido den Hollander
7ccdae8cd4 msg: fix buffer size for IPv6 address parsing
Signed-off-by: Wido den Hollander <wido@widodh.nl>
2010-11-12 07:36:00 -08:00
Sage Weil
5d1d8d0c46 v0.23 2010-11-10 21:18:37 -08:00
Sage Weil
3d10b34074 mds: fix null_snapflush with multiple intervening snaps
The client is allowed to not send a snapflush if there is no dirty metadata
to write for a given snap.  However, the mds can only look up inodes by
the last snapid in the interval.  So, when doing a null_snapflush (filling
in for snapflushes the client didn't send), we have to walk forward through
intervening snaps until we find the right inode.

Note that this means we will call _do_snap_update multiple times on the
same inode, but with different snapids.

Add unit test to check this.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-10 20:58:49 -08:00
Sage Weil
82aa79f863 mds: fix inode->frag rstat projected with snaps
The snapid 'first' value needs to be >= inode->first; move that into
the helper.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-10 09:43:56 -08:00
Sage Weil
5deef24396 osdmap: break up asserts for easier debugging
If we fail one of these it's helpful to know which one.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-10 09:04:31 -08:00
Sage Weil
586c9e7a80 objecter: throttle before looking at lock protected state
The take_op_budget() may drop our lock if we are in keep_balanced_budget
mode, so we need to do that _before_ we take references to internal state
that may change out from under us during that time.

This fixes a crash like

./osd/OSDMap.h: In function 'entity_inst_t OSDMap::get_inst(int)':
./osd/OSDMap.h:460: FAILED assert(exists(osd) && is_up(osd))
ceph version 0.22.1 (commit:c6f403a6f441184956e00659ce713eaee7014279)
1: (Objecter::op_submit(Objecter::Op*)+0x6c2) [0x38658854c2]
2: /usr/lib64/librados.so.1() [0x3865855dc9]
3: (RadosClient::aio_write(RadosClient::PoolCtx&, object_t, long,
ceph::buffer::list const&, unsigned long,
RadosClient::AioCompletion*)+0x24b) [0x386585724b]
4: (rados_aio_write()+0x9a) [0x386585741a]
5: /usr/bin/qemu-kvm() [0x45a305]
6: /usr/bin/qemu-kvm() [0x45a430]
7: /usr/bin/qemu-kvm() [0x43bb73]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
./osd/OSDMap.h: In function 'entity_inst_t OSDMap::get_inst(int)':
./osd/OSDMap.h:460: FAILED assert(exists(osd) && is_up(osd))
ceph version 0.22.1 (commit:c6f403a6f441184956e00659ce713eaee7014279)
1: (Objecter::op_submit(Objecter::Op*)+0x6c2) [0x38658854c2]
2: /usr/lib64/librados.so.1() [0x3865855dc9]
3: (RadosClient::aio_write(RadosClient::PoolCtx&, object_t, long,
ceph::buffer::list const&, unsigned long,
RadosClient::AioCompletion*)+0x24b) [0x386585724b]
4: (rados_aio_write()+0x9a) [0x386585741a]
5: /usr/bin/qemu-kvm() [0x45a305]
6: /usr/bin/qemu-kvm() [0x45a430]
7: /usr/bin/qemu-kvm() [0x43bb73]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
*** Caught signal (ABRT) ***
ceph version 0.22.1 (commit:c6f403a6f441184956e00659ce713eaee7014279)
1: (sigabrt_handler(int)+0x91) [0x3865922b91]
2: /lib64/libc.so.6() [0x3c0c032a30]
3: (gsignal()+0x35) [0x3c0c0329b5]
4: (abort()+0x175) [0x3c0c034195]
5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x3c110beaad]

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-10 09:03:37 -08:00
Sage Weil
57513739f5 mon: drop unnecessary state checks
We want to ignore all beacons from the mds regardless of what state they
are in.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-10 08:50:25 -08:00
Sage Weil
84840ed76b debian: don't explicitly depend on libgoogle-perftools0
dpkg-buildpackage will autodetect the dependency.  Except on lenny, where
it doesn't exist and we don't use it!

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-10 08:45:36 -08:00
Greg Farnum
ca3693d8ff mds: Enable --journal_check mode.
This replaces the old --shadow option, which didn't work.
It starts up the MDS daemon, then replays the journal for
another MDS, and then shuts down.

Also minimally modifies the MDSMonitor to enable this
behavior; since it requires shared state.
2010-11-10 08:14:04 -08:00
Greg Farnum
214b726959 osdc: Fix bad assert in ~ObjectCacher.
The objects data member is never empty on shutdown since it now consists
of a vector of pools. Instead, check each pool map for emptiness.
2010-11-10 08:13:28 -08:00
Sage Weil
5035c82279 uclient: only update inode if version increased
This realigns the code with the kernel version, fixing a number of
problems when you have multiple MDSs returning info on the same inode.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-10 07:42:29 -08:00
Sage Weil
6bc31511c2 gui: add missing #include
Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-09 15:04:10 -08:00
Kacper Kowalik
1c7d8f1ac2 Makefile: use openssl module check
This allows ceph to build with --as-needed.

Signed-off-by: Kacper Kowalik <xarthisius@gentoo.org>
2010-11-09 13:30:15 -08:00
Sage Weil
954ad98230 osd: shut down if we do not exist
Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-09 13:17:25 -08:00
Sage Weil
ea56dfdc66 osd: handle osds that no longer exist in prior_set_affected
Consider no-longer-existent OSDs lost.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-09 13:08:56 -08:00
Sage Weil
e15c9569ba mds: fix inode freeze auth pin allowance
When we're renaming across nodes, we need to freeze the inode.  This
requires that we allow for the auth_pins that _we_ hold, which include
one because of the linklock xlock, and one by the MDRequest.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-09 10:04:02 -08:00
Sage Weil
ae13fc865b osd: handle osds that no longer exist in build_prior
Fix build_prior to handle OSDs that no longer exist in the current map.
Consider them lost.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-09 10:04:02 -08:00
Christian Brunner
aad3f7f275 ceph.spec.in: don't strip rados classes
Signed-off-by: Christian Brunner <christian@brunner-muc.de>
2010-11-08 22:03:15 -08:00
Sage Weil
64f95ad95c mds: add missing Dumper.[h,cc] 2010-11-08 13:22:08 -08:00
Sage Weil
be9328ac7d mds: tolerate/fix negative dir size counts
Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-08 13:18:31 -08:00
Sage Weil
0b19092098 Merge remote branch 'origin/testing' into unstable 2010-11-07 09:42:51 -08:00
Sage Weil
a4674af5db mds: eval: put scatter in MIX if replicated, otherwise LOCK
Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-07 07:49:59 -08:00
Sage Weil
33c6e230a2 mds: do not scatter_writebehind in MIX state
Replicas might come in while we're flushing and get a MIX state with
the old state.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-07 07:45:52 -08:00
Sage Weil
1bf8e73299 Merge branch 'unstable' into mix_stale 2010-11-06 21:05:11 -07:00
Sage Weil
bdc2fa5b34 mds: remove MIX_STALE
Yay, we don't need it!

If we can't update the frag on scatter, fine.  The staleness of the frag
is implicit in the frag's scatter stat version not matching the inode's.
If/when we do want to update it, the frag will clearly be writable, and
we can bring it back in sync then.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-06 21:01:20 -07:00
Sage Weil
c1ee560e42 mds: don't fuss with versions when taking frag/rstat from frag; it's never stale here
Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-06 21:01:20 -07:00
Sage Weil
1eb94da2f6 mds: introduce/use helpers to resync stale fragstat/rstat; update version
Simplifies code.

Also, update the version when we resync!

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-06 21:01:20 -07:00
Sage Weil
c203482954 mds: ignore done_locking on slave requests' acquire_locks()
Slave requests ask for each xlock one at a time.  Don't bail out based on
the done_locking flag.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-06 21:00:55 -07:00
Sage Weil
51b6a863a8 mds: don't use helper for rename srcdn
The rdlock_path_xlock_dentry helper works for _auth_ dentries that we
create locally in an auth dirfrag.  For the srcdn, we need to discover an
_existing_ dentry that is not necessarily auth.

Call path_traverse ourselves, but be careful to take the appropriate locks
on the resulting dn, dir, and ancestors.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-06 21:00:51 -07:00
Sage Weil
eb0a60d024 mds: never complete a gather on a flushing lock
The scatter_writebehind() takes a wrlock, but that may still allow the lock
to complete a gather to LOCK and even move to say MIX before the data is
committed.  Bad news!

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-06 21:00:46 -07:00
Sage Weil
bdf3bc5ef2 mds: update version when bring stale rstat back up to date
Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-06 09:38:15 -07:00
Sage Weil
a74054d11d mds: simplify stale semantics a bit
is_stale() => next MIX is MIX_STALE. Stale flag is then cleared.  Then we
special case the import to preserve stale-ness.

TODO: add_replica_inode likely has this same problem.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-06 07:58:32 -07:00
Sage Weil
e27f111f31 mds: preserve stale state on import; some cleanup
Our new invariant is that MIX_STALE always implies is_stale().  And on
import, if is_stale(), MIX becomes MIX_STALE.  This ensures that a replica
that we put into MIX_STALE doesn't turn back into MIX if we import it
and take the auth's state in CInode::decode_import().

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-05 21:52:28 -07:00
Sage Weil
a582345c57 Merge branch 'mix_stale' into unstable 2010-11-05 17:08:10 -07:00