Commit Graph

11497 Commits

Author SHA1 Message Date
Sage Weil
1b2e99275b debian: update scripts to do packaging fixes 2010-10-18 10:19:28 -07:00
Sage Weil
d44267c2d6 Revert "messenger: introduce a "halt_delivery" flag, checked by queue_delivery."
This reverts commit 69be0df61d.
2010-10-17 20:15:02 -07:00
Sage Weil
69b764a8d3 mon: add 'mds rm <gid>' and 'mds rmfailed <id>' commands
For cleaning up the mds map when things get weird.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-17 20:04:32 -07:00
Sage Weil
ce09cbdd94 Merge remote branch 'origin/testing' into testing 2010-10-17 20:00:03 -07:00
Sage Weil
8a7c95f60a v0.22 2010-10-15 15:34:44 -07:00
Sage Weil
2bc159e626 debian: no libgoogle-perftools-dev on lenny 2010-10-15 15:34:44 -07:00
Sage Weil
180f4412a9 mds: cleanup: clarify issue_seq in cap release debug output
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-15 13:41:03 -07:00
Sage Weil
b8ab009aa4 mds: cleanup: print waiter masks in hex
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-15 13:41:03 -07:00
Sage Weil
0e472d4a5a mds: use correct helper when pinning past snaprealm parent
The heler also updates the SnapRealm::open_past_parents, which is needed
for the have_past_parents_open() check.

That is used when, among other things, we import caps; not updating it
prevented the cap import from sending the client cap message, which makes
the mds<->client cap relationship get out of sync.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-15 13:41:03 -07:00
Sage Weil
d8ee92a642 mds: take nestlock wrlock when projecting rstat into dirfrag
We were already checking that we _can_ wrlock before doing the rstat
projection (if we can't, we mark_dirty_rstat() on the inode), but we
weren't actually taking the wrlock to prevent lock state changes while
that happened.

This bug eventually manifested itself as a failed assertion at the
now familiar
mds/CInode.cc: In function 'virtual void CInode::decode_lock_state(int, ceph::bufferlist&)':
mds/CInode.cc:1364: FAILED assert(pf->rstat == rstat)

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-15 13:41:03 -07:00
Greg Farnum
8528ebb0c6 messenger: introduce timeouts on pipes.
This will return read errors on a pipe if it gets no data
for the given period of time (default 15 minutes). In a stateful
session the Connection will hang around and the next write will
initiate standard reconnect, so things keep working but we don't
rack up hundreds of useless threads!
2010-10-15 11:21:53 -07:00
Yehuda Sadeh
6e1eeac3b3 rgw: small cleanup 2010-10-15 10:41:58 -07:00
Wido den Hollander
b378cb4899 Add RGW_PRINT_CONTINUE to control wether we print the 100-continue header
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2010-10-15 10:41:46 -07:00
Yehuda Sadeh
32e790cf03 conf: only set sig handler if wasn't set already
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2010-10-15 10:17:41 -07:00
Henry C Chang
dfc46f5ef3 mon: do not assert if paxosv < monmap->epoch
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-14 20:06:50 -07:00
Henry C Chang
406648e160 mon: do not delete mon->monmap which is not created by new
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-14 20:06:48 -07:00
Sage Weil
04189f8408 mds: fix can_scatter_pin() to be only SYNC and MIX
Those are the only states where the replica can effectively prevent the
lock from cycling in a way that would force a frozen dirfrag beneath
the scatterpinned inode to update/journal something
(accounted_fragstat/rstat).

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-14 15:07:16 -07:00
Colin Patrick McCabe
ad12d5d5be Fix bug #487: osd: fix hang during mkfs
If the user has turned on journalling, but left osd_journal_size at 0,
normally we would use the existing size of the journal without
modifications. If the journal doesn't exist (i.e., we are running
mkjournal()), we have to check for this condition and return an error.
We can't create a journal if we don't know what size that journal needs
to be.

This fixes a bug where an extremely small journal file was being
created, leading to an infinite loop in FileJournal::wrap_read_bl().

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-10-14 12:09:27 -07:00
Colin Patrick McCabe
17de417fad FileJournal.h: add attribute __packed where needed
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-10-14 12:09:24 -07:00
Greg Farnum
69be0df61d messenger: introduce a "halt_delivery" flag, checked by queue_delivery.
Defaults to false, is set to true by destroy_queue.
2010-10-14 11:05:51 -07:00
Sage Weil
60bfc670c9 osd: fix MOSDBoot versioning
1 is what it was before; make it 2.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-13 12:11:14 -07:00
Sage Weil
7f493a11cb qa: add ffsb 2010-10-13 10:09:43 -07:00
Sage Weil
e6d28ce380 prefix git sha1 with commit:
This just makes it into a link when pasted directly into redmine.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-13 08:50:55 -07:00
Sage Weil
dc295a371c mds: don't assert on mismatched rbytes 2010-10-12 15:26:15 -07:00
Sage Weil
53decffc7e Merge branch 'testing' into rc 2010-10-12 15:15:05 -07:00
Sage Weil
f35bdc2860 add rc to release.sh 2010-10-12 15:15:03 -07:00
Sage Weil
219b4764fa mds: fix const-ness of is_dirty()
This was fixed before, got lost somehow.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-12 13:59:44 -07:00
Greg Farnum
df265a22c5 mon: don't include endl on clock drift warning 2010-10-12 12:42:40 -07:00
Sage Weil
dead368d97 Makefile: add cdebugpack.in to EXTRA_DIST 2010-10-12 11:17:45 -07:00
Greg Farnum
b438b3d65b mds: Fix projection in rename code paths.
We aren't actually projecting the inode unless destdn->is_auth(),
so check for that before projecting the snaprealm (which requires
a projected inode)!
Then on rename_apply, open the snaprealm on non-auth MDSes.
2010-10-12 07:49:56 -07:00
Greg Farnum
4ba060ccfa mds: CInode doesn't always call assimilate_dirty_rstate_inodes_finish
This was causing a mis-match in the projection code, since
assimilate_...finish() calls pop_and_dirty_projected_inode(), but
the first half is only called on CEPH_LOCK_INEST locks. So make them match!
2010-10-12 07:49:56 -07:00
Greg Farnum
c56ab53fe7 mds: Locker::local_wrlock_finish now calls finish_waiters!
Fixes a bug that could cause requests to hang since they were
put to sleep and never woken up.
2010-10-12 07:49:56 -07:00
Greg Farnum
53fe418d39 mds: MDCache should adjust_nested_anchors once the op's been logged.
Fixes crashes from assert(nested_anchors >= 0) failures
when updating at the wrong point.
2010-10-12 07:49:56 -07:00
Sage Weil
fc609846d4 mds: avoid EXCL if mds_caps_wanted in _do_cap_update
The file_excl() trigger asserts mds_caps_wanted is empty.  The caller
shouldn't call it if that's the case.  If it is, just go to LOCK instead.
All we're doing is picking a state to move to that will allow us to
update max_size.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-11 21:25:17 -07:00
Sage Weil
fa2c371f6e mds: bump dirstat.version during link/unlink/mtime update
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-11 21:13:00 -07:00
Sage Weil
9e5a203da8 mds: fix get_xlock() assert on slave xlock
If we do a slave request xlock, the state is LOCK, not XLOCK.  Weaken
the SimpleLock::get_xlock() assert accordingly.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-11 20:57:46 -07:00
Sage Weil
f9b102e0d5 mds: bump rstat version in predirty_journal_parents
When we propagate the rstat to inode in predirty_journal_parents (because
we hold the nestlock), bump the rstat version as well.  This avoids
confusing any replicas, who expect the rstat to have a new version or to
remain unchanged when the lock scatters.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-11 20:32:31 -07:00
Sage Weil
d2175ee830 filestore: don't start commit if nothing new is _applied_
We were starting a commit if we had started a new op, but that left a
window in which the op could be being journaled, and nothing new has been
applied to disk.  With this fix we only commit if committing/committed
will increase.  Now the check matches the

 committing_seq = applied_seq;

a few lines down, and all is well.

The actual crash this fixes was:

2010-10-07 16:20:36.245301 7f07e66d3710 filestore(/mnt/osd3) taking snap 'snap_23230'
2010-10-07 16:20:36.245428 7f07e66d3710 filestore(/mnt/osd3) snap create 'snap_23230' got -1 File exists
os/FileStore.cc: In function 'void FileStore::sync_entry()':
os/FileStore.cc:1738: FAILED assert(r == 0)
 ceph version 0.22~rc (1d77c14bc310aed31d6cfeb2c87e87187d3527ea)
 1: (FileStore::sync_entry()+0x6ee) [0x793148]
 2: (FileStore::SyncThread::entry()+0x19) [0x761d43]
 3: (Thread::_entry_func(void*)+0x20) [0x667822]
 4: (()+0x68ba) [0x7f07eac248ba]
 5: (clone()+0x6d) [0x7f07e9bd802d]

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-08 17:23:19 -07:00
Yehuda Sadeh
55370d3acd cdebugpack: update Makefile.am, add missing line 2010-10-08 13:55:13 -07:00
Yehuda Sadeh
0b26f3153f mon: class library encodes/decodes activated class
This fixes bug #470
2010-10-07 23:21:15 -07:00
Sage Weil
873095beef osd: fix merge_log cut point
Look at the eversion.version field (not the whole eversion) when deciding
what is divergent.  That way if we have

our log: 100'10 (0'0) m 10000004d3a.00000000/head by client4225.1:18529
new log: 122'10 (0'0) m 10000004d3a.00000000/head by client4225.1:18529

The 100'10 is divergent and the 122'10 wins and we don't get a dup
reqid in the log.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-07 16:17:09 -07:00
Sage Weil
6bcda253e5 osd: loosen caller_ops asserts
The problem is that merge_log adds new items to the log before it unindexes
divergent items, and that behavior is needed by the current implementation
of merge_old_entry().  Since the divergent items may be the same requests
(and frequently are) these asserts needs to be loosened up.

Now, the most recent addition "wins," and we only remove the entry in
unindex() if it points to us.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-07 16:17:09 -07:00
Sage Weil
6679c27459 osd: move to boot state if down OR wrong address in map
Saw an OSD that was up in the map, but the address didn't match.  Caused
all kinds of strange behavior.  I'm not sure what I had in mind when the
original test only checked for down AND same address before moving to boot
state, since having the wrong address is clearly bad news.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-07 16:17:09 -07:00
Sage Weil
6545f3ca1c cdebugpack: behave when /bin/sh is dash
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-07 09:47:34 -07:00
Sage Weil
af749e62cb cdebugpack: man page
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-07 09:38:37 -07:00
Sage Weil
9805eb5b6b cdebugpack: include cdebugpack.XXXX dir in tarball
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-07 09:31:31 -07:00
Sage Weil
2c49ac4d46 cdebugpack: include .tar.gz in usage filename 2010-10-07 09:31:13 -07:00
Sage Weil
3b1b8f89ff cdebugpack: include in deb, rpm 2010-10-07 09:25:26 -07:00
Sage Weil
f10906b3fd mds: respawn (instead of suicide) on being marked down
This makes temporarily laggy daemons will restart and rejoin the cluster
in standby mode.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-07 07:52:50 -07:00
Sage Weil
a2bcb419c4 debug: always append to log
We were truncating if we were in log_per_instance mode.  But normally those
logs don't exist.  And if they do, we probably don't want to truncate
them.  This is particularly true if we respawn ourselves (e.g. after being
marked down) and restart with the same pid.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-07 07:52:02 -07:00