Commit Graph

10500 Commits

Author SHA1 Message Date
Sage Weil
013f9e52e3 Makefile: include cclsinfo; lose the old .cc 2010-06-29 16:23:11 -07:00
Sage Weil
0812b8f4d9 Makefile: include cclass.in in dist 2010-06-29 16:08:33 -07:00
Yehuda Sadeh
f9dc4b9091 buffer: write_fd should skip empty buffers 2010-06-29 16:00:05 -07:00
Yehuda Sadeh
96b1db80d0 rbdtool: init rbd block id, later will be used for rename 2010-06-29 16:00:05 -07:00
Yehuda Sadeh
c9930900e6 cclass.in: LIBDIR=.libs in certain cases 2010-06-29 16:00:05 -07:00
Yehuda Sadeh
0f4ddbac3b cls: cls_read, cls_cxx_read return the number of bytes read 2010-06-29 16:00:05 -07:00
Sage Weil
2ec729d1fe config: use <<20 for MB 2010-06-29 14:40:30 -07:00
Sage Weil
2df8b9fd49 script/plot.pl: don't pause
Run like so:

 $ script/plot.pl path/to/log osd c_wrb [smooth bezier] | gnuplot -persist
2010-06-29 14:40:23 -07:00
Sage Weil
fcc39c8113 msgr: use dedicated reaper thread
We were calling the reaper from the wait() loop.  The problem is that
the OSD has two messengers, and only the first was in wait().. the second
wait() was only called after the first terminated (i.e, when the OSD was
shutting down).

Instead, launch a separate reaper thread when we bind, and close it out
on shutdown right after the accepter.
2010-06-29 14:40:23 -07:00
Sage Weil
3e334024f4 osd: removed unused RepGather::indata
Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-29 14:38:39 -07:00
Sage Weil
852ab94a22 osd: change write osd logging timing 2010-06-29 14:38:28 -07:00
Sage Weil
419bd914dc osd: always use original Connection when replying
...even when the op came from another OSD.  Not that that should happen
anyway, since we don't forward messages currently.  (And can't, since the
OSD doesn't initiate connections to the client!)
2010-06-29 14:32:28 -07:00
Sage Weil
def4b40e96 osd: always include osd op result, result code in the first reply 2010-06-29 14:31:12 -07:00
Sage Weil
e85d98ba62 osd: track open repops in logger
Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-29 14:21:06 -07:00
Sage Weil
ff8df7dcca osd: add 'logger reset' command 2010-06-29 14:21:06 -07:00
Sage Weil
fc1d1665d9 journal: set max journal write to 10MB
If we take too big a bite of data to write in a single writev(2), we can
end up making performance worse, because everyone waits for the full write
to complete.  Bigger writes mean better throughput but higher latency.
So, balance the two by placing some upper limit.
2010-06-29 14:21:05 -07:00
Yehuda Sadeh
1d517986a8 conf: fix parsing when there's no eol at the end of file 2010-06-29 09:59:26 -07:00
Sage Weil
d9dc7cb581 msg: fix entity_addr_t::parse() to return false on failure 2010-06-29 08:34:53 -07:00
CC Lien
0cb7a71c3b mkcephfs: Fix wrong maxosd when OSD ids are random ordered in ceph.conf
Hi

I got a trouble that mkcephfs will have wrong "maxosd" when you have
ceph.conf with OSD ids in random order like:

[osd2]
...
[osd0]
...
[osd1]
...

In this case, you will got "2" for the "maxosd", instead of 3.
After adding a sort, the problem seems solved.

Cheers,
CC Lien

Signed-off-by: CC Lien <cc_lien@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-29 08:22:13 -07:00
Sage Weil
50f254d0ce filejournal: fix journal write_pos advance
This was broken by bd4188a02a.  @pos needs to
be advanced (it is pass by reference) or else we just overwrite the same
bytes at the journal start over and over again.
2010-06-28 11:34:29 -07:00
Sage Weil
d9554d5db2 mount.ceph: update mount options
Signed-off-by: Thomas Mueller <thomas@chaschperli.ch>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-28 09:57:25 -07:00
Sage Weil
6bae200b6b msgr: fix throttle deadlock
Do msgr throttle after peer policy throttle.  The msgr (dispatch) throttle
is shortlived and won't deadlock (unless dispatch blocks), so it's safe to
take last.  In contrast, the policy throttle carries over the lifetime of
the message, and may block until replication completes or whatever else.
2010-06-26 10:29:11 -07:00
Sage Weil
8f2731bc02 crushwrapper: gracefully handle crush error
crush_do_rule can return <0 in certain error cases (e.g., forcefed device
does not exist in crush map).  We should take that to mean an empty []
result instead of crashing.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-25 21:47:00 -07:00
Sage Weil
928512ffc6 crushtool: add simple test function 2010-06-25 21:47:00 -07:00
Sage Weil
59b114a095 crush: fix "in" threshold to be 1.0, not 0.1
We were effectively counting any item with weight > .1 to be weight 1.0.
2010-06-25 21:47:00 -07:00
Greg Farnum
19b4a5ff80 cfuse: remove some NPEs with ESTALE from MDSes.
Under certain circumstances this continues to let you get ESTALE
and not handle it, but those are still being analyzed for a fix.
2010-06-25 16:10:35 -07:00
Sage Weil
bef062942a mds: keep cap follows above in->first in FLUSHSNAP
The client has a follows of 0 initially, which is correct (it does follow
0, and there are no prior snaps).  But the inode has ->first of 2, which
is also fine.  The follows here needs to be at least higher than the
inode first, though, or the caps cloning gets off...
2010-06-24 16:49:52 -07:00
Sage Weil
eeede270a8 qa: read recently written/deleted data back from snap 2010-06-24 16:49:52 -07:00
Sage Weil
9795fabb63 mds: fix client cap condition
In 551a12f52e we fixed a bug with cow_inode() where the
cap->client_follows didn't match last precisely.  Instead, we compare
to first.  But the == is too strict.. cap follows that is equal _or_older_
than the clone's first should be copied to the clone inode.

This fixes the simple test case
 $ echo asdf > bar ; mkdir .snap/bar ; rm bar ; cat .snap/bar/bar
 asdf
(Previously we would get nothing unless we waited for the cap to flush on
its own.)
2010-06-24 16:49:52 -07:00
Greg Farnum
478fe72362 ceph_fs: add CEPH_LOCK_IFLOCK so its inclusion elsewhere continues to build 2010-06-24 11:51:59 -07:00
Greg Farnum
62827156b2 mds: add IFLOCK to wait bits to prevent collisions with lock branch 2010-06-24 11:37:29 -07:00
Sage Weil
7ce0338683 crush: fix recursion through intervening types
This fixes pretty core behavior when doing recursion down the tree.  I
suspect it was broken when changing the retry behavior.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-24 10:42:20 -07:00
Sage Weil
241d011f05 crush: make CHOOSE_LEAF to behave when leaf type is encountered
We may not want to recursively call crush_choose() if we start out with a
leaf.  If that happens, we need to fill out the out2[] vector with
our result immediately.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-24 10:40:14 -07:00
Sage Weil
8f7df4ed4f client: resync with kernel's ioctl.h
This got munged up by the #ifndef/#define guard cleanup.
2010-06-24 10:39:36 -07:00
Sage Weil
07cfbad803 mds: fix SimpleLock wait_shift()
DVERSION was missing, others were overlapping...
2010-06-24 10:39:20 -07:00
Greg Farnum
5634ce8d05 ceph_fs: add CEPH_FEATURE_FLOCK to ceph_fs so its bit doesn't get covered again 2010-06-23 16:35:18 -07:00
Sage Weil
58f4dceb96 osdmap: negative osd ids do not exist
Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-23 14:12:05 -07:00
Sage Weil
b7ad8aa91b crush: behave when chooseleaf is given leaf type
Fill in the out2 choose_leaf vector if it's defined.  This is necessary
because we may not recursively call choose on out2 if the item we're on is
not a bucket (e.g., when chooseleaf is given the leaf type 0).

Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-23 14:08:39 -07:00
Yehuda Sadeh
364521acb0 mds: handle_export_caps, copy cap_import map (instead of swap content)
This fixes #200. The client_map is later on swapped again in the new ESession(),
which left finish->client_map empty.
2010-06-22 14:11:53 -07:00
Greg Farnum
43a7ec4c88 client: properly handle ESTALE by redirecting to auth MDS 2010-06-21 16:27:59 -07:00
Greg Farnum
c89afb4c8b client: don't send cap snap message to MDS if not dirty or writing.
From Sage.
2010-06-21 16:27:58 -07:00
Sage Weil
2e73f737d3 mds: only acquire_locks once in handle_client_open
Subsequent calls will just return with 'already locked'
2010-06-21 11:35:55 -07:00
Sage Weil
2cd04f47c8 mds: rename handle_client_opent 2010-06-21 11:35:55 -07:00
Yehuda Sadeh
027b6c3e6f rados: more descriptive ls usage failure, stdout set implicitly 2010-06-21 11:43:30 -07:00
Yehuda Sadeh
4d86180e53 rbdtool: fix --list 2010-06-21 11:27:41 -07:00
Greg Farnum
90511120a7 osd: fix incorrect logic check on fsid comparison 2010-06-21 10:36:13 -07:00
Greg Farnum
9bbeec4745 osd: Warn and shutdown on a mismatched fsid, instead of failing an assert 2010-06-21 09:46:22 -07:00
Thomas Mueller
c9af6def0c add helptext for option "snapdirname" to manpage of mount.ceph
[ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "iso-8859-1" character set.  ]
    [ Some characters may be displayed incorrectly. ]

inspired by the addition to
http://ceph.newdream.net/wiki/Snapshots about the snapdirname
 option i've created a patch for the mount.ceph manpage

- Thomas

Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-21 08:33:31 -07:00
Sage Weil
bf3d52a4b7 journal: initialize applied_seq during journal replay
This should avoid

#0  0x00007f41b1a18a75 in raise () from /lib/libc.so.6
#1  0x00007f41b1a1c5c0 in abort () from /lib/libc.so.6
#2  0x00007f41b22cd8e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#3  0x00007f41b22cbd16 in ?? () from /usr/lib/libstdc++.so.6
#4  0x00007f41b22cbd43 in std::terminate() () from /usr/lib/libstdc++.so.6
#5  0x00007f41b22cbe3e in __cxa_throw () from /usr/lib/libstdc++.so.6
#6  0x00000000005b39f8 in ceph::__ceph_assert_fail (assertion=0x5ec3b2 "seq >= last_committed_seq", file=<value optimized out>, line=711, func=<value optimized out>) at common/assert.cc:30
#7  0x00000000005649e1 in FileJournal::committed_thru (this=0x1116310, seq=0) at os/FileJournal.cc:711
#8  0x000000000055d265 in JournalingObjectStore::commit_finish (this=0x1125740) at os/JournalingObjectStore.cc:186
#9  0x00000000005543f3 in FileStore::sync_entry (this=0x1125740) at os/FileStore.cc:1714
#10 0x00000000004ef93d in FileStore::SyncThread::entry() ()
#11 0x0000000000469a4a in Thread::_entry_func (arg=0x6315) at ./common/Thread.h:39
#12 0x00007f41b28ab9ca in start_thread () from /lib/libpthread.so.0
#13 0x00007f41b1acb6cd in clone () from /lib/libc.so.6
#14 0x0000000000000000 in ?? ()

Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-20 14:41:19 -07:00
Sage Weil
2540ea480a mkcephfs: push conf to remote machines
Signed-off-by: Fred Ar <ar.fred@yahoo.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-06-20 09:45:05 -07:00