Sage Weil
73705f661b
monmaptool: fix clitests
...
Initial map is epoch 0. Modifications still bump epoch by one.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-11-11 20:52:28 -08:00
Sage Weil
36241da4b1
paxos: discard waiting_for_active events on reset
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 20:49:00 -08:00
Sage Weil
80ab65682a
monclient: use blank fsid (instead of epoch==0) for monmap checks
...
We can safely mkfs with an epoch=0 monmap as long as the fsid is set. And
that is what commit f31825cee5
changed.
Instead, use a zeroed fsid to tell if the monmap is valid/usable.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 20:48:59 -08:00
Sage Weil
2253c0168d
use libuuid for fsid
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 20:48:59 -08:00
Sage Weil
cf0a53e1ed
mon: fix seed monmap removal
...
Remove if we previous had no latest, not based on which map we now have.
It's possible we join when monmap epoch is something much larger than 1!
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 14:54:41 -08:00
Sage Weil
6d370f3bd6
mon: allow monitor to automagically join cluster
...
If a monitor starts up with the correct fsid and auth keys, it will now
add itself to the monmap (and subsequently try to join the quorum) if it
is not already in the monmap.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 14:52:44 -08:00
Sage Weil
d56485a8af
osd: pass monclient::init errors up the stack
...
Fixes crash like
ceph version 0.38-149-gbf254de (commit:bf254de5cf8a17ce9467d166d87f3ab93170ae13)
1: (ceph::BackTrace::BackTrace(int)+0x2d) [0x91d97b]
2: ./ceph-osd() [0xa05baa]
3: (()+0xef60) [0x7fb54c87ef60]
4: (std::_Rb_tree<unsigned int, unsigned int, std::_Identity<unsigned int>, std::less<unsigned int>, std::allocator<unsigned int> >::size() const+0xc) [0x8a4bc6]
5: (std::set<unsigned int, std::less<unsigned int>, std::allocator<unsigned int> >::size() const+0x18) [0x8a1d32]
6: (void encode<unsigned int>(std::set<unsigned int, std::less<unsigned int>, std::allocator<unsigned int> > const&, ceph::buffer::list&)+0x1c) [0x8a0311]
7: (MonClient::_reopen_session()+0x2c5) [0x89a425]
8: (MonClient::authenticate(double)+0x24f) [0x898da7]
9: (OSD::init()+0x112b) [0x807ca1]
10: (main()+0x2c09) [0x73e406]
11: (__libc_start_main()+0xfd) [0x7fb54b04ec4d]
12: ./ceph-osd() [0x73b499]
due to auth_supported being NULL.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:52:24 -08:00
Sage Weil
bf254de5cf
mon: verify fsid during probe and election
...
This will keep mismatched fsids out of the same quorum.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:37:07 -08:00
Sage Weil
f1a98fb8af
mon: tolerate won election while active
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:22:37 -08:00
Sage Weil
cd736b9da0
mon: clean up logic a bit
...
More explicit.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:22:22 -08:00
Sage Weil
2633d71dbd
mon: only re bootstrap if monmap actually changes
...
If we go thru here just to update latest, that's fine; no need to restart
the bootstrap process.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:22:09 -08:00
Sage Weil
622fbadd66
paxos: fix off-by-one in share_state
...
We hit this on adding a new monitor to an existing cluster.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:15:16 -08:00
Sage Weil
6c663d855e
mon: fix monmap update
...
It's on the stack; update in place.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:05:01 -08:00
Sage Weil
1134fdfe78
mon: properly process monmaps even when i have the latest
...
We may get the latest monmap when we are doing our probing, but we still
need to process it in update_from_paxos(). Consider get_latest_version()
in addition to the active map.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:02:52 -08:00
Sage Weil
c097e63401
mon: fix up update_from_paxos() methods
...
Make sure they behave when the initial state is learned from paxos.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 11:55:34 -08:00
Sage Weil
f31825cee5
monmaptool: new maps get epoch 0
...
Just for consistency's sake.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 11:41:42 -08:00
Sage Weil
65f797ea47
mon: clean up mkfs seed data
...
And make sure the monmap/latest gets written properly.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 11:41:42 -08:00
Sage Weil
e545af2db5
mon: remove empty monstore dirs
...
This is sloppy, but it works well enough since we mkdir dirs as needed
too.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 11:41:42 -08:00
Sage Weil
aea7563f03
mon: create initial states after quorum is formed
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 11:41:42 -08:00
Sage Weil
1533f1c0ed
mon: stage mkfs seed info in mkfs/ dir
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 10:45:27 -08:00
Sage Weil
9e941c4359
mon: eliminate PaxosService::init()
...
update_from_paxos() is sufficient
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 10:34:42 -08:00
Sage Weil
0a926ef576
mon: include monmap dump in mon_status and quorum_status
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 10:19:33 -08:00
Sage Weil
8c3d872ef0
mon: pull initial monmap from monmap/latest OR mkfs/monmap
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 10:15:23 -08:00
Sage Weil
0ecae9969e
mon: take explicit initial monmap -or- generate one via MonClient
...
This will simplify bootstrapping a cluster via e.g. mon_host.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 10:05:36 -08:00
Sage Weil
dae6c95654
test_filestore_idempotent: detect commit cycles due to non-idempotent ops
...
If we do a non-idempotent op and it does a commit itself, we don't see
fs->is_committed() true ever. Also count full commit cycles, and kill
ourselves after several of those have gone by.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
add04d15f4
filejournal: fix replay of non-idempotent ops
...
- start sync thread prior to replay, so that we can commit as we replay
operations
- keep applied_seq accurate
- pass seq (not old op_seq) to do_transactions
- carry open_ops ref so that commit blocks until we have finished applying
the full transaction
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
9f1673c15a
test_filestore_idempotent: transactions are individually idempotent
...
Make individual transactions idempotent, but their interactions
non-idempotent. I.e. A A A A is okay, but A B A is not.
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
8df0cd3867
filestore: make trigger_commit() wake up sync; adjust locking
...
We need to wake up the sync thread (duh).
Also, we need to obey the FileJournal::lock -> journal_lock locking
order.
Also, lockdep is broken. :(
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
0981112097
filestore: document the btrfs_* fields
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
69cd362542
filestore: sync after non-idempotent operations
...
This is a big hammer to fix journal replay on non-btrfs fs backends (extN,
xfs, whatever). The problem is that it is not safe to replay some journal
operations more than once, notably things like CLONE whose source data
may be changed by subsequent operations.
The simple fix is to initiate a full commit after any non-idempotent
operations prior to any subsequent operation within the same Sequencer.
This is done by calling trigger_commit() in _do_transactions(), which means
any potentially dependent operation that follows will get blocked because
a commit is about to start.
I made trigger_commit() a bit more robust to callers who are not holding
an open_ops ref to also succeeding if the given op_seq is already
committing. For the current caller, that can't happen.
There are probably better performing solutions, but this one is at least
correct.
Fixes : #213
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
fa5047b377
Merge remote branch 'gh/stable'
2011-11-10 20:50:31 -08:00
Josh Durgin
5407fa70f8
workunits: add workunit for running rgw and rados python tests
...
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-10 17:03:12 -08:00
Yehuda Sadeh
2fb702979c
rgw: remove warning
2011-11-10 17:10:41 -08:00
Josh Durgin
71bfe8974e
test/pybind: add test_rgw
...
Forgot to add this in the previous commit.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-10 16:52:01 -08:00
Josh Durgin
ea42e02ca2
test/pybind: convert python rados and rgw tests to be runnable by nose
...
These tests can now be run automatically more easily.
Fixes : #1653
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-10 16:46:35 -08:00
Josh Durgin
25cde7f98a
rados.py: fix Snap.get_timestamp
...
This now uses datetime, imports the right things, and calls the right function.
Fixes #1577
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-10 16:33:46 -08:00
Sage Weil
b600ec2ac7
v0.38
2011-11-10 15:07:05 -08:00
Samuel Just
2a7fbe0c90
common: return null if mc.init() unsuccessful
...
Prevents ceph.cc from segfaulting on missing keyring.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-10 15:05:52 -08:00
Josh Durgin
a177a702eb
rbd.py: fix list when there are no images
...
It should return [], not [''].
Reported-by: Eric Chen <Eric_YH_Chen@wistron.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-10 15:05:38 -08:00
Sage Weil
27bb48c5bf
mon: overwrite in put_bl
...
This fixes a situation where we accept a large value, there is some failure
and recovery, and then we commit a smaller value with the same version.
E.g.,
INFO:teuthology.task.ceph.mon.b.err:terminate called after throwing an instance of 'ceph::buffer::end_of_buffer'
INFO:teuthology.task.ceph.mon.b.err: what(): buffer::end_of_buffer
INFO:teuthology.task.ceph.mon.b.err:*** Caught signal (Aborted) **
INFO:teuthology.task.ceph.mon.b.err: in thread 7f0a6037c700
INFO:teuthology.task.ceph.mon.b.err: ceph version 0.37-365-g5b20830 (commit:5b208302e1ad134f56933dfdbccb074e03c88be3)
INFO:teuthology.task.ceph.mon.b.err: 1: (ceph::BackTrace::BackTrace(int)+0x2d) [0x6f4d1b]
INFO:teuthology.task.ceph.mon.b.err: 2: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x7e9492]
INFO:teuthology.task.ceph.mon.b.err: 3: (()+0xfb40) [0x7f0a63bf4b40]
INFO:teuthology.task.ceph.mon.b.err: 4: (gsignal()+0x35) [0x7f0a625cdba5]
INFO:teuthology.task.ceph.mon.b.err: 5: (abort()+0x180) [0x7f0a625d16b0]
INFO:teuthology.task.ceph.mon.b.err: 6: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f0a62e716bd]
INFO:teuthology.task.ceph.mon.b.err: 7: (()+0xb9906) [0x7f0a62e6f906]
INFO:teuthology.task.ceph.mon.b.err: 8: (()+0xb9933) [0x7f0a62e6f933]
INFO:teuthology.task.ceph.mon.b.err: 9: (()+0xb9a3e) [0x7f0a62e6fa3e]
INFO:teuthology.task.ceph.mon.b.err: 10: (ceph::buffer::list::iterator::copy(unsigned int, std::string&)+0xcb) [0x7d73a7]
INFO:teuthology.task.ceph.mon.b.err: 11: (decode(std::string&, ceph::buffer::list::iterator&)+0x44) [0x5fa2e8]
INFO:teuthology.task.ceph.mon.b.err: 12: (LogEntry::decode(ceph::buffer::list::iterator&)+0xa8) [0x6ceee8]
INFO:teuthology.task.ceph.mon.b.err: 13: (LogMonitor::update_from_paxos()+0x346) [0x6cce9a]
INFO:teuthology.task.ceph.mon.b.err: 14: (PaxosService::_active()+0x13b) [0x647ab5]
INFO:teuthology.task.ceph.mon.b.err: 15: (PaxosService::C_Active::finish(int)+0x25) [0x647cb9]
INFO:teuthology.task.ceph.mon.b.err: 16: (Context::complete(int)+0x2b) [0x61a5a9]
INFO:teuthology.task.ceph.mon.b.err: 17: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x20b) [0x61a7ef]
INFO:teuthology.task.ceph.mon.b.err: 18: (Paxos::handle_last(MMonPaxos*)+0xea7) [0x63d081]
INFO:teuthology.task.ceph.mon.b.err: 19: (Paxos::dispatch(PaxosServiceMessage*)+0x29c) [0x642046]
INFO:teuthology.task.ceph.mon.b.err: 20: (Monitor::_ms_dispatch(Message*)+0xd78) [0x61636e]
INFO:teuthology.task.ceph.mon.b.err: 21: (Monitor::ms_dispatch(Message*)+0x3a) [0x61de84]
INFO:teuthology.task.ceph.mon.b.err: 22: (Messenger::ms_deliver_dispatch(Message*)+0x63) [0x7c690f]
INFO:teuthology.task.ceph.mon.b.err: 23: (SimpleMessenger::dispatch_entry()+0x7c2) [0x7b0156]
INFO:teuthology.task.ceph.mon.b.err: 24: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x5fd6ac]
INFO:teuthology.task.ceph.mon.b.err: 25: (Thread::_entry_func(void*)+0x23) [0x6e9261]
INFO:teuthology.task.ceph.mon.b.err: 26: (()+0x7971) [0x7f0a63bec971]
INFO:teuthology.task.ceph.mon.b.err: 27: (clone()+0x6d) [0x7f0a6268092d]
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 15:05:26 -08:00
Samuel Just
2f97a2223f
PG: mark scrubmap entry as not absent when we see an update
...
Previously, there would be an assert failure in _scan_list if we see an
object deleted and then recreated.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-10 15:05:05 -08:00
Yehuda Sadeh
87941128b6
rgw: implement swift copy, fix copy auth
2011-11-10 14:58:19 -08:00
Samuel Just
704644bca7
PG: gen_prefix: use osdmap_ref rather than osd->osdmap
...
Otherwise, the debug output might not match the map used by
the pg logic.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-10 14:09:35 -08:00
Samuel Just
7fb182a17b
OSD: sync_and_flush afer mkfs to create first snap
...
Previously, if we kill the OSD process before the filestore
does its first sync, we end up replaying the journal on top
of current and potentially hitting -EEXIST.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-10 14:09:34 -08:00
Samuel Just
a3dd5bd67b
PG: update info.history even if lastmap is absent
...
Previously, we did not update same_interval_since etc if
we do not have the previous map.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-09 17:17:34 -08:00
Sage Weil
023ff5903a
Makefile: add MMonProbe.h
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-09 16:36:48 -08:00
Sage Weil
fd5fb993e3
osd: remove useless proc_replica_log() side-effect
...
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-11-09 16:33:56 -08:00
Greg Farnum
78ad144abe
hadoop: update patch and Readme.
...
Patch generated by Noah Watkins <noahwatkins@gmail.com>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-11-09 15:38:57 -08:00
Yehuda Sadeh
386c0db372
rgw: swift guesses mime type if not specified
2011-11-09 15:30:14 -08:00
Sage Weil
78ccb2a980
osd: comment PG::lock*(), whitespace
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-09 14:50:09 -08:00