Commit Graph

17167 Commits

Author SHA1 Message Date
Sage Weil
73705f661b monmaptool: fix clitests
Initial map is epoch 0.  Modifications still bump epoch by one.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-11-11 20:52:28 -08:00
Sage Weil
36241da4b1 paxos: discard waiting_for_active events on reset
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 20:49:00 -08:00
Sage Weil
80ab65682a monclient: use blank fsid (instead of epoch==0) for monmap checks
We can safely mkfs with an epoch=0 monmap as long as the fsid is set.  And
that is what commit f31825cee5 changed.

Instead, use a zeroed fsid to tell if the monmap is valid/usable.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 20:48:59 -08:00
Sage Weil
2253c0168d use libuuid for fsid
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 20:48:59 -08:00
Sage Weil
cf0a53e1ed mon: fix seed monmap removal
Remove if we previous had no latest, not based on which map we now have.
It's possible we join when monmap epoch is something much larger than 1!

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 14:54:41 -08:00
Sage Weil
6d370f3bd6 mon: allow monitor to automagically join cluster
If a monitor starts up with the correct fsid and auth keys, it will now
add itself to the monmap (and subsequently try to join the quorum) if it
is not already in the monmap.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 14:52:44 -08:00
Sage Weil
d56485a8af osd: pass monclient::init errors up the stack
Fixes crash like

 ceph version 0.38-149-gbf254de (commit:bf254de5cf8a17ce9467d166d87f3ab93170ae13)
 1: (ceph::BackTrace::BackTrace(int)+0x2d) [0x91d97b]
 2: ./ceph-osd() [0xa05baa]
 3: (()+0xef60) [0x7fb54c87ef60]
 4: (std::_Rb_tree<unsigned int, unsigned int, std::_Identity<unsigned int>, std::less<unsigned int>, std::allocator<unsigned int> >::size() const+0xc) [0x8a4bc6]
 5: (std::set<unsigned int, std::less<unsigned int>, std::allocator<unsigned int> >::size() const+0x18) [0x8a1d32]
 6: (void encode<unsigned int>(std::set<unsigned int, std::less<unsigned int>, std::allocator<unsigned int> > const&, ceph::buffer::list&)+0x1c) [0x8a0311]
 7: (MonClient::_reopen_session()+0x2c5) [0x89a425]
 8: (MonClient::authenticate(double)+0x24f) [0x898da7]
 9: (OSD::init()+0x112b) [0x807ca1]
 10: (main()+0x2c09) [0x73e406]
 11: (__libc_start_main()+0xfd) [0x7fb54b04ec4d]
 12: ./ceph-osd() [0x73b499]

due to auth_supported being NULL.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:52:24 -08:00
Sage Weil
bf254de5cf mon: verify fsid during probe and election
This will keep mismatched fsids out of the same quorum.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:37:07 -08:00
Sage Weil
f1a98fb8af mon: tolerate won election while active
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:22:37 -08:00
Sage Weil
cd736b9da0 mon: clean up logic a bit
More explicit.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:22:22 -08:00
Sage Weil
2633d71dbd mon: only re bootstrap if monmap actually changes
If we go thru here just to update latest, that's fine; no need to restart
the bootstrap process.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:22:09 -08:00
Sage Weil
622fbadd66 paxos: fix off-by-one in share_state
We hit this on adding a new monitor to an existing cluster.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:15:16 -08:00
Sage Weil
6c663d855e mon: fix monmap update
It's on the stack; update in place.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:05:01 -08:00
Sage Weil
1134fdfe78 mon: properly process monmaps even when i have the latest
We may get the latest monmap when we are doing our probing, but we still
need to process it in update_from_paxos().  Consider get_latest_version()
in addition to the active map.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 12:02:52 -08:00
Sage Weil
c097e63401 mon: fix up update_from_paxos() methods
Make sure they behave when the initial state is learned from paxos.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 11:55:34 -08:00
Sage Weil
f31825cee5 monmaptool: new maps get epoch 0
Just for consistency's sake.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 11:41:42 -08:00
Sage Weil
65f797ea47 mon: clean up mkfs seed data
And make sure the monmap/latest gets written properly.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 11:41:42 -08:00
Sage Weil
e545af2db5 mon: remove empty monstore dirs
This is sloppy, but it works well enough since we mkdir dirs as needed
too.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 11:41:42 -08:00
Sage Weil
aea7563f03 mon: create initial states after quorum is formed
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 11:41:42 -08:00
Sage Weil
1533f1c0ed mon: stage mkfs seed info in mkfs/ dir
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 10:45:27 -08:00
Sage Weil
9e941c4359 mon: eliminate PaxosService::init()
update_from_paxos() is sufficient

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 10:34:42 -08:00
Sage Weil
0a926ef576 mon: include monmap dump in mon_status and quorum_status
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 10:19:33 -08:00
Sage Weil
8c3d872ef0 mon: pull initial monmap from monmap/latest OR mkfs/monmap
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 10:15:23 -08:00
Sage Weil
0ecae9969e mon: take explicit initial monmap -or- generate one via MonClient
This will simplify bootstrapping a cluster via e.g. mon_host.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-11 10:05:36 -08:00
Sage Weil
dae6c95654 test_filestore_idempotent: detect commit cycles due to non-idempotent ops
If we do a non-idempotent op and it does a commit itself, we don't see
fs->is_committed() true ever.  Also count full commit cycles, and kill
ourselves after several of those have gone by.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
add04d15f4 filejournal: fix replay of non-idempotent ops
- start sync thread prior to replay, so that we can commit as we replay
  operations
- keep applied_seq accurate
- pass seq (not old op_seq) to do_transactions
- carry open_ops ref so that commit blocks until we have finished applying
  the full transaction

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
9f1673c15a test_filestore_idempotent: transactions are individually idempotent
Make individual transactions idempotent, but their interactions
non-idempotent.  I.e. A A A A is okay, but A B A is not.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
8df0cd3867 filestore: make trigger_commit() wake up sync; adjust locking
We need to wake up the sync thread (duh).

Also, we need to obey the FileJournal::lock -> journal_lock locking
order.

Also, lockdep is broken. :(

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
0981112097 filestore: document the btrfs_* fields
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
69cd362542 filestore: sync after non-idempotent operations
This is a big hammer to fix journal replay on non-btrfs fs backends (extN,
xfs, whatever).  The problem is that it is not safe to replay some journal
operations more than once, notably things like CLONE whose source data
may be changed by subsequent operations.

The simple fix is to initiate a full commit after any non-idempotent
operations prior to any subsequent operation within the same Sequencer.
This is done by calling trigger_commit() in _do_transactions(), which means
any potentially dependent operation that follows will get blocked because
a commit is about to start.

I made trigger_commit() a bit more robust to callers who are not holding
an open_ops ref to also succeeding if the given op_seq is already
committing.  For the current caller, that can't happen.

There are probably better performing solutions, but this one is at least
correct.

Fixes: #213
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 21:12:38 -08:00
Sage Weil
fa5047b377 Merge remote branch 'gh/stable' 2011-11-10 20:50:31 -08:00
Josh Durgin
5407fa70f8 workunits: add workunit for running rgw and rados python tests
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-10 17:03:12 -08:00
Yehuda Sadeh
2fb702979c rgw: remove warning 2011-11-10 17:10:41 -08:00
Josh Durgin
71bfe8974e test/pybind: add test_rgw
Forgot to add this in the previous commit.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-10 16:52:01 -08:00
Josh Durgin
ea42e02ca2 test/pybind: convert python rados and rgw tests to be runnable by nose
These tests can now be run automatically more easily.

Fixes: #1653
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-10 16:46:35 -08:00
Josh Durgin
25cde7f98a rados.py: fix Snap.get_timestamp
This now uses datetime, imports the right things, and calls the right function.

Fixes #1577
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-10 16:33:46 -08:00
Sage Weil
b600ec2ac7 v0.38 2011-11-10 15:07:05 -08:00
Samuel Just
2a7fbe0c90 common: return null if mc.init() unsuccessful
Prevents ceph.cc from segfaulting on missing keyring.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-10 15:05:52 -08:00
Josh Durgin
a177a702eb rbd.py: fix list when there are no images
It should return [], not [''].

Reported-by: Eric Chen <Eric_YH_Chen@wistron.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-10 15:05:38 -08:00
Sage Weil
27bb48c5bf mon: overwrite in put_bl
This fixes a situation where we accept a large value, there is some failure
and recovery, and then we commit a smaller value with the same version.

E.g.,

INFO:teuthology.task.ceph.mon.b.err:terminate called after throwing an instance of 'ceph::buffer::end_of_buffer'
INFO:teuthology.task.ceph.mon.b.err:  what():  buffer::end_of_buffer
INFO:teuthology.task.ceph.mon.b.err:*** Caught signal (Aborted) **
INFO:teuthology.task.ceph.mon.b.err: in thread 7f0a6037c700
INFO:teuthology.task.ceph.mon.b.err: ceph version 0.37-365-g5b20830 (commit:5b208302e1ad134f56933dfdbccb074e03c88be3)
INFO:teuthology.task.ceph.mon.b.err: 1: (ceph::BackTrace::BackTrace(int)+0x2d) [0x6f4d1b]
INFO:teuthology.task.ceph.mon.b.err: 2: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x7e9492]
INFO:teuthology.task.ceph.mon.b.err: 3: (()+0xfb40) [0x7f0a63bf4b40]
INFO:teuthology.task.ceph.mon.b.err: 4: (gsignal()+0x35) [0x7f0a625cdba5]
INFO:teuthology.task.ceph.mon.b.err: 5: (abort()+0x180) [0x7f0a625d16b0]
INFO:teuthology.task.ceph.mon.b.err: 6: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f0a62e716bd]
INFO:teuthology.task.ceph.mon.b.err: 7: (()+0xb9906) [0x7f0a62e6f906]
INFO:teuthology.task.ceph.mon.b.err: 8: (()+0xb9933) [0x7f0a62e6f933]
INFO:teuthology.task.ceph.mon.b.err: 9: (()+0xb9a3e) [0x7f0a62e6fa3e]
INFO:teuthology.task.ceph.mon.b.err: 10: (ceph::buffer::list::iterator::copy(unsigned int, std::string&)+0xcb) [0x7d73a7]
INFO:teuthology.task.ceph.mon.b.err: 11: (decode(std::string&, ceph::buffer::list::iterator&)+0x44) [0x5fa2e8]
INFO:teuthology.task.ceph.mon.b.err: 12: (LogEntry::decode(ceph::buffer::list::iterator&)+0xa8) [0x6ceee8]
INFO:teuthology.task.ceph.mon.b.err: 13: (LogMonitor::update_from_paxos()+0x346) [0x6cce9a]
INFO:teuthology.task.ceph.mon.b.err: 14: (PaxosService::_active()+0x13b) [0x647ab5]
INFO:teuthology.task.ceph.mon.b.err: 15: (PaxosService::C_Active::finish(int)+0x25) [0x647cb9]
INFO:teuthology.task.ceph.mon.b.err: 16: (Context::complete(int)+0x2b) [0x61a5a9]
INFO:teuthology.task.ceph.mon.b.err: 17: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x20b) [0x61a7ef]
INFO:teuthology.task.ceph.mon.b.err: 18: (Paxos::handle_last(MMonPaxos*)+0xea7) [0x63d081]
INFO:teuthology.task.ceph.mon.b.err: 19: (Paxos::dispatch(PaxosServiceMessage*)+0x29c) [0x642046]
INFO:teuthology.task.ceph.mon.b.err: 20: (Monitor::_ms_dispatch(Message*)+0xd78) [0x61636e]
INFO:teuthology.task.ceph.mon.b.err: 21: (Monitor::ms_dispatch(Message*)+0x3a) [0x61de84]
INFO:teuthology.task.ceph.mon.b.err: 22: (Messenger::ms_deliver_dispatch(Message*)+0x63) [0x7c690f]
INFO:teuthology.task.ceph.mon.b.err: 23: (SimpleMessenger::dispatch_entry()+0x7c2) [0x7b0156]
INFO:teuthology.task.ceph.mon.b.err: 24: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x5fd6ac]
INFO:teuthology.task.ceph.mon.b.err: 25: (Thread::_entry_func(void*)+0x23) [0x6e9261]
INFO:teuthology.task.ceph.mon.b.err: 26: (()+0x7971) [0x7f0a63bec971]
INFO:teuthology.task.ceph.mon.b.err: 27: (clone()+0x6d) [0x7f0a6268092d]

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 15:05:26 -08:00
Samuel Just
2f97a2223f PG: mark scrubmap entry as not absent when we see an update
Previously, there would be an assert failure in _scan_list if we see an
object deleted and then recreated.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-10 15:05:05 -08:00
Yehuda Sadeh
87941128b6 rgw: implement swift copy, fix copy auth 2011-11-10 14:58:19 -08:00
Samuel Just
704644bca7 PG: gen_prefix: use osdmap_ref rather than osd->osdmap
Otherwise, the debug output might not match the map used by
the pg logic.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-10 14:09:35 -08:00
Samuel Just
7fb182a17b OSD: sync_and_flush afer mkfs to create first snap
Previously, if we kill the OSD process before the filestore
does its first sync, we end up replaying the journal on top
of current and potentially hitting -EEXIST.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-10 14:09:34 -08:00
Samuel Just
a3dd5bd67b PG: update info.history even if lastmap is absent
Previously, we did not update same_interval_since etc if
we do not have the previous map.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-11-09 17:17:34 -08:00
Sage Weil
023ff5903a Makefile: add MMonProbe.h
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-09 16:36:48 -08:00
Sage Weil
fd5fb993e3 osd: remove useless proc_replica_log() side-effect
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-11-09 16:33:56 -08:00
Greg Farnum
78ad144abe hadoop: update patch and Readme.
Patch generated by Noah Watkins <noahwatkins@gmail.com>

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-11-09 15:38:57 -08:00
Yehuda Sadeh
386c0db372 rgw: swift guesses mime type if not specified 2011-11-09 15:30:14 -08:00
Sage Weil
78ccb2a980 osd: comment PG::lock*(), whitespace
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-09 14:50:09 -08:00