Commit Graph

16957 Commits

Author SHA1 Message Date
Sondra.Menthers
ed839f5a41 fixed graphic reference and headings 2011-10-27 14:04:56 -07:00
Sondra.Menthers
2c4eb07536 fixed image reference 2011-10-27 14:00:57 -07:00
Sondra.Menthers
b42443ec71 fixed architecture document 2011-10-27 13:54:31 -07:00
Sondra.Menthers
c57ed06cf3 add images for documentation 2011-10-27 13:43:05 -07:00
Sondra.Menthers
7a02202977 rgw: handle swift PUT with incorrect etag 2011-10-27 12:51:57 -07:00
Sondra.Menthers
cae7d5a056 rgw: handle swift PUT with incorrect etag 2011-10-27 12:44:37 -07:00
Sondra.Menthers
697bba394d rgw: handle swift PUT with incorrect etag 2011-10-27 12:44:09 -07:00
Sondra.Menthers
a817a38eff rgw: handle swift PUT with incorrect etag 2011-10-27 11:20:41 -07:00
Sondra.Menthers
d9dfd14761 rgw: handle swift PUT with incorrect etag 2011-10-27 11:16:51 -07:00
Sondra.Menthers
87224c08ae rgw: handle swift PUT with incorrect etag 2011-10-27 11:02:23 -07:00
Yehuda Sadeh
0c78f0dc80 rgw: handle swift PUT with incorrect etag 2011-10-26 17:20:51 -07:00
Yehuda Sadeh
e8e101580e rgw: rgw-admin --skip-zero-entries 2011-10-26 16:07:30 -07:00
Sage Weil
180c744bf3 perfcounters: fix accessor name
FreakingCamelCaps

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-26 16:00:45 -07:00
Sage Weil
1a0a732e9b objecter: instrument with perfcounter
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-26 16:00:36 -07:00
Yehuda Sadeh
e747456c9f rgw: rgw-admin generate-key/access-key=false fix 2011-10-26 15:34:52 -07:00
Yehuda Sadeh
9386a7b5e5 rgw: rgw-admin can show log summation 2011-10-26 15:34:18 -07:00
Yehuda Sadeh
6752babdfd rgw: fix bucket suspension 2011-10-26 14:30:50 -07:00
Sage Weil
f197e84517 rgw: fix uninitialized variable warnings
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-25 21:34:07 -07:00
Yehuda Sadeh
71fd830220 Merge branch 'master' of ssh://github.com/NewDreamNetwork/ceph
Conflicts:
	src/rgw/rgw_rados.cc
2011-10-25 16:29:40 -07:00
Greg Farnum
952be11aaa hadoop: bring back Java changes.
These convert the Hadoop stuff to work on the branch-0.20 API.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-10-25 16:39:01 -07:00
Yehuda Sadeh
d9f73605b9 rgw: fix attr cache 2011-10-25 16:23:08 -07:00
Sage Weil
ef48183a6f fix osdmaptool clitests 2011-10-25 14:15:13 -07:00
Sage Weil
8ae02dab74 Merge branch 'wip-pools' 2011-10-25 14:02:42 -07:00
Sage Weil
6287ccf624 mon: reencode routed messages
The message encoding may depend on the target features.  Clear the
payload so that the Message gets reencoded appropriately.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25 10:52:06 -07:00
Sage Weil
72e0ca0261 MOSDMap: reencode full map embedded in Incremental, as needed
The Incremental may have a bufferlist containing a full map; reencode
that too if we are reencoding for old clients.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25 10:51:21 -07:00
Sage Weil
cd6d70090d Merge remote-tracking branch 'gh/wip-rbd-tool' 2011-10-25 10:13:44 -07:00
Sage Weil
90f0429f8b mon: fix rare races with pool updates
Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25 09:53:18 -07:00
Sage Weil
6ca99060a3 mon: parse 0 values properly
Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25 09:53:18 -07:00
Sage Weil
43aa33a20c Merge remote branch 'gh/wip-osd-queue' 2011-10-24 22:51:15 -07:00
Sage Weil
03ad5a28ee osd: fix last_complete adjustment after recovering an object
After we recover each object, we try to raise the last_complete value
(and matching complete_to iterator).  If our log was purely a backlog, this
won't necessarily end up bringing last_complete all the way up to the
last_update value, and we'll fail an assert later.

If complete_to does reach the end of the log, then we fast-forward
last_complete to last_update.

The crash we were hitting was in finish_recovery(), and looked something
like

osd/PG.cc: In function 'void PG::finish_recovery(ObjectStore::Transaction&, std::list<Context*, std::allocator<Context*> >&)', in thread '0x7f4573df7700'
osd/PG.cc: 1800: FAILED assert(info.last_complete == info.last_update)
 ceph version 0.36-251-g6e29c28 (commit:6e29c2826066a7723ed05b60b8ac0433a04c3c13)
 1: (PG::finish_recovery(ObjectStore::Transaction&, std::list<Context*, std::allocator<Context*> >&)+0x8d) [0x6ff0ed]
 2: (PG::RecoveryState::Active::react(PG::RecoveryState::ActMap const&)+0x316) [0x729196]
 3: (boost::statechart::simple_state<PG::RecoveryState::Active, PG::RecoveryState::Primary, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x21b) [0x759c0b]
 4: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x8d) [0x7423dd]
 5: (PG::RecoveryState::handle_activate_map(PG::RecoveryCtx*)+0x183) [0x711f43]
 6: (OSD::activate_map(ObjectStore::Transaction&, std::list<Context*, std::allocator<Context*> >&)+0x674) [0x579884]
 7: (OSD::handle_osd_map(MOSDMap*)+0x2270) [0x57bd50]
 8: (OSD::_dispatch(Message*)+0x4d0) [0x596bb0]
 9: (OSD::ms_dispatch(Message*)+0x17b) [0x59803b]
 10: (SimpleMessenger::dispatch_entry()+0x9c2) [0x617562]
 11: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x4a3dec]
 12: (Thread::_entry_func(void*)+0x12) [0x611a92]
 13: (()+0x7971) [0x7f457f87b971]
 14: (clone()+0x6d) [0x7f457e10b92d]

Fixes: #1609
Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-24 22:50:43 -07:00
Sage Weil
12b3b2d5af osd: fix generate_past_intervals maybe_went_rw on oldest interval
We stop working backwards when we hit last_epoch_clean, which means for the
oldest interval first_epoch may not be the _real_ first_epoch.  (We can't
continue working backward because we may have thrown out those maps
entirely.)

However, if the last_epoch_clean epoch is contained within that interval,
we know that the OSD did in fact go rw because it had to have completed
recovery (and thus peering) to set last_clean_epoch in the first place.

This fixes cases where two different nodes have slightly different
past intervals, generate different prior probe sets as a result, and
flip/flop on the acting set choice.  (It may have eventually resolved when
the wrongly excluded node's notify races and arrives in time to be
considered, but that's still clearly no good.)

This does leave the start epoch for that oldest interval incorrect.  That
doesn't currently matter except that it's confusing, but I'm not sure how
to mark it properly, or if it's worth the effort.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-24 22:50:43 -07:00
Sage Weil
c30ab1e25a osd: MOSDPGNotify: print prettier
Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-24 22:50:43 -07:00
Sage Weil
7de2f7a94e osd: print useful debug info from choose_acting
Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-24 22:50:43 -07:00
Sage Weil
e2f3c20b04 osd: make proc_replica_log missing dump include useful information
I needed to see have/need to debug a weird unfound issue turned up by
thrashing.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-24 22:50:42 -07:00
Sage Weil
f8e9289669 osd: fix/simplify op discard checks
Use a helper to determine when we should discard an op due to the client
being disconnected.  Use this when the op is first received, (re)queued,
and dequeued.

Fix the check to keep ops that are replayed ACKs, as we should make every
effort to reapply those even when the client goes away.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-24 22:21:43 -07:00
Sage Weil
fa722de670 osd: move queue checks into enqueue_op, kill _handle_ helpers
This simplifies things, and renames the checks to make it clear that we are
doing validation checks only, with no side-effects allowed.

Also move some checks into the parent handle_op() to further simplify the
(re)queue checks.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-24 22:13:59 -07:00
Sage Weil
3a2dc65665 osd: move op cap check into helper
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-24 21:59:49 -07:00
Sage Weil
662414d76e osd: drop useless PG hooks
These no longer need to be exposed to the generic OSD code.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-24 21:54:10 -07:00
Sage Weil
b1de913188 osd: drop ability to disable op queue entirely
This is pretty useless, and broken wrt requeueing anyway.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-24 21:54:10 -07:00
Sage Weil
b17c9ca595 osd: handle missing/degraded in op thread
The _handle_op() method (and friends) are called when an op is initially
queued and when it is requeued.  In the requeue case we have to be more
careful because the caller may be in the middle of doing all sorts of
random stuff.  That means we need to limit ourselves to queueing or
discarding the op, and refrain from doing anything else with dangerous
side effects.

This fixes a crash like

osd/ReplicatedPG.cc: In function 'void ReplicatedPG::recover_primary_got(hobject_t, eversion_t)', in thread '7f21d0189700'
osd/ReplicatedPG.cc: 4109: FAILED assert(missing.num_missing() == 0)
 ceph version 0.37-105-gc2069eb (commit:c2069eb1e562ba7d753c9b5ce5c904f4f5ef6abe)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0x8ab95a]
 2: (ReplicatedPG::recover_primary_got(hobject_t, eversion_t)+0x62e) [0x767eea]
 3: (ReplicatedPG::sub_op_push(MOSDSubOp*)+0x2b79) [0x76abeb]
 4: (ReplicatedPG::do_sub_op(MOSDSubOp*)+0x1ab) [0x74761b]
 5: (OSD::dequeue_op(PG*)+0x47d) [0x820ac3]
 6: (OSD::OpWQ::_process(PG*)+0x27) [0x82cc8b]

due to an object being pushed to a replica before it is activated.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-24 21:54:10 -07:00
Sage Weil
7aa0d89bb9 osd: set reqid on push/pull ops
Not strictly necessary, but makes logs easier to follow.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-24 20:54:26 -07:00
Sage Weil
e2766bd87c mon: remove compatset cruft
The CompatSet is built on demand; it's no longer static.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-10-24 20:42:12 -07:00
Samuel Just
024bcc4b24 FileStore: ignore EEXIST on clones and collection creation !btrfs_snap
We need to ignore EEXIST on btrfs also when m_filestore_btrfs_snap is
disabled.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-24 16:54:13 -07:00
Samuel Just
6f1b65c6ff ReplicatedPG: fix snapshot directory handling in snap_trimmer
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-24 16:54:13 -07:00
Yehuda Sadeh
4d884040f1 rgw: fix rgw_obj compare function 2011-10-24 16:43:14 -07:00
Greg Farnum
df2967a602 rgw: use a uint64_t instead of a size_t for storing the size
librados uses uint64_t so that 32-bit architectures aren't hobbled.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-10-24 15:34:42 -07:00
Josh Durgin
e161ce1593 workunits: test rbd python bindings
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-10-24 15:32:47 -07:00
Josh Durgin
b7aa57ff73 rbd.py: update python bindings for new copy interface
It was changed to return 0 on success in d7f7a21354

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-10-24 15:32:47 -07:00
Josh Durgin
2af32a411b librados: use stored snap context for all operations
Using an empty snap context led to the failure of
test_rbd.TestImage.test_rollback_with_resize, since clones weren't
created when deleting objects. This test now passes.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-10-24 15:32:47 -07:00
Josh Durgin
ae91911c75 librbd: resize if necessary before rolling back
This is a partial fix for test_rbd.TestImage.test_rollback_with_resize

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-10-24 15:32:47 -07:00