Commit Graph

21785 Commits

Author SHA1 Message Date
Yehuda Sadeh
84299e16f3 rgw: fix multipart overwrite
Fixes: #3400
Removed a few lines of code that prematurely created the head
part of the final object (before creating the manifest).

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-06 10:17:14 -08:00
Yehuda Sadeh
be6d563653 rgw: don't reset multipart parts when updating their metadata
Fixes: #3401
The problem was that put_obj_meta() was assuming object is going
to be reset, so it was resetting the object anyway. This is not
true when dealing with the immutable multipart upload parts.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-06 10:16:59 -08:00
Yehuda Sadeh
488b019adf rgw: break out of read loop if we got zero bytes
If the part that we're reading is corrupted and we end up
reading zero bytes, we need to exit, otherwise we'd just
loop forever.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-06 10:16:17 -08:00
Dan Mick
241569c595 rbd: allow removal of image even if rbd_children deletion fails
Users have been seeing failures where rbd rm is half-done; could be
because of outstanding watches on the rbd_header object.  The state
is that rbd_children no longer contains the child, but other pieces
remain; remove considers this a failure.

Fix: test for ENOENT from remove_child, and treat that as an ignorable
error and drive on.  Simulate this in copy.sh by removing the
rbd_children object altogether, which also results in ENOENT return
from remove_child.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-11-05 21:41:34 -08:00
Samuel Just
342c2c7077 PG::merge_old_entry: fix case for divergent prior_version
Previously, we asserted that a log entry with a divergent
prior_version must be a clone.  Consider the following
case:

6'11(6'2)  m foo
7'12(6'3) m bar
7'13(7'12) m bar

If this is merged with:

6'11(6'2)  m foo
8'12(6'4) m baz

we will hit the assert.  The correct behavior is simply to remove
the object as in the clone case.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-11-04 06:16:36 -08:00
Samuel Just
7e264678a9 PG: use remove_object_with_snap_hardlinks for divergent objects
Otherwise, we end up leaving snap hardlinks in the snapshot
index directories.  This eventually results in an EEXIST error
when we attempt to re-link the clone into place during
recovery.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-11-04 06:16:18 -08:00
Sage Weil
c435d314ca ceph-disk-activate: avoid duplicating mounts if already activated
If the given device is already mounted at the target location, do not
mount --move it again and create a bunch of dup entries in the /etc/mtab
and kernel mount table.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-31 17:14:31 -07:00
Mike Ryan
3f08e96cc0 PG: requeue snap_trimmer after scrub finishes
Previously the snap_trimmer would continuously requeue itself until the
end of scrub. This degrades performance and fills up logs for No Good
Reason.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
2012-10-31 15:42:03 -07:00
Sage Weil
402e1f5319 ceph-disk-prepare: poke kernel into refreshing partition tables
Prod the kernel to refresh the partition table after we create one.  The
partprobe program is packaged with parted, which we already use, so this
introduces no new dependency.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-30 10:40:58 -07:00
Sage Weil
2e32a0ee2d ceph-disk-prepare: fix journal partition creation
The end value needs to have + to indicate it is relative to wherever the
start is.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-30 10:40:58 -07:00
Sage Weil
8921fc7c7b ceph-disk-prepare: assume parted failure means no partition table
If the disk has no valid label we get an error like

  Error: /dev/sdi: unrecognised disk label

Assume any error we get is that and go with an id label of 1.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-30 10:40:58 -07:00
Sage Weil
b9eccdf8ba osd: make pool_snap_info_t encoding backward compatible
Way back in fc869dee1e (v0.42) when we redid
the osd type encoding we forgot to make this conditionally encode the old
format for old clients.  In particular, this means that kernel clients
will fail to decode the osdmap if there is a rados pool with a pool-level
snapshot defined.

Fixes: #3290
Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-29 11:05:59 -07:00
Sage Weil
2f09d47d21 mon: fix leading error string from 'ceph report'
Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-26 14:55:43 -07:00
Sage Weil
c0df832877 osd: fix populate_obc_watchers() assert
There is one case where populate_obc_watchers gets called when the object
is missing: during a revert.  And in that case we *should* do the populate,
since all that is getting reverted is the object version.

Fixes: #3405
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
2012-10-25 11:59:33 -07:00
Sage Weil
2248822b2c osd: drop conditional check in populate_obc_watchers
Turn these into asserts.  The only two callers are create_object_context()
and get_object_context(), and they only get called when the object is no
longer missing.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2012-10-22 10:45:36 -07:00
Sage Weil
4156b984a5 osd: populate obc watchers even when degraded
Bug #3142 appears to be caused by the following sequence:

 - object X missing on primary and replica
 - [assert-ver,watch], notify, unwatch requests come in, get deferred
 - object is recovered on primary, !missing, create_object_context
   - populate_obc_watchers() does nothing, since still degraded
 - notify happens now (odd but ok?)
 - replica recovered, !degraded
 - watch skips bc of bad assert
 - unwatch trips up on an assert because populate_obc_watchers never
   ran

Fix this by populating the obc watcher when !missing, not when
!degraded.  This conditional dates back to Sam's original watch/notify
cleanup in October 2011.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2012-10-22 10:45:20 -07:00
Sam Lang
233b0bdf0b test/libcephfs: Fix telldir/seekdir test
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-10-19 09:22:29 -07:00
Sage Weil
1c382869ba addr_parsing: make , and ; and ' ' all delimiters
Instead of just ,.  Currently "foo.com, bar.com" will fail because of the
space after the comma.  This patches fixes that, and makes all delim
chars interchangeable.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-17 23:00:10 -07:00
Sage Weil
6f74e6b36a radosgw: fix compile warning
Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-16 17:27:28 -07:00
Gary Lowell
d78ba6af94 Merge branch 'next' 2012-10-16 23:27:21 +00:00
John Wilkins
ab4d8b75f3 doc: Updated the cephx section of the toc for cluster ops.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-10-16 15:24:11 -07:00
John Wilkins
256c665eab doc: Did a little clean-up work in the cephx guide.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-10-16 15:23:34 -07:00
John Wilkins
0818e1e95a doc: Added a new intro for cephx authentication.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-10-16 15:22:25 -07:00
Yehuda Sadeh
d2afddd457 rgw: multiple coverity fixes
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-10-16 14:13:38 -07:00
Sage Weil
db976663a5 mds: explicitly queue messages for unconnected clients
Previously, the messenger would queue messages for a destination that
didn't exist when you were a server; that changed a while back with the
wip-msgr merge (circa v0.52).  The result is that when we force open
client sessions and queue messages, they are dropped on the floor and the
client--when it does connect--gets confusing stuff from the MDS.

Instead, explicitly queue and send these messages.  Also, *always* send
via the Connection* instead of the inst.

Fixes: #2681
Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-16 13:04:43 -07:00
Gary Lowell
2528b5ee10 v0.43 2012-10-16 17:42:36 +00:00
Sage Weil
318bd19275 Merge remote-tracking branch 'gh/wip-fedora18' into next 2012-10-16 09:05:55 -07:00
Dan Mick
96e365be85 radosgw-admin manpage: Fix broken quotes
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-10-15 18:18:49 -07:00
Sage Weil
412efc1681 admin_socket: fix '0' protocol version
Broken by 895e24d198.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-15 16:37:24 -07:00
Sage Weil
18a3cee29e client: avoid possible null deref
CID 716910 (#1 of 1): Explicit null dereferenced (FORWARD_NULL)
At (6): Dereferencing null pointer "mds_session".

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-15 14:22:31 -07:00
Sage Weil
0095a13824 client: fix shadowing in inode ctor
CID 728080 (#1 of 1): Incorrect sizeof expression (BAD_SIZEOF)
Taking the size of pointer parameter "layout" is suspicious.

At (2): Non-static class member field "layout.fl_stripe_unit" is not initialized in this constructor nor in any functions that it calls.
At (4): Non-static class member field "layout.fl_stripe_count" is not initialized in this constructor nor in any functions that it calls.
At (6): Non-static class member field "layout.fl_object_size" is not initialized in this constructor nor in any functions that it calls.
At (8): Non-static class member field "layout.fl_cas_hash" is not initialized in this constructor nor in any functions that it calls.
At (10): Non-static class member field "layout.fl_object_stripe_unit" is not initialized in this constructor nor in any functions that it calls.
At (12): Non-static class member field "layout.fl_unused" is not initialized in this constructor nor in any functions that it calls.
CID 717206 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
At (14): Non-static class member field "layout.fl_pg_pool" is not initialized in this constructor nor in any functions that it calls.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-15 14:20:51 -07:00
Sage Weil
d8bb685d84 client: init readdir fields
At (2): Non-static class member "readdir_offset" is not initialized in this constructor nor in any functions that it calls.
At (4): Non-static class member "readdir_end" is not initialized in this constructor nor in any functions that it calls.
At (6): Non-static class member "readdir_num" is not initialized in this constructor nor in any functions that it calls.
CID 717207 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
At (8): Non-static class member "tid" is not initialized in this constructor nor in any functions that it calls.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-15 14:19:10 -07:00
Gary Lowell
55115642f1 Makefile: Add CRYPTO_C(XX)FLAGS to librdb 2012-10-15 14:14:35 -07:00
Gary Lowell
4c134a42d9 Makefiles: Add ar-lib to .gitignore. 2012-10-15 14:14:35 -07:00
Gary Lowell
f525534b5f autogen.sh: On some platforms, the m4 is created earlier. 2012-10-15 14:14:35 -07:00
Gary Lowell
d28ba52b5f autogen.sh: Create m4 directory for leveldb submodule. 2012-10-15 14:14:35 -07:00
Gary Lowell
0cc828ba22 Makefiles: Update submodule reference to latestes for leveldb. 2012-10-15 14:14:35 -07:00
Gary Lowell
3ecd289139 Makefile: update submodule reference for leveldb. 2012-10-15 14:14:35 -07:00
Gary Lowell
0219b66e04 leveldb: fix-up submodule entry. 2012-10-15 14:14:35 -07:00
Gary Lowell
071fdc217a Makefile: Improve test for boost system library. 2012-10-15 14:14:35 -07:00
Gary Lowell
7ea734c472 Makefiles: Missing boost library should not be fatal. 2012-10-15 14:14:34 -07:00
Gary Lowell
151d9403c5 Makefiles: ignore the m4 macro directory 2012-10-15 14:14:34 -07:00
Gary Lowell
3658157b60 Makefile: Updates to eliminates warnings, add test for boost system lib. 2012-10-15 14:14:34 -07:00
Sage Weil
a1d8267c57 cls_rgw: init var in ctor
CID 727992 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
At (2): Non-static class member "tag_timeout" is not initialized in this constructor nor in any functions that it calls.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-15 14:14:28 -07:00
Yehuda Sadeh
8d7c8e3b86 rgw: don't add port to url if already has one
Fixes: #3296
Specifically, is host name string already has ':', then
don't try to append theport (swift auth).

backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-10-15 09:59:14 -07:00
Tommi Virtanen
662c69e525 ceph-disk-prepare, debian/control: Support external journals.
Previously, ceph-disk-* would only let you use a journal that was a
file inside the OSD data directory. With this, you can do:

  ceph-disk-prepare /dev/sdb /dev/sdb

to put the journal as a second partition on the same disk as the OSD
data (might save some file system overhead), or, more interestingly:

  ceph-disk-prepare /dev/sdb /dev/sdc

which makes it create a new partition on /dev/sdc to use as the
journal. Size of the partition is decided by $osd_journal_size.
/dev/sdc must be a GPT-format disk. Multiple OSDs may share the same
journal disk (using separate partitions); this way, a single fast SSD
can serve as journal for multiple spinning disks.

The second use case currently requires parted, so a Recommends: for
parted has been added to Debian packaging.

Closes: #3078
Closes: #3079
Signed-off-by: Tommi Virtanen <tv@inktank.com>
2012-10-15 09:34:36 -07:00
Chris Dunlop
4db12511f7 logrotate: fix bash syntax
Introduced by 32a6394be0.

Signed-off-by: Chris Dunlop <chris@onthe.net.au>
2012-10-13 22:34:29 -07:00
Sage Weil
251649cdfa doc: remove cephfs warning
Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-13 20:13:35 -07:00
Sage Weil
168bd10e4d doc: fix file system recs
- drop xattr warning; this is not an issue with the leveldb stuff.
- the ext3 vs xattr discussion was somewhat inaccurate.  also, no longer
  relevant.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-10-13 20:12:48 -07:00
Yehuda Sadeh
389fac7a82 rgw: replace bucket creation with explicit pool creation
Following a recent cleanup, usage should create a pool and
not a bucket.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-10-12 14:03:02 -07:00