Commit Graph

25360 Commits

Author SHA1 Message Date
Sage Weil
39689fea49 librbd: fix diff_iterate arithmetic for non-standard striping
This code is confusing because we are moving back and forth between
image offsets, "buffer" offsets (image offsets relative to off), and
object offsets.  Fix the math.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:41 -07:00
Sage Weil
f2b0880a89 qa: rbd/diff_continuous.sh: base test off a clone
Get a bit of coverage on clones by starting with a clone.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:41 -07:00
Sage Weil
fc3f4fda76 rbd: implement simple 'diff' command
Report extents allocated/changed, and whether they contain data or zeros.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:41 -07:00
Sage Weil
4d02e17f36 librbd: handle diff from clone
If we have a parent image, and the reference is from snap 0 (beginning of
time) we need to look at the diff on the parent from the beginning of time
and report that when we get an ENOENT.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:41 -07:00
Sage Weil
186ddda58c rbd: send import debug noise to dout, not stdout
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:41 -07:00
Sage Weil
58c2dedded qa: add rbd/diff_continuous.sh stress test
Stress test that does io on an image while we are mirroring a diff from
earlier snaps to a second copy.  At the end, verify that all snaps have
matching content.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
e7167433ae rbd: implement 'export-diff' and 'import-diff' commands
Export a diff of an image from a previous snapshot to a file (or stdout).

Import a diff and apply it to an image, and then create the ending
snapshot.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
cf7d13a7e9 rbd: add --io-pattern <seq|rand> option to bench-write
Write to random offsets instead of sequentially.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
0296c7cdae librbd: implement diff_iterate
Implement a diff_iterate() method that will iterate over an image and
report which extents vary between two snapshots (or a snapshot and the
head).  The callback gets an extent and a flag indicating whether it is
full of data or is known to be zero in the ending snapshot.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
6db5109127 librados: expose snapset seq via list_snaps
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
ebed000014 osdc/Objecter: prval optional for listsnaps
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
4ae977435c osd: fix error codes for list-snaps
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
941cfc26a0 osd: fix clone snap list for list-snaps
We need to return the list of snaps that each clone is defined for, not
the list of snaps we know may or may not exist globally over a similar
interval.  This requires looking at the clone's obc, unfortunately.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
ceee218fd5 osd: wait for all clones on SNAPDIR requests
Wait for all clones to be present.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
556d33442f osd: direct reads on SNAPDIR to either head or snapdir
The list_snaps operation needs to look at the SnapSet, and is logically
querying all revisions of the object.  Make requests to SNAPDIR be
read-only, and grab the head or snapdir obc transparently (whichever one
exists).  This allows us to list snaps when, say, the head does not
exist, but there are in fact snaps.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
b64bb5f860 osd: do not include snaps with head on list_snaps()
If there is a sequence of snaps 1, 2, 3, 4, 5, and we have a clone
2 with [1,2], and the head reflects content at snap times [3,4,5], then
the snap_list should return

 clone 2 snaps [1,2]
 head snaps
 seq 2

because it never saw a write after snap 2, and therefor has the same
content currently as it did in snaps 3,4,5.  If the SnapSet on the
object lists snaps 3,4,5, and the head exists, it actually means the
object was deleted between 2 and 3, and was recreated after 5:

 clone 2 snaps [1,2]
 head snaps []
 seq 5

The key to telling the two situations apart is the seq number on the
SnapSet (now included in the list_snaps reply) that tells us when the
last update was.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
01b74209fb osd: clean up some whitespace
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
0074228911 osd: include SnapSet seq in the list snaps response
It is important to know the latest seq that the object has seen in order
to tell if a response like

 clone 2 snaps=[1,2]
 clone head snaps=[]

was untouched before a hypothetical snap 3, or deleted prior to snap 3,
and then recreated+modified after.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
fa5206ce9b osd: make LIST_WATCHERS and LIST_SNAPS print nicely for OSDOp
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
849a45c67f strings: add 'list-watchers' to MOSDOp strings
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 23:32:40 -07:00
Sage Weil
945ead5a81 Merge remote-tracking branch 'gh/wip-cors-rebased'
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2013-03-31 23:23:47 -07:00
Sage Weil
a2956f6f8e rgw: fix warning
On a 64-bit arch, we still want to make sure it's a 32-bit value.  Gcc is
too smart for us to just cast; it will still warn on 32-bit arch that the
comparison is always true.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 21:51:59 -07:00
Yehuda Sadeh
01779df17d rgw: add missing include file
Add missing limits.h, needed for ULONG_MAX.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-03-31 21:51:59 -07:00
Yehuda Sadeh
3c52b8bbf5 Makefile.am: change some cors rules
The cors unitest should be a standalone test (not part of the make
unitests) as it requires having a running gateway and needs input params
to run correctly.
Also update missing header files.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-03-31 21:51:59 -07:00
Yehuda Sadeh
e1a78f9827 rgw: fix a few warnings
Adjust data types

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-03-31 21:51:59 -07:00
Babu Shanmugam
d4b22f3e17 rgw: more cors fixes
Remove the check for read_cors_config in rgw_main.cc, and changes type of 'a' to unsigned from long as max_age cannot be a negative integer

Modified the type of 'a' to unsigned long and used ULONG_MAX and strtol in rgw_cors_swift.h

Signed-off-by: Babu Shanmugam <anbu@enovance.com>
2013-03-31 21:51:59 -07:00
Yehuda Sadeh
e9e86ad14a rgw: cors, style fixes, other fixes
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-03-31 21:51:59 -07:00
Babu Shanmugam
f165049cba rgw: with CORS support
With CORS test cases

1. Added license headers to the cors files
2. SIWFT POST metadata for cors will replace the old cors configuration
3. Fixed a buf in rgw_cors_swift.h

With Yehuda's review comments along with some fixes;
1. If the origin is allowed only for https, we should not approve the same host for http requests
2. Accounted for hostname situtation like www.www.org, or www.wowwww.com or www.*
3. Replaced atoi with strtol
4. Have a centralized place for parsing host names, hence avoiding duplicates

Checked certain senarios with amazon S3 and made changes accordingly

With some fixes in rgw_cors.cc and str_list.cc

Removing the whitespace auto-append to the delimiters in get_str_list(), added white spaces delimiters in is_string_in_set()
2013-03-31 21:51:48 -07:00
Sage Weil
c01e2e42f3 client: do sync read when 'client oc = false'
Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 21:44:00 -07:00
Sage Weil
74c708367b client: fix use-after-free on session close and cond signals
Move the signal into the closed method, before we deallocate the
MetaSession, so that other callers catch it too.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-31 21:43:57 -07:00
Yan, Zheng
4ad35b2a83 mds: mark connection down when MDS fails
So if the MDS restarts and uses the same address, it does not get
old messages.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-03-31 16:57:14 +08:00
Yan, Zheng
fbcc64dffd mds: fix MDCache::adjust_bounded_subtree_auth()
There are cases that need both create new bound and swallow intervening
subtree. For example: A MDS exports subtree A with bound B and imports
subtree B with bound C at the same time. The MDS crashes, exporting
subtree A fails, but importing subtree B succeed. During recovery, the
MDS may create new bound C and swallow subtree B.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-03-31 16:57:14 +08:00
Yan, Zheng
573a4ae1a2 mds: process finished contexts in batch
If there are several unstable locks in an inode, current Locker::eval(CInode*,)
processes each lock's finished contexts seperately. This may cause very deep
call stack if finished contexts also call Locker::eval() on the same inode.
An extreme example is:

Locker::eval() wakes an open request(). Server::handle_client_open() starts
a log entry, then call Locker::issue_new_caps(). Locker::issue_new_caps()
calls Locker::eval() and wakes another request. The later request also tries
starting a log entry.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-03-31 16:57:14 +08:00
Yan, Zheng
5cbaae6648 mds: preserve subtree bounds until slave commit
When replaying an operation that rename a directory inode to non-auth subtree,
if the inode has subtree bounds, we should prevent them from being trimmed
until slave commit.

This patch also fixes a bug in ESlaveUpdate::replay(). EMetaBlob::replay()
should be called before MDCache::finish_uncommitted_slave_update().

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-03-31 16:57:14 +08:00
Sage Weil
ce8793ce3b Merge pull request #175 from dachary/wip-4594
fix null character in object name triggering segfault

Reviewed-by: Sage Weil <sage@inktank.com>
2013-03-30 18:22:01 -07:00
Loic Dachary
c344ff170d fix null character in object name triggering segfault
Parsing \n in  lfn_parse_object_name is implemented with

  out->append('\0');

which segfaults when using libstdc++ and g++ version 4.6.3 on Debian
GNU/Linux. It is replaced with

  (*out) += '\0';

to avoid the bugous implicit conversion. There is no append(charT)
method in C++98 or C++11, which means it relies on an implicit
conversion that is bugous. It would be better to rely on the
basic_string& operator+=(charT c); method as defined in ISO 14882-1998
(page 385) thru ISO 14882-2012 (page 640)

A set of tests is added to generate and parse object names. They need
access to the private function lfn_parse_object_name because there is
no convenient protected method to exercise it. The tests contain a
LFNIndex derived class, TestWrapLFNIndex which is made a friend of
LFNIndex to gain access to the private methods.

http://tracker.ceph.com/issues/4594 refs #4594

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-03-30 14:28:34 +01:00
Sage Weil
2b8eb31b85 Merge branch 'wip-4490' 2013-03-29 18:02:15 -07:00
Sage Weil
e611937f3e mon: OSDMonitor: add 'osd pool set-quota' command
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-03-29 17:59:35 -07:00
John Wilkins
95328089b8 doc: Added entries for Pool, PG, & CRUSH. Moved heartbeat link.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-03-29 17:38:48 -07:00
John Wilkins
bcc5c65305 doc: Added heartbeat configuration settings.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-03-29 17:38:02 -07:00
John Wilkins
6157d68369 doc: Moved PG info to separate page. Moved heartbeat to mon-osd doc.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-03-29 17:36:23 -07:00
John Wilkins
ca77aabbf1 doc: Rewrote monitor configuration section.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-03-29 17:34:45 -07:00
John Wilkins
ea3c833d0f doc: Moved to separate section for parallelism.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-03-29 17:32:47 -07:00
John Wilkins
ba73b8301a doc: Cleanup.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-03-29 17:32:00 -07:00
Sage Weil
e9b3f2e6e9 ceph-disk list: say 'unknown cluster $UUID' when cluster is unknown
This makes it clearer that an old osd is in fact old.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-03-29 17:30:28 -07:00
Greg Farnum
9e7ddf677f config_opts: fix rgw_port comments to be plaintext
Signed-off-by: Greg Farnum <greg@inktank.com>
2013-03-29 17:05:41 -07:00
Samuel Just
3da3129e07 ReplicatedPG: check for full if delta_stats.num_bytes > 0
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-03-29 16:47:29 -07:00
Joao Eduardo Luis
9b09073259 mon: Monitor: check if 'pss' arg is !NULL on parse_pos_long()
We already do it all throughout the function, but this one place didn't.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-03-29 16:47:29 -07:00
Joao Eduardo Luis
e2a936d2ae common: util: add 'unit_to_bytesize()' function
Converts from a numerical value that may or may not contain an unit
modifier ('1024', '1K', '2M', ..., '1E') and returns the parsed size
in bytes.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-03-29 16:47:28 -07:00
Joao Eduardo Luis
23c2fa7fc2 osd: osd_types: add pool quota related fields 2013-03-29 16:03:21 -07:00