Commit Graph

37940 Commits

Author SHA1 Message Date
Yehuda Sadeh
30d0a49c84 rgw: generalize container type for concurrent IO base class
Turned the ConcurrentIO class a template, so that we could use different
kind of containers that are needed for the different operations.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:23 -08:00
Yehuda Sadeh
04441f2878 cls_rgw, rgw: create base class for common bucket shard operations
Instead of copy pasting the same code all over again, create a base
class for the needed concurrent IO operations.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-13 19:21:23 -08:00
Guang Yang
9c5acd67c4 Adjust bucket stats/index checking/index rebuild/tag timeout implementation to work with multiple shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:23 -08:00
Guang Yang
56feee792e Adjust bucket listing to work with multiple shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:23 -08:00
Guang Yang
751fd07bec Adjust rgw bucket prepare/complete OP to work with multiple bucket index shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:22 -08:00
Guang Yang
5d004d3eac Implement sharding for bucket creation.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:22 -08:00
Guang Yang
90a3920c44 Add a new field to bucket info indicating the number of shards of this bucket and make it configurable.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2015-01-13 19:21:22 -08:00
Josh Durgin
d784bc47c4 Merge pull request #3316 from ceph/wip-10471
rgw: index swift keys appropriately

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-01-12 16:20:28 -08:00
Sage Weil
58b5e02890 Merge pull request #3327 from ceph/wip-peeringqueue
osd: fix peering queue bug

Reviewed-by: Samuel Just <sjust@redhat.com>
2015-01-09 21:43:04 -08:00
Yehuda Sadeh
d375532436 rgw: return InvalidAccessKeyId instead of AccessDenied
Fixes: #10334

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 56af795b10)
2015-01-09 15:13:51 -08:00
Yehuda Sadeh
dd57af2f0a rgw: return SignatureDoesNotMatch instead of AccessDenied
Fixes: #10329

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit ef75d720f289ce2e18c0047380a16b7688864560)
2015-01-09 15:13:43 -08:00
Josh Durgin
3fb080ffa5 Merge pull request #3250 from ceph/wip-10372
osdc/Objecter: improve pool deletion detection

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-01-09 14:12:22 -08:00
Yehuda Sadeh
f90c48f271 Revert "rgw: switch to new watch/notify API"
This reverts commit dc67cd69604ec4e4df846b818ec739dc7b09a537.

Conflicts:
	src/rgw/rgw_rados.cc
2015-01-09 14:18:32 -08:00
Sage Weil
20be188d5f osd: assert there is a peering event
This became conditional way back in 12e22b3d44eba51a70d8babebc2684f0c46575a7
for unclear reasons.  It probably predates the in_use checks.  In any case,
at this point, we should only arrive here if the PG was queued, implying
that there will always be an event to process.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-01-08 16:14:56 -08:00
Sage Weil
492ccc900c osd: requeue PG when we skip handling a peering event
If we don't handle the event, we need to put the PG back into the peering
queue or else the event won't get processed until the next event is
queued, at which point we'll be processing events with a delay.

The queue_null is not necessary (and is a waste of effort) because the
event is still in pg->peering_queue and the PG is queued.

Note that this only triggers when we exceeed osd_map_max_advance, usually
when there is a lot of peering and recovery activity going on.  A
workaround is to increase that value, but if you exceed osd_map_cache_size
you expose yourself to crache thrashing by the peering work queue, which
can cause serious problems with heavily degraded clusters and bit lots of
people on dumpling.

Backport: giant, firefly
Fixes: #10431
Signed-off-by: Sage Weil <sage@redhat.com>
2015-01-08 16:14:56 -08:00
Yehuda Sadeh
478629bd2f rgw: index swift keys appropriately
Fixes: #10471
Backport: firefly, giant

We need to index the swift keys by the full uid:subuser when decoding
the json representation, to keep it in line with how we store it when
creating it through other mechanism.

Reported-by: hemant burman <hemant.burman@gmail.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2015-01-07 14:20:45 -08:00
Josh Durgin
d5e2ca1620 Merge pull request #3277 from ceph/wip-watch-leak
librados: fix a memory leak in watch

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-01-05 12:54:22 -08:00
Sage Weil
3a2e2df79c Merge pull request #3247 from ceph/wip-10422
mon: provide encoded canonical full OSDMap from primary

Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
2015-01-05 11:16:49 -08:00
Sage Weil
5cf84e6393 librados: fix leak of WatchContext on unwatch
The lifecycle matches that of the watch linger_op.  Note that it is NULL
for notify.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-01-05 09:36:43 -08:00
Sage Weil
7d75f0cf76 Makefile: include radosgw-admin in base
Signed-off-by: Sage Weil <sage@redhat.com>
2014-12-29 15:55:24 -08:00
Sage Weil
38350a09bd client: fix quota signed/unsigned warning
client/Client.cc: In member function 'bool Client::is_quota_bytes_exceeded(Inode*, uint64_t)':
client/Client.cc:10393:66: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     if (quota->max_bytes && (rstat->rbytes + new_bytes) > quota->max_bytes)

Signed-off-by: Sage Weil <sage@redhat.com>
2014-12-29 15:47:28 -08:00
Sage Weil
aa56ee40c0 mon: provide encoded canonical full OSDMap from primary
Currently we make each monitor apply the incremental and encode the full
map locally.  The original motivation was to save bandwidth, but the
savings are minimal to modest and the complexity associated with doing this
is huge.

This strategy also causes problems now that we have OSDMap crc's and old
mons/clusters may have diverging full OSDMaps due to mixed version
clusters.  See #10422

Instead, include the encoded full map in the paxos transaction.  We will
still apply the incremental and check the crc, but if it fails and we have
the correct version, reload it from disk and move on.  If we don't, we
will continue as we have before--the primary mon doesn't have support for
crc's yet.  When it does we will start verifying and/or get our
full map back into sync.

Fixes: #10422
Signed-off-by: Sage Weil <sage@redhat.com>
2014-12-23 17:01:53 -08:00
Sage Weil
d7fd6fccc9 osdc/Objecter: improve pool deletion detection
Currently we can have a race like so:

 - send op in epoch X
 - osd replies
 - pool deleted in epoch X+1
 - client gets X+1, sends map epoch check
 - client gets reply
   -> fails assert that there is no map check in flight

Avoid this situation by inferring that the pool is deleted when we see
that we previously sent the request but the pool is no longer present.
Since pool ids are not reused there is no point in doing a synchronous
map check at all.

Backport: giant
Fixes: #10372
Signed-off-by: Sage Weil <sage@redhat.com>
2014-12-23 15:59:19 -08:00
Haomai Wang
a540ac3385 librados: only call watch_flush if necessary
Fix bug #10424
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>

(cherry picked from commit 926a1b7468)
2014-12-23 14:24:27 -08:00
Loic Dachary
6b030aa8ba mds: add default ctor for quota_info_t
http://tracker.ceph.com/issues/10400 Fixes: #10400

Signed-off-by: Loic Dachary <ldachary@redhat.com>
(cherry picked from commit 7f1e510165)
2014-12-23 12:52:46 -08:00
Sage Weil
b95c73eed6 librados: warn about rados_watch_flush() prior to ioctx shutdown
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 1fbe9b6f3a)
2014-12-22 17:41:37 -08:00
Sage Weil
93825bf05a librados: watch_flush() on shutdown
Users can easily forget this. It makes shutdown potentially block, but if
they have racing callbacks they get what they ask for.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 30678f6daf)
2014-12-22 17:41:32 -08:00
Sage Weil
7de1b4d4ff librados: add rados_watch_flush() call
Add a call so that callers can make sure all queued callbacks have
completed before shutting down the ioctx.  This avoids a segv triggered
by the LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify2Timeout/1
test due to the ioctx being destroyed when the in-progress callback
does a notify_ack.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 4ebd4b4280)
2014-12-22 17:41:28 -08:00
Sage Weil
5cf4483cc4 osdc/Objecter: do notify completion callback in fast-dispatch context
The notify completion has exactly one user, the librados caller which
does nothing but take a local (inner) lock and signal a Cond.  Do this
in the fast-dispatch context for simplicity.

Notably, this makes the notify completion (and timeout) trigger a
notify2() return (with ETIMEDOUT) even when the finisher queue that
normally delivers notify is busy.. for example with a notify that is
being very slow.  In our case, the unit test is doing a sleep(3) to
test timeouts but also prevented the ETIMEDOUT notification from
being delivered to the caller.  This patch resolves that.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 9b78dafd4a)
2014-12-22 17:41:21 -08:00
Samuel Just
f76e0e4598 Merge pull request #3229 from ceph/wip-osd-ctor
osd/ReplicatedPG: initialize new_backfill in ctor

Reviewed-by: Samuel Just <sjust@redhat.com>
2014-12-22 11:39:16 -08:00
Sage Weil
4047f2ddef Merge pull request #3230 from ceph/wip-pg-stat
mon/PGMap: restructure 'pg stat' formatted output

Reviewed-by: John Spray <jspray@redhat.com>
2014-12-22 06:46:21 -08:00
Sage Weil
7f9c03d1bf mon/PGMap: restructure 'pg stat' formatted output
The + character, which appears in state names, is not a valid XML token.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-12-22 06:41:25 -08:00
Sage Weil
9c8827a5bd osd/ReplicatedPG: initialize new_backfill in ctor
*** CID 1260213:  Uninitialized scalar field  (UNINIT_CTOR)
/osd/ReplicatedPG.cc: 1242 in ReplicatedPG::ReplicatedPG(OSDService *,
std::tr1::shared_ptr<const OSDMap>, const PGPool &, spg_t)()
1236       snap_trimmer_machine(this)
1237     {
1238       missing_loc.set_backend_predicates(
1239         pgbackend->get_is_readable_predicate(),
1240         pgbackend->get_is_recoverable_predicate());
1241       snap_trimmer_machine.initiate();
>>>     CID 1260213:  Uninitialized scalar field  (UNINIT_CTOR)
>>>     Non-static class member "new_backfill" is not initialized in this
constructor nor in any functions that it calls.
1242     }
1243
1244     void ReplicatedPG::get_src_oloc(const object_t& oid, const
object_locator_t& oloc, object_locator_t& src_oloc)
1245     {
1246       src_oloc = oloc;
1247       if (oloc.key.empty())

Signed-off-by: Sage Weil <sage@redhat.com>
2014-12-21 07:27:22 -08:00
Gregory Farnum
243b9e4350 Merge pull request #3121 from ceph/wip-10277
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2014-12-19 14:45:44 -08:00
Sage Weil
6067462102 Merge remote-tracking branch 'gh/next' 2014-12-19 11:55:58 -08:00
John Spray
61126a2d71 Merge pull request #3199 from ceph/wip-qa-empty-xattr
qa: test zero size xattr

Reviewed-by: John Spray <john.spray@redhat.com>
2014-12-19 17:03:25 +00:00
Sage Weil
0a25bee539 Merge remote-tracking branch 'gh/wip-fs-quota'
Conflicts:
	src/client/Client.cc
2014-12-19 07:45:02 -08:00
Sage Weil
676ce2a58e Merge pull request #3218 from dachary/wip-10383-disable-unittest-msgr
tests: temporarily disable unittest_msgr
2014-12-19 07:42:48 -08:00
Loic Dachary
ecbdbb1629 tests: temporarily disable unittest_msgr
http://tracker.ceph.com/issues/10383 Refs: #10383

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2014-12-19 16:21:50 +01:00
Sage Weil
84ae9fd127 Merge pull request #2976 from ceph/wip-pgmeta
osd: move PG metadata to a per-PG object

Passed rados and upgrade tests.

Reviewed-by: Samuel Just <sjust@redhat.com>
2014-12-19 07:19:05 -08:00
Sage Weil
531ed8a38c Merge pull request #3212 from ceph/wip-10255-wusui
Remove sepia dependency (use fqdn)

Backport: giant
Reviewed-by: Sage Weil <sage@redhat.com>
2014-12-19 07:09:33 -08:00
Sage Weil
c2951131bd Merge pull request #3214 from xinxinsh/wip-cleanup
cleanup : remove sync_epoch

Reviewed-by: Sage Weil <sage@redhat.com>
2014-12-19 07:02:39 -08:00
Jenkins
08bd1e1eee 0.90 2014-12-19 06:56:22 -08:00
Wido den Hollander
49c2322160 doc: Instead of using admin socket, use 'ceph daemon' command. 2014-12-19 15:51:49 +01:00
Loic Dachary
11fa1dfbcb Merge pull request #3193 from nilamdyuti/wip-doc-ceph-disk
Changes format style in ceph-disk to improve readability as html

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2014-12-19 15:36:39 +01:00
Loic Dachary
71df64519d Merge pull request #3210 from ceph/wip-osdmap
osd: only verify OSDMap crc if it is known

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2014-12-19 15:07:56 +01:00
Loic Dachary
8c2c48d8d4 Merge pull request #3216 from cstavr/master
ceph-disk: Fix wrong string formatting

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2014-12-19 14:57:11 +01:00
Christos Stavrakakis
a302c44e9c ceph-disk: Fix wrong string formatting
Signed-off-by: Christos Stavrakakis <stavr.chris@gmail.com>
2014-12-19 13:46:44 +02:00
xinxin shu
2f63e54f0f cleanup : remove sync_epoch
Signed-off-by: xinxin shu <xinxin.shu@intel.com>
2014-12-19 09:19:26 +08:00
Warren Usui
19dafe1648 Remove sepia dependency (use fqdn)
Fixes: #10255
Signed-off-by: Warren Usui <warren.usui@inktank.com>
2014-12-18 17:16:24 -08:00