Commit Graph

25617 Commits

Author SHA1 Message Date
Sage Weil
741f468523 mon: fix Monitor::pick_random_mon()
The other arg isn't used, so remove the (broken) handling for that case.
If we re-add it later, model after the MonClient's version.

Fixes: #4821
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-26 11:12:53 -07:00
Dan Mick
89692e099f debian/rules: use multiline search to look for Build-Depends
When Build-Depends was split into multiple lines (in commit
8f5c665744), the grep for
libgoogle-perftools-dev broke.  Replace grep with perl for multiline
matching.

Fixes: #4818
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-04-26 00:05:48 -07:00
Sage Weil
407ce132ba PendingReleaseNotes: these are now in the release-notes.rst
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-25 11:17:41 -07:00
Sage Weil
c979d65bd2 Merge remote-tracking branch 'gh/next' 2013-04-25 11:17:15 -07:00
Sage Weil
4af93dccae doc/release-notes: add note about sysvinit script change
See cd7e52cc76.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-25 11:17:08 -07:00
Sage Weil
cd7e52cc76 init-ceph: use remote config when starting daemons on remote nodes (-a)
If you use -a to start a remote daemon, assume the remote config is present
instead of pushing the local config.  This makes more sense and simplifies
things.

Note that this means that -a in concert with -c foo means that foo must
also be present on the remote node in the same path.  That, however, is a
use case that I don't particularly care about right now.  :)

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-04-25 11:13:33 -07:00
Sage Weil
ea54e6603f Merge branch 'wip-4748-b' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-25 10:21:11 -07:00
David Zafman
f4804849b7 Merge branch 'wip-4778' into next
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-04-24 17:33:00 -07:00
David Zafman
ac3dda214d scrub clears inconsistent flag set by deep scrub
Add new num_deep_scrub_errors and num_shallow_scrub_errors to object_stat_sum_t
Show deep-scrub error count when outputing regular scrub errors
Set invalid size in case of a stat error which sets read_error
For now do deep-scrub after repair (see #4783)

fixes: #4778
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-04-24 17:32:39 -07:00
Sage Weil
ba527c1ea2 doc/release-notes: enospc note
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-24 16:46:02 -07:00
Sage Weil
2075ec601e doc/release-notes: 0.61 cuttlefish notes
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-24 16:42:07 -07:00
Josh Durgin
4322522028 Merge pull request #242 from ceph/wip-objectcacher-enoent
Reviewed-by: Sage Weil <sage.weil@inktank.com>
2013-04-24 16:20:59 -07:00
Josh Durgin
82d5cd601e ObjectCacher: remove all buffers from a non-existent object
Once we're sure an object doesn't exist, we retry all the waiters in
order, and they return -ENOENT immediately. If there were a bunch of
BufferHeads waiting for data (rx state), they would be left behind
while the reads that triggered them were complete from the cache
user's perspective. These extra rx BufferHeads would pin the object in
the lru, so they wouldn't be removed by release_set(). This meant that
the assert during shutdown of the cache would be triggered.

To fix this, remove any BufferHeads in this state immediately when we
find out the object doesn't exist. Use the same condition as readx for
determining whether this is safe - if we got -ENOENT and all
BufferHeads for the object are clean or rx.

Fixes: #3664
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-04-24 15:54:07 -07:00
Samuel Just
14f2392263 FileStore::_split_collection: src or dest may be removed on replay
If the collection is subsequently removed, the _split_collection
might get replayed and find either src or dest removed.

Fixes: #4806
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-04-24 15:05:41 -07:00
Sage Weil
3604c98232 librados: fix calc_snap_set_diff interval calculation
When calculating the [a,b] interval over which a given clone is valid, do
not assume that b == the clone id; that is *not* true when the original
end snap has been deleted/trimmed.

While we are here, make the code a bit cleaner to read.

Fixes: #4785
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-04-24 14:34:40 -07:00
Samuel Just
5668e5b5a4 Merge remote-tracking branch 'upstream/wip_2476' into next
Fixes: #2476
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-24 14:04:07 -07:00
Samuel Just
81a6165c13 PG: call check_recovery_sources in remove_down_peer_info
If we transition out of peering due to affected
prior set, we won't trigger start_peering_interval
and check_recovery_sources won't get called.  This
will leave an entry in missing_loc_sources without
a matching missing set.  We always want to
check_recovery_sources with remove_down_peer_info.

Fixes: 4805
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-04-24 13:20:44 -07:00
Sage Weil
a9791dae1b mon: send clients away while sychronizing
When we are out of quorum, we waitlist client messages or (eventually)
send them elsewhere.  If we are synchronizing, do the same.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-24 12:26:37 -07:00
Sage Weil
12bc9a7aa9 mkcephfs: give mon. key 'allow *' mon caps
This will ease the transition from mkcephfs to ceph-deploy by allowing
ceph-create-keys to use the mon. keyring file in $mon_data and get the
caps it needs.

Fixes: #4756
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-24 11:23:25 -07:00
Josh Durgin
cce1c91ae8 PendingReleaseNotes: note about rbd resize --allow-shrink
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-04-24 10:16:03 -07:00
Sage Weil
14777ec1b5 Merge remote-tracking branch 'gh/next'
Conflicts:
	ceph.spec.in
2013-04-24 08:51:25 -07:00
leseb
31399d1776 Fix typo of the keystone service-create command
Signed-off-by: leseb <sebastien.han@enovance.com>
2013-04-24 08:49:58 -07:00
Sage Weil
c7a0477bad rbd: fix cli-integration tests for striping change
We don't set the striping feature when we are using backward-compatible
(default) striping now; fix the test accordingly.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-24 08:36:06 -07:00
Gary Lowell
446641aa34 95-ceph-osd-alt.rules: Fix missing parent parameter
Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-04-24 08:22:04 -07:00
Samuel Just
1f7ff412ab ReplicatedPG: timeout watches based on last_became_active
This way a notify on an object with a single defunct watcher
won't necessarily have to wait the full timeout if the pg
has been active for a while.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-23 20:54:57 -07:00
Samuel Just
a40772bedc osd_types: add last_became_active to pg_stats
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-23 20:54:57 -07:00
Samuel Just
d44cfc524f Merge branch 'wip_4552' into next
Fixes: #4552
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-23 20:51:37 -07:00
Samuel Just
d196b5badf OSD: don't report peers down if hbclient_messenger is backed up
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-23 18:27:28 -07:00
Samuel Just
49eeaeba3f Messenger: add interface to get oldest queued message arrival time
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-23 18:27:28 -07:00
Samuel Just
297c6714b3 DispatchQueue: track queued message arrival times and expose oldest
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-23 18:27:28 -07:00
Sage Weil
0cd86dfb64 Merge pull request #237 from ceph/wip-4794
init-ceph: fix (and simplify) pushing ceph.conf to remote unique name
2013-04-23 17:23:32 -07:00
Sage Weil
e09efda7c8 Merge pull request #241 from ceph/wip-4798
#4798

Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-23 17:17:02 -07:00
Sage Weil
48631c114a mon: revert part of PaxosService::is_readable() change
In 98e23980f4 is_readable() was changed to
call is_active(), but that has a check for is_bootstrapping(), so there is
a semantic change.

As a result, we may fail PaxosService::is_readable() (due to bootstrapping)
and then try to call Paxos::wait_for_readable().  That will assert that
Paxos::is_readable() is false, but it will be true and we will crash.

Revert that part of the change, since the semantic change was not
intentional.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-23 17:16:31 -07:00
Sage Weil
0093d704e6 librbd: fix i386 build
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-23 16:18:53 -07:00
Josh Durgin
5349ee3056 Merge pull request #240 from ceph/wip-4665
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-04-23 16:11:44 -07:00
Sage Weil
857c88e017 librbd: add read_iterate2 call with fixed argument type
The existing read_iterate takes a size_t for the length, which is only 4GB
on 32-bit machines.  Instead, take a uint64_t length for the new
read_iterate2().

Return 0 instead of the number of bytes read; this makes the user-facing
API a bit simpler.

Fixes: #4665
Signed-off-by: Sage Weil <sage@inktank.com>

keep bytes return from internal method
2013-04-23 15:57:26 -07:00
Sage Weil
6c798ed940 librbd: implement read not in terms of read_iterate
The read() method returns the bytes read, trimmed to the end of the image;
use the other read() variant to do this (which use aio_read()) instead of
read_iterate().

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-23 15:45:19 -07:00
Sage Weil
95ed73a73b mon: drop forwarded requests after an election
On each election, we resend routed requests to the new leader (or
requeue for ourselves).  Therefore, if we receive a forwarded request,
we should drop it on the floor if there is a new election.  Add a field
in the PaxosServiceMessage struct to track which election epoch we
received the request in, and drop it in PaxosService::dispatch() if
that is in the past.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-23 14:06:41 -07:00
Sage Weil
ab25707092 mon: requeue routed_requests for self if elected leader
If we have requests that we have forwarded, and are elected leader,
requeue those requests for ourself and queue them normally and clear out
the routed_requests map.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-23 13:45:59 -07:00
Sage Weil
4b07d6928c mon: track original Connection* for forwarded requests
Keep a reference to the source Connection* for forwarded requests.  This
makes the reply path slightly cleaner, and will help us later.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-23 13:40:27 -07:00
Gregory Farnum
426e3be64e Merge pull request #222 from ceph/wip-3495
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-23 12:44:05 -07:00
Samuel Just
8402107c65 test_filejournal: adjust corrupt entry tests to force header write
The journal no longer assumes corruption if it finds a valid entry
after an inavlid entry.  Instead, these tests will exercise the
corruption detection via the header committed_up_to member.

Fixes: #4792
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-23 12:28:36 -07:00
Sage Weil
ccbc4dbc6e init-ceph: fix (and simplify) pushing ceph.conf to remote unique name
The old code would only do the push once per remote node (due to the
list in $pushed_to) but would reset $unique on each attempt.  This would
break if a remote host was processed twice.

Fix by just skipping the $pushed_to optimization entirely.

Fixes: #4794
Reported-by: Andreas Friedrich <andreas.friedrich@ts.fujitsu.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-23 10:00:38 -07:00
Gary Lowell
7ad63d23d7 ceph-disk: OSD hotplug fixes for Centos
Two fixes for Centos 6.3 and other systems with udev versions
prior to 172.  The disk peristant name using the GPT UUID does
not exist, so use the by_path persistent name instead for the
journal symlink.

The gpt label fields are not available for use in udev rules. Add
ceph-disk-udev wrapper script that extracts the partition
type guid from the label and calls ceph-disk-activate if it is
a ceph guid type. (Bug #4632)

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-04-22 22:30:39 -07:00
John Wilkins
3dd9574bbf doc: Usage requires --num_osds.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-22 21:03:15 -07:00
John Wilkins
b71ec9c25a doc: Added some detail. Calculating PGs, maps; reorganized a bit.
fixes: #2968

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-22 21:02:45 -07:00
Joao Eduardo Luis
b73ef010bf mon: [MDS]Monitor: remove 'stop_cluster' and 'do_stop()'
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-04-23 00:18:28 +01:00
Joao Eduardo Luis
f42fc0e462 mon: MDSMonitor: tighter leash on cross-proposals to the osdmon
Fixes: #3495

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-04-23 00:18:28 +01:00
Gregory Farnum
2501980350 Merge pull request #234 from ceph/wip-4758
Fixes #4758.

Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-22 15:22:04 -07:00
Joao Eduardo Luis
fa77e1e732 mon: PaxosService: add request_proposal() to perform cross-proposals
Instead of allowing services to directly use 'propose_pending()' on
other services, we instead add two new functions:

  - request_proposal() to request 'this' service to propose its
    pending value; and
  - request_proposal(PaxosService *other) so that 'this' service
    can request a proposal to 'other'

These functions should allow us to enforce a greater set of
constraints at time of a cross-proposal, either by making sure a
service will (e.g.) hold-off his own proposals until said proposal
is performed, or even that the other service will enforce a tighter
set of constraints that wouldn't otherwise be enforced by using
'propose_pending()' directly.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-04-22 23:20:26 +01:00