Commit Graph

25691 Commits

Author SHA1 Message Date
Alexandre Marangone
56619ab917 Fix journal partition creation
With OSD sharing data and journal, the previous code created the
journal partiton from the end of the device. A uint32_t is
used in sgdisk to get the last sector, with large HD, uint32_t
is too small.
The journal partition will be created backwards from the
a sector in the midlle of the disk leaving space before
and after it. The data partition will use whichever of
these spaces is greater. The remaining will not be used.

This patch creates the journal partition from the start as a workaround.

Signed-off-by: Alexandre Marangone <alexandre.marangone@inktank.com>
2013-04-19 15:11:09 -07:00
Sage Weil
fe9d326099 rbd: fix qa tests to use --allow-shrink
Fixes: #4763
Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-19 14:37:21 -07:00
Gregory Farnum
f114fdc40a Merge pull request #227 from ceph/wip-4574
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-19 14:28:18 -07:00
Sage Weil
d395aa521e init-ceph: do not stop start on first failure
When starting we often loop over many daemon instances.  Currently we stop
on the first error and do not try to start other daemons.

Instead, try them all, but return a failure if anything did not start.

Fixes: #2545
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
2013-04-19 13:05:43 -07:00
Joao Eduardo Luis
9a7d1f5197 mon: Monitor: fix timechecks get_health clobbering overall status
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-04-19 12:26:13 -07:00
Sage Weil
aa0d5f39d6 mon: fix health monitor calls
- unconditionally call get_health, regardless of formatter *
- return a meaningful health status code

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-19 12:16:11 -07:00
Sage Weil
be4807f5b8 global: call observers (and start logging) in global_init
Call observers so that the logging infrastructure gets initailized and we
start logging.  Otherwise, unless a default log setting has been modified,
we won't start logging until we daemonize, and we won't get the nice
version banner in the log file.

Unlike the previous attempt to fix this (a3091774), we do this after all
of the lockdep initialization has completed.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-19 12:03:43 -07:00
David Zafman
76505c28de osd: Create new static function PG::_write_info() for use by PG import
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-04-19 11:29:18 -07:00
David Zafman
52d8240adf osd: Add OSD::make_infos_oid() as common function to create oid
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-04-19 11:29:18 -07:00
David Zafman
5ffb3ef4c2 filestore, osd: Fixes to comform to programming guidelines
Signed-off-by: David Zafman <david.zafman@inktank.com>
2013-04-19 11:29:17 -07:00
Joao Eduardo Luis
fa89cfd2e4 mon: QuorumService: return health status on get_health()
This allows us to return the appropriate overall health status on
Monitor::get_health().

Fixes: 4574

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-04-19 19:26:51 +01:00
Samuel Just
78c9db88cd OpRequest: don't maintain history if the OSD is shutting down
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:06:04 -07:00
Samuel Just
1493e7dbfb osd/: optionally track every pg ref
This involves three pieces:

For intrusive_ptr type references, we use TrackedIntPtr instead.  This
uses get_with_id and put_with_id to associate an id and backtrace with
each particular ref instance.

For refs taken via direct calls to get() and put(), get and put now
require a tag string.  The PG tracks individual ref counts for each tag
as well as the total.

Finally, PGs register/unregister themselves on construction/destruction
with OSDService.

As a result, on shutdown, we can check for live pgs and determine where
the references are held.

This behavior is compiled out by default, but can be included with the
--enable-pgrefdebugging flag.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:05:58 -07:00
Samuel Just
66c007fb3b common/: add tracked_int_ptr.hpp
TrackedIntPtr acts like intrusive_ptr, but is able to
track a ref id.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:22 -07:00
Samuel Just
ec6f71bd02 ReplicatedPG: use the ReplicatedPGRef typedef
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:22 -07:00
Samuel Just
4090eff8a6 ReplicatedPG: use ReplicatedPGRef for C_PG_MarkUnfoundLost
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:22 -07:00
Samuel Just
f03ba5a298 ReplicatedPG: use ReplicatedPGRef for C_OSD_OpCommit
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:22 -07:00
Samuel Just
8fe1b9d5a3 ReplicatedPG: use ReplicatedPGRef for C_OSD_OpApplied
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:22 -07:00
Samuel Just
c04c3e59ec OSD: use PGRef in handle_pg_remove
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:21 -07:00
Samuel Just
1c2b66cf02 OSD: use PGRef in handle_pg_stats_ack
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:21 -07:00
Samuel Just
c2127a1126 PG: use PGRef in QueuePeeringEvt
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:21 -07:00
Samuel Just
0b7795acda OSD: use PGRef in consume_map
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:21 -07:00
Samuel Just
f45a541365 PG: use PGRef for FlushState
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:21 -07:00
Samuel Just
2f9a35ac3d PG: use PGRef for C_PG_FinishRecovery
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:21 -07:00
Samuel Just
8bd89e12f8 PG: use PGRef in C_PG_ActivateCommitted
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:21 -07:00
Samuel Just
ce64775367 PG: do not put() in scrub() if pg is deleting
scrub() no longer handles the put, this call
must have been missed.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:21 -07:00
Samuel Just
b021036bde PG,ReplicatedPG: move intrusive_ptr declarations to top
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:21 -07:00
Samuel Just
220c65127d ReplicatedPG: add ReplicatedPGRef
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:21 -07:00
Samuel Just
016e975ab9 FileStore::_do_copy_range: read(2) might return EINTR
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:20 -07:00
Samuel Just
07a80ee35e FileStore::_do_clone_range: _do_copy_range encodes error in return, not errno
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-04-19 11:00:20 -07:00
Sage Weil
af5a9b37f2 Merge pull request #224 from ceph/wip-mon-crush
Wip mon crush

Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-04-19 10:20:18 -07:00
Sage Weil
5e4b8bc442 config: clarify 'mon osd down out subtree limit'
Clarify the description; this is the subtree type that we won't mark out
if it is all down, but anything less than it will be.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-19 09:30:16 -07:00
John Wilkins
cd2cabecf1 doc: Trimmed toc depth for nicer visual appearance.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-18 14:23:47 -07:00
John Wilkins
44aa696b44 doc: Added new PG troubleshooting use case.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-18 14:08:43 -07:00
John Wilkins
2e3579edd4 doc: Updated title.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-18 14:08:10 -07:00
John Wilkins
304a2343a7 doc: Added PG troubleshooting to toctree.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-18 14:07:56 -07:00
John Wilkins
d5139ba1ff doc: Bifurcating OSD and PG Troubleshooting. Updated hyperlink.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-18 13:30:50 -07:00
John Wilkins
3b8057ac93 doc: Bifurcating OSD and PG Troubleshooting. Added PG troubleshooting doc.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-18 13:30:05 -07:00
John Wilkins
3c4bf83cf8 doc: Bifurcating OSD and PG Troubleshooting. Removed PG section.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-18 13:29:16 -07:00
Sage Weil
b0c1001a5e mon: ensure 'osd crush rule ...' commands are idempotent
Ensure that we return 0 for these cases.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-18 11:20:20 -07:00
Sage Weil
0d46dc4646 mon: make 'osd crush link ...' idempotent
We fixed move in f5ba0fbbe7 but missed this
one.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-04-18 11:20:20 -07:00
caleb miles
5f1898d9c9 rgw_bucket: Fix dump_index_check.
Signed-off-by caleb miles <caleb.miles@inktank.com>
2013-04-18 14:09:17 -04:00
Greg Farnum
efbe2e8b55 Merge branch 'wip-max_size-3637' into next
Reviewed-by: Sage Weil <sage@inktank.com>
2013-04-18 10:39:03 -07:00
Kuan Kai Chiu
87634d882f mds: journal the projected root xattrs in add_root()
In EMetaBlob::add_root(), we should log the projected root xattrs
instead of original ones to reflect xattr changes.

Signed-off-by: Kuan Kai Chiu <big.chiu@bigtera.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-18 10:38:21 -07:00
Kuan Kai Chiu
f379ce37bf mds: fix setting/removing xattrs on root
MDS crashes while journaling dirty root inode in handle_client_setxattr
and handle_client_removexattr. We should use journal_dirty_inode to
safely log root inode here.

Signed-off-by: Kuan Kai Chiu <big.chiu@bigtera.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-04-18 10:38:05 -07:00
Gary Lowell
7e4f80b12e debian/control: Fix typo in libboost version number
Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-04-18 10:41:38 -07:00
Gary Lowell
f4bc760776 build: Add new package dependencies
Add libboost-system-dev (bug #4725).

Add hdparm to rpm installation requirements.  The hdparm
command is used to determin if write-caching is enabled on
the journal device.

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-04-18 10:41:20 -07:00
Joao Eduardo Luis
4b34b0e52b mon: PaxosService: fix trim criteria so to avoid constantly trimming
Say a service establishes it will only keep 500 versions once a given
condition X is true.  Now say that said condition X only becomes true
after said service committing some 800 versions.

Once we decide to trim, this service would trim all 300 surplus versions
in one go.  After that, each committed version would also trim the
previous version.

Trimming an unbounded number of versions is not a good practice
as it will generate bigger transactions (thus a greater workload on
leveldb) and therefore bigger messages too.

Constantly trimming versions implies more frequent accesses to leveldb,
and keeping around a couple more versions won't hurt us in any significant
way, so let us put off trimming unless we go over a predefined minimum.

This patch adds two new options:

 paxos service trim min - minimum amount of versions to trigger a trim
                          (default: 30, 0 disables it)
 paxos service trim max - maximum amount of versions to trim during a
                          single proposal
                          (default: 50, 0 disables it)

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-04-18 16:45:07 +01:00
Joao Eduardo Luis
5a5fdfc66d mon: Paxos: increase debug levels for proposal listing
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-04-18 09:58:55 +01:00
John Wilkins
a0e457ae18 doc: Removed legacy man page index. Generates warning otherwise.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-04-17 18:34:54 -07:00