Commit Graph

29757 Commits

Author SHA1 Message Date
Joao Eduardo Luis
e02740ac5d mon: OSDMonitor: only allow an osd to boot iff it has the fsid on record
Fixes: #6605

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-10-29 20:35:25 +00:00
Joao Eduardo Luis
42c4137cbf mon: OSDMonitor: fix some annoying whitespace
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-10-29 20:30:37 +00:00
John Wilkins
60264f9f02 doc: Fixed formatting. Fixed hyperlink.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-10-29 12:24:37 -07:00
Loic Dachary
7dd387b0af Merge pull request #779 from ceph/wip-crush-hook
upstart,sysvinit: allow 'osd crush location hook' script to determine osd crush position

Reviewed-by: Loic Dachary <loic@dachary.org>
2013-10-29 12:24:05 -07:00
John Wilkins
46d897a377 doc: fix formatting.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-10-29 12:01:42 -07:00
Sage Weil
111a37efb1 upstart, sysvinit: use ceph-crush-location hook
Instead of hard-coding a check in ceph.conf and some reasonable
defaults, defer this work to ceph-crush-location, and allow users to
specify their own hook with alternative logic.

This can be helpful in a nubmer of cases, like:

 - rack (or other) information included in hostname and easily parsed
   out by a hook
 - multiple types of devices in each host, resulting in 'parallel'
   crush trees (e.g., one for hdd, one for ssd)

Signed-off-by: Sage Weil <sage@inktank.com>
2013-10-29 11:10:32 -07:00
Sage Weil
fc49065d85 ceph-crush-location: new crush location hook
This generalizes the bit of code that builds a key=value pair list to
update an entity's CRUSH location.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-10-29 11:09:52 -07:00
Sage Weil
13d1b9c99b Merge pull request #786 from ceph/wip-6673
mon/PGMonitor: always send pg creations after mapping

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-10-29 10:16:52 -07:00
Sage Weil
df229e5eff mon/PGMonitor: always send pg creations after mapping
At some point in the dumpling cycle I separated the map stage from the
send stage.  We can send the creates any time we have a non-zero osdmap
epoch, and are in good shape as long as we do the map step after the
osdmap is loaded (hence the post_paxos_update).

Some background:

We originally introduced the map-but-don't send in a2fe0137, at which
point all was well because we only called it on ceph-mon startup.

Later, this turned into post_paxos_update in e635c478, at which point
it was now called by a running monitor.. but we didn't add in the
send_pg_creates().  This is where this bug stems from.

This particular path is responsible for the stalled test referenced in
bug #6673.

Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
2013-10-29 10:10:21 -07:00
Sage Weil
2181b4c946 mon/OSDMonitor: fix signedness warning on poolid
Signed-off-by: Sage Weil <sage@inktank.com>
2013-10-29 08:59:06 -07:00
Samuel Just
705f4c1f6e Merge remote-tracking branch 'upstream/next' 2013-10-29 08:27:54 -07:00
Samuel Just
7a06a71e0f ReplicatedPG::recover_backfill: update last_backfill to max() when backfill is complete
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-29 08:26:57 -07:00
Samuel Just
f8fa30953e ReplicatedPG: src_obcs can now be empty
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 23:35:00 -07:00
Samuel Just
00ddea37e4 Merge remote-tracking branch 'upstream/next' 2013-10-28 22:51:04 -07:00
Sage Weil
ce8457680d Merge pull request #773 from dachary/wip-6614
common: rebuild_page_aligned sometimes rebuilds unaligned

Reviewed-by: Sage Weil <sage@inktank.com>
2013-10-28 21:15:32 -07:00
athanatos
ad5655beb2 Merge pull request #780 from ceph/wip-6585
Wip 6585

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-10-28 21:11:27 -07:00
Sage Weil
8b98166d09 Merge pull request #775 from ceph/wip-readdirend
mds: fix readdir end check

Reviewed-by: Sage Weil <sage@inktank.com>
2013-10-28 21:01:27 -07:00
Yan, Zheng
3b99cd0ad8 mds: fix readdir end check
If the last item in the directory is a remote link and the corresponding
inode is not in cache, the readir reply will not contain the last item.
But iterator 'it' is equal to dir->end() in this case, it causes the 'end'
flag of the readdir reply be set to true.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-10-29 09:48:25 +08:00
John Wilkins
6eded8a129 doc: Fixes to normalize header hierarchy. Tweaked IA slightly.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-10-28 18:03:28 -07:00
John Wilkins
bd507ef3fd doc: Updated with a verified installation procedure and latest usage.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2013-10-28 17:46:06 -07:00
Josh Durgin
702234d7a2 Merge pull request #777 from ceph/wip-scripts
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-10-28 17:12:26 -07:00
Sage Weil
4e48dd56a4 osd/ReplicatedPG: use MIN for backfill_pos
Signed-off-by: Sage Weil <sage@inktank.com>
2013-10-28 16:39:09 -07:00
Loic Dachary
e6d983beda Merge pull request #772 from ceph/wip-5612
init-ceph, upstart: make crush update on osd start time out

Reviewed-by: Loic Dachary <loic@dachary.org>
2013-10-28 16:13:34 -07:00
Samuel Just
4139e75d63 ReplicatedPG: recover_backfill: don't prematurely adjust last_backfill
We can't adjust last_backfill to object x until x has been fully
backfilled.  pending_backfill_updates contains all those backfills
started, but which have not yet been reflected in pinfo.last_update.
backfills_in_flight contains those backfills which have not yet
completed.  Thus, we can adjust last_update to the largest entry
in pending_backfill_updates not in backfills_in_flight.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 16:10:16 -07:00
Samuel Just
ecddd12b01 ReplicatedPG: add empty stat when we remove an object in recover_backfill
Subsequent updates to that object need to have their stats added
to the backfill info stats atomically with the last_backfill
update.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 16:10:09 -07:00
Samuel Just
9ec35d5ccf ReplicatedPG: replace backfill_pos with last_backfill_started
last_backfill_started reflects what pinfo.last_backfill will be
once all currently outstanding backfills complete.  backfill_pos
was tricky since we couldn't correctly inialize it without
doing the first backfill scan pair.

In recover_backfill, we rescan from last_backfill_started rather
than from backfill_pos.  This ensures that we capture all clones
created between last_backfill_started and what previously had been
backfill_pos without special handling in make_writeable.  The main
downside is that we will tend to "rescan" last_backfill_started.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 16:03:59 -07:00
Samuel Just
8774f03d39 PG::BackfillInfo: introduce trim_to
We'll use this to trim off last_backfill_started since it'll
often be included in rescans.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 16:03:58 -07:00
Samuel Just
46dfd91975 PG::BackfillInterval: use trim() in pop_front()
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 16:03:58 -07:00
Samuel Just
0a9a2d7b9c ReplicatedPG::prepare_transaction: info.last_backfill is inclusive
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 16:03:58 -07:00
Sage Weil
5939eaceb0 upstart: fail osd start if crush update fails
If the update for the CRUSH position fails for some reason, do not
start the OSD.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-10-28 15:58:29 -07:00
Sage Weil
177e2ab1ca init-ceph: make crush update on osd start time out
If the monitor is not currently available, this crush update would block
forever, preventing the OSD and (potentially) the rest of the system
from starting up.  Instead, make it time out after 10 seconds and then
abort startup.  This prevents startup of an OSD if we failed to update
the CRUSH position for some reason.

In fact, do not start up the OSD if the CRUSH update fails for any
reason--not just a timeout!

Works-around: #5612
Signed-off-by: Sage Weil <sage@inktank.com>
2013-10-28 15:58:29 -07:00
Sage Weil
78b5a2cf8b Merge pull request #771 from ceph/wip-ceph-context
ceph_context: replace semaphore

Reviewed-by: Sage Weil <sage@inktank.com>
2013-10-28 14:48:24 -07:00
Noah Watkins
b28b64a0b6 pybind: use find_library for libcephfs and librbd
Use find_library to avoid assumptions about platform shared library
naming conventions.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-10-28 14:37:07 -07:00
Sage Weil
ac8dcdbeed Merge pull request #778 from ceph/wip-6621
radosgw-admin: accept negative values for quota params

Reviewed-by: Sage Weil <sage@inktank.com>
2013-10-28 14:28:25 -07:00
Yehuda Sadeh
d5d36d0baa radosgw-admin: accept negative values for quota params
and document that in the usage output.

Fixes: #6621

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-10-28 14:15:43 -07:00
athanatos
7cbfdbf38d Merge pull request #760 from ceph/wip-6585
Wip 6585

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-10-28 13:50:34 -07:00
Samuel Just
9d136a440c ReplicatedPG: no need to clear repop->*obc
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:50:06 -07:00
Samuel Just
ae1263ad64 Merge remote-tracking branch 'upstream/next' into wip-obc 2013-10-28 13:47:39 -07:00
Sage Weil
f58396a685 doc/release-notes: emperor blurb
Signed-off-by: Sage Weil <sage@inktank.com>
2013-10-28 13:53:11 -07:00
Samuel Just
8db03ed027 ReplicatedBackend: don't hold ObjectContexts in pull completion callback
We need flushing the sequencer to ensure that all Contexts which hold
ObjectContextRefs have been run or deleted.
C_ReplicatedBackend_OnPullComplete, however, gets queued in a second
work queue in order to avoid performing expensive push related reads
in the FileStore finisher.

Rather than keep the objects contexts around, we instead put off
removing the object from the pulling map until the call back
fires and read the object context out of the pulling map.  This
way the ObjectContextRef will be cleaned up along with the rest
of the pulling map in on_change.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:35:17 -07:00
Samuel Just
5a416dab6e ReplicatedPG: put repops even in TrimObjects
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:35:17 -07:00
Samuel Just
420182a1e8 ReplicatedPG: improved on_flushed error output
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:35:17 -07:00
Samuel Just
ce33892271 PG: call on_flushed on FlushEvt
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:35:10 -07:00
Samuel Just
6f975e35a1 PG,ReplicatedPG: remove the waiting_for_backfill_peer mechanism
See previous patch.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:34:17 -07:00
Samuel Just
3d0d69fed0 ReplicatedPG: have make_writeable adjust backfill_pos
If we are writing to backfill_pos and create a clone, we end
up failing to send the transaction creating the clone to the
backfill peer.  This is fine as long as we end up backfilling
the clone.  To that end, we simply add the clone to
backfill_info and adjust backfill_pos accordingly.  This is less
brittle than the waiting_for_backfill_pos mechanism since it
works even if we wait between that check and issuing the repop,
which can happen for copy_from.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:34:16 -07:00
Samuel Just
3de32bd368 ReplicatedBackend: fix failed push error output
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:34:16 -07:00
Samuel Just
807dde4814 ReplicatedPG,osd_types: move rw tracking from its own map to ObjectContext
We also modify recovering to hold a reference to the recovering obc
in order to ensure that our backfill_read_lock doesn't outlive the
obc.

ReplicatedPG::op_applied no longer clears repop->obc since we need
it to live until the op is finally cleaned up.  This is fine since
repop->obc is now an ObjectContextRef and can clean itself up.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:32:56 -07:00
Samuel Just
2cadc231ae osd_types,OpRequest: move osd_req_id into OpRequest
This way I can have OpRequest included from osd_types.h.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:31:08 -07:00
Samuel Just
9b003b327e OpRequest: move method implementations into cc
I need to remove the osd_types.h include.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:31:08 -07:00
Samuel Just
c4442d70ed ReplicatedPG: reset new_obs and new_snapset in execute_ctx
This way, if execute_ctx is rerun on the same OpContext, we
won't erroneously reuse a stale snapset/object_info.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-10-28 13:30:42 -07:00