Someday we need to do something smarter so that a single unfound object
doesn't hold up replication of other objects. For now, this is the
simplest thing to do.
Signed-off-by: Sage Weil <sage@newdream.net>
This avoids crashing later in do_osd_ops() with something like
osd/ReplicatedPG.cc: In function 'int ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&, ceph::bufferlist&)', in thread '7f27e2d7e700'
osd/ReplicatedPG.cc: 1386: FAILED assert(src_obc)
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
These weren't comparing key.
While we're at it, clean this up by using generic macros for writing
these operators, so we don't get it wrong half the time.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Put OCF resource agents in a separate subpackage,
to be enabled with a separate build conditional
(--with ocf).
Make the subpackage depend on the resource-agents
package, which provides the ocf-shellfuncs library
that the Ceph RAs use.
Signed-off-by: Florian Haas <florian@hastexo.com>
Add a wrapper around the ceph init script that makes
MDS, OSD and MON configurable as Open Cluster Framework
(OCF) compliant cluster resources. Allows Ceph
daemons to tie in with cluster resource managers that
support OCF, such as Pacemaker (http://www.clusterlabs.org).
Disabled by default, configure --with-ocf to enable.
Signed-off-by: Florian Haas <florian@hastexo.com>
We can't propose_pending() from any context; do this in the tick() thread,
with the proper locking. Among other things, this fixes the crash on
startup that is now triggered due to eba235f2.
Signed-off-by: Sage Weil <sage@newdream.net>
An exit code of 1 on status is defined in LSB as
"program is dead, but pid file exists". Check for existence
of this pid file, and only set the exit status 1 if it's still there.
Set it to 3 ("program is not running") otherwise.
Reference: http://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
Signed-off-by: Florian Haas <florian@hastexo.com>
backfill_pos is the leading edge; last_backfill is the trailing edge.
Anything inbetween is either pushed, doesn't exist, or in
backfills_in_flight.
For operations on non-degraded (in-progress) objects in that window, book
the stats update in pending_backfill_updates so that it will get applied
when last_backfill is advanced.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
We may not have a valid OSDMap in all of these cases (notably, during
boot). Always take the fsid from the monmap, which will be valid after
we've authenticated.
This fixes messages like
2011-12-29 08:53:44.530830 7ff3595e2700 mon.a@0(leader).pg v5 handle_statfs on fsid 00000000-0000-0000-0000-000000000000 != f8a6383d-5fbe-4f65-907e-f8d09e1d540d
on the monitor from MPGStats messages with a bad fsid right after osd boot.
Signed-off-by: Sage Weil <sage@newdream.net>
Tell the monitor which monmap version we have in our initial auth message.
Make the monitor send the latest monmap if it has something newer. This
ensures that once authentication completes the monclient has the latest
monmap and a valid fsid.
Fixes: #1848
Signed-off-by: Sage Weil <sage@newdream.net>
Only check backfill if we pushed to the backfill target. And avoid teh hash
lookup in the general case.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Actually, I don't think this was fully implemented to begin with, so it's
not a 'fix' per se. This will let you use injectargs to adjust the
filestore config options during runtime.
Signed-off-by: Sage Weil <sage@newdream.net>
We can scan starting from last_backfill to avoid rescanning portions
of the collection recovered by normal recovery. collection_list_partial
now includes begin if present. next will be <= the next object in the
collection. This way we can scan starting at last_backfill without
skipping last_backfill.
Signed-off-by: Samuel Just <rexludorum@gmail.com>