'ceph health' to get the usual summary, 'ceph health detail' to
additionally get a comprehensive list of problems found.
Eventually we can format this as yaml, json, whatever, too.
Signed-off-by: Sage Weil <sage@newdream.net>
If we are blacklisted by the OSD cluster, it's because we were too slow
and were replaced by another ceph-mds. Respawn and re-register as a
standby.
If we get some other write error, shut down.
Fixes: #1796
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Specify a generic callback for any write error the journaler encounters.
This is more helpful than passing up write errors to specific callers
because
- there are several of them
- journaler initiates writes on its own (like the head)
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
If we create a new filestore, apply one transaction, and then crash, we
want to make sure roll back to a consistent reference point--empty. The
simplest solution is to create that snap_0 during mkfs. This avoids
strangeness like
2012-02-27 00:42:00.336703 7fb1381ef780 filestore(/ceph/osd.0) mkfs in /ceph/osd.0
2012-02-27 00:42:00.341399 7fb1381ef780 journal _open /ceph/osd.0.journal fd 10: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
2012-02-27 00:42:00.349705 7fb1381ef780 filestore(/ceph/osd.0) mkjournal created journal on /ceph/osd.0.journal
2012-02-27 00:42:00.349728 7fb1381ef780 filestore(/ceph/osd.0) mkfs done in /ceph/osd.0
2012-02-27 00:42:00.349787 7fb1381ef780 filestore(/ceph/osd.0) mount FIEMAP ioctl is NOT supported
2012-02-27 00:42:00.349800 7fb1381ef780 filestore(/ceph/osd.0) mount detected btrfs
2012-02-27 00:42:00.349813 7fb1381ef780 filestore(/ceph/osd.0) mount btrfs CLONE_RANGE ioctl is supported
2012-02-27 00:42:00.357023 7fb1381ef780 filestore(/ceph/osd.0) mount btrfs SNAP_CREATE is supported
2012-02-27 00:42:00.405174 7fb1381ef780 filestore(/ceph/osd.0) mount btrfs SNAP_DESTROY is supported
2012-02-27 00:42:00.405214 7fb1381ef780 filestore(/ceph/osd.0) mount btrfs START_SYNC got (25) Inappropriate ioctl for device
2012-02-27 00:42:00.405228 7fb1381ef780 filestore(/ceph/osd.0) mount btrfs START_SYNC is NOT supported: (25) Inappropriate ioctl for device
2012-02-27 00:42:00.405235 7fb1381ef780 filestore(/ceph/osd.0) mount WARNING: btrfs snaps enabled, but no SNAP_CREATE_V2 ioctl (from kernel 2.6.37+)
2012-02-27 00:42:00.405561 7fb1381ef780 filestore(/ceph/osd.0) mount found snaps <>
2012-02-27 00:42:00.405576 7fb1381ef780 filestore(/ceph/osd.0) mount WARNING: no consistent snaps found, store may be in inconsistent state
and subsequent badness if we fail before a proper commit is made.
Fixes: #2105
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
If we get new (non-replayed) ops during replay, those need to wait until
after the replayed ops are ordered and applied. Otherwise we break the op
ordering completely, particularly with something like
- pg not active
- get op 1, put on waiting_for_active
- pg enters replay
- get op 2, apply immediately
- finish replay, requeue op 1
Fixes: #2082
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
The pusher doesn't know what clone_overlap we'll see, so it has no idea
if we are data_complete from our perspective, making this check useless.
In particular, we screw up if we race with a recalculation of
clone_overlap.
Fixes: #2133
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
We shouldn't get to this point. If we do, recover_primary didn't do what
it needed to. Dump the remaining missing set and hope we can debug.
Signed-off-by: Sage Weil <sage@newdream.net>
Add a resource agent for mapping, unmapping and monitoring RBD devices.
Maps an RBD on start, unmaps it on stop. Checks "rbd showmapped"
output for monitoring whether the device is mapped, thus does not
rely on the ceph-rbdnamer udev magic to be enabled.
This RA is cloneable and essentially allows people to use RBD devices
as a drop-in replacement for
- iSCSI devices,
- host-based mirrored devices using md RAID-1,
- DRBD devices
in Pacemaker clusters.
This is the start of making the SimpleMessenger interface legible
to users. In addition to moving the configuration and accessor
functions to the top of the file, it adds virtual to the functions
which are part of the defined Messenger interface.
You can tell from some of the comments that work remains.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
This was pretty pointless since each Messenger has a well-defined
exit point and shutdown process.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
With that, remove the set_nonce function and the gratuitous passing
of nonce around through layers of functions.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Instead, have a settable nonce value that you can fill in any time
after construction and that it uses during regular start().
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
This function has been vestigial for a long time. Remove it and move
its remaining functionality into the constructor.
Update users to the new interface (this is remarkably easy and
simplifies the code).
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
There is a window in the old check between when current/commit_op_seq is
written and the snapshot is taken. If ceph-osd crashes, we'll be unable to
start because we'll believe current/ was in use without proper checkpoints.
Instead, make the snapped/not snapped state of current/ explicit.
Fixes: #2118
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>