When removing the last instance of ceph, also remove the files
created by ceph during operation. These consist of the files
under /var/lib/ceph, /etc/ceph, and /var/log/ceph. Bug #4415.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
When the checksum or footer are invalid, we will now try to
look at the next entry. If we find a valid entry, it is likely
that the journal is corrupt.
Signed-off-by: Samuel Just <sam.just@inktank.com>
header_t::committed_up_to provides a lower bound for safetly committed
journal entries. If read_entry fails prior to committed_up_to, we
know we have a corrupt jorunal entry. Furthermore, if
journal_write_header_frequency is not 0, we will write out the
journal header once every journal_write_header_frequency
journal writes.
Signed-off-by: Samuel Just <sam.just@inktank.com>
If queue_pos == header.max_size when we create the entry
header magic, the entry will be rejected at get_top() on
replay.
Fixes: #4436
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Otherwise:
1) expand_pg_num removes a splitting pg entry
2) peering thread grabs pg lock and starts split
3) OSD::consume_map grabs pg lock and starts removal
At step 2), we run afoul of the assert(is_splitting)
check in split_pgs. This way, the would be splitting
pg is marked as removed prior to the splitting state
being updated.
Backport: bobtail
Fixes: #4449
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
1) Replica sends notify
2) Prior to processing notify, primary queues query to replica
3) Primary processes notify and activates sending MOSDPGLog
to replica.
4) Primary does do_notifies at end of process_peering_events
and sends to Query.
5) Replica sees MOSDPGLog and activates
6) Replica sees Query and asserts.
In the above case, the Replica should simply ignore the old
Query.
Fixes: #4050
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
I broke this in 4637752db6 when I
restructured this function. Only try to increase the max if we are
the leader.
Signed-off-by: Sage Weil <sage@inktank.com>
Determine what cluster the disk belongs to by checking the fsid defined
in /etc/ceph/*.conf. Previously we hard-coded 'ceph'.
Note that this has the nice side-effect that if we have a disk with a
bad/different fsid, we now fail to activate it. Previously, we would
mount and start ceph-osd, but the daemon would fail to authenticate
because it was part of the wrong cluster.
Fixes: #3253
Signed-off-by: Sage Weil <sage@inktank.com>
The ceph-mds.conf file moced from the ceph package to the
ceph-mds package. Add replaces/breaks statements to the
control file to handle this on upgrade.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
If the target position is already a mount point, fail to move our mount
over to it. This usually indicates that a different osd.N from a
different cluster instances is in that position.
Signed-off-by: Sage Weil <sage@inktank.com>
This ensures that when we then start individual mds instances, we can
stop ceph-mds-all and they will get stopped. We do the same already for
ceph-all.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 41897fcba1)
This ensures that when we then start individual mds instances, we can
stop ceph-mds-all and they will get stopped. We do the same already for
ceph-all.
Signed-off-by: Sage Weil <sage@inktank.com>
This reverts commit 813e9fe2b4.
We run --mkfs with the osd disk mounted in a temporary location, so it is
necessary to explicitly pass in these paths.
If we want to support journals in a different location, we need to make
ceph-disk-prepare update the journal symlink accordingly.. not control it via
the config option.
Signed-off-by: Sage Weil <sage@inktank.com>