This will be useful in the general case where the cluster is created with
an empty map and useful crush hierarchy.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
This is simpler and exercises the monitors ability to start with a generic
osdmap and build it out as new osds are added to the cluster.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
If an initial osdmap is not provided, we generate an empty one. The user
add osds on their own after that.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
This will set the seed monmap's fsid. This is useful if the monmap is
dynamically generated (e.g., based on ceph.conf or --mon-host list).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
osd/ReplicatedPG.cc: In member function 'virtual void ReplicatedPG::remove_watchers_and_notifies()':
osd/ReplicatedPG.cc:1167: warning: suggest a space before ';' or explicit braces around empty body in 'for' statement
osd/ReplicatedPG.cc:1176: warning: suggest a space before ';' or explicit braces around empty body in 'for' statement
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
We can safely mkfs with an epoch=0 monmap as long as the fsid is set. And
that is what commit f31825cee5 changed.
Instead, use a zeroed fsid to tell if the monmap is valid/usable.
Signed-off-by: Sage Weil <sage@newdream.net>
These are now in the generated crush maps, so it seems appropriate to
recompile them :).
Reported-by: Martin Mailand <martin@tuxadero.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Remove if we previous had no latest, not based on which map we now have.
It's possible we join when monmap epoch is something much larger than 1!
Signed-off-by: Sage Weil <sage@newdream.net>
If a monitor starts up with the correct fsid and auth keys, it will now
add itself to the monmap (and subsequently try to join the quorum) if it
is not already in the monmap.
Signed-off-by: Sage Weil <sage@newdream.net>
We may get the latest monmap when we are doing our probing, but we still
need to process it in update_from_paxos(). Consider get_latest_version()
in addition to the active map.
Signed-off-by: Sage Weil <sage@newdream.net>
If we do a non-idempotent op and it does a commit itself, we don't see
fs->is_committed() true ever. Also count full commit cycles, and kill
ourselves after several of those have gone by.
Signed-off-by: Sage Weil <sage@newdream.net>
- start sync thread prior to replay, so that we can commit as we replay
operations
- keep applied_seq accurate
- pass seq (not old op_seq) to do_transactions
- carry open_ops ref so that commit blocks until we have finished applying
the full transaction
Signed-off-by: Sage Weil <sage@newdream.net>
Make individual transactions idempotent, but their interactions
non-idempotent. I.e. A A A A is okay, but A B A is not.
Signed-off-by: Sage Weil <sage@newdream.net>
We need to wake up the sync thread (duh).
Also, we need to obey the FileJournal::lock -> journal_lock locking
order.
Also, lockdep is broken. :(
Signed-off-by: Sage Weil <sage@newdream.net>
This is a big hammer to fix journal replay on non-btrfs fs backends (extN,
xfs, whatever). The problem is that it is not safe to replay some journal
operations more than once, notably things like CLONE whose source data
may be changed by subsequent operations.
The simple fix is to initiate a full commit after any non-idempotent
operations prior to any subsequent operation within the same Sequencer.
This is done by calling trigger_commit() in _do_transactions(), which means
any potentially dependent operation that follows will get blocked because
a commit is about to start.
I made trigger_commit() a bit more robust to callers who are not holding
an open_ops ref to also succeeding if the given op_seq is already
committing. For the current caller, that can't happen.
There are probably better performing solutions, but this one is at least
correct.
Fixes: #213
Signed-off-by: Sage Weil <sage@newdream.net>