If we have the base dirfrag, do not request it. Otherwise we can get a
reply that contains only that (partial progress), and we will then fail
to wake up our dentry waiter.
This was broken with the rewrite in b58b8d098e.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
To ensure consistency, always set the snap context when the header is
updated. If snapid is set, we update librados' snapid when refreshing
the header as well. Also use CEPH_NOSNAP instead of 0 as the default
snapid to prevent confusion. These changes fix snapshot creation
and removal, and prevent writing to a snapshot.
Rollback is fixed by using selfmanaged_snapshot_rollback.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
This was removed in 2cb86f713d, but is
required for selfmanaged snaps because their snapids aren't in the
pool's snap list, which is how regular rollback finds them.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
The tmp directory is removed after each daemon. Previously, this would
break if two daemons were on the same node. Now, the files will be
copied for each daemon.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
is_readable() may need to adjust the write_pos backward, but will return
false. If we are at the end we still need to wake up any waiters so they
know about it.
Signed-off-by: Sage Weil <sage@newdream.net>
This was consistently breaking stuff for some people, as the acks were
high priority but the commits weren't. They should match.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
This lets you manually inject a monmap into a down monitor. This is useful
in cases where you need to change the monmap but aren't able to get a
quorum with the old map.
Signed-off-by: Sage Weil <sage@newdream.net>
If the dentry isn't marked dirty _commit_partial won't save it. This is
caught later by the check_rstats() (or anyone actually trying to use the
/.ceph directory).
Fixes: #938
Signed-off-by: Sage Weil <sage@newdream.net>
We want to remove the client session from the map as long as it is not
attached to an actual messenger Connection. This key point got lost
somewhere the last time the session states were restructured. It is now
explicit.
This fixes the symptom where a recovering MDS reconnect has to time out on
clients that cleanly closed their sessions.
Also, fix a use-after-free when (uselessly) printing the session state.
Signed-off-by: Sage Weil <sage@newdream.net>
Add a flag, CINIT_FLAG_NO_DEFAULT_CONFIG_FILE, that specifies that the
program should not read a config file by default.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
If this is set to true (which it defaults to), then the mon
will force MDSes configured as mds_standby_replay to become active.
For #893.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
The mds_log_unsafe mode would wait for ack for some journal writes, and
safe for others. Now that we can reply to client requests without waiting
for the journal to flush (as of ~2 years ago), this distinction is no
longer useful. It is also more error-prone, as it complicates the code
and vastly expands the possible combinations of MDS failures and replay
scenarios we need to verify.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Allow users to create bucket names that don't meet the S3
recommendations, but which do meet the spec.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
The goal is for the MDS to stop processing requests when it hasn't heard
from the monitors, to avoid a situation where a rogue process goes off
doing its own thing. Yes, if we fail it over the cmds can't write to the
object store, but it can reply to clients when it may not be appropriate
or good to do so.
The old logic was fragile and wonky, with messages getting deferred, and
then re-deferred. This implementation is much cleaner and should be much
more efficient and less fragile. There are still improvements to be made
as far as which messages we do/do not process when we think we're laggy.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Back in olden times when we would would wait for acks for some journal
writes, we did an extra wait_for_safe() before discarding a journal segment
to make sure anything being discarded was safely committed in newers
segments. These days mds_log_unsafe is always false (and
journaler_safe is true), so we can skip this check.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>