This will cause read operations from standby mdses to be distinguishable
from those from the normal by changing the node name in the messenger.
Previously, the replay node would have the same name as the node it's
following.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
If we encounter a partial tail entry, we drop it by moving the write_pos
(end of journal) back to read_pos. We also need to reset the read
state (read_buf, requested/received_pos) so that subsequent replay attempts
won't be horribly confused.
Signed-off-by: Sage Weil <sage@newdream.net>
The add_ambiguous_import() call was clobbering the bounds field for
EImportStart::replay(), screwing up the subtree auth adjustment. Make the
argument const.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
We need to rejoin dirfrag replicas explicitly. We were doing only dentries,
but that won't necessarily include every dirfrag.
Signed-off-by: Sage Weil <sage@newdream.net>
This ensures that try_trim_non_auth_subtree() doesn't throw out a subtree
we're in the midst of importing (during journal replay).
Signed-off-by: Sage Weil <sage@newdream.net>
This helper captures the logic of keeping subtrees when necessary but
dropping them when possible, and cleaning up as appropriate.
Signed-off-by: Sage Weil <sage@newdream.net>
We have to keep export bounds open for auth subtrees. After we export a
subtree, though, there are two opportunities to drop empty dirfrags from
our cache:
- The children of the exported subtree may now be trimmable, if they are
also non-auth and empty.
- The exported subtree may be trimmable if it is empty and the parent is
also non-auth. This may be true for ancestors further up the hierarchy
as well.
This helps ensure that when we get to rejoin, the only non-auth subtrees we
have are there because they are non-empty or because they are bounds on our
own subtrees.
Signed-off-by: Sage Weil <sage@newdream.net>
Previously, an mds could go into standby replay before the mds which it
is replaying has finished creating.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
We want to advance requested_pos *only* as far as we actually want to read,
i.e., the previously-probed end of the journal.
This, among other things, will screw us up later when we reprobe and try to
read more because requested_pos is already past read_pos.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
If we have the base dirfrag, do not request it. Otherwise we can get a
reply that contains only that (partial progress), and we will then fail
to wake up our dentry waiter.
This was broken with the rewrite in b58b8d098e.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
get_bucket throws an exception when the bucket doesn't exist, whereas
lookup just returns None. Sometimes we want the former behavior.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
If we have the base dirfrag, do not request it. Otherwise we can get a
reply that contains only that (partial progress), and we will then fail
to wake up our dentry waiter.
This was broken with the rewrite in b58b8d098e.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
When initializing the config_options array, complain if the size of the
option field we're trying to initialize doesn't match the size of our
type. This will prevent careless type annotations from overwriting
neighboring option fields.
Also create a header called "static assert" which implements a
compile-time assert.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Previously we could only path_traverse and retry a request or message.
This just allows an explicit context to be used as well. It's the caller's
job to clean it up if we return <= 0.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>