Also catches corner-case found by Zheng where an unjournaled directory will
cause export pinning to fail because it cannot be made a subtree until its
parent is stable.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
With Zheng's help, now that the code has captured all the paths where an inode
should be checked for export pins, we don't need to look at parents anymore.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Zheng's suggestion: "[maybe_export_pin is called by MDCache::add_inode() and
Migrator. For the first case, inode has no dirfrag (I think it's better to call
this function in CInode::get_or_open_dirfrag)"
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Idea here is that a pinned inode should not be exported when its parent is.
Setting the pinned inode's dirfrags to aux subtrees prevents them from being
merged with a parent subtree.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
i.e. don't keep walking the parents after adding the ancestor (or the node
itself) to the export_pin_queue.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This adds a chain of linked lists to CInode which can be followed to CInodes
that are export pinned to this rank.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This allows the client/admin to pin a directory tree to a particular rank,
preventing its export by the dynamic balancer.
Fixes: http://tracker.ceph.com/issues/17834
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This commit moves the MDSMap export_targets updates/handling to MDSRank. It is
necessary to wait for export_targets to be updated prior to doing any exports
as clients must have sessions open with targets of exports before any export
can be performed. Before this commit, this handling was only done for
migrations initiated by the balancer and not for manual migrations done by the
admin socket.
This fix and refactoring does the following:
o MDSRank now manages export_targets via a map of ranks with DecayCounters.
MDSRank::hit_export_target enables the Migrator/MDBalancer to hit a rank to
indicate migration is either desired or in progress. Importantly, updating
export_targets is now no longer tied to the previous MDBalancer cycle.
o mds_bal_target_removal_min and mds_bal_target_removal_max are removed in
favor of a DecayRate, via mds_bal_target_decay, which is independent of the
tick rate.
o Certain balancing state has been pulled out of the MDBalancer into a stack
variable type (balance_state_t). This is to avoid unnecessary persistence
of my_targets, imported, and exported maps which made the code confusing.
o try_rebalance is no longer called on MDSMap updates. This was done before
export target checking was part of the balancer, in 3e36d3202.
o The Migrator now hits a rank in export_targets via MDSRank::hit_export_target
proportional to how much is being exported and periodically during the
course of the export. In my testing, one "default" hit (-1) will at least
keep the target in the export map for about 2 minutes.
o The Migrator will wait until a target is in the export_targets before
it actually does the export, or abort the export if the target is not
added in a timely manner.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Funnel dispatch through MDCache::dispatch_request so we have only one call site
for Migrator::dispatch_export_dir.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
An anonymous inode may not have a stable parent so immediate migration would
cause a segfault when checking for strays.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This resolves a bug where getting the DecayCounter value before one second has
elapsed will result in 0 always being returned. This is because ::hit will add
to the delta, not the value. The delta is added to the value only in the decay
function which only processes changes after 1 second has elapsed since the last
decay.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>