We can now permanently mark objects as lost by setting the lost bit in
their object_info_t. Rev the object_info_t struct.
get_object_context: re-arrange this so that we're always setting the
lost bit. Also avoid some unecessary steps.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
In activate_map, we now mark objects that we know are unfindable as
lost. This relies on the might_have_unfound set introduced earlier.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Install SIGSEGV / SIGABORT handlers with sigaction using SA_RESETHAND.
This will ensure that if the signal handler itself encounters another
fault, the default signal handler (usually dump core) will be what is
used. Also, flush the log before dumping core.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Don't request information from an OSD unless it is up and part of the
might_have_unfound set. Add more logging.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Rather than vectors of pointers, use vectors of NodeInfo structures.
This avoids the problem of freeing the NodeInfo structures.
GuiMonitor::gen_node_info_from_icons: initialize status.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Avoid confusing other code (e.g. kick_flushing_caps) by staying on the mds
flushign_caps list when we don't even have an auth_cap with them anymore.
We'll need to re-flush to a new MDS later.
Signed-off-by: Sage Weil <sage@newdream.net>
This should only return true when recovery is done, i.e., no more missing
objects. Nothing to do with unfound.
Signed-off-by: Sage Weil <sage@newdream.net>
In PG::is_all_uptodate, don't try to look for peer_missing[osd->whoami].
The primary keeps that in PG::missing!
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Formerly, we had a special case in read_log for dealing with objects
whose objects were present on the disk, but not their attributes. This
conflicts with our plans to mark objects as lost by putting a bit in the
object attributes, since without those attributes, we'll never know if
the objects were formerly marked as lost.
This should almost never happen, and if it does, we just handle the
objects as missing in the normal way.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
The might_have_unfound set is used by the primary OSD during recovery.
This set tracks the OSDs which might have unfound objects that the
primary OSD needs. As we receive Missing from each OSD in
might_have_unfound, we will remove the OSD from the set.
When might_have_unfound is empty, we will mark objects as LOST if the
latest version of the object resided on an OSD marked as lost.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
No need to specify destination in send_reply, as we always have the request
for reference.
Simplify MRoute constructors (keep the ones we use) for tid and bcast
best-effort case.
Do NOT do a best-effort forward of a reply with a tid specified if the tid
is not in the routed-request map.
Signed-off-by: Sage Weil <sage@newdream.net>
When activating an inactive replica, assert that we are doing so based
on a message from the primary.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
We want to remove replicas that we don't ack, but those don't appear in
the strong_inode map; they're appended to the base_inode bufferlist. Make
a (temporary) set to track who those are so that we know who to get rid of.
Signed-off-by: Sage Weil <sage@newdream.net>
This removes a compiler warning that appeared in a gcc upgrade and
is apparently erroneous, about its usage violating strict-aliasing rules
when the + operator is used.
This actually is initialized before all uses, but compilers tend to
have trouble with assignment in if-else branches, and -1 is considered
invalid so there's no danger of refactoring breaking anything.
switching to a new journal segment.
MDSCache:
The stray member has been replaced with strays, an array of inodes
representing the set of available stray directories, as well as
stray_index indicating the index of the current stray directory.
get_stray() now returns a pointer to the current stray directory
inode.
advance_stray() advances stray_index to the next stray directory.
migrate_stray no longer takes a source argument, the source mds
is inferred from the parent of the dir entry.
stray dir entries are now stray<index> rather than stray.
scan_stray_dir now scans all stray directories.
MDSLog:
start_new_segment now calls advance_stray() on MDSCache to force a new
stray directory.
mdstypes:
NUM_STRAY indicates the number of stray directories to use per MDS
MDS_INO_STRAY now takes an index argument as well as the mds number
MDS_INO_STRAY_OWNER(i) returns the mds owner of the stray directory i
MDS_INO_STRAY_OWNER(i) returns the index of the stray directory i
Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
PG::generate_past_intervals needs to generate all the intervals back to
history.last_epoch_clean, rather than just to
history.last_epoch_started. This is required by
PG::build_might_have_unfound, which needs to examine these intervals
when building the might_have_unfound set.
Move the check for whether past_intervals is up-to-date into
generate_past_intervals itself. Fix the check.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>