We can backlist either a specific instance (1.2.3.4:1234/5678) or an
entire IP, in which case the table has something like "1.2.3.4:0/0" (a port
and nonce of 0).
We would send an incremental for anything >1, or the latest map, but not
osdmap e1 itself. Fix the condition, and make send_incremental() smart
about starting with the full map at 1 as needed.
If the client reconnects, the journal 'close' replay doesn't remove the
session, which leaves the session state intact. It needs to reset it in
that case, or else we get problems if the session is reopened and the
state doesn't match up.
Reported-by: Nat N <phenisha@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Reimplement core readdir (readdir_r_cb), using kclient as a template.
Reimplement all other readdir variants in terms of readdir_r_cb.
Main change is support for frag chunking, and hopefully lots of subtle bugs
that have been fixed in the kclient code we're based on.
If we don't mark down the hb link immediately, we'll forget about it
because it won't be in the from or to set anymore, and if it does go down
later we'll end up with garbage in the logs.
Instead, always mark it down. Since we want to share our map with old
peers that are still up, do that via the cluster link instead, which is
reliably marked down if/when the peer goes down.
Signed-off-by: Sage Weil <sage@newdream.net>
The forcefed mapping relies on a parent map. However, the current
implementation assumes that the parent mapping is unique for all rules. If
that is not the case (i.e., some osd exists in multiple hierarchies) then
we cannot assert that the TAKE matches the calculated force_context.
For now, we can just fail the mapping in that case (we don't use forcefed
mappings yet). The real solution is probably to define parent maps for
all possible hierarchies (i.e., starting at each unique TAKE starting
point).
Signed-off-by: Sage Weil <sage@newdream.net>
We can't trust the inode rstat size without holding the locks. We can
look at our auth frags and though without fear of a false positive
ENOTEMPTY, however.
Rename the function, introduce a helper for the locked check, update
comments, etc.
This fixes a race when reading and deleting objects, as evidenced by
cp bigfile a
mkdir .snap/foo
rmdir a
diff bigfile .snap/foo/a <-- reads cloned object before it hits disk
Reproduced by snaptest-snap-rm-cmp.sh.
Signed-off-by: Sage Weil <sage@newdream.net>