This mismatch about whether pool IDs are signed or unsigned is
a persistent annoyance. I'm now casting the unsigned down to signed space
because apparently the OSD is using negative IDs for temporary object
namespaces.
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
If a directory is complete, we *really* want to keep the exclusive cap
so that we don't end up needing to do MDS lookup requests on every cache
miss.
Fixes: #11226
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
The pg_log.add() call already dirties the log such that the later
write_log() call will write it. There is no need to encode it separately
here and then explicitly omap_setkeys() it.
Signed-off-by: Sage Weil <sage@redhat.com>
Previously, we did not actually set it when we got a pg creation message from
the mon. It would actually get set on the first start_peering_interval after
that point. If we don't get that far, but do send a stat update to the mon, we
can end up with 11197. Instead, let's just set it and clear it upon entry into
and exit from the Primary state.
Fixes: 11197
Signed-off-by: Samuel Just <sjust@redhat.com>
filelock in LOCK_XSYN state does not allow Fs cap. so client can't
mark directory as complete when handling the readdir reply.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Handle the case that kernel does not support fcntl.F_OFD_SETLK.
Also fix the code that checks if fnctl fails with errno == EINTR.
Fixes: 11205
Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit 4ececa3dc4)
Otherwise, we fail to trim the peer's last_backfill_started and get bug 11199.
1) osd 4 backfills up to 31bccdb2/mira01213209-286/head (henceforth: foo)
2) Interval change happens
3) osd 0 now finds itself backfilling to 4 (lb=foo) and osd.5
(lb=b6670ba2/mira01213209-160/snapdir//1, henceforth: bar)
4) recover_backfill causes both 4 and 5 to scan forward, so 4 has an interval
starting at foo, 5 has an interval starting at bar.
5) Once those have come back, recover_backfill attempts to trim off the
last_backfill_started, but 4's interval starts after that, so foo remains in
osd 4's interval (this is the bug)
7) We serve a copyfrom on foo (sent to 4 as well).
8) We eventually get to foo in the backfilling. Normally, they would have the
same version, but of course we don't update osd.4's interval from the log since
it should not have received writes in that interval. Thus, we end up trying to
recover foo on osd.4 anyway.
9) But, an interval change happens between removing foo from osd.4 and
completing the recovery, leaving osd.4 without foo, but with lb >= foo
Fixes: #11199
Backport: firefly
Signed-off-by: Samuel Just <sjust@redhat.com>
tests: TestFlatIndex.cc races with TestLFNIndex.cc
Both use the same PATH and when run in parallel they sometime conflict.
Fixes: #11217
Signed-off-by: Xinze Chi <xmdxcxz@gmail.com>