e.g., 0x432da000~1000 instead of 0x432da000~0x1000
I think it's sufficiently clear that the value after ~ should have the same
base as the first bit, and it's easier to read. And less text.
Signed-off-by: Sage Weil <sage@redhat.com>
Class-wide Compressor, compression mode, and options. For now these are
global, although later we'll do them per-Collection so they can be pool-
specific.
Signed-off-by: Sage Weil <sage@redhat.com>
Snappy fails to decompress if there are extra zeros in the input buffer.
So, store the length explicitly in the header to avoid feeding them into
the decompressor.
Signed-off-by: Sage Weil <sage@redhat.com>
- we weren't reading from 'clean' buffers
- restructured loop a bit chasing another bug (but it ended up being
in the caller)
Signed-off-by: Sage Weil <sage@redhat.com>
We weren't handling the case of
read block 0~300
cache bloc 100~100
where the result is read(head) + cached + read(tail). Restructure the
loop to handle this.
Signed-off-by: Sage Weil <sage@redhat.com>
BitmapFreelistManager doesn't like overlapping allocated+released sets
when the debug option is enabled, because it does a read to verify the
op is valid and that may not have been applied to the kv store yet.
This makes bluestore ObjectStore/StoreTest.SimpleCloneTest/2 pass with
bluestore_clone_cow = false and bluestore_freelist_type = bitmap.
Signed-off-by: Sage Weil <sage@redhat.com>
Use the blob put_ref helper so that we can deallocate blobs partially
(instead of always waiting until they are completely unused).
Signed-off-by: Sage Weil <sage@redhat.com>
We're only worried about direct writes and wal overwrites; the other write
paths are to freshly allocated blobs.
Signed-off-by: Sage Weil <sage@redhat.com>
We reference count which parts of the blob are used (by lextents), but
currently we only release our space back to the system when all references
go away. That is a problem if the blob is large (say, 4MB), and we, say,
truncate off most (but not all) of it.
Unfortunately, we can't simply deallocate anything that doesn't have a
reference, because the logical refs are on byte boundaries, and allocation
happens in larger units (min_alloc_size). A one byte logical punch_hole
might be responsible for the release of a larger block of storage.
To resolve this, we keep track of which portions of the blob have been
released by poisoning the offset in the extents vector. We expect that
this vector will be almost always short, so we do not bother with a
indexed structure, since iterating a blob offset to determine if it is
still allocated is likely faster.
Signed-off-by: Sage Weil <sage@redhat.com>
This is a "magic" offset that we can use to indicate an invalid extent
(vs, say, an extent at offset 0 that might clobber real data if it were
used).
Signed-off-by: Sage Weil <sage@redhat.com>
We can only do a direct write into an already-allocated blob once, if that
range hasn't yet been used. Once it has been used, it is much to complex
to keep track of when all references to it have committed to disk before
reusing it, so we don't try to handle that case at all.
Since the range has never been used, we can assert that there are no
references to it.
Signed-off-by: Sage Weil <sage@redhat.com>
- writing into unreferenced blob space
- wal blob writes
both need to update the blob used map. The full blob writes generates
blobs that are always full, so no change is needed there. New partial
blob creations need to indicate which parts aren't yet used.
Signed-off-by: Sage Weil <sage@redhat.com>
Keep track of which ranges of this blob have *never* been used. We do
this as a negative so that the common case of a fully-written blob is an
empty set.
Signed-off-by: Sage Weil <sage@redhat.com>