Unprotect examines all pools, so use blanket x before 0.54. After
that, use class-read restricted by object_prefix to rbd_children.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Clone needs to actually re-read the header to make sure the image is
still protected before returning. Additionally, it needs to consider
the image protected *only* if the protection status is protected -
unprotecting does not count. I thought I'd already fixed this, but
can't find the commit.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Since 58890cfad5, regular {rbd_}open()
would fail with -EPERM if the user did not have write access to the
pool, since a watch on the header was requested.
For many uses of read-only access, establishing a watch is not
necessary, since changes to the header do not matter. For example,
getting metadata about an image via 'rbd info' does not care if a new
snapshot is created while it is in progress.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
20496b8d2b forgot to do this. Without
this change, all class methods required regular read permission in
addition to class-read or class-write.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
This prevented a read-only user from being able to unprotect a
snapshot without write permission on all pools. This was masked before
by the CLS_METHOD_PUBLIC flag.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Remove the special-case check, which does not inform the peer what
protocol features are missing. It also enforces this requirement even
when we negotiate auth none.
Reported as part of bug #3657.
Signed-off-by: Sage Weil <sage@inktank.com>
In a mixed cluster where some OSDs support the recovery reservations and
some don't, the replica may be new code in RepNotRecoverying and will
complete a backfill. In that case, we want to just stayin
RepNotRecovering.
It may also be possible to make it infer what the primary is doing even
thought it is not sending recovery reservation messages, but this is much
more complicated and doesn't accomplish much.
Fixes: #3689
Signed-off-by: Sage Weil <sage@inktank.com>
Most users don't need this, and having it on will just fill their clusters
with objects that will need to be cleaned up later.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
The local state isn't propagated into the backtick shell, resulting in
'unknown' for all remote daemons. Avoid backticks altogether.
Signed-off-by: Sage Weil <sage@inktank.com>
Having this too large means that queues get too deep on the OSDs during
backfill and latency is very high. In my tests, it also meant we generated
a lot of slow recovery messages just from the recovery ops themselves (no
client io).
Keeping this at the old default means we are no worse in this respect than
argonaut, which is a safe position to start from.
Signed-off-by: Sage Weil <sage@inktank.com>
Keep the journal queue size smaller than the filestore queue size.
Keeping this small also means that we can lower the latency for new
high priority ops that come into the op queue.
Signed-off-by: Sage Weil <sage@inktank.com>
If we had a pending failure report, and send a cancellation, take it
out of our pending list so that we don't keep resending cancellations.
Signed-off-by: Sage Weil <sage@inktank.com>
Previously, using the state on active worked, but now we might
go back through WaitRemoteRecoveryReserved without resetting
Active.
Signed-off-by: Samuel Just <sam.just@inktank.com>
We don't want to change missing sets during a chunky
scrub since it would cause !is_clean() and derail
the rest of the scrub. Instead, move the missing,
inconsistent, and authoritative sets into scrubber
and add to during scrub_compare_maps(). Then,
handle repairing objects all at once in scrub_finish().
Signed-off-by: Samuel Just <sam.just@inktank.com>
Add tests for:
- sparse import makes expected sparse images
- sparse export makes expected sparse files
- sparse import from stdin also creates sparse images
- import from partially-sparse file leads to partially-sparse image
- import from stdin with zeros leads to sparse
- export from zeros-image to file leads to sparse file
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Try to accumulate image-sized blocks when importing from stdin, even if
each read is shorter than requested; if we get a full block, and it's
all zeroes, we can seek and make a sparse output file
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
We can get a pattern like so:
- new mon session
- after say 120 seconds, we decide to send a stats msg
- outstanding_pg_stats is finally true, we immediately time out (30 second
grace), and reconnect to a new mon
-> repeat
The problem is that we don't reset the last_sent timestamp when we send.
Or that we do this check after sending instead of before. Fix both.
This should resolve the issue #3661 where osds that don't have pgs
updating are not stats messags to the mon to check in, and are eventually
getting marked down as a result.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
This avoids the situation where a librados or other user with the default
of 'cephx,none' and no keyring is authenticating against a cluster with
required of 'none' and an annoying warning is generated every time. Now
we only print a helpful message if we actually failed.
Signed-off-by: Sage Weil <sage@inktank.com>
This means we can drop the scrub repair state_clear() call. We probably
can drop others, but lets leave that for another day.
Signed-off-by: Sage Weil <sage@inktank.com>
If both cephx and none are accepted auth methods, and
cephx keyring cannot be found then resort to using
none, instead of failing.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
If we do a scrub repair, we need to go from clean to recovery again to
copy objects around.
This fixes a simple repair of a missing object, either on the primary or
replica.
Signed-off-by: Sage Weil <sage@inktank.com>
We set SCRUBBING when we queue a pg for scrub. If we dequeue and
call scrub() but abort for some reason (!active, degraded, etc.), clear
that state bit.
Bug is easily reproduced with 'ceph osd scrub N' during cluster startup
when PGs are peering; some PGs can get left in the scrubbing state.
Signed-off-by: Sage Weil <sage@inktank.com>