Add check whether to allow writing ops based on failsafe full percentage
Check for failsafe nearfull warning or full error message every heartbeat
Use clock to limit messages to every 30 secs (osd_op_complaint_time)
Feature: #4197
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
CEPH_MDS_OP_CREATE has CEPH_MDS_OP_WRITE bit set so already checked
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Also, sub_op_modify transactions currently carry the operations
for creating snap links in the shipped transaction. To handle
ops shipped by unenlightened osds, transactions can now be
tagged with a tolerate_collection_add_enoent flag.
Signed-off-by: Samuel Just <sam.just@inktank.com>
- SnapTrimmer now uses SnapMapper to get the next object to trim
- Entries for a snap are implicitely removed from SnapMapper when
the last object is trimmed, so no need for the adjust_local_snaps
logic.
- Scrub now compares the object_info snaps set on the object attr
with the version stored in the SnapMapper.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Create helpers in hobject for generating prefixes for a
pg as well as matching hobjects against a pgid/numpgs
combo. Use these in HashIndex.cc.
Signed-off-by: Samuel Just <sam.just@inktank.com>
ObjectStore now appends passed contexts in queue_transaction
to the Transaction contexts and uses that to pass into
the virtual queue_transactions.
Signed-off-by: Samuel Just <sam.just@inktank.com>
missing_loc/missing_loc_sources also must be cleaned up
if a peer goes down during peering:
1) pg is in GetInfo, acting is [3,1]
2) we find object A on osd [0] in GetInfo
3) 0 goes down, no new peering interval since it is neither up nor
acting, but peer_missing[0] is removed.
4) pg goes active and try to pull A from 0 since missing_loc did not get
cleaned up.
Backport: bobtail
Fixes: #4371
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
No further refs to the object can remain at this point.
Furthermore, the callbacks might lock mutexes of their
own.
Backport: bobtail
Fixes: #4378
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
If there is a fault while delivering the message, close the con. This will
clean up the Session state from memory. If the client doesn't get the
CLOSED message, they will reconnect (from their perspective, it is still
a lossless connection) and get a remote_reset event telling them that the
session is gone. The client code already handles this case properly.
Note that way back in 4ac45200f1 we removed
this because the client would reuse the same connection when it reopened
the session. Now the client never does that; it will mark_down the con
as soon as it is closed and open a new one for a new session... which means
the MDS will get a remote_reset and close out the old session.
Signed-off-by: Sage Weil <sage@inktank.com>