Previously, submit_error_log was chained to failure_func
returned future.
Now submit_error_log is called from within do_osd_ops_execute
Fixes: https://tracker.ceph.com/issues/61651
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
```
submit_error_log records the result of an IO into the pg log so that we can return
the same error code if the client resends the request.
This should only be relevant for logical errors resulting from the target object state
-- for example, EEXIST returned on an exclusive create -- because there is application
logic built to rely on them.
In classic, the only such site is if the return value from do_osd_ops is negative
(or the transaction is empty) -- see PrimaryLogPG::prepare_transaction,
specifically where we set update_log_only to true.
We do not want to record space usage errors or errors specific to conditions on the primary
OSD such as IO errors -- submit_error_log isn't a catch-all error path.
```
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
This change is crucial for the next commits,
submit_error_log and failure_func should share the same
rep_tid.
to be shared later with error_log call
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
`submit_error_log()` was returning `version` to be used later in
`failure_func` call to `complete_write()`.
Maintain the version returned from `submit_error_log()` in a dedicated map
to avoid handling the lifetime of 'version'.
Note: This change is crucial to the following change that will
return 'error_fut' separately.
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
* '!log_entries.empty()' assert instead of if-case.
log_entries entry is inserted right before.
* 'version != eversion_t()' assert instead of if-case.
since op_info.may_write() is true, we should have a non-empty version.
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
crimson/common/interruptible_future: deal with exceptions thrown from seastar::future::get() and seastar:🧵:yield()
Reviewed-by: Samuel Just <sjust@redhat.com>
Add the term "Quorum" to the glossary and link to the part of
architecture.rst concerning Monitors. The sticky header at the top of
the docs.ceph.com website gets in the way of the location linked to in
this commit, but fatigue and disgust prevent me from spending time today
trial-and-erroring my way through the hostile and ill-documented
wilderness of scroll-margin so that the link goes where it should.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
Edit the text in the "Initial Troubleshooting" section of
doc/rados/troubleshooting/troubleshooting-mon.rst.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
It's strongly recommended for objects that have references to
external resources (e.g., files) to explicitly release them.
Python doesn't guarantee garbage collection of objects and hence
doesn't guarantee freeing of external resources that occur on
garbage collection.
The __del__() methods in the python mgr modules may not even be
called since garbage collection of objects is not guaranteed in python.
And some of the __del__() methods try to cleanup that seem redundant.
- In volumes/module.py, vc.shutdown() is called in Module.shutdown().
No need to call it again in Module.__del__()
- In telegraf/basesocket.py, BaseSocker.close() is called in
BaseSocket.__exit__(). No need to call it again in
BaseSocket.__del__().
- In mgr_module.py, MgrModuleLoggingMixin._unconfigure_logging() is
called in MgrModule.__init__() and MgrStandbyModule.__init__(). No
need to call it in MgrModule.__del__() and
MgrStandbyModule.__del__().|
- In dashboard/services/cephfs.py, the libcephfs mount is not
shutdown explicitly by the mgr module. However, the cython libcephfs
bindings has a LibCephFS.__dealloc__() finalizer method that calls
LibCephFS.shutdown(). This should unmount and cleanup the ceph mount
handle.
Remove the __del__() of the python mgr modules.
Fixes: https://tracker.ceph.com/issues/63421
Signed-off-by: Ramana Raja <rraja@redhat.com>
Beacuse the loop's returned future is ignored,
we should cover the scenario where the pg is removed and the
snap_trimq iteration didn't complete yet.
Fixes: https://tracker.ceph.com/issues/61653
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Format the steps in the "Initial Troubleshooting" section of
doc/rados/troubleshooting/troubleshooting-mon.rst. A near-future PR (not
this one) will add context to this section and explain that the steps
described here are the first steps that you should undertake when you
determine that you have an unresponsive or down Monitor. This PR is
merely for formatting.
Signed-off-by: Zac Dover <zac.dover@proton.me>
The operation's id and future returned when starting SnapTrimObjSubEvent
is emplaced into subop_blocker.
Later on, we await the completion of all the started operations futures.
Before this patch, we only stored the op id in the subop_blocker vector
which allowed `op` to go out of scope and lose all its references
(and get deleted) before exiting.
Storing the operation as a reference instead of the id
will maintain the SnapTrimObjSubEvent operation lifetime.
Fixes: https://tracker.ceph.com/issues/63299
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Give parallel structure to the questions in the Q&A section of the "The
Cluster Has Quorum But At Least One Monitor Is Down" subsection of the
"Most Common Monitor Issues" section of
doc/rados/troubleshooting/troubleshooting-mon.rst.
Signed-off-by: Zac Dover <zac.dover@proton.me>
Edit the first section of doc/rados/configuration/ceph-conf.rst.
Initially I just wanted to change "series" to "set", but once I got my
hands dirty I ended up simplifying some sentences.
Signed-off-by: Zac Dover <zac.dover@proton.me>
Unlike the other types Ceph and CephExporter share the underlying
method. There was no other use of get_container_mounts on the class
so it could be converted to be customize_container_mounts.
Because there's an extra arg that passes from get_container_mounts
top-level function to Ceph.get_ceph_mounts, that function was not
changed.
Signed-off-by: John Mulligan <jmulligan@redhat.com>