If a kubernetes pod spec specifies a limit of X, then the pod gets both
the limits.memory and requests.memory resource fields set, and rook passes
those as POD_MEMORY_LIMIT and POD_MEMORY_REQUEST environment variables.
This is a problem if only the limit is set, because we will end up
setting our osd_memory_target (and, in the future, other *_memory_targets)
to the hard limit, and the daemon will inevitably reach that threshold
and get killed.
Fix this by also looking at the POD_MEMORY_LIMIT value, and applying the
ratio (default: .8) to it, and setting our actual target to the min of
that and the POD_MEMORY_REQUEST.
Also, set the "default" target to ratio*limit, so that it will apply in
general when no request is specified.
When both request and limit are 10M, we then see
"osd_memory_target": {
"default": "800000000000",
"env": "800000000000",
"final": "800000000000"
},
In a more "normal" situation where limit is 10M and request is 5M, we get
"osd_memory_target": {
"default": "800000000000",
"env": "500000000000",
"final": "500000000000"
},
If only limit is specified (to 10M), we get
"osd_memory_target": {
"default": "800000000000",
"final": "800000000000"
},
Fixes: https://tracker.ceph.com/issues/41037
Signed-off-by: Sage Weil <sage@redhat.com>
This helps to to avoid the case where new tasks were not being scheduled
when an image name was re-used after having a task created under the
same name.
Fixes: https://tracker.ceph.com/issues/41032
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Move to ondisk format v3. This means that per-pool omap keys may exist,
but does not imply that *all* objects use the new form until the
per_pool_omap=1 super key is also set.
Signed-off-by: Sage Weil <sage@redhat.com>
The get_user_bytes() helper is a bit weird because it uses the
raw_used_rate (replication/EC factor) so that it can work *backwards*
from raw usage to normalized user usage. However, the legacy case that
works from PG stats does not use this factor... and the stored_raw value
(in the JSON output only) was incorrectly passing in a factor of 1.0,
which meant that for legacy mode it was a bogus value.
Fix by calculating stored_raw as stored_normalized * raw_used_rate.
Signed-off-by: Sage Weil <sage@redhat.com>
This is a minimal change: we aren't separately reporting data vs omap
usage (like we do in 'osd df' output for individual osds).
Signed-off-by: Sage Weil <sage@redhat.com>
Set per-onode flag to indicate whether the object has per-pool keys or
not. This will allow us to incrementally transition objects later.
Put the new keys under a different prefix.
Signed-off-by: Sage Weil <sage@redhat.com>
Additionally, introduce `task status` field in manager report
messages to forward status of executing tasks in daemons (e.g.,
status of executing scrubs in ceph metadata servers).
`task status` makes its way upto service map which is then used
to display the relevant information in ceph status.
Signed-off-by: Venky Shankar <vshankar@redhat.com>
otherwise when we destruct `journal::JournalRecorder::m_object_locks`,
it/they would be still being waited by some condition variable.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* Added daemons to thrashers
* Join the mds thrasher, as the other thrashers did
Fixes: http://tracker.ceph.com/issues/10369
Signed-off-by: Jos Collin <jcollin@redhat.com>
* Start DaemonWatchdog when ceph starts
* Drop the DaemonWatchdog starting in mds_thrash.py
* Bring the thrashers in mds_thrash.py into the context
Fixes: http://tracker.ceph.com/issues/10369
Signed-off-by: Jos Collin <jcollin@redhat.com>
* make watch and bark handle more daemons
* drop the manager parameter, as it wont be available when DaemonWatchdog starts
* get the cluster from the config
Fixes: http://tracker.ceph.com/issues/10369
Signed-off-by: Jos Collin <jcollin@redhat.com>
This ensures that heartbeat_reset() gets call and we clean up the
ref loop between the Connections and Sessions.
Signed-off-by: Sage Weil <sage@redhat.com>
get_mnow isn't clearly at home in OSDMapService, and the other methods
are needed from PeeringState, so let's consolidate on ShardServices
for now. We probably ought OSDMapService state out of OSD into its
own module at some point.
Signed-off-by: Samuel Just <sjust@redhat.com>