The new config option "osd_mclock_force_run_benchmark_on_init" is
introduced to allow a user to force run the OSD benchmark test on every
OSD boot-up even if the historical data about the OSD's iops capacity is
available on the MON config store. The 'force_run_benchmark' flag is set
to the value indicated by the new config option.
By default this new config option is set to false.
The utility of this option is to help refresh the OSD iops capacity
when the underlying device's performance characteristics have changed
significantly. In such cases, the OSD can be restarted with this option
enabled temporarily. Once the new iops capacity is updated to the MON
store, this option can be removed from the OSD's start-up config.
Fixes: https://tracker.ceph.com/issues/51464
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Use "mon_cmd_set_config()" to store the OSD's max iops capacity to
the MON store during the first bring-up. Don't run the OSD benchmark
test on subsequent boot-ups if a previously persisted iops capacity is
available on the MON store and is different from the default iops
capacity.
Add the 'force_run_benchmark' flag to force a run of the benchmark
in case the default iops capacity cannot be determined.
Fixes: https://tracker.ceph.com/issues/51464
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Add wrapper method "get_val_default()" to the ConfigProxy class that takes
the config option key to search. This method in-turn calls another method
with the same name added to md_config_t class that does the actual work of
searching for the config option. If the option is valid, _get_val_default()
is used to get the default value. Otherwise, the wrapper method returns
std::nullopt.
Fixes: https://tracker.ceph.com/issues/51464
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Add method mon_cmd_set_config() to save config option key and
value to the MON store. The ConfigMonitor command, 'config set' is
used to achieve this.
A corresponding get method is unnecessary since any config option
found on the MON store is loaded during OSD boot-up and set using
the md_config_t::set_mon_vals() method. Therefore, the existing
versions of ConfigProxy::get_val() method are sufficient to get
the latest value for the config option.
Fixes: https://tracker.ceph.com/issues/51464
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
structured binding does not define variables, so we cannot capture them
without defining variables in capture list.
in this change, instead of using a map<> for defining labels, just
create labels on the fly.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Only keep the basic metrics to minimize the total number of metrics.
Derived metrics can be numerous according to different needs and can be
confusing with labels.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
mgr/dashboard: backend unit tests: decouple from build dir
Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
In order to cross-check the writes at segment manager level, and
evaluate the write amplification from each sub-component.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
no need to clone the whole repo, just clone the tip of the specified
tag. this saves the bandwidth, disk IO and precious time.
Signed-off-by: Kefu Chai <kchai@redhat.com>
in snappy, the commit of 26102a0c66175bc39edbf484c994a21902e986dc
fixes the SNAPPY_VERSION generation. and this commit was included by
v1.1.8 and v1.1.9.
also, in v1.1.9, a change was introduced, where the function signature
was changed, and more importantly, this change is not backward
compatible:
< bool GetUncompressedLength(Source* source, uint32_t* result);
---
> bool GetUncompressedLength(Source* source, uint32* result);
see also, https://tracker.ceph.com/issues/50934
so we check SNAPPY_VERSION to tell if we should use `uint32_t` or
`uint32`.
in this change, snappy version used to build win32 client is bumped
to the latest stable version, v1.1.9, to include the fix of
SNAPPY_VERSION. this paves the road to fix of https://tracker.ceph.com/issues/50934
Signed-off-by: Kefu Chai <kchai@redhat.com>
The snappy project made the following change in snappy.h between version 1.1.8
and 1.1.9:
< bool GetUncompressedLength(Source* source, uint32_t* result);
---
> bool GetUncompressedLength(Source* source, uint32* result);
This causes Ceph to FTBFS with snappy 1.1.9.
Thanks to Chris Denice for bringing this to our attention via Redmine.
Fixes: https://tracker.ceph.com/issues/50934
Signed-off-by: Nathan Cutler <ncutler@suse.com>
The clean_cgroup method assumes that the ctx.fsid is set while this is
true for the bootstrap command, it isn't set for adopt or deploy commands
(and maybe others).
This ends up to the adopt command to fails:
Traceback (most recent call last):
File "/sbin/cephadm", line 8301, in <module>
main()
File "/sbin/cephadm", line 8289, in main
r = ctx.func(ctx)
File "/sbin/cephadm", line 1764, in _default_image
return func(ctx)
File "/sbin/cephadm", line 5091, in command_adopt
command_adopt_ceph(ctx, daemon_type, daemon_id, fsid)
File "/sbin/cephadm", line 5299, in command_adopt_ceph
osd_fsid=osd_fsid)
File "/sbin/cephadm", line 2884, in deploy_daemon_units
clean_cgroup(ctx, unit_name)
File "/sbin/cephadm", line 2724, in clean_cgroup
if not ctx.fsid:
File "/sbin/cephadm", line 155, in __getattr__
return super().__getattribute__(name)
AttributeError: 'CephadmContext' object has no attribute 'fsid'
Since we already have the fsid value in deploy_daemon_units (which calls
clean_cgroup) then we can pass the fsid value directly.
This fixes a regression introduced by 1fee255
Fixes: https://tracker.ceph.com/issues/51902
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
* refs/pull/42349/head:
mon/MDSMonitor: propose if FSMap struct_v is too old
mon/MDSMonitor: give a proper error message if FSMap struct_v is too old
mds/FSMap: use DECODE_OLDEST to gate FSMap version
qa: add tests for fs dump of epoch and trimming
qa: add file system support for dumping epoch
mon/MDSMonitor: return mon_mds_force_trim_to even if equal to current epoch
mon: add debugging for trimming methods
mon: fix debug spacing
qa: add nofs upgrade suite
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
* refs/pull/41025/head:
qa: wait pgs to be clean before using the pools
qa: ignore PG_RECOVERY_FULL and PG_DEGRADED for mds-full
qa: wait more time since there have many more pgs than before
qa: do not multiple the full ratio twice
qa: do not raise for kclient for _fsync test
qa: use the pg autoscale mode to calcuate the pg_num
qa: set the object_size to 1M
qa: move the is_full() to parent class
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
If we are in this block, then p == channel_fds.end() and p->first is not
valid.
Also, no need to populate channel_fds with an fd of -1.
Fixes: https://tracker.ceph.com/issues/51816
Signed-off-by: Sage Weil <sage@newdream.net>
There was strange bolding and bullet point placement due to a missing new line in the perf description.
Signed-off-by: Laura Flores <lflores@redhat.com>
Sometimes, it can happen that the osds being destroyed in those tests
are not yet marked as 'down' for some reason. Let's add some retries on
those tasks to avoid CI failures.
Fixes: https://tracker.ceph.com/issues/51903
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>