mgr/dashboard: fix: get SMART data from single-daemon device
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
crimson/os/seastore/journal: fast submit if RecordSubmitter is IDLE and no pending
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
hostname -f and hostname generated from gwcli_create being different
gave rise to error:
The first gateway defined must be the local machine
Fixes: https://tracker.ceph.com/issues/53830
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
Return SMART data even when a device is only associated with a single daemon.
Fixes: https://tracker.ceph.com/issues/53858
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Fix issues with PromQL expressions and vector matching with the
`ceph_disk_occupation` metric.
As it turns out, `ceph_disk_occupation` cannot simply be used as
expected, as there seem to be some edge cases for users that have
several OSDs on a single disk. This leads to issues which cannot be
approached by PromQL alone (many-to-many PromQL erros). The data we
have expected is simply different in some rare cases.
I have not found a sole PromQL solution to this issue. What we basically
need is the following.
1. Match on labels `host` and `instance` to get one or more OSD names
from a metadata metric (`ceph_disk_occupation`) to let a user know
about which OSDs belong to which disk.
2. Match on labels `ceph_daemon` of the `ceph_disk_occupation` metric,
in which case the value of `ceph_daemon` must not refer to more than
a single OSD. The exact opposite to requirement 1.
As both operations are currently performed on a single metric, and there
is no way to satisfy both requirements on a single metric, the intention
of this commit is to extend the metric by providing a similar metric
that satisfies one of the requirements. This enables the queries to
differentiate between a vector matching operation to show a string to
the user (where `ceph_daemon` could possibly be `osd.1` or
`osd.1+osd.2`) and to match a vector by having a single `ceph_daemon` in
the condition for the matching.
Although the `ceph_daemon` label is used on a variety of daemons, only
OSDs seem to be affected by this issue (only if more than one OSD is run
on a single disk). This means that only the `ceph_disk_occupation`
metadata metric seems to need to be extended and provided as two
metrics.
`ceph_disk_occupation` is supposed to be used for matching the
`ceph_daemon` label value.
foo * on(ceph_daemon) group_left ceph_disk_occupation
`ceph_disk_occupation_human` is supposed to be used for anything where
the resulting data is displayed to be consumed by humans (graphs, alert
messages, etc).
foo * on(device,instance)
group_left(ceph_daemon) ceph_disk_occupation_human
Fixes: https://tracker.ceph.com/issues/52974
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
This is the first commit in a series of commits that aims at adding a primary balancer to Ceph and improving the current upmap balancer functionality. This first commit focuses on simplifying (refactoring) the code of `calc_pg_upmaps` so it is easier to change in the future. This PR keeps the existing functionality as-is and does not change anything but the code structure.
As part of the work is major refactoring of OSDMap::calc_pg_upmaps, the first thing is adding an --upmap-seed param to osdmaptool so test results can be compared without the random factor.
Other changes made:
- Divided sections of `OSDMap::calc_pg_upmaps` into their own separate functions
- Renamed tmp to tmp_osd_map
- Changed all the occurances of 'first' and 'second' in the function to more meaningful names.
Signed-off-by: Josh Salomon <josh.salomon@gmail.com>
(1) adding arrow/parquet to make(install is missing)
(2) s3select-operation contains 2 flows CSV and Parquet
(3) upon parquet-flow s3select processing engine is calling (via callback) to get-size and range-request, the range-requests are a-sync, thus the caller is waiting until notification.
(4) flow : execute --> s3select --(arrow layer)--> range-request --> GetObj::execute --> send_response_data --> notify-range-request --> (back-to) --> s3select
(5) on parquet flow the s3select is handling the response (using call-backs) because of aws-response-limitation (16mb)
add unique pointer (rgw_api); verify magic number for parquet objects; s3select module update
fix buffer-over-flow (copy range request)
change the range-request flow. now,it needs to use the callback parametrs (ofs & len) and not to use the element length
refactoring. seperate the CSV flow from the parquet flow, a phase before adding conditional build(depend on arrow package installation)
adding arrow/parquet installation to debian/control
align s3select repo with RGW (missing API"s, such as get_error_description)
undefined reference to arrow symbol
fix comment: using optional_yield by value
fix comments; remove future/promise
s3select: a leak fix
s3select: fixing result production
s3select,s3tests : parquet alignments
typo: git-remote --> git_remote
s3select: remove redundant comma(end of projections); bug fix in parquet flow upon aggregation queries
adding arrow/parquet
editorial. remove blank lines
s3select: merged with master(output serialization,presto alignments)
merging(not rebase) master functionlities into parquet branch
(*) a dedicated source-files for s3select operation.
(*) s3select-engine: fix leaks on parquet flows, enabling allocate csv_object and parquet_object on stack
(*) the csv_object and parquet object allocated on stack (no heap allocation)
move data-members from heap to stack allocation, refactoring, separate flows for CSV and parquet. s3select: bug fix
conditional build: upon arrow package is installed the parquet flow become visable, thus enables to process parquet object. in case the package is not installed only CSV is usable
remove redundant try/catch, s3select: fix compile warning
arrow-devel version should be higher than 4.0.0, where arrow::io::AsyncContext become depecrated
missing sudo; wrong url;move the rm -f arrow.list
replace codename with $(lsb_release -sc)
arrow version should be >= 4.0.0; iocontext not exists in namespace on lower versions
RGW points to s3select/master
s3select submodule
sudo --> $SUDO
Signed-off-by: gal salomon <gal.salomon@gmail.com>
crimson/os/seastore/../segment_manager: improve logs and validations
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Xuehan Xu <xuxuehan@360.cn>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
If wget fails (e.g. due to a certificate issue), it still creates
an empty file. Then this file is marked executable, ./"${SCRIPT}"
immediately returns 0 and run_xfstests_qemu.sh exits successfully
without running a single xfstest.
This started on Sep 30, 2021 with the expiration of Let's Encrypt
root certificate -- all qemu jobs with "test: qa/run_xfstests_qemu.sh"
just booted the VM for a couple of seconds and reported success.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>