Commit Graph

129136 Commits

Author SHA1 Message Date
Soumya Koduri
7b5527ba2b rgw/dbstore: Fixing s3 test 'test_bucket_delete_nonempty'
if delete_children not set to 'true', delete bucket should
fail with ENOTEMPTY

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
2022-01-13 17:22:11 +05:30
Yuval Lifshitz
b709091d81
Merge pull request #43995 from TRYTOBE8TME/wip-rgw-kafka-teuth-cleanup
qa/tasks: Checking for kafka cleanup
2022-01-13 11:57:03 +02:00
Patrick Seidensal
154d3525b1 mgr/prometheus: Refactoring: Introduce type aliases
Fixes: https://tracker.ceph.com/issues/52974

Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
2022-01-13 10:34:12 +01:00
James McClune
ed20f98df1 mgr/cephadm: fixes minor grammar nit in Dry-Runs message
Signed-off-by: James McClune <jmcclune@mcclunetechnologies.net>
2022-01-12 22:46:42 -05:00
Josh Salomon
86d6d110b4 osd, tools: refactor OSDMap::calc_pg_upmaps (simplify the code)
This is the first commit in a series of commits that aims at adding a primary balancer to Ceph and improving the current upmap balancer functionality. This first commit focuses on simplifying (refactoring) the code of `calc_pg_upmaps` so it is easier to change in the future. This PR keeps the existing functionality as-is and does not change anything but the code structure.

As part of the work is major refactoring of OSDMap::calc_pg_upmaps, the first thing is adding an --upmap-seed param to osdmaptool so test results can be compared without the random factor.

Other changes made:
    - Divided sections of `OSDMap::calc_pg_upmaps` into their own separate functions
    - Renamed tmp to tmp_osd_map
    - Changed all the occurances of 'first' and 'second' in the function to more meaningful names.

Signed-off-by: Josh Salomon <josh.salomon@gmail.com>
2022-01-13 02:25:14 +00:00
Yuri Weinstein
10be79e6c4
Merge pull request #43299 from markhpc/wip-age-binning-rebase-20210923
common/PriorityCache: Updated Implementation of Cache Age Binning

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
2022-01-12 16:54:23 -08:00
gal salomon
e3254b6306 parquet implementation:
(1) adding arrow/parquet to make(install is missing)
(2) s3select-operation contains 2 flows CSV and Parquet
(3) upon parquet-flow s3select processing engine is calling (via callback) to get-size and range-request, the range-requests are a-sync, thus the caller is waiting until notification.
(4) flow : execute --> s3select --(arrow layer)--> range-request --> GetObj::execute --> send_response_data --> notify-range-request --> (back-to) --> s3select
(5) on parquet flow the s3select is handling the response (using call-backs) because of aws-response-limitation (16mb)

add unique pointer (rgw_api); verify magic number for parquet objects; s3select module update
fix buffer-over-flow (copy range request)
change the range-request flow. now,it needs to use the callback parametrs (ofs & len) and not to use the element length
refactoring.  seperate the CSV flow from the parquet flow, a phase before adding conditional build(depend on arrow package installation)
adding arrow/parquet installation to debian/control
align s3select repo with RGW (missing API"s, such as get_error_description)
undefined reference to arrow symbol
fix comment: using optional_yield by value
fix comments; remove future/promise
s3select: a leak fix
s3select: fixing result production
s3select,s3tests : parquet alignments
typo: git-remote --> git_remote
s3select: remove redundant comma(end of projections); bug fix in parquet flow upon aggregation queries
adding arrow/parquet
editorial. remove blank lines
s3select: merged with master(output serialization,presto alignments)
merging(not rebase) master functionlities into parquet branch

(*) a dedicated source-files for s3select operation.
(*) s3select-engine: fix leaks on parquet flows, enabling allocate csv_object and parquet_object on stack
(*) the csv_object and parquet object allocated on stack (no heap allocation)

move data-members from heap to stack allocation, refactoring, separate flows for CSV and parquet. s3select: bug fix

conditional build: upon arrow package is installed the parquet flow become visable, thus enables to process parquet object. in case the package is not installed only CSV is usable

remove redundant try/catch, s3select: fix compile warning

arrow-devel version should be higher than 4.0.0, where arrow::io::AsyncContext become depecrated

missing sudo; wrong url;move the rm -f arrow.list

replace codename with $(lsb_release -sc)

arrow version should be >= 4.0.0; iocontext not exists in namespace on lower versions

RGW points to s3select/master

s3select submodule

sudo --> $SUDO

Signed-off-by: gal salomon <gal.salomon@gmail.com>
2022-01-12 23:15:21 +02:00
Casey Bodley
95544e802b qa/rgw: add PG_DEGRADED cluster warnings to log-ignorelist
and cover rgw/singleton suite

Fixes: https://tracker.ceph.com/issues/51727

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2022-01-12 15:56:38 -05:00
Ilya Dryomov
651f0fbc08
Merge pull request #43494 from majianpeng/enable-test-librbd-BlockGuard
test/librbd: re-enable BlockGuard test

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-12 21:50:00 +01:00
Radoslaw Zarzynski
62650c2720 test/objectstore: verify the huge page-backed reading of BlueStore.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2022-01-12 20:35:50 +00:00
Radoslaw Zarzynski
936e578bf9 common: introduce instrumented_raw to buffer_instrumentation
Its initial user will be a unit test for BlueStore's huge
paged-backed reading.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2022-01-12 20:35:50 +00:00
Radoslaw Zarzynski
a0777ce5ac common, test: move instrumented_bptr to a dedicated header.
We're going to reuse it outside `test/bufferlist.cc`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2022-01-12 20:35:50 +00:00
Radoslaw Zarzynski
1362134171 blk: don't cache the huge page-based buffers of KernelDevice.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2022-01-12 20:35:50 +00:00
Radoslaw Zarzynski
9ad03651c6 blk: introduce multi-size huge page pools to KernelDevice.
When testing remember about `bluestore_max_blob_size` as it's
only 64 KB by default while the entire huge page-based pools
machinery targets far bigger scenrios (initially 4 MB!).

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2022-01-12 20:35:50 +00:00
Radoslaw Zarzynski
a3c8090ea5 blk: move the buffer size of ExplicitHugePagePool to run-time.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2022-01-12 20:35:50 +00:00
Radoslaw Zarzynski
64aae2b955 blk: bring MAP_HUGETLB-based buffer pool to KernelDevice.
The idea here is to bring a pool of `mmap`-allocated,
constantly-sized buffers which would take precedence
over the 2 MB-aligned, THP-based mechanism. On first
attempt to acquire a 4 MB buffer, KernelDevice mmaps
`bdev_read_preallocated_huge_buffer_num` (default 128)
memory regions using the MAP_HUGETLB option. If this
fails, the entire process is aborted. Buffers, after
their life-times going over, are recycled with lock-
free queue shared across entire process.

Remember about allocating the appropriate number of
huge pages in the system! For instance:

```
echo 256 | sudo tee /proc/sys/vm/nr_hugepages
```

This commit bases on / cherry-picks with changes
897a4932bee5cba3641c18619cccd0ee945bfcf8.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2022-01-12 20:35:50 +00:00
Radoslaw Zarzynski
67ce52f5f9 blk: make the buffer alignment configurable in KernelDevice.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2022-01-12 20:35:50 +00:00
Radoslaw Zarzynski
9768120e9a blk, os/bluestore: introduce a cache bypassing to IOContext and BlueStore.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2022-01-12 20:35:50 +00:00
Samuel Just
39e0e7b8a3
Merge pull request #44478 from cyx1231st/wip-crimson-improve-log-3
crimson/os/seastore/../segment_manager: improve logs and validations

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Xuehan Xu <xuxuehan@360.cn>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
2022-01-12 12:27:14 -08:00
Ilya Dryomov
b47965b577 qa/tasks/qemu: get the new Let's Encrypt root certificate
Fixes: https://tracker.ceph.com/issues/53841
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-12 20:53:45 +01:00
Ilya Dryomov
387be94794 qa/run_xfstests_qemu.sh: harden against wget failures
If wget fails (e.g. due to a certificate issue), it still creates
an empty file.  Then this file is marked executable, ./"${SCRIPT}"
immediately returns 0 and run_xfstests_qemu.sh exits successfully
without running a single xfstest.

This started on Sep 30, 2021 with the expiration of Let's Encrypt
root certificate -- all qemu jobs with "test: qa/run_xfstests_qemu.sh"
just booted the VM for a couple of seconds and reported success.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-12 20:53:45 +01:00
Casey Bodley
d01ee1122f
Merge pull request #44536 from yuvalif/wip-yuval-dynamic-reshard
rgw: fix dynamic reshard happening during user stats sync

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2022-01-12 14:14:41 -05:00
Daniel Gryniewicz
661cda66d7 RGW Zipper - don't load stats for every bucket load
This was a side-effect of consolidating the Zipper API, and resulted in
a large performance hit.  Stats are only needed if they are requested,
so don't load them every time.

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
2022-01-12 12:47:40 -05:00
Yuri Weinstein
a8bb49d4d9
Merge pull request #39440 from pdvian/wip-warn-filestore-osds
mon/OSDMonitor, osd: Add warning on filestore deprecation and force use of wpq scheduler for filestore OSDs

Reviewed-by: Neha Ojha <nojha@redhat.com>
2022-01-12 08:49:02 -08:00
Gabriel BenHanokh
a39b1f3cf7 tools/ceph-bluestore-tool: Fix bluefs-bdev-expand command
Update allocation file when we expand-device
Add the expended space to the allocator and then force an update to the allocation file

There is also a new standalone test case for expand

Fixes: https://tracker.ceph.com/issues/53699
Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>
2022-01-12 18:07:59 +02:00
Daniel Gryniewicz
cca74fa8dd
Merge pull request #41778 from felixhuettner/fix_subdir_name
rgw: RGWSwiftWebsiteHandler::is_web_dir checks empty subdir_name
2022-01-12 09:22:04 -05:00
Daniel Gryniewicz
be54d5f681
Merge pull request #38234 from inspur-wyq/wip-copy-obj-check-size
rgw : check the object size when copy obj
2022-01-12 09:21:48 -05:00
Daniel Gryniewicz
2645ff5ab7
Merge pull request #38532 from Rjerk/wip-empty-tagset
rgw: an empty tagset is allowed by S3
2022-01-12 09:21:35 -05:00
Daniel Gryniewicz
72e1208321
Merge pull request #40573 from Huber-ming/rgw_admin
rgw: delete abbreviation for option "--new-uid"
2022-01-12 09:21:15 -05:00
Daniel Gryniewicz
8d2b3d578f
Merge pull request #40575 from Huber-ming/rgw_admin-f
radosgw-admin: delete the abbreviation of option "--infile"
2022-01-12 09:20:49 -05:00
Ilya Dryomov
966830f651
Merge pull request #44500 from idryomov/wip-rbd-test-group-leak
test/librbd: fix group_info.name leaks in TestGroup.add_image

Reviewed-by: Mykola Golub <mgolub@suse.com>
2022-01-12 12:56:09 +01:00
Aishwarya Mathuria
91885f1a87 qa/standalone: add test to check if objects_scrubbed is equal to number of objects in a PG once a scrub finishes
Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
2022-01-12 14:57:40 +05:30
Aishwarya Mathuria
fbee00afa6 osd/scrub: Add stats to PG dump for number of objects scrubbed
Addition of a new column in PG dump, OBJECTS_SCRUBBED, which keeps track of the number of objects scrubbed.

Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
2022-01-12 14:57:27 +05:30
Liu-Chunmei
5fe65bc92b
Merge pull request #44490 from liu-chunmei/crimson-fix-aligned
crimson: fix assert_aligned(size) in trim_data_reservation

reviewed by: Samuel Just <sjust@redhat.com> , Yingxin <yingxin.cheng@intel.com>
2022-01-11 22:05:22 -08:00
Yingxin Cheng
bd58665f33 crimson/os/seastore/journal: fast submit if RecordSubmitter is IDLE and no pending
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-01-12 13:47:47 +08:00
Yingxin Cheng
8aaaeea814 crimson/os/seastore/../segment_manager: add more validations
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-01-12 13:43:52 +08:00
Yingxin Cheng
90ce0f046d crimson/os/seastore/../segment_manager: consolidate logs with structured level and format
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-01-12 13:43:52 +08:00
Yingxin Cheng
ecad0f8d68 crimson/os/seastore/../segment_manager: cleanup device_id usage
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-01-12 13:43:52 +08:00
Yingxin Cheng
d5b0cd1392 crimson/os/seastore/../segment_manager: pretty print data structures
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-01-12 13:43:43 +08:00
Josh Durgin
52995842a9 doc/rbd/rbd-config-ref: group QoS options by throttle type
This makes it clearer that there are distinct throttles with the
same groups of settings.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2022-01-11 21:46:37 -05:00
Josh Durgin
146b40ecd6 doc/releases: mark nautilus eol
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2022-01-11 21:13:29 -05:00
Josh Durgin
3bc27e21e5 doc/releases: remove obsolete info
We haven't done dev releases for years, and versions prior to luminous
are no longer relevant.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2022-01-11 21:04:55 -05:00
Yingxin Cheng
549036edd8 crimson/os/seastore/../segment_manager: convert to seastore logging
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-01-12 09:29:08 +08:00
Yingxin Cheng
3405661fec crimson/os/seastore/../segment_manager: suppress compile warning about unused logger
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2022-01-12 09:29:08 +08:00
Yingxin
f77aae9731
Merge pull request #44532 from rzarzynski/wip-crimson-fix-test-runner
test/crimson: fix a race condition in SeastarRunner

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2022-01-12 09:23:53 +08:00
Neha Ojha
c365b8da55
Merge pull request #43593 from ljflores/wip-rocksdb
mgr: expose rocksdb version number for use in telemetry

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
2022-01-11 16:31:41 -08:00
chunmei-liu
7594b61826 crimson: fix assert_aligned(size) in trim_data_reservation
Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
signed-off-by: Samuel Just <sjust@redhat.com>
2022-01-11 15:16:16 -08:00
Laura Flores
394cbf98e6 mgr/telemetry: add the rocksdb version number to telemetry
Capturing the RocksDB version number in Telemetry would allow us to check that users are using the appropriate RocksDB version for their Ceph cluster. For instance, if a user is working in a Pacific cluster, but their RocksDB version is meant for Nautilus, that might be a problem.

It is strucured as "rocksdb_stats" --> "version" in anticipation of more stats that can will be added under "rocksdb_stats".

Signed-off-by: Laura Flores <lflores@redhat.com>
2022-01-11 23:04:01 +00:00
Laura Flores
7a4e747536 mgr: expose rocksdb version number in the mgr module
It is only necessary here to link the rocksdb include directory
since the mgr simply needs access to the rocksdb version numbers.

Signed-off-by: Laura Flores <lflores@redhat.com>
Co-authored-by: Kefu Chai <tchaikov@gmail.com>
Co-authored-by: Adam Kupczyk <akupczyk@redhat.com>
2022-01-11 23:03:54 +00:00
Neha Ojha
e76e994bb2
Merge pull request #43794 from aclamk/wip-bluefs-fine-grain-locking-4
os/bluestore: BlueFS fine grain locking

Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>
2022-01-11 13:59:16 -08:00