Commit Graph

132063 Commits

Author SHA1 Message Date
Ernesto Puerta
9a527b3f98
Merge pull request #46370 from rhcs-dashboard/stop-poll-page-inactive
mgr/dashboard: stop polling when page is not visible

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
2022-06-14 13:23:59 +02:00
Ernesto Puerta
a84541439c
Merge pull request #46475 from rhcs-dashboard/01-hosts-failure
mgr/dashboard: fix drain e2e failure 

Reviewed-by: Sarthak Gupta <sarthak.dev.0702@gmail.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
2022-06-14 13:21:26 +02:00
Venky Shankar
80b2776c62
Merge pull request #44567 from lxbsz/client_reply
client: do nothing when get a stale reply

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
2022-06-14 15:16:27 +05:30
Yuval Lifshitz
e2aa88eddb
Merge pull request #46313 from zenomri/wip-omri-tracing-lua
rgw: add SetAttribute and AddEvent functions for TraceMetaTable in Lua

Reviewed-by: yuvalif
2022-06-14 12:22:21 +03:00
Ernesto Puerta
8bba174615
Merge pull request #46407 from melissa-kun-li/disable-create-image
mgr/dashboard: add rbd status endpoint and error page

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: sunilangadi2 <NOT@FOUND>
2022-06-14 11:00:21 +02:00
Nikhilkumar Shelke
dc4b0ee405 qa: subvolume ls command crashes if groupname as '_nogroup'
If --group_name=_nogroup is provided in the command then
throw error permission denied as it is internal group of ceph fs.

Fixes: https://tracker.ceph.com/issues/55759
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
2022-06-14 12:34:51 +05:30
Nikhilkumar Shelke
acf1337334 mgr/volumes: subvolume ls command crashes if groupname as '_nogroup'
If --group_name=_nogroup is provided in the command then
throw error permission denied as it is internal group of ceph fs.

Fixes: https://tracker.ceph.com/issues/55759
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
2022-06-14 12:17:23 +05:30
zdover23
6944600e6a
Merge pull request #46651 from zdover23/wip-doc-2022-06-13-dev-guide-essentials-master-to-main
doc/dev: s/master/main/ essentials.rst dev guide

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2022-06-14 16:27:14 +10:00
Xiubo Li
72194627c1 qa: wait rank 0 to become up:active state before mounting fuse client
When setting the ec pool to the layout the filesystem may not be
ready, so when mounting a fuse client it will fail. To fix this we
need to wait at least the rank 0 to be in up:active state.

Fixes: https://tracker.ceph.com/issues/55824
Signed-off-by: Xiubo Li <xiubli@redhat.com>
2022-06-14 09:20:45 +08:00
Anthony D'Atri
bb5f95a15a
Merge pull request #46659 from anthonyeleven/anthonyeleven-46637-followup
doc/start: Polish network section of hardware-recommendations.rst
2022-06-13 16:58:08 -07:00
Anthony D'Atri
2eb173fef9 doc/start: Polish network section of hardware-recommendations.rst
Harmonize network throughput notation, minor tweaks to wording.
Followup to #46637

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
2022-06-13 16:13:53 -07:00
Zac Dover
728b8f2674 doc/dev: s/master/main/ essentials.rst dev guide
This PR changes all reference to the "master" branch
to references to the "main" branch (because we renamed
"master" to main", and the docs now need to reflect that).

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2022-06-14 07:48:46 +10:00
zdover23
0eca78faef
Merge pull request #46637 from zdover23/wip-doc-2022-06-12-start-intro-networks-rewrite
doc/start: rewrite hardware-recs networks section

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2022-06-14 07:11:12 +10:00
Casey Bodley
6f765e25ab
Merge pull request #43597 from pritha-srivastava/wip-rgw-sts-role-multisite
rgw multisite: replicate metadata for iam roles

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2022-06-13 12:04:12 -04:00
Yuval Lifshitz
9e290f4012
Merge pull request #46568 from yuvalif/wip-yuval-lua-counters
rgw/lua: add counters to background table

Reviewed-by: mattbenjamin
Reviewed-by: Matan-b
2022-06-13 18:18:56 +03:00
Nizamudeen A
dcf0445153 mgr/dashboard: fix drain e2e failure
Cypress sometimes fail to register the click and that causes the
deselect/select to not happen properly. Deselecting the row immediately
after performing the action makes it pass from cypress.

Fixes: https://tracker.ceph.com/issues/55741
Signed-off-by: Nizamudeen A <nia@redhat.com>
2022-06-13 14:00:53 +05:30
Xiubo Li
1bbe6bbc29 client: do nothing when get a stale reply
In theory when we get a stale reply from incorrect session, that
means it's buggy in MDS. Anyway we should discard it without doing
anything.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
2022-06-13 16:19:48 +08:00
Venky Shankar
2667314621
Merge pull request #44655 from lxbsz/wip-53741
mds: clear MDCache::rejoin_*_q queues before recovering file inodes 

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Nikhilkumar Shelke <nshelke@redhat.com>
2022-06-13 13:40:04 +05:30
Venky Shankar
27f4729256
Merge pull request #45556 from mchangir/qa-add-subvolume-option-flavors
qa: add subvolume option flavors

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
2022-06-13 12:29:43 +05:30
Venky Shankar
f7bc95c2f7
Merge pull request #44347 from kotreshhr/subvolumegroup-quotas
mgr/volumes: subvolumegroup quotas

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
2022-06-13 12:26:59 +05:30
Venky Shankar
67371c1ab4
Merge pull request #46332 from lxbsz/qa-snap
qa: enlarge the tag number and test more for the snapshot

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2022-06-13 12:25:44 +05:30
Omri Zeneva
6e43859106 doc: add explanation about the new two functions and example
Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
2022-06-13 02:05:11 -04:00
Omri Zeneva
61054347f3 test: add unit tests
- added trace initialization
- opentelemetry linking when needed
- conditional ASSERT on SetBadAttribute, when we don't have opentelemetry (tracing sdk), we expect different result from the execute function.

Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
2022-06-13 02:05:11 -04:00
Omri Zeneva
923f4542cf rgw: add functionality of SetAttribute and AddEvent method in postRequest context
opentelemetry supports mainly string, int64, double and boolean for values of trace's Attributes,
so we should validate those types and static cast to the proper type, which is different than Lua types

SetAttribute Closure will be returned to lua only if the request's trace is real and not noop span or even null like what happens in preRequest Context

AddEvent method comes to give us the ability to record an event. the event can be a single string that represents the event, or event name and key-value pairs.

Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
2022-06-13 02:05:11 -04:00
dparmar18
269567d005 qa/cephfs: fix read_debug_file() return value and a pep8 violation
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2022-06-13 11:33:43 +05:30
dparmar18
0aca27d2fc qa/cephfs: fallback to older way of get_op_read_count
Fixes: https://tracker.ceph.com/issues/55538

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
2022-06-13 11:33:43 +05:30
Zac Dover
778d3c0b59 doc/start: rewrite hardware-recs networks section
This rewrites the first two-thirds of the "Networks"
section of the Hardware Recommendations page in the
Intro to Ceph document. I have tried to divide the
techincal content in this section into subsections
that foreground the various subjects covered.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2022-06-13 14:34:36 +10:00
Anthony D'Atri
6c46e1d842
Merge pull request #46634 from Huber-ming/pr_target_main
SubmittingPatches.rst: PRs should target "main"
2022-06-12 20:47:48 -07:00
Anthony D'Atri
052877aaef
Merge pull request #46583 from zdover23/wip-doc-2022-06-08-intro-hardware-recs-osd-acro-fix
doc/start: make OSD and MDS structures parallel
2022-06-12 20:32:05 -07:00
Anthony D'Atri
d4ef619a8d
Merge pull request #46633 from zdover23/wip-doc-2022-06-12-start-intro-crush-para-fix
doc/start: rewrite CRUSH para
2022-06-12 20:26:02 -07:00
Huber-ming
a892525c71 SubmittingPatches.rst: PRs should target "main"
Signed-off-by: Huber-ming <zhangsm01@inspur.com>
2022-06-13 09:38:40 +08:00
Zac Dover
4f6edb92b9 doc/start: make OSD and MDS structures parallel
This PR makes the "Ceph OSDs" and "MDSs" bullet points
parallel by naming "object storage daemon" before referring
to the (admittedly more common and colloquial, but surely
unknown to people who genuinely require a document called
'Intro') acronym "OSD".

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2022-06-13 10:12:57 +10:00
Zac Dover
ba1a85b292 doc/start: rewrite CRUSH para
This PR supersedes https://github.com/ceph/ceph/pull/46584
and makes changes suggested by Anthony D'Atri that improve
the coherence and consistency of the paragraph that explains
the basics of the CRUSH algorithm.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2022-06-13 09:41:28 +10:00
Nikhilkumar Shelke
9957a036df qa: remove incorrect 'size' from output of 'snapshot info'
The 'size' shown in the output of snapshot info command relies on
rstats which is incorrect snapshot size. It tracks size of the
subvolume from the snapshot has been taken instead of the snapshot
itself. Hence having the 'size' field in the output of 'snapshot info'
doesn't make sense until the rstats is fixed.

Fixes: https://tracker.ceph.com/issues/55822
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
2022-06-12 17:17:03 +05:30
Nikhilkumar Shelke
86f4cd3dca docs: remove incorrect 'size' from output of 'snapshot info'
The 'size' shown in the output of snapshot info command relies on
rstats which is incorrect snapshot size. It tracks size of the
subvolume from the snapshot has been taken instead of the snapshot
itself. Hence having the 'size' field in the output of 'snapshot info'
doesn't make sense until the rstats is fixed.

Fixes: https://tracker.ceph.com/issues/55822
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
2022-06-12 17:16:49 +05:30
Nikhilkumar Shelke
9f514a0bd1 mgr/volumes: remove incorrect 'size' from output of 'snapshot info'
The 'size' shown in the output of snapshot info command relies on
rstats which is incorrect snapshot size. It tracks size of the
subvolume from the snapshot has been taken instead of the snapshot
itself. Hence having the 'size' field in the output of 'snapshot info'
doesn't make sense until the rstats is fixed.

Fixes: https://tracker.ceph.com/issues/55822
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
2022-06-12 17:16:29 +05:30
Nikhilkumar Shelke
2b0ffbb36d docs: add details about all options used in 'ceph fs new' command
Fixes: https://tracker.ceph.com/issues/54111
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
2022-06-12 16:25:40 +05:30
Nikhilkumar Shelke
0f4add67eb qa: verify command status if data or metadata pool already in use
Fixes: https://tracker.ceph.com/issues/54111
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
2022-06-12 16:25:34 +05:30
Nikhilkumar Shelke
55b7e285cb mon: verify pool is already not in use by any other app or fs
pool should not be shared between multiple file system.
Hence adding check to verify pool is already not in use.

Fixes: https://tracker.ceph.com/issues/54111
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
2022-06-12 16:24:09 +05:30
Anthony D'Atri
b3cc593774 src/ceph-volume/ceph_volume/activate: Improve usage message text
Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
2022-06-11 18:23:28 -07:00
Radosław Zarzyński
a2190f901a tools: ceph-objectstore-tool is able to trim pg log dups' entries.
The main assumption is trimming just dups doesn't need any update
to the corresponding pg_info_t.

Testing:

1. cluster without the autoscaler
```
rzarz@ubulap:~/dev/ceph/build$ MON=1 MGR=1 OSD=3 MGR=1 MDS=0 ../src/vstart.sh -l -b -n -o "osd_pg_log_dups_tracked=3000000" -o "osd_pool_default_pg_autoscale_mode=off"
```

2. 8 PGs in the testing pool.
```
rzarz@ubulap:~/dev/ceph/build$ bin/ceph osd pool create test-pool 8 8
```

3. Provisioning dups with rados bench
```
bin/rados bench -p test-pool 300 write -b 4096  --no-cleanup
...
Total time run:         300.034
Total writes made:      103413
Write size:             4096
Object size:            4096
Bandwidth (MB/sec):     1.34637
Stddev Bandwidth:       0.589071
Max bandwidth (MB/sec): 2.4375
Min bandwidth (MB/sec): 0.902344
Average IOPS:           344
Stddev IOPS:            150.802
Max IOPS:               624
Min IOPS:               231
Average Latency(s):     0.0464151
Stddev Latency(s):      0.0183627
Max latency(s):         0.0928424
Min latency(s):         0.0131932
```

4. Killing osd.0
```
rzarz@ubulap:~/dev/ceph/build$ kill 2572129 # pid of osd.0
```

5. Listing PGs on osd.0 and calculating number of pg log's entries and
dups:

```
rzarz@ubulap:~/dev/ceph/build$ bin/ceph-objectstore-tool --data-path dev/osd0 --op list-pgs --pgid 2.c > osd0_pgs.txt
rzarz@ubulap:~/dev/ceph/build$ for pgid in `cat osd0_pgs.txt`; do echo $pgid; bin/ceph-objectstore-tool --data-path dev/osd0 --op log --pgid $pgid | jq '(.pg_log_t.log|length),(.pg_log_t.dups|length)'; done
2.7
10020
3100
2.6
10100
3000
2.3
10012
2800
2.1
10049
2900
2.2
10057
2700
2.0
10027
2900
2.5
10077
2700
2.4
10072
2900
1.0
97
0
```

6. Trimming dups
```
rzarz@ubulap:~/dev/ceph/build$ CEPH_ARGS="--osd_pg_log_dups_tracked 2500 --osd_pg_log_trim_max=100" bin/ceph-objectstore-tool --data-path dev/osd0 --op trim-pg-log-dups --pgid 2.7
max_dup_entries=2500 max_chunk_size=100
Removing keys dup_0000000020.00000000000000000001 - dup_0000000020.00000000000000000100
Removing keys dup_0000000020.00000000000000000101 - dup_0000000020.00000000000000000200
Removing keys dup_0000000020.00000000000000000201 - dup_0000000020.00000000000000000300
Removing keys dup_0000000020.00000000000000000301 - dup_0000000020.00000000000000000400
Removing keys dup_0000000020.00000000000000000401 - dup_0000000020.00000000000000000500
Removing keys dup_0000000020.00000000000000000501 - dup_0000000020.00000000000000000600
Finished trimming, now compacting...
Finished trimming pg log dups
```

7. Checking number of pg log's entries and dups
```
rzarz@ubulap:~/dev/ceph/build$ for pgid in `cat osd0_pgs.txt`; do echo $pgid; bin/ceph-objectstore-tool --data-path dev/osd0 --op log --pgid $pgid | jq '(.pg_log_t.log|length),(.pg_log_t.dups|length)'; done
2.7
10020
2500
2.6
10100
3000
2.3
10012
2800
2.1
10049
2900
2.2
10057
2700
2.0
10027
2900
2.5
10077
2700
2.4
10072
2900
1.0
97
0
```

Fixes: https://tracker.ceph.com/issues/53729
Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
2022-06-12 00:44:29 +02:00
Radosław Zarzyński
e312733598 Revert "tools/ceph_objectstore_took: Add duplicate entry trimming"
This reverts commit 9fb7ec61ba.

Although the chunking in off-line `dups` trimming (via COT) seems
fine, the `ceph-objectstore-tool` is a client of `trim()` of
`PGLog::IndexedLog` which means than a partial revert is not
possible without extensive changes. Moreover, trimming pg log
is not enough without modifying pg_info_t accordingly which
the reverted patch lacks.

Fixes: https://tracker.ceph.com/issues/53729
Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
2022-06-12 00:44:29 +02:00
Radosław Zarzyński
9bf0053bc9 Revert "osd/PGLog.cc: Trim duplicates by number of entries"
This reverts commit 0d253bcc09.
which is the in-OSD part of the fix for accumulation of `dup`
entries in a PG Log. Brainstorming it has brought questions
on the OSD's behaviour during an upgrade if there are tons of
dups in the log. What must be double-checked before bringing
it back is ensuring we chunk the deletions properly to not
impose OOMs / stalls in, to exemplify, RocksDB.

Fixes: https://tracker.ceph.com/issues/53729
Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
2022-06-12 00:44:29 +02:00
Anthony D'Atri
3e24921adc doc/man/8: Tweak formatting and wording in ceph.rst
Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
2022-06-11 14:21:50 -07:00
Anthony D'Atri
784f5bb9bf
Merge pull request #46200 from elacunza/doc-man-ceph-add-enable_stretch_mode
doc/man/8: Add enable_stretch_mode docs
2022-06-11 14:02:32 -07:00
Anthony D'Atri
de85ee65ed
Merge pull request #46462 from Thingee/update-foundation-mems-202205
doc: Updating Ceph Foundation members for May
2022-06-11 13:43:48 -07:00
Anthony D'Atri
5af1f5f3cc
Merge pull request #46195 from snosratiershad/fix-docs-double-dash-convertion-to-em-dash
doc: Disable double dashes "--" smartquotes conversion to en-dashes
2022-06-11 13:32:15 -07:00
Salar Nosrati-Ershad
26d44bcae8 doc: Disable double dashes "--" smartquotes conversion to en-dashes 2022-06-11 23:34:06 +04:30
Laura Flores
4c9fd0c272
Merge pull request #46604 from ljflores/wip-librados-test-fix
test/librados: modify LibRadosMiscConnectFailure.ConnectFailure to comply with new seconds unit
2022-06-10 11:56:19 -05:00
Yuri Weinstein
986146b0a6
Merge pull request #46606 from rzarzynski/wip-55982
osd: log the number of 'dups' entries in a PG Log

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
2022-06-10 09:39:24 -07:00