Commit Graph

338 Commits

Author SHA1 Message Date
Sage Weil
81fc73b24e Merge PR #32788 into master
* refs/pull/32788/head:
	qa/tasks/mgr/dashboard: set pg_num to 32
	mgr/pg_autoscaler: default to pg_num[_min] = 32

Reviewed-by: Sage Weil <sage@redhat.com>
2020-01-24 08:32:13 -06:00
Neha
2451b93d10 doc/rados/configuration: document osd_max_pg_log_entries
Signed-off-by: Neha Ojha <nojha@redhat.com>
2020-01-22 19:21:59 +00:00
Neha
75b4707ca9 doc/rados: update osd_min_pg_log_entries
update docs to match 0db140c15c

Signed-off-by: Neha Ojha <nojha@redhat.com>
2020-01-22 19:15:11 +00:00
Neha
0c7991c0e8 mgr/pg_autoscaler: default to pg_num[_min] = 32
78bf924480 increased the default to 16.
Increasing it further to 32 will provide enough parallelism to improve
out of the box performance for new users.

Fixes: https://tracker.ceph.com/issues/43757
Signed-off-by: Neha Ojha <nojha@redhat.com>
2020-01-22 15:21:21 +00:00
Jos Collin
1033d5b373
doc: mounting CephFS subdirectory and Persistent Mounts cleanup
Fixes: https://tracker.ceph.com/issues/37746
Signed-off-by: Jos Collin <jcollin@redhat.com>
2020-01-06 18:33:26 +05:30
Neha Ojha
0852827258
Merge pull request #32226 from neha-ojha/wip-four-percent
doc/rados: Better block.db size recommendations for bluestore

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
2019-12-13 10:23:34 -08:00
Neha
36cdafbcca doc/rados: Better block.db size recommendations for bluestore
Signed-off-by: Neha Ojha <nojha@redhat.com>
2019-12-13 17:24:58 +00:00
Ramana Raja
aeaef1b4c5 mds: obsoleting 'mds_cache_size'
Remove last bits of support for 'mds_cache_size'.
'mds_cache_memory_limit' is preferred.

Fixes: https://tracker.ceph.com/issues/41951
Signed-off-by: Ramana Raja <rraja@redhat.com>
2019-12-02 14:51:25 +05:30
Sage Weil
ba01e1e951 Merge PR #31636 into master
* refs/pull/31636/head:
	mgr/pg_autoscaler: default to pg_num[_min] = 16

Reviewed-by: Mark Nelson <mnelson@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2019-11-23 08:48:47 -06:00
Kefu Chai
1d0579566e
Merge pull request #31541 from BlackLotus/master
added a remark to always use powers of two for pg_num

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-11-19 09:24:08 +08:00
Sage Weil
78bf924480 mgr/pg_autoscaler: default to pg_num[_min] = 16
4 or 8 PGs doesn't provide much parallelism at baseline.  Start with 16
and set the floor there; that's a more reasonable number of OSDs that
will be put to work on a single pool.

Note that there is no magic number here.  At some point someone has to
tell Ceph if an empty pool should get lots of PGs across lots of devices
to get the full throughput of the cluster.  But this will be a bit less
painful/surprising for users.

Fixes: https://tracker.ceph.com/issues/42509
Signed-off-by: Sage Weil <sage@redhat.com>
2019-11-14 13:37:44 -06:00
Thomas Schneider
f84ea48b80 doc/rados, sample.ceph.conf: pg_num should always be a power of two
Signed-off-by: Thomas Schneider <thomas@brainfuck.space>
2019-11-11 22:56:35 +01:00
Sridhar Seshasayee
33c647e811 osd/OSDMap: Show health warning if a pool is configured with size 1
Introduce a config option called 'mon_warn_on_pool_no_redundancy' that is
used to show a health warning if any pool in the ceph cluster is
configured with a size of 1. The user can mute/unmute the warning using
'ceph health mute/unmute POOL_NO_REDUNDANCY'.

Add standalone test to verify warning on setting pool size=1. Set the
associated warning to 'false' in ceph.conf.template under qa/tasks so
that existing tests do not break.

Fixes: https://tracker.ceph.com/issues/41666
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2019-11-11 10:36:35 +05:30
zhangdaolong
e5d91c9872 doc, qa:remove invalid option mon_pg_warn_max_per_osd
The older mon_pg_warn_max_per_osd option has been removed in v12.2.1 Luminous
https://ceph.com/releases/v12-2-1-luminous-released/

Fixes: https://tracker.ceph.com/issues/42221

Signed-off-by: zhang daolong <zhangdaolong@fiberhome.com>
2019-10-09 16:09:18 +08:00
Kefu Chai
c2adfb62c6
Merge pull request #30583 from mika/mika/typos
doc: fix typos

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-10-04 16:06:20 +08:00
xie xingguo
5f22e36b44 Revert "osd: give recovery ops initialized by client op a higher priority"
This reverts commit c0f87e0f91.

The 'osd_op_queue_cut_off' config option determines which level of
high priority ops should use strict priority ordering and may change
from time to time. Since the main strategy of 'osd_kick_recovery_op_priority'
is to simply follow up 'osd_op_queue_cut_off', we can instead make a direct
use of 'osd_op_queue_cut_off' to achieve the same thing explicitly.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-09-29 13:08:29 +08:00
Michael Prokop
d98b3e36a9 doc: fix typos
s/amount of times/number of times/
s/assosciated/associated/
s/availabe/available/
s/Commiting/Committing/
s/Containter/Container/
s/dependant/dependent/
s/developement/development/
s/filesytem/filesystem/
s/guarenteed/guaranteed/
s/hiearchy/hierarchy/
s/intance/instance/
s/Interger/Integer/
s/mutiple/multiple/
s/nubmer/number/
s/occured/occurred/
s/overriden/overridden/
s/reseted/reset/
s/sytem/system/
s/unkown/unknown/

Signed-off-by: Michael Prokop <mika@grml.org>
2019-09-26 09:17:07 +02:00
Anthony D'Atri
be4582c26a Change osd op queue cut off default to high
Discussion: https://www.mail-archive.com/ceph-users@ceph.io/msg00166.html

Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
2019-09-18 06:50:27 -07:00
Patrick Donnelly
e7a7cf429e
doc: filesystem to file system
"Filesystem" is not a word (although fairly common in use).

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-09-10 08:43:28 -07:00
David Zafman
336b6b66ca
Merge pull request #28755 from dzafman/wip-network
feature: Health warnings on long network ping times, add "dump_osd_network" to get a report

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-09-05 07:54:43 -07:00
David Zafman
5f83a6158b osd doc mon mgr: To milliseconds for config value, user input and threshold out
Signed-off-by: David Zafman <dzafman@redhat.com>
2019-09-04 17:13:32 +00:00
James McClune
8d8cbabff4 doc: updated ceph monitor config options
Executed ceph-conf --dump-all on a freshly installed v14.2.2 (nautilus)
cluster. Compared the global defaults to the keys/values specified in
mon-config-ref.rst. Checked options.cc to make sure the obsolete keys
are no longer used.

Fixes: https://tracker.ceph.com/issues/41516
Signed-off-by: James McClune <jmcclune@mcclunetechnologies.net>
2019-08-29 23:54:10 -04:00
David Zafman
048f809626 osd mgr: Add osd_mon_heartbeat_stat_stale option to time out ping info
after 1 hour

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-08-26 15:25:34 +00:00
David Zafman
f4a0be2e87 doc: Add documentation and release notes
Signed-off-by: David Zafman <dzafman@redhat.com>
2019-08-26 15:25:34 +00:00
David Zafman
66d44e7f91 osd mon: Track heartbeat ping times and report health warning
Fixes: http://tracker.ceph.com/issues/40640

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-08-26 15:25:32 +00:00
James McClune
820ada1e58 doc: default values for mon_health_to_clog_* were flipped
On a freshly installed nautilus cluster (i.e. 14.2.2), the default values are:

mon_health_to_clog_interval = 3600
mon_health_to_clog_tick_interval = 60.000000

Fixes: https://tracker.ceph.com/issues/41403
Signed-off-by: James McClune <jmcclune@mcclunetechnologies.net>
2019-08-25 11:35:41 -04:00
Anthony D'Atri
51fb48b0f7
doc: operations: correct 'comma-delimited'
CIDR blocks are comma-separated, not comma-delimited.

Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
2019-08-13 12:50:39 -07:00
Sridhar Seshasayee
3b96417e18 mon/OSDMonitor: Use generic priority cache tuner for mon caches
Use priority cache manager to tune inc, full and rocksdb caches.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2019-08-06 20:22:16 +05:30
Jan Fajerski
f0746bbbec
Merge pull request #27859 from jan--f/update-bluestore-cache-doc
doc: update bluestore cache settings and clarify data fraction
2019-08-06 13:32:58 +02:00
Jan Fajerski
9d8336a7f4 doc: update bluestore cache settings and clarify data fraction
Fixes: http://tracker.ceph.com/issues/39522

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2019-07-29 13:58:32 +02:00
Kefu Chai
e6aee61076
Merge pull request #28772 from neha-ojha/wip-40528-2
osd: add hdd, ssd and hybrid variants for osd_snap_trim_sleep

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
2019-07-24 09:24:14 +08:00
Kefu Chai
00a67b07b1
Merge pull request #28753 from tchaikov/wip-doc-conf
doc/rados/configuration: update to be in sync with ConfUtils changes

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
2019-07-04 13:23:24 +08:00
Lan Liu
2d71227e43 doc/rados/configuration: fix typos in osd-config-ref.rst
Signed-off-by: Lan Liu <liulan@umcloud.com>
2019-07-02 10:22:03 +08:00
Kefu Chai
eceed56b95 doc/rados/configuration: update to be in sync with ConfUtils changes
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-07-02 10:13:02 +08:00
Neha Ojha
733df09fe5 common/options.cc, doc: osd_snap_trim_sleep overrides other variants
A value > 0 for osd_snap_trim_sleep, will override the backend specific
variants of osd_snap_trim_sleep.

Signed-off-by: Neha Ojha <nojha@redhat.com>
2019-06-28 09:53:01 -07:00
Neha Ojha
981babc8fa doc/rados/configuration/osd-config-ref.rst: document osd_delete_sleep
Signed-off-by: Neha Ojha <nojha@redhat.com>
2019-06-27 17:21:58 -07:00
Neha Ojha
accf95e9de doc/rados/configuration/osd-config-ref.rst: document snap trim sleep
Signed-off-by: Neha Ojha <nojha@redhat.com>
2019-06-27 12:46:41 -07:00
Sage Weil
35c0d75888 osd: add hdd and ssd variants for osd_recovery_max_active
Semi-arbitrarily set the SSD max to 10 (instead of 3).  This should be
tuned based on some real data.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-06-20 16:24:51 -05:00
xie xingguo
c0f87e0f91 osd: give recovery ops initialized by client op a higher priority
to use strict priority ordering.

The new "mclock_opclass/mclock_client" queue basically prioritizes
operations based on the class they belong to. The priority property
of an operation, if lower than a specific value (64, by default),
will get ignored and hence all operations from the same class will
be treated fairly in a FIFO fashion (but still limited by the total
IOPS or bandwidth available for the corresponding class).

To reduce the impact of performance, a more general strategy would be
enforcing some limitations on the IOPS or bandwidth for the background
recovery (or backfill) operation class. However, this way we'll end up
blocking client operations too if they are currently blocked by some
degraded objects which need to be recovered first.

We hereby grant recovery operations of this kind a higher priority
to force them to use strict priority ordering, which should still
be of significance once we switch to the new "mclock_opclass/mclock_client"
queue.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-06-11 15:46:57 +08:00
Vangelis Tasoulas
24131fc59a
doc: Update documentation for the MANY_OBJECTS_PER_PG warning
The current documentation for the MANY_OBJECTS_PER_PG warning
states that The threshold can be raised to silence the health
warning by adjusting the mon_pg_warn_max_object_skew config
option on the monitors. It seems that this is not true (at least)
since the luminous times, and this option should be adjusted on
the managers.

I encountered this problem and I spend quite sometime injecting
the mon_pg_warn_max_object_skew to the monitors, added the option
ceph.conf and restarted the monitors several times but the warning
was not going away. I had to download the code to see what's
happening and I found out this:

$ git grep -A 3 mon_pg_warn_max_object_skew src/common/options.cc
src/common/options.cc:1480:    Option("mon_pg_warn_max_object_skew", Option::TYPE_FLOAT, Option::LEVEL_ADVANCED)
src/common/options.cc-1481-    .set_default(10.0)
src/common/options.cc-1482-    .set_description("max skew few average in objects per pg")
src/common/options.cc-1483-    .add_service("mgr"),

After I restarted the ceph-mgr service, the warning went away.

Signed-off-by: Vangelis Tasoulas <vangelis@tasoulas.net>
2019-04-05 19:53:35 +02:00
Vanush "Misha" Paturyan
3d935c3c53 doc/rados/configuration/mon-lookup-dns: fix typo
Signed-off-by: Vanush "Misha" Paturyan <ektich@gmail.com>
2019-04-04 12:37:54 +01:00
Jason Dillaman
41d3fdc554
Merge pull request #27074 from LenzGr/master-documentation
doc: Updated dashboard iSCSI configuration, added labels

Reviewed-by: Ricardo Marques <rimarques@suse.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2019-04-01 09:20:24 -04:00
Lenz Grimmer
71e46887d4 doc: Updated dashboard iSCSI configuration, added labels
Added note about the requirement for the latest ceph-iscsi version
3 to the dashboard documentation. Added some doc references
and replaced some URLs in the iSCSI docs with reST labels instead.

Signed-off-by: Lenz Grimmer <lgrimmer@suse.com>
2019-03-31 13:32:15 -05:00
Casey Bodley
9e949fcd5c
Merge pull request #27243 from theanalyst/doc-scheduler
config-ref: add a note on current scheduler settings.

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
2019-03-28 14:23:11 -04:00
Abhishek Lekshmanan
909b8ef4bc docs: rgw: add a x-ref to rados dmclock docs
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
2019-03-28 17:04:31 +01:00
David Zafman
769cdc8294 doc: Document new pg state and changes to auto repair behavior
Fixes: http://tracker.ceph.com/issues/38616

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-03-25 16:03:36 -07:00
David Zafman
9fd4b062f1 doc: Fix the pg states and auto repair config options
Fixes: http://tracker.ceph.com/issues/38896

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-03-22 19:58:00 -07:00
Kefu Chai
32df73f9f2
Merge pull request #26940 from xiexingguo/wip-monc-add-con
mon/MonClient: weight-based mon selection

Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-03-22 15:43:04 +08:00
xie xingguo
1ba6b267db doc/mon-lookup-dns: update "mon weight" related changes
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-03-20 12:17:00 +08:00
Sage Weil
dac96a4c0e doc/releases/nautilus: add reference to msgr2 config update section
Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-18 03:14:24 -05:00