Commit Graph

1073 Commits

Author SHA1 Message Date
Sage Weil
35c0d75888 osd: add hdd and ssd variants for osd_recovery_max_active
Semi-arbitrarily set the SSD max to 10 (instead of 3).  This should be
tuned based on some real data.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-06-20 16:24:51 -05:00
Kefu Chai
ce26c99bed
Merge pull request #28418 from xiexingguo/wip-kick-recovery-priority
osd: give recovery ops initialized by client op a higher priority

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2019-06-18 23:29:42 +08:00
Hannes von Haugwitz
65c6425489 doc: update mondb recovery script
- be specific about stopped OSDs
- add missing '--no-mon-config' option
- fix indent of here script delimiting identifier
- use $host variable in for loop

Signed-off-by: Hannes von Haugwitz <hannes@vonhaugwitz.com>
2019-06-12 17:16:25 +02:00
xie xingguo
c0f87e0f91 osd: give recovery ops initialized by client op a higher priority
to use strict priority ordering.

The new "mclock_opclass/mclock_client" queue basically prioritizes
operations based on the class they belong to. The priority property
of an operation, if lower than a specific value (64, by default),
will get ignored and hence all operations from the same class will
be treated fairly in a FIFO fashion (but still limited by the total
IOPS or bandwidth available for the corresponding class).

To reduce the impact of performance, a more general strategy would be
enforcing some limitations on the IOPS or bandwidth for the background
recovery (or backfill) operation class. However, this way we'll end up
blocking client operations too if they are currently blocked by some
degraded objects which need to be recovered first.

We hereby grant recovery operations of this kind a higher priority
to force them to use strict priority ordering, which should still
be of significance once we switch to the new "mclock_opclass/mclock_client"
queue.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-06-11 15:46:57 +08:00
Xie Xingguo
302d7bcdd8
Merge pull request #27735 from xiexingguo/wip-device-class-noout
osd: revamp {noup,nodown,noin,noout} related commands

Reviewed-by: Sage Weil <sage@redhat.com>
2019-06-05 14:17:06 +08:00
Jason Dillaman
d23bdb7931
Merge pull request #28296 from mcv21/doc-profile-rbd
doc: note explicitly that "profile rbd" allows blacklisting

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2019-05-31 08:29:21 -04:00
Kefu Chai
f6b022bdbe
Merge pull request #27806 from ashitakasam/add-osd-alarm
osd: Better error message when OSD count is less than osd_pool_default_size

Reviewed-by: Neha Ojha <nojha@redhat.com>
2019-05-30 21:28:54 +08:00
xie xingguo
a3b0dc29b9 doc: refresh {noup,nodown,noin,noout} changes
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-05-30 10:52:38 +08:00
Matthew Vernon
6812582222 doc: note explicitly that "profile rbd" allows blacklisting
The Luminous release notes tell users to ensure that rbd clients have
the ability to blacklist other client users; this is provided by
"profile rbd", which this change now documents explicitly in the user
management documentation.

Signed-off-by: Matthew Vernon <mv3@sanger.ac.uk>
2019-05-29 14:36:48 +01:00
zjh
94237d3693 osd: Better error message when OSD count is less than osd_pool_default_size
Fixes: http://tracker.ceph.com/issues/38617

Signed-off-by: zjh <jhzeng93@foxmail.com>
2019-04-28 20:09:13 +08:00
David Zafman
39cc14bdc1
Merge pull request #27503 from dzafman/wip-39099
osd: Give recovery for inactive PGs a higher priority

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2019-04-25 15:06:56 -07:00
David Zafman
444aa9f9fe osd, mon: New pool recovery priority range -10 to 10
Use OSD_POOL_PRIORITY_MAX and OSD_POOL_PRIORITY_MIN constants
Scale legacy priorities if exceeds maximum

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-04-25 13:53:27 -07:00
Sage Weil
61d6d051de Merge PR #27472 into master
* refs/pull/27472/head:
	doc/rados/operations/devices: document device failure prediction

Reviewed-by: Rick Chen <rick.chen@prophetstor.com>
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
2019-04-24 08:37:49 -05:00
Sage Weil
a3a4af3454 Merge PR #27656 into master
* refs/pull/27656/head:
	doc/dev/erasure-coded-pool: update
	doc/rados/operations/erasure-code*: update default ec profile references
	common/options: change default erasure-code-profile to k=2 m=2

Reviewed-by: Neha Ojha <nojha@redhat.com>
2019-04-24 08:14:55 -05:00
Sage Weil
67fadc711a doc/rados/operations/devices: document device failure prediction
Signed-off-by: Sage Weil <sage@redhat.com>
2019-04-23 07:10:53 -05:00
Sage Weil
69c7a4d24e doc/rados/operations/erasure-code*: update default ec profile references
Signed-off-by: Sage Weil <sage@redhat.com>
2019-04-22 11:20:55 -05:00
Anthony D'Atri
8c2b2fdd27
doc: operations: reweight-by-utilization typo
Add a missing backquote delimiter.

Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
2019-04-19 15:15:12 -07:00
Anthony D'Atri
963bfab07d
doc: operations: improve reweight-by-utilization
Add the missing `max_change`, `max_osds`, and `--no-increasing` parameters to `reweight-by-utilization` and `test-reweight-by-utilization`.  Minor adjustments to wording.

Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
2019-04-17 14:48:33 -07:00
Sage Weil
c2190c1ff8 Merge PR #27519 into master
* refs/pull/27519/head:
	doc/rados/operations/health-checks: document new bluestore warnings
	os/bluestore: alert on fm/bdev size mismatch
	os/bluestore: introduce legacy statfs alert

Reviewed-by: Sage Weil <sage@redhat.com>
2019-04-16 14:31:49 -05:00
Sage Weil
872590fe83 Merge PR #27563 into master
* refs/pull/27563/head:
	mon/OSDMonitor: respect crush node flags for can_mark_*()
	osd/OSDMap: add get_crush_node_flags(int osd)
	mon/OSDMonitor: make 'osd {add,rm}-{noin,noout,...}' support crush nodes
	osd/OSDMap: raise OSD_FLAGS health alert for crush node flags, too
	osd/OSDMap: add flags for crush nodes

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-04-16 14:30:41 -05:00
Sage Weil
b29495954b doc/rados/operations/health-checks: document new bluestore warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2019-04-15 17:42:48 +03:00
Sage Weil
9b979a5c5d doc/release/nautilus: proscribe minimum hammer tunables and straw2 on upgrade
Signed-off-by: Sage Weil <sage@redhat.com>
2019-04-12 17:26:56 -05:00
Sage Weil
9aa9893b8f osd/OSDMap: raise OSD_FLAGS health alert for crush node flags, too
Signed-off-by: Sage Weil <sage@redhat.com>
2019-04-12 11:10:35 -05:00
Changcheng Liu
c0df98fc7e doc: fix parameter to set pg autoscale mode
osd_pool_default_pg_autoscale_mode is the right parameter to
set placement-group autoscale mode.

Signed-off-by: Changcheng Liu <changcheng.liu@intel.com>
2019-04-08 10:40:47 +08:00
Vangelis Tasoulas
24131fc59a
doc: Update documentation for the MANY_OBJECTS_PER_PG warning
The current documentation for the MANY_OBJECTS_PER_PG warning
states that The threshold can be raised to silence the health
warning by adjusting the mon_pg_warn_max_object_skew config
option on the monitors. It seems that this is not true (at least)
since the luminous times, and this option should be adjusted on
the managers.

I encountered this problem and I spend quite sometime injecting
the mon_pg_warn_max_object_skew to the monitors, added the option
ceph.conf and restarted the monitors several times but the warning
was not going away. I had to download the code to see what's
happening and I found out this:

$ git grep -A 3 mon_pg_warn_max_object_skew src/common/options.cc
src/common/options.cc:1480:    Option("mon_pg_warn_max_object_skew", Option::TYPE_FLOAT, Option::LEVEL_ADVANCED)
src/common/options.cc-1481-    .set_default(10.0)
src/common/options.cc-1482-    .set_description("max skew few average in objects per pg")
src/common/options.cc-1483-    .add_service("mgr"),

After I restarted the ceph-mgr service, the warning went away.

Signed-off-by: Vangelis Tasoulas <vangelis@tasoulas.net>
2019-04-05 19:53:35 +02:00
Vanush "Misha" Paturyan
3d935c3c53 doc/rados/configuration/mon-lookup-dns: fix typo
Signed-off-by: Vanush "Misha" Paturyan <ektich@gmail.com>
2019-04-04 12:37:54 +01:00
Sage Weil
242ef7824d doc/rados/operations: document BLUEFS_SPILLOVER
Signed-off-by: Sage Weil <sage@redhat.com>
2019-04-02 11:13:31 -05:00
Jason Dillaman
41d3fdc554
Merge pull request #27074 from LenzGr/master-documentation
doc: Updated dashboard iSCSI configuration, added labels

Reviewed-by: Ricardo Marques <rimarques@suse.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2019-04-01 09:20:24 -04:00
Lenz Grimmer
71e46887d4 doc: Updated dashboard iSCSI configuration, added labels
Added note about the requirement for the latest ceph-iscsi version
3 to the dashboard documentation. Added some doc references
and replaced some URLs in the iSCSI docs with reST labels instead.

Signed-off-by: Lenz Grimmer <lgrimmer@suse.com>
2019-03-31 13:32:15 -05:00
Casey Bodley
9e949fcd5c
Merge pull request #27243 from theanalyst/doc-scheduler
config-ref: add a note on current scheduler settings.

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
2019-03-28 14:23:11 -04:00
Abhishek Lekshmanan
909b8ef4bc docs: rgw: add a x-ref to rados dmclock docs
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
2019-03-28 17:04:31 +01:00
David Zafman
769cdc8294 doc: Document new pg state and changes to auto repair behavior
Fixes: http://tracker.ceph.com/issues/38616

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-03-25 16:03:36 -07:00
David Zafman
9fd4b062f1 doc: Fix the pg states and auto repair config options
Fixes: http://tracker.ceph.com/issues/38896

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-03-22 19:58:00 -07:00
Kefu Chai
32df73f9f2
Merge pull request #26940 from xiexingguo/wip-monc-add-con
mon/MonClient: weight-based mon selection

Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-03-22 15:43:04 +08:00
Danny Al-Gaaf
c9441c2916 doc: fix LRC documentation
Recovery from a failure in jerasure need only k reads and
not k+m-1.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2019-03-22 01:37:56 +01:00
Kefu Chai
ee5eab81e1
Merge pull request #26934 from sebastian-philipp/doc-rados-mon_command
doc/rados/api/python: Add documentation for mon_command

Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-03-20 18:27:47 +08:00
xie xingguo
1ba6b267db doc/mon-lookup-dns: update "mon weight" related changes
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-03-20 12:17:00 +08:00
Sebastian Wagner
315bc1a9ee doc/rados/api/python: Add documentation for mon_command
Co-authored-by: Nathan Cutler <ncutler@suse.com>
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
2019-03-18 11:55:56 +01:00
Sage Weil
dac96a4c0e doc/releases/nautilus: add reference to msgr2 config update section
Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-18 03:14:24 -05:00
Sage Weil
c529925e2f doc/releases/nautilus: final upgrade note updates
Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-17 05:29:27 -05:00
Kefu Chai
9f2f403553 doc/rados/operations: add clay to erasure-code-profile
so it's more visible.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-03-12 13:19:05 +08:00
Sage Weil
937f28e6a6 doc/releases/nautilus: add msgr2 refs
Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-11 10:31:49 -05:00
Sage Weil
60fab64f4c doc/rados/configuration/msgr2: some documentation about msgr2
This doesn't integrate very well into network-config.rst, mostly because
that document is horribly out of data and I don't know where to start.
:(

Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-09 19:10:21 -06:00
Xie Xingguo
46189eaa64
Merge pull request #26705 from dzafman/wip-23999
Improve docs osd_recovery_priority, osd_recovery_op_priority and related

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-03-02 09:43:33 +08:00
David Zafman
f7bec341da common, doc: Improve docs osd_recovery_priority, osd_recovery_op_priority and related
Add option desciptions for osd_recovery_priority and osd_recovery_op_priority

Fixes: https://tracker.ceph.com/issues/23999

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-03-01 13:55:35 -08:00
David Zafman
b1efd43096 doc: Update some of the priority item descriptions
Signed-off-by: David Zafman <dzafman@redhat.com>
2019-02-28 12:34:53 -08:00
David Zafman
992c690772 doc: Minor fixes to monitoring-osd-pg.rst
Signed-off-by: David Zafman <dzafman@redhat.com>
2019-02-28 12:34:53 -08:00
David Zafman
dee162039c doc: Remove osd disk thread items that no longer exist
Caused by: 35a4b5072f

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-02-28 12:34:53 -08:00
Changcheng Liu
0da1f3540c doc: change ruleset to id in crush map file
ruleset is not used after merging below patch
   commit f9a095deb1
       crush: s/ruleset/id/ in decompiled output
       Moving away from the 'ruleset' terminology.

Signed-off-by: Changcheng Liu <changcheng.liu@intel.com>
2019-02-27 11:47:44 +08:00
David Zafman
ce975581a6
Merge pull request #26522 from ashishkumsingh/wip-doc-38310
doc: Fix incorrect mention of 'osd_deep_mon_scrub_interval'

Reviewed-by: David Zafman <dzafman@redhat.com>
2019-02-25 08:20:49 -08:00