Include test case
Configurable by setting mon_osd_warn_num_repaired (default 10)
Ignore new health warning with random eio injection test
Fixes: https://tracker.ceph.com/issues/41564
Signed-off-by: David Zafman <dzafman@redhat.com>
I think someday the docs for how health alerts work (here) and the
enumeration of all actual alerts should be restructured. For now this
si the simplest placde to fit this!
Signed-off-by: Sage Weil <sage@redhat.com>t
* refs/pull/29292/head:
os/bluestore: warn on no per-pool omap
os/bluestore: fsck: warning (not error) by default on no per-pool omap
os/bluestore: fsck: int64_t for error count
os/bluestore: default size of 1 TB for testing
os/bluestore: behave if we *do* set PGMETA and PERPOOL flags
os/bluestore: do not set both PGMETA_OMAP and PERPOOL_OMAP
os/bluestore: fsck: only generate 1 error per omap_head
os/bluestore: make fsck repair convert to per-pool omap
os/bluestore: teach fsck to tolerate per-pool omap
os/bluestore: ondisk format change to 3 for per-pool omap
mon/PGMap: add data/omap breakouts for 'df detail' view
osd/osd_types: separate get_{user,allocated}_bytes() into data and omap variants
mon/PGMap: fix stored_raw calculation
mon/PGMap: add in actual omap usage into per-pool stats
osd: report per-pool omap support via store_statfs_t
os/bluestore: set per_pool_omap key on mkfs
osd/osd_types: count per-pool omap capable OSDs
os/bluestore: report omap_allocated per-pool
os/bluestore: add pool prefix to omap keys
kv/KeyValueDB: take key_prefix for estimate_prefix_size()
os/bluestore: fix manual omap key manipulation to use Onode::get_omap_key()
os/bluestore: make omap key helpers Onode methods
os/bluestore: add Onode::get_omap_prefix() helper
os/bluestore: change _do_omap_clear() args
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Treat backfull_toofull as a warning condition because it can resolve itself.
Includes test case for PG_BACKFILL_FULL
Includes test case for recovery_toofull / PG_RECOVERY_FULL
Fixes: https://tracker.ceph.com/issues/39555
Signed-off-by: David Zafman <dzafman@redhat.com>
The current documentation for the MANY_OBJECTS_PER_PG warning
states that The threshold can be raised to silence the health
warning by adjusting the mon_pg_warn_max_object_skew config
option on the monitors. It seems that this is not true (at least)
since the luminous times, and this option should be adjusted on
the managers.
I encountered this problem and I spend quite sometime injecting
the mon_pg_warn_max_object_skew to the monitors, added the option
ceph.conf and restarted the monitors several times but the warning
was not going away. I had to download the code to see what's
happening and I found out this:
$ git grep -A 3 mon_pg_warn_max_object_skew src/common/options.cc
src/common/options.cc:1480: Option("mon_pg_warn_max_object_skew", Option::TYPE_FLOAT, Option::LEVEL_ADVANCED)
src/common/options.cc-1481- .set_default(10.0)
src/common/options.cc-1482- .set_description("max skew few average in objects per pg")
src/common/options.cc-1483- .add_service("mgr"),
After I restarted the ceph-mgr service, the warning went away.
Signed-off-by: Vangelis Tasoulas <vangelis@tasoulas.net>
Fixed the incorrect mention of 'osd_deep_mon_scrub_interval' in health-checks.rst.
Changed it to 'osd_deep_scrub_interval'.
Fixes: https://tracker.ceph.com/issues/38310
Signed-off-by: Ashish Singh <assingh@redhat.com>
Make this mon_warn code clearer since it involves 2 values
Code used mon scrub interval instead of pg scrub interval
Rename config values to include _pg_ and ratio to make it more clear
Fix scrub warniing handling use per-pool intervals when specified
Fixes: http://tracker.ceph.com/issues/37264
Signed-off-by: David Zafman <dzafman@redhat.com>
* refs/pull/25849/head:
qa/suites/rados/upgrade: one mon per node, and enable-msgr2 at end
qa/rados/thrash-old-clients: avoid msgr2
mon: make bootstrap rank check more robust
mon: clean up probe debug output a bit
msg/async: use v1 for v1 <-> [v2,v1] peers
msg/async/AsyncMessenger: drop single-use _send_to
mon/HealthMonitor: raise MON_MSGR2_NOT_ENABLED if mons not bound to msgr2
doc/rados/operations/health-checks: document MON_* health warnings
mon/MonMapMonitor: add 'mon enable-msgr2' command
mon: respawn if rank addr changes
mon/MonMap: calc_addr_mons() after setting rank addrvec
Reviewed-by: Ricardo Dias <rdias@suse.com>
If the ms_bind_msgr2 option is enabled, and all mons are nautilus,
raise a health alert if any mons aren't bound to msgr2 addresses.
Whitelist tests that mon_bind_addrvec=false or mon_bind_msgr2=false.
Signed-off-by: Sage Weil <sage@redhat.com>
First paragraph: explain what the error means.
Second or later paragraph: describe steps to fix or mitigate.
Signed-off-by: Sage Weil <sage@redhat.com>
- radosgw/s3/bucketops.rst: fix Malformed table.
- operations/health-checks.rst: Title underline too short
- rbd/rados-rbd-cmds.rst: Title underline too short
- rados/operations/index.rst: include health-checks in toc
Signed-off-by: Kefu Chai <kchai@redhat.com>