osd: scrub error on big objects; make bluestore refuse to start on big objects
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Make sure mute and unmute work. Make sure stick is sticky. Mkae sure
counts can go down bupt if they go upt hte mute clears.
Signed-off-by: Sage Weil <sage@redhat.com>
1. always take osd_scrub_sleep for manually initiated
scrubs
2. when scrub_time_permit() return true for scheduled
ones, the existing osd_scrub_sleep is used
3. when scrub_time_permit() return false for scheduled
ones, there may be 2 scenarios
3.1 if osd_scrub_extended_sleep <= osd_scrub_sleep,
let's take osd_scrub_sleep
3.2 otherwise, let's take osd_scrub_extended_sleep
Fixes: http://tracker.ceph.com/issues/40955
Signed-off-by: Jeegn Chen <jeegnchen@tencent.com>
Set datefmt parameter to track the log information
%F Equivalent to %Y-%m-%d
%T Equivalent to "%H:%M:%S"
Signed-off-by: Robert Church <robert.church@windriver.com>
Reviewed-by: Changcheng Liu <changcheng.liu@aliyun.com>
osd_repair_during_recovery=true allow explicitly requested reqair
to be scheduled on OSDs with active recovering.
Fixes: http://tracker.ceph.com/issues/40620
Signed-off-by: Jeegn Chen <jeegnchen@tencent.com>
We no longer have a snaps field with real values, so dumping this as a
"snap_context" is silly. Instead, just dump the seq.
Adjust qa/standalone/scrub/osd-scrub-repair.sh accordingly.
Signed-off-by: Sage Weil <sage@redhat.com>
mon: Improve health status for backfill_toofull and recovery_toofull
Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Treat backfull_toofull as a warning condition because it can resolve itself.
Includes test case for PG_BACKFILL_FULL
Includes test case for recovery_toofull / PG_RECOVERY_FULL
Fixes: https://tracker.ceph.com/issues/39555
Signed-off-by: David Zafman <dzafman@redhat.com>
Our test admin has been asking for this for the past few years:-)
Besides, this is also useful for operating on large Ceph clusters with
mutliple storage pools possibly spanning over all osds.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
address the regression introduced by e62cfceb
in e62cfceb, we wanted to test the newly introduced TOO_FEW_OSDS
warning, so we increased the number of OSD to the size of pool, so if
the number of OSD is less than pool size, monitor will send a warning
message.
but we need to bring all OSDs back if we are expecting a healthy
cluster. in this change, all OSDs are resurrect before
`wait_for_health_ok`.
Signed-off-by: Kefu Chai <kchai@redhat.com>
The changes to the way EC/ReplicatedBackend communicate read
t showerrors had a side effect of making first eio on the object in
TEST_rados_get_subread_eio_shard_[01] repair itself depending
on the timing of the killed osd recovering. The test should
be improved to actually test that behavior at some point.
Signed-off-by: Samuel Just <sjust@redhat.com>
Use OSD_POOL_PRIORITY_MAX and OSD_POOL_PRIORITY_MIN constants
Scale legacy priorities if exceeds maximum
Signed-off-by: David Zafman <dzafman@redhat.com>
Case 1: A more recent update exists
Case 2: The first entry in the divergent sequence is a create
Case 3 NOT TESTED - Ohject currently missing
Case 4: We can rollback all of the entries
Case 5: We cannot rollback at least 1 of the entries
Support starting OSDs even when "noup" is set (don't wait for up).
Move create_ec_pool() to ceph-helpers.sh
Fixes: https://tracker.ceph.com/issues/39162
Signed-off-by: David Zafman <dzafman@redhat.com>
stop command can be used to force stopping a specified osd daemon, e.g.,
you don't have to pre-figure out where it located.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
If user specifies dump-import it will still work, but isn't
in the usage that way.
Fixes: http://tracker.ceph.com/issues/39284
Signed-off-by: David Zafman <dzafman@redhat.com>
The ceph cli tool checks for the presence of the variable, not its value.
Fixes: http://tracker.ceph.com/issues/38359
Signed-off-by: Sage Weil <sage@redhat.com>
Change run_osd() to default objectstore bluestore
Use run_osd_filestore() to use the non-default objectstore
Fix inject_eio to handle any objectstore if config prefixed with type
Remaining tests using filestore:
osd-pool-create.sh TEST_pool_create_rep_expected_num_objects
Test filestore directory creation
qa/standalone/osd/osd-dup.sh TEST_filestore_to_bluestore
Obvious
qa/standalone/osd/osd-rep-recov-eio.sh TEST_rep_read_unfound
Requires data digest in object info
qa/standalone/scrub/osd-scrub-repair.sh multiple tests
Erasure code pools append mode for filestore is tested
qa/standalone/special/ceph_objectstore_tool.py
Test code verifies COT by directly examining filestore contents
Fixes: https://tracker.ceph.com/issues/39162
Signed-off-by: David Zafman <dzafman@redhat.com>
Helpers to decide when it is safe to stop a mon, add a mon that is
not started, or remove a mon. (Adding and start a mon would always
be safe, but it takes time to sync, so it's not really possible to do
quickly.)
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/27169/head:
common/config: parse --default-$option as a default value
Reviewed-by: Sébastien Han <seb@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Sometimes it is useful to specify an alternative default value for an
option via the command line such that it has a lower priority than the
mon config database, config file, the rest of the command line, or the
environment.
Signed-off-by: Sage Weil <sage@redhat.com>
Leave repair pg state on until recovery finishes or a new scrub starts
Fixes: http://tracker.ceph.com/issues/38616
Signed-off-by: David Zafman <dzafman@redhat.com>
Allow auto_repair for replicated bluestore pools
Regular scrub within auto repair parameters will trigger deep scrub
New state failed_repair if PG repair attempt could not fix everything
Set failed_repair if not possible to repair anything
Fixes: http://tracker.ceph.com/issues/38616
Signed-off-by: David Zafman <dzafman@redhat.com>
Fix for argument handling of create_ec_pool()
Always pass a value for allow_overwrites for consistency
Caused by: 3ca750d41dfe33c6efea4abc96d2bd426a9742b9
Signed-off-by: David Zafman <dzafman@redhat.com>
Verify we have the expected behavior for creates and moves that
maintain bucket summation, both with and without the
osd_crush_update_weight_set option enabled.
Signed-off-by: Sage Weil <sage@redhat.com>
- Make the initial weight-set actually consistent (summing)
- Fix the intermediate state so that it reflects a correctly
maintained summation.
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/26898/head:
osd/PG: invalidate PG if merging with unexpected version
osd,mon: include more pg merge metadata in pg_pool_t
qa/standalone/osd/pg-split-merge.sh: reproduce pg merge problem with empty pgs
osd: add osd_debug_no_{acting_change,purge_strays}
Reviewed-by: Neha Ojha <nojha@redhat.com>
* refs/pull/26894/head:
qa/standalone/erasure-code/test-erasure-code: adjust test to avoid m=0
erasure-code: ensure m >= 1
mon/OSDMonitor: set ec min_size to k + min(1, m - 1)
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
_DD is k=2 m=0, which we don't allow. Switch it to cDD.
I confess I don't fully understand why this was _DD to begin with, but
I'm pretty sure mapping is there to control the order of results so that
it can be mapped to the CRUSH rule output sanely, and the coding portion
is not relevant to the test.
Signed-off-by: Sage Weil <sage@redhat.com>
If the source or target PG version is 0'0, we may silently take the max
of the source and target and still leave the PG complete. This
specifically can happen with an empty PG, as seen with bug 38655. In
theory we could encounter one of the PGs with some other last_update
that doesn't match what we expect. If that ever happens, make sure the
result is incomplete so that backfill can clean up.
Additionally check that the pool metadata for the last merge matches the
PGs at all. This could mismatch if we have an osdmap gap and are forced
to do some merge without merge info at all... in which case we should
definitely invalidate: there should be newer copies of the PG(s), and we
have no idea whether the PGs we are merging are what we want. If this is
some disaster recovery situation, an operator is always free to use
ceph-objectstore-tool to re-mark a PG complete (at their own peril!).
Fixes: http://tracker.ceph.com/issues/38655
Signed-off-by: Sage Weil <sage@redhat.com>
Bluestore caused grep crash with "grep: memory exhausted" due to
size of "block" storage.
Fixes: http://tracker.ceph.com/issues/38678
Signed-off-by: David Zafman <dzafman@redhat.com>
This also reverts commit 10b9626ea7b09e7c124067a2ce08a76eea073c9c.
Fixes: http://tracker.ceph.com/issues/38631
Signed-off-by: David Zafman <dzafman@redhat.com>
It's now "1/2 unfound":
1/2 objects unfound (50.000%)
..presumably due to the rbd pool init creating the rbd_directory.
Signed-off-by: Sage Weil <sage@redhat.com>
- no need for the default pool size
- no initial osds or it will collide with setup_osds later
- no need for rbd pool at all
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/26794/head:
mon/MgrMonitor: only try to update always_on_modules if >= NAUTILUS
qa/standalone/mon/msgr-v2-transition: add some tests for enabling msgr v2
mon/MonmapMonitor: add 'ceph mon set-addrs <name> <addrvec>' command
Revert "mon/MonClient: disable ms_bind_msgr2 if NAUTILUS feature not set"
mon/OSDMonitor: use legacy_equals to compare osd addrs
msg/msg_types: make legacy_equals() symmetrical
mon/MDSMonitor: stop using get_orig_source_inst()
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
'rbd pool init' now does IO. Drop the pool, or change the pool size to 1.
Fixes: http://tracker.ceph.com/issues/38585
Signed-off-by: Sage Weil <sage@redhat.com>
This test introduces a map gap. What *should* happen is that when there is
such a gap, we cannot import. Previously, the test didn't reliably produce
a map gap at all, and didn't check that import failed--it verified that it
passed.
Fix the test so that it reliably produces a gap *and* reports
min_last_epoch_clean to the mon so we can trim. Then verify we fail to
import, but can with --force. But remove the pg again, because if we
force an import with a map gap the osd will refuse to start.
Fixes: http://tracker.ceph.com/issues/38525
Signed-off-by: Sage Weil <sage@redhat.com>
The markdown test is based on marking down a specific number of times, but
the duplicate commands from the CLI may not get absorbed/batched by the
mon, breaking the test. Override the default qa/tasks/workunit.py
behavior of sending dups.
Fixes: http://tracker.ceph.com/issues/38359
Signed-off-by: Sage Weil <sage@redhat.com>
Stopping the osd daemon won't reliably get you HEALTH_WARN or ERR; you have
to make sure it is also marked down.
Signed-off-by: Sage Weil <sage@redhat.com>
Add trigger_deep_scrub osd command for testing
Publish stats when trigger_scrub/trigger_deep_scrub is used for testing
Add optional argument to trigger_scrub/trigger_deep_scrub
for amount of extra time to change last scrub stamps
Signed-off-by: David Zafman <dzafman@redhat.com>
osd: Deny reservation if expected backfill size would put us over bac…
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
There may be a situation where data digest in object info is
inconsistent with that computed from object data, then deep-scrub
will fail even though all three repicas have the same object data.
Fixes: https://tracker.ceph.com/issues/37935
Signed-off-by: Li Yichao <liyichao.good@gmail.com>
* refs/pull/25009/head:
librbd: stringify locker name with get_legacy_str()
osdc/Objecter: fix list_watchers addr rendering to match legacy
test/crimson: disable unittest_seastar_messenger test
msg/msg_types: encode entity_addr_t TYPE_ANY as TYPE_LEGACY for pre-nautilus
client: make blacklist detection handle TYPE_ANY entries
mon/OSDMonitor: maintain compat output for 'blacklist ls'
client: maintain compat for {inst,addr}_str in status dump
qa/tasks/ceph_manager: compare osd flush seq #'s as ints
qa/suites/fs: make use of simple.yaml where appropriate
qa/msgr: move msgr factet into generic re-usable dir
crimson: fix monmap build for seastar
doc/start/ceph.conf: trim the sample ceph.conf file
doc/rados/operations: only describe --public-{addr,network} method for adding mons
PendingReleaseNotes: deprecate 'mon addr'
doc: fix some 'mon addr' references
doc/rados/configuration: fix some 'mon addr' references
doc/rados/configuration/network-config-ref: revise network docs somewhat
doc/rados/configuration/network-config-ref: remove totally obsolete section
qa/suites/rados: replace mon_seesaw.py task with a small bash script
qa/suites/fs/upgrade: don't bind to v2 addrs
qa/tasks/mon_thrash: avoid 'mon addr' in mon section
mon/MonClient: disable ms_bind_msgr2 if NAUTILUS feature not set
osd/OSDMap: maintain compat addr fields
msg/msg_types: add get_legacy_str()
mds/MDSMap.h: maintain compat addr field
mon/MgrMap: maintain compat active_addr field
mon/MonClient: reconnect to mon if it's addrvec appears to have changed
qa/tasks/ceph.conf.template: increase mon_mgr_mkfs_grace
msg/async/ProtocolV2: fill in IP for all peer_addrs
msg/async: print all addrs on debug lines
mon/MonMap: no noname- mon name prefix when for_mkfs
ceph-monstore-tool: print initial monmap
msg/async/ProtocolV2: advertise ourselves as a v2 addr when using v2 protocol
msg/async: assert existing protocol matches current protocol
msg/async: add missing modelines
mon/MonMap: add missing modeline
vstart.sh: put mon addrs in mon_host, not 'mon addr'
msg/async: better debug around conn map lookups and updates
mon/MonClient: dump initial monmap at debug level 10
qa/standalone/osd/osd-fast-mark-down: use v1 addr w/ simplemessenger
qa/tasks/ceph: set initial monmap features with using addrvec addrs
monmaptool: add --enable-all-features option
qa/tasks/ceph: only use monmaptool --addv if addr has [,:v]
qa/tasks/ceph_manager: make get_mon_status use mon addr
qa/tasks/ceph: keep mon addrs in ctx namespace
mon/OSDMonitor: log all osd addrs on boot
msg/simple: behave when v2 and v1 addrs are present at target
mon/MonClient: warn if global_id changes
msg/Connection: add warning/note on get_peer_global_id
mds/MDSDaemon: clean up handle_mds_map debug output a bit
qa/suites/rados/upgrade: debug mds
mds/MDSRank: improve is_stale_message to handle addrvecs
msg/async: make loopback detect when sending to one of our many addrs
qa/suites/rados/upgrade: no aggressive pg num changes
mon/OSDMonitor: require nautilus mons for require_osd_release=nautilus
mon/OSDMonitor: require mimic mons for require_osd_release=mimic
qa/suites/rados/thrash-old-clients: use legacy addr syntax in ceph.conf
msg/async: preserve peer features when replacing a connection
qa/tasks/ceph.py: move methods from teuthology.git into ceph.py directly; support mon bind * options
mon/MonMap: adjust build_initial behavior for mkfs vs probe
mon/MonMap: improve ambiguous addr behavior
qa/suites/rados/upgrade: spread mons a bit
qa/rados/thrash-old-clients: keep mons on separate hosts
qa/standalone/mon/misc.sh: tweak test to be more robust
qa/tasks/mon_seesaw: expect v1/v2 prefix in addr
osd/OSDMap: fix is_blacklisted() check to assume type ANY
mon/OSDMonitor: use ANY addr type for blacklisting
mon/msg_types: TYPE_V1ORV2 -> TYPE_ANY
qa/workunits/cephtool: fix blacklist test
qa/suites/upgrade: install old version with only v1 addrs
common/options: by default, bind to both msgr v1 and v2 addresses
vstart.sh: add --msgr1, --msgr2, --msgr21 options
msg/async/ProtocolV2: be flexible with server identity check
msg/msg_types: fix entity_addrvec_t::parse() with null end arg
qa/suites/rados/basic/msgr: no msgr2 addrs in initial monmaps
qa/tasks/ceph: add 'mon_bind_addrvec' and 'mon_bind_msgr2' options
monmaptool: add --addv argument to pass in addrvec directly
qa/suites/rados/basic/msgr: do not use msgr2 with simplemessenger
qa/suites/rados/basic/msgr: async is not experimental
messages/MOSDBoot: fix compat with pre-nautilus
mon/MonMap: allow v1 or v2 to be explicitly specified along with part
msg/msg_types: allow parsing of IPs without assuming v1 vs v2
msg/msg_types: default parse to v2 addrs
msg: standarize on v1: and v2: prefixes for *all* entity_addr_t's
vstart.sh: use msgr2 by default
mon/MonMap: remove get_addr() methods
ceph-mon: adjust startup/bind/join sequence to use addrs
mon: use MonMap::get_addrs() (instead of get_addr())
mon/MonClient: change pending_cons to addrvec-based map
mon/MonMap: fix set_addr() caller, kill wrapper
mon/MonMap: remove addr-based add()
monmaptool: fix --add to do either legacy or msgr2+legacy
monmaptool: clean up iterator use a bit
mon/MonMap: handle ambiguous mon addrs by trying both legacy and msgr
mon/MonMap: take addrvec for set_initial_members
mon/MonMap: use addrvecs for test instances
mon: pass addrvec via MMonJoin
mon/MonmapMonitor: fix 'mon add' to populate addrvec
mon/MonMap: addr -> addrvec
msg/async/ProtocolV2: only update socket_addr if we learned our addr
osd: go active even if mon only accepted our v1 addr
test/msgr: add test for msgr2 protocol
msg/async/ProtocolV2: share socket_addr and all addrs during handshake
msg/async: print socket_addr for the connection
msg/async: msgr2 protocol placeholder
msg/async: move ProtocolV1 class to its own source file
msg/async: keep listen addr in ServerSocket, pass to new connections
msg/async/AsyncMessenger: fix set_addr_unknowns
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
The teuthology test did not like the change to remove 'mon addr' from
ceph.conf. The standalone script is easier to test.
Note that it avoids mon names 'a', 'b', 'c' since the MonMap::build_initial
uses those.
Signed-off-by: Sage Weil <sage@redhat.com>
Scrub testing requires an orderly control of scrubbing. Most but not
all the time, the duplicate scrub request is ignored because the first
request hasn't finished. Teuthology enables this environment variable
in the workunit handling.
Fixes: https://tracker.ceph.com/issues/36525
Signed-off-by: David Zafman <dzafman@redhat.com>
Use --rmtype snapmap with new obj16 to remove snapmap only, check for repair message
Use --rmtype nosnapmap to remove obj5 while leaving snapmap behind
Signed-off-by: David Zafman <dzafman@redhat.com>
* refs/pull/24828/head:
qa/osd-bluefs-volume-ops: use ceph-bluestore-tool for fsck
qa/osd-bluefs-volume-ops: reduce space usage for the test case
Reviewed-by: David Zafman <dzafman@redhat.com>
* refs/pull/24809/head:
os/bluestore: omit redundant '/' in OSD path for ceph-bluestore-tool if
os/bluestore: improve error handling for migrate ops in
qa/standtalone/osd-bluefs-volume-ops: remove redundant code.
Reviewed-by: Sage Weil <sage@redhat.com>
* refs/pull/24787/head:
Merge PR #24796 into nautilus
osd: fix heartbeat_reset unlock
Merge PR #24780 into nautilus
Merge PR #24761 into nautilus
Merge PR #24651 into nautilus
osd: fix race between op_wq and context_queue
test: Make sure kill_daemons failure will be easy to find
test: Add flush_pg_stats to make test more deterministic
This reverts a27fd9d25cb2819e25cc48b790c40afac0250464 and
b863883ca783487401fde4f4480ed1d9b093363e.
Quote form Sébastien Han:
> IIRC at some point, we were able to create a device class from the CLI.
Now it seems that the device class gets created when at least one OSD
of a particular class starts.
In ceph-ansible, we create pools after the initial monitors are up and
we want to assign a device crush class on some of them.
That's not possible at the moment since there no device class available yet.
Also, someone might want to create its own device class.
Something as crazy as running Filestore with a tmpfs osd store and
might want to isolate them.
I know it's a very limited use case, but still, it could be desired.
See also https://www.spinics.net/lists/ceph-devel/msg41152.html
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
* refs/pull/23985/head:
ceph-objectstore-tool: add back pool dne check
qa/suites/rados/singleton/reg11184: remove old test
ceph-objectstore-tool: import pg at original epoch
osd: handle null pg slot on startup
ceph-objectstore-tool: drop support for ancient export files
osd: avoid dropping osd_lock when pg osdmaps are not laggy
qa/standalone/osd/pg-merge.sh: add merge vs pg import test
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
- In the jewel era, we fast-forwarded the PG to the OSD's latest epoch
and cleared past_intervals.
- In mimic, as of 2347ecb9614b0cd4cd9eae1d67b03119cc7ad18e, we brought the
PG up to date while updating past_intervals. (At the same time we removed
the OSD's parallel past_intervals regeneration.)
The problem is that the tool then has to reimplement the past_intervals
update logic, and *also* has to cope with splits and merges. Splits are
somewhat easier (until now we enable partial import of a PG into a split
child), but merges are not so easy.
This patch changes it so we import the PG and leave the pg_epoch matching
the import file. The OSD is then responsible for bringing it up to date
with the latest map, and dealing with any intervening splits or merges.
We also adjust the safety check to ensure that we don't collide with
any existing PG, either a child we eventually split into, or a parent
we eventually merge into.
Fixes: http://tracker.ceph.com/issues/35955
Signed-off-by: Sage Weil <sage@redhat.com>
- You can't import the source half a PG that's since merged. Sorry! We
could implement this later.
- You can import the target half, but the result will then be incomplete,
and you rely on backfill to clean it up.
- Map gaps don't affect this behavior.
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/23845/head:
osd/OSDMap: include age in up and in counts for ceph status
mon/OSDMonitor: set new_last_{up,in}_change
osd/OSDMap: store last_up_change and last_in_change
mgr/MgrMap: include mgr age in map printer
mon/MgrMap: track active_changed timestamp
mon: include mon quorum age in status
include/utime: add utimespan_str helper
Reviewed-by: John Spray <john.spray@redhat.com>
Grep from the primary's log, not every osd's log.
For the backfill_remapped task in particular, after the pg_temp change it
just so happens that the primary changes across the pool size change and
thus two different primaries do (some) backfill. Fix that test to pass
the correct primary.
Other tests are unaffected as they do not (happen to) trigger a primary
change and already satisfied the (removed) check that only one OSD does
backfill.
Signed-off-by: Sage Weil <sage@redhat.com>
awk uses some tests that the native FreeBSD awk does not support:
like: BEGIN{print 0 < 90}
And TESTDIR is not set when calling ceph-helpers from smoke.sh
So fix with keeping the archive in /tmp
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Also:
- Do not print **offset** until specified
- Count missing objects correctly (used to be primary's local missing)
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
* refs/pull/23540/head:
include/ceph_fs: rename old auid field
PendingReleaseNotes: note about auid support removal
radosgw-admin: remove -a --auth-uid arg
rgw: remove auid member from RGWUserInfo
auth: remove auid member from EntityAuth
osd: remove auid session member
mon: remove auid session member
doc/dev/cephx_protocol: drop auid reference
auth: remove auid args from handle_request and verify_authorizer
mon/OSDMonitor: remove 'osd pool {get,set} <name> auid ...'
mon/OSDMonitor: remove auid arg for 'osd lspools' and deprecate
osd/OSDCap: remove auid from grammar
osd/OSDCap: remove auid from is_capable() etc args
auth: clean up cap parse error messages
mon/AuthMonitor: raise health warning on invalid caps
mon/AuthMonitor: drop ancient auth inc encoding compat
messages/MPoolOp: drop auid member
osdc/Objecter: drop change_pool_auid
pybind/rados: drop auid arg to pool_create
pybind/rados: drop change_auid
rados: drop mkpool, rmpool commands
rados: remove 'chown' command
librados: deprecate calls that take auid
librados: mark all auid calls deprecated
mon/OSDMonitor: drop variable pool auid for prepare_new_pool
mon/OSDMonitor: remove pool auid change support
osdc/Objecter: do not pass auid to create_pool
ceph-authtool: remove auid options
qa/workunits/cephtool: remove auid tests
Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
So if there are a lot fo missing objects on primary, we can
make use of auth_log_shard to restore client I/O quickly.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
callers of get_python_path were not passing in a $1 parameter, so
ceph_lib was an empty string resulting in an invalid path to the built
cython modules. assume this is called from the `lib` parent directory.
pass path to the manager modules when starting ceph-mgr.
Signed-off-by: Noah Watkins <nwatkins@redhat.com>
The log trimming case wasn't quite right. Before HEAD^ we were
rolling forward too aggressively and miscalculating the can_rollforward_to,
which affected the trim_to calculation.
Signed-off-by: Sage Weil <sage@redhat.com>
Performance evaluations of medium to large size Ceph clusters have
demonstrated negligible performance impact from unnecessarily deep
directory hierarchies but significant performance impact from filestore
split and merge activity. Disable merges by default.
Fixes: http://tracker.ceph.com/issues/24686
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
If ulimit is set to a 1024 value, ceph-osd will segfault with the
following error :
filestore(td/smoke/0) error (24) Too many open files not handled on operation 0x55565d1fd004 (2182.1.0, or op 0, counting from 0)
This patch is about to insure that before setting up ceph daemons in tests, a valid ulimit value is setup.
Signed-off-by: Erwan Velu <erwan@redhat.com>
get_timeout_delays() is a generic function to compute delays for a long
period of time without saturating the CPU is busy loops.
It works pretty fine when the delay is short like having the following
series when requesting a 20seconds timeout : "0.1 0.2 0.4 0.8 1.6 3.2 6.4 7.3 ".
Here the maximum between two loops is 7.3 which is perfectly fine.
When the timeout reaches 300sec, the same code produces the following
series : "0.1 0.2 0.4 0.8 1.6 3.2 6.4 12.8 25.6 51.2 102.4 95.3 "
In such example there is delays which are nearly 2 minutes !
That is not efficient as the expected event, between two loops, could
arrive just after this long sleep occurs making a minute+ sleep for
nothing. On a local system that could be ok while on a CI, if all jobs
run like CI the overall is pretty unefficient by generating useless CPU
waits.
This patch is about adding a maximum acceptable delay time between two
loops while keeping the same rampup behavior.
On the same 300 seconds delay example, with MAX_TIMEOUT set to 10, we
now have the following series: "0.1 0.2 0.4 0.8 1.6 3.2 6.4 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 7.3"
We can see that the long 12/25/51/102/95 values vanished and being
replaced by a series of 10 seconds. It's up to every test defining the
probability of having a soonish event to complete.
The MAX_TIMEOUT is set to 15seconds.
Signed-off-by: Erwan Velu <erwan@redhat.com>