It used the wrong indentation level and did not add a newline after
closing a section. dump_stream() did not indent at all.
Simplify a little and remove the parameter from print_spaces(). If we just
remove the element from m_sections before calling print_spaces() in
close_section(), the number of elements in m_sections is always the
indentation level.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
The previous 'osd scrub min interval' was mostly meaningless and useless.
Meanwhile, the 'osd scrub max interval' would only trigger a scrub if the
load was sufficiently low; if it was high, the PG might *never* scrub.
Instead, make the 'min' what the max used to be. If it has been more than
this many seconds, and the load is low, scrub. And add an additional
condition that if it has been more than the max threshold, scrub the PG
no matter what--regardless of the load.
Note that this does not change the default scrub interval for less-loaded
clusters, but it *does* change the meaning of existing config options.
Fixes: #3786
Signed-off-by: Sage Weil <sage@inktank.com>
map_cache.cached_lb() provides us with a lower bound across
all pgs for in-use osdmaps. We cannot trim past this since
those maps are still in use.
backport: bobtail
Fixes: #3770
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
This setting was intended to prevent recovery from overwhelming peering traffic
by delaying the recovery_wq until osd_recovery_delay_start seconds after pgs
stop being added to it. This should be less necessary now that recovery
messages are sent with strictly lower priority then peering messages.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Gregory Farnum <greg@inktank.com>
Exercise some rare races by injecting delays before taking locks
via the 'ms inject internal delays' option.
Signed-off-by: Sage Weil <sage@inktank.com>
Most users don't need this, and having it on will just fill their clusters
with objects that will need to be cleaned up later.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Having this too large means that queues get too deep on the OSDs during
backfill and latency is very high. In my tests, it also meant we generated
a lot of slow recovery messages just from the recovery ops themselves (no
client io).
Keeping this at the old default means we are no worse in this respect than
argonaut, which is a safe position to start from.
Signed-off-by: Sage Weil <sage@inktank.com>
Keep the journal queue size smaller than the filestore queue size.
Keeping this small also means that we can lower the latency for new
high priority ops that come into the op queue.
Signed-off-by: Sage Weil <sage@inktank.com>
Fixes: #3638
rgw exit timeout secs : number of seconds to wait for process
to exit cleanly before forcing exit. If set to 0, it'l wait
indefinitely.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Fixes: #3649
No need to have an extra configurable to use keystone. Use keystone
whenever keystone url has been specified. Also, fix a bad error
handling that turned a failure to authenticate into successfully
authenticating a bad user.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Instead of having a hardcoded default, use a configurable one. It is
limited to 65536 until future testing guarantees there is no side-effects
of increasing it past this value, but by being adjustable the user still
has the freedom to specify whatever maximum value he wants.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Skipping the top 4 (it starts at 0) calls in the
backtrace actually skips the call that does the lock.
Skip 3 instead.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Testing the tick delay with a fork/suspend is causing
corruption in the lockdep code. This approach uses
a config option to sleep the tick thread for a number
of seconds, avoiding the entire fork/suspend mess.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Fixes: #3535
New object attributes are now configurable. A list
can be specified via the 'rgw extended http attrs'
config param.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Fixes: #3529
Added a new option: rgw_s3_success_create_obj_status.
Expected values are 0, 200, 201, 204. A value of 0
will skip the special handling altogether. Any value
other than the specified will default to 200.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Do not generate errors each time we fail to open a config file; only
generate one at the end if a search path was specified and none were
usable, right before we (already) exit. This avoids spamming stderr
about each path we tried in the search list before we found a good one.
Signed-off-by: Sage Weil <sage@inktank.com>
osd max backfills: 5 was too low for a default, 10
seems to work better in testing. The message
priority system should minimize disruption of
push and pull operations anyway.
osd recovery max chunk: 1MB was too small for a
default. 8MB is reasonable for a single push
and will allow us to recover an rbd block in
one push rather then 4 reducing client io
latency during log-based recovery.
osd recovery op priority: 10 rather than 30 will
further reduce the client io latency impact of
push and pull operations.
Signed-off-by: Samuel Just <sam.just@inktank.com>
The delay for releasing an inode's capability is
hardcoded to 5 seconds. This patch takes the timeout
value from a config parameter, which defaults presently
to 5 seconds.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Several pieces of HashIndex involve multi-step operations
which are sensitive to OSD crashes. This patch introduces
failure injection to force retries from various points in
the LFNIndex helper methods to be used with store_test.cc.
Signed-off-by: Samuel Just <sam.just@inktank.com>
CID 745933 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
At (2): Non-static class member "data_size" is not initialized in this constructor nor in any functions that it calls.
Signed-off-by: Sage Weil <sage@inktank.com>
When ms_inject_delay_type matches that of the incoming Connection,
the Pipe sets up a delay queue that it shuttles all Messages through.
This lets us check cleanup and some notification code but doesn't
actually generate any delays.
Signed-off-by: Greg Farnum <greg@inktank.com>
If the mds revokes our cache cap, and we follow
the _read_sync() path, on a zero-byte file the
osd returns ENOENT. We need to replace ENOENT
with a return of 0 in this case.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
If the call to syncfs() fails, don't try to call syncfs again via
syscall(). If HAVE_SYS_SYNCFS is defined, don't fall through to try
syscall() with SYS_syncfs or __NR_syncfs.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
(Almost) all current float users are actually time values, so switch to
a utime_t-based interface and internally using nsec in a u64. This avoids
using floating point in librbd, which is problematic for windows VMs that
leave the FPU in an unfriendly state.
There are two non-time users in the mds and osd that log the CPU load.
Just multiply those values by 100 and report as ints instead.
Fixes: #3521
Signed-off-by: Sage Weil <sage@inktank.com>
Also make the filestore startup tell us *all* variants that are
supported, not just the first one.
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Sage Weil <sage@inktank.com>