Commit Graph

1414 Commits

Author SHA1 Message Date
Josh Durgin
333cc0d511 Merge branch 'wip-rbd-formatted-output'
Reviewed-by: Dan Mick <dan.mick@inktank.com>

Conflicts:
	src/rbd.cc
	src/test/cli/rbd/help.t
2013-01-16 13:29:22 -08:00
Josh Durgin
4e5a07bceb XMLFormatter: fix pretty printing
It used the wrong indentation level and did not add a newline after
closing a section. dump_stream() did not indent at all.

Simplify a little and remove the parameter from print_spaces(). If we just
remove the element from m_sections before calling print_spaces() in
close_section(), the number of elements in m_sections is always the
indentation level.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-16 13:14:49 -08:00
Sage Weil
299548024a osd: change scrub min/max thresholds
The previous 'osd scrub min interval' was mostly meaningless and useless.
Meanwhile, the 'osd scrub max interval' would only trigger a scrub if the
load was sufficiently low; if it was high, the PG might *never* scrub.

Instead, make the 'min' what the max used to be.  If it has been more than
this many seconds, and the load is low, scrub.  And add an additional
condition that if it has been more than the max threshold, scrub the PG
no matter what--regardless of the load.

Note that this does not change the default scrub interval for less-loaded
clusters, but it *does* change the meaning of existing config options.

Fixes: #3786
Signed-off-by: Sage Weil <sage@inktank.com>
2013-01-14 18:24:40 -08:00
Samuel Just
66eb93b836 OSD: only trim up to the oldest map still in use by a pg
map_cache.cached_lb() provides us with a lower bound across
all pgs for in-use osdmaps.  We cannot trim past this since
those maps are still in use.

backport: bobtail
Fixes: #3770
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-11 12:17:10 -08:00
Sage Weil
310112f702 Merge remote-tracking branch 'gh/wip-3633'
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-10 18:05:27 -08:00
Joao Eduardo Luis
684d4ba242 mon: Monitor: add timecheck infrastructure to detect clock skews
Fixes: #3633
Fixes: #3695

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-01-11 00:44:21 +00:00
Samuel Just
44625d4460 config_opts.h: default osd_recovery_delay_start to 0
This setting was intended to prevent recovery from overwhelming peering traffic
by delaying the recovery_wq until osd_recovery_delay_start seconds after pgs
stop being added to it.  This should be less necessary now that recovery
messages are sent with strictly lower priority then peering messages.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Gregory Farnum <greg@inktank.com>
2013-01-10 11:10:04 -08:00
Sage Weil
a5d692a7b9 msgr: inject delays at inconvenient times
Exercise some rare races by injecting delays before taking locks
via the 'ms inject internal delays' option.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-28 17:21:01 -08:00
Sage Weil
f6ce5dda43 rgw: disable ops and usage logging by default
Most users don't need this, and having it on will just fill their clusters
with objects that will need to be cleaned up later.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-27 22:02:38 -08:00
Sage Weil
82c71716f7 osd: drop 'osd recovery max active' back to previous default (5)
Having this too large means that queues get too deep on the OSDs during
backfill and latency is very high.  In my tests, it also meant we generated
a lot of slow recovery messages just from the recovery ops themselves (no
client io).

Keeping this at the old default means we are no worse in this respect than
argonaut, which is a safe position to start from.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-27 11:12:33 -08:00
Sage Weil
6f1f03c7d3 journal: reduce journal max queue size
Keep the journal queue size smaller than the filestore queue size.

Keeping this small also means that we can lower the latency for new
high priority ops that come into the op queue.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-27 11:11:08 -08:00
Dan Mick
4a558048cf librbd: move buf_is_zero() to new common/util.cc and include/util.h
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-21 17:03:38 -08:00
Yehuda Sadeh
a803159b02 rgw: configurable exit timeout
Fixes: #3638

rgw exit timeout secs : number of seconds to wait for process
to exit cleanly before forcing exit. If set to 0, it'l wait
indefinitely.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-20 10:13:16 -08:00
Yehuda Sadeh
799c59ae89 rgw: remove useless configurable, fix swift auth error handling
Fixes: #3649
No need to have an extra configurable to use keystone. Use keystone
whenever keystone url has been specified. Also, fix a bad error
handling that turned a failure to authenticate into successfully
authenticating a bad user.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-19 22:03:56 -08:00
Joao Eduardo Luis
bdc998ef4c mon: OSDMonitor: add option 'mon_max_pool_pg_num' and limit 'pg_num' accordingly
Instead of having a hardcoded default, use a configurable one. It is
limited to 65536 until future testing guarantees there is no side-effects
of increasing it past this value, but by being adjustable the user still
has the freedom to specify whatever maximum value he wants.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2012-12-17 14:41:48 -08:00
Sam Lang
a7de975d93 lockdep: Decrease lockdep backtrace skip by 1
Skipping the top 4 (it starts at 0) calls in the
backtrace actually skips the call that does the lock.
Skip 3 instead.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-12-14 14:17:40 -08:00
Sam Lang
f16e571757 client: Add config option to inject sleep for tick
Testing the tick delay with a fork/suspend is causing
corruption in the lockdep code.  This approach uses
a config option to sleep the tick thread for a number
of seconds, avoiding the entire fork/suspend mess.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-12-13 17:49:43 -10:00
Yehuda Sadeh
3a95d97648 rgw: configurable list of object attributes
Fixes: #3535
New object attributes are now configurable. A list
can be specified via the 'rgw extended http attrs'
config param.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-12 13:45:21 -08:00
Yehuda Sadeh
8708724557 rgw: option to provide alternative s3 put obj success code
Fixes: #3529
Added a new option: rgw_s3_success_create_obj_status.
Expected values are 0, 200, 201, 204. A value of 0
will skip the special handling altogether. Any value
other than the specified will default to 200.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-12 13:45:21 -08:00
Sage Weil
6007088c53 Merge remote-tracking branch 'gh/wip-conf' into next
Reviewed-by: Greg Farnu <greg@inktank.com>
2012-12-11 15:07:38 -08:00
Samuel Just
bcf1461c7e Merge remote-tracking branch 'upstream/wip_split2' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
2012-12-10 22:00:36 -08:00
Samuel Just
12673c24f2 PrioritizedQueue: increment ret when removing items from list
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-10 17:31:44 -08:00
Samuel Just
80cca214b9 PrioritizedQueue: move if check out of loop in filter_list_pairs
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-10 17:30:59 -08:00
Sage Weil
a50c7d3b2f config: do not always print config file missing errors
Do not generate errors each time we fail to open a config file; only
generate one at the end if a search path was specified and none were
usable, right before we (already) exit.  This avoids spamming stderr
about each path we tried in the search list before we found a good one.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-10 16:41:19 -08:00
Samuel Just
788992bbf5 config_opts.h: adjust recovery defaults
osd max backfills: 5 was too low for a default, 10
 seems to work better in testing.  The message
 priority system should minimize disruption of
 push and pull operations anyway.

osd recovery max chunk: 1MB was too small for a
 default.  8MB is reasonable for a single push
 and will allow us to recover an rbd block in
 one push rather then 4 reducing client io
 latency during log-based recovery.

osd recovery op priority: 10 rather than 30 will
 further reduce the client io latency impact of
 push and pull operations.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-10 13:53:10 -08:00
Sage Weil
42d21937fb Merge branch 'testing' into next 2012-12-08 09:12:21 -08:00
Yehuda Sadeh
81fdea135c auth: set default auth_client_required
Fixes: #3578
Set auth_client_required to default to "cephx, none".

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-07 22:33:31 -08:00
Sam Lang
214c7a1705 client: Allow cap release timeout to be configured
The delay for releasing an inode's capability is
hardcoded to 5 seconds.  This patch takes the timeout
value from a config parameter, which defaults presently
to 5 seconds.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-12-06 05:30:17 -08:00
Samuel Just
36c0fd220e PrioritizedQueue: allow caller to get items removed by removed_by_filter
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-05 11:34:18 -08:00
Samuel Just
fec47cad52 OSD: don't wait for superblock writes in handle_osd_map
Instead, pass the pinned maps into a Context and clear the
cache after the transaction is applied.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-05 11:34:18 -08:00
Samuel Just
a48dee547c os/: Add failure CollectionIndex failure injection
Several pieces of HashIndex involve multi-step operations
which are sensitive to OSD crashes.  This patch introduces
failure injection to force retries from various points in
the LFNIndex helper methods to be used with store_test.cc.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-05 11:34:18 -08:00
Sage Weil
f3bd3564fa Merge branch 'wip-msgr-delay-queue' into next 2012-12-04 14:52:22 -08:00
Sage Weil
5bea57bfd0 config: we still want osd_thread_recovery_timeout
Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-03 03:56:15 -08:00
Sam Lang
e686cb14e9 config: Remove unused options
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-12-03 03:44:36 -08:00
Sage Weil
880a185625 OutputDataSocket: fix uninit var
CID 745933 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
At (2): Non-static class member "data_size" is not initialized in this constructor nor in any functions that it calls.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-01 16:57:00 -08:00
Greg Farnum
90f66980bf messenger: add the shell of a system to delay incoming Message delivery
When ms_inject_delay_type matches that of the incoming Connection,
the Pipe sets up a delay queue that it shuttles all Messages through.
This lets us check cleanup and some notification code but doesn't
actually generate any delays.

Signed-off-by: Greg Farnum <greg@inktank.com>
2012-11-29 16:09:44 -08:00
Sage Weil
c26dc1885d Merge branch 'next'
Conflicts:
	src/rgw/rgw_admin.cc
2012-11-29 15:48:54 -08:00
Sam Lang
7d27e2e95c client: Fix for #3490 and config option to test
If the mds revokes our cache cap, and we follow
the _read_sync() path, on a zero-byte file the
osd returns ENOENT.  We need to replace ENOENT
with a return of 0 in this case.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-11-29 15:04:06 -08:00
Sage Weil
78286b1403 log: 10,000 recent log entries
This is what we were (wrongly) doing before, so there are no memory
utilization surprises.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-28 13:00:36 -08:00
Sage Weil
4de7748b72 log: fix log_max_recent config
<facepalm>

Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-28 12:59:43 -08:00
Danny Al-Gaaf
54da979d20 common/pipe.c: remove twice included unistd.h
Fix includes: remove twice included unistd.h

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2012-11-28 08:25:42 -08:00
Sage Weil
cf2a045402 config: make $pid a metavariable
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-27 17:29:11 -08:00
Sage Weil
15b4ac58b2 Merge remote-tracking branch 'gh/wip-perf' into next
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2012-11-27 09:29:03 -08:00
Danny Al-Gaaf
d4bc3729fd fix syncfs handling in error case
If the call to syncfs() fails, don't try to call syncfs again via
syscall(). If HAVE_SYS_SYNCFS is defined, don't fall through to try
syscall() with SYS_syncfs or __NR_syncfs.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2012-11-27 08:52:52 -08:00
Sage Weil
94423ac90f perfcounters: fl -> time, use u64 nsec instead of double
(Almost) all current float users are actually time values, so switch to
a utime_t-based interface and internally using nsec in a u64.  This avoids
using floating point in librbd, which is problematic for windows VMs that
leave the FPU in an unfriendly state.

There are two non-time users in the mds and osd that log the CPU load.
Just multiply those values by 100 and report as ints instead.

Fixes: #3521
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-26 15:30:25 -08:00
Sage Weil
3a0ee8e49d perfcounters: add 'perf' option to disable perf counters
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-26 15:30:25 -08:00
Sage Weil
bc32fc42d2 syncfs: check for __NR_syncfs too
Also make the filestore startup tell us *all* variants that are
supported, not just the first one.

Tested-by: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Sage Weil <sage@inktank.com>
2012-11-25 13:29:52 -08:00
Yehuda Sadeh
3110e5ca42 Merge remote-tracking branch 'origin/next' into next 2012-11-22 12:57:33 -08:00
Yehuda Sadeh
a0e8452a09 Merge branch 'wip-opslog-socket2' into next
Conflicts:
	src/rgw/rgw_main.cc
2012-11-22 12:55:35 -08:00
Dan Mick
b706945ae9 Try using syscall() for syncfs if not supported directly by glibc
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-11-22 08:50:44 -08:00