Commit Graph

38569 Commits

Author SHA1 Message Date
Haomai Wang
50771dd7e6 AsyncConnection: Enhance replace process
Make handle_connect_msg follow lock rule: unlock any lock before acquire
messenger's lock. Otherwise, deadlock will happen.

Enhance lock condition check because connection's state maybe change while
unlock itself and lock again.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:12 +08:00
Haomai Wang
a1753902dc AsyncConnection: set state_offset=0 in case of reuse this connection
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:12 +08:00
Haomai Wang
2f9238361c Event: Fix incorrect memset
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:12 +08:00
Haomai Wang
4b900a6f82 test_msgr: Add SyntheticWorkload to do message measurement
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:12 +08:00
Haomai Wang
e823af41df AsyncConnection: Don't alloc buffer when reenter "READ_FRONT" state
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:12 +08:00
Haomai Wang
9fc24d4eb9 test_msgr: Add test for a message with large payload
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:11 +08:00
Haomai Wang
34cbd4c76c AsyncConnection: Avoid calling callback after delteing AsyncMessenger
Now when calling mark_down/mark_down_all, it will dispatch a reset event.
If we call Messenger::shutdown/wait, and it will let reset event called after
Messenger dealloc.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:11 +08:00
Haomai Wang
9a84a905fd test_msgr: Add random usleep to Dispatcher impl
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:11 +08:00
Haomai Wang
e7db911489 AsyncMessenger: wait for dispatch event done
In order to avoid deadlock like:
1. mark_down_all with holding lock
2. ms_dispatch_reset
3. get_connection want to get lock
4. deadlock

We signal a workerpool barrier to wait for all in-queue events done.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:11 +08:00
Haomai Wang
e84d1344fe AsyncConnection: Add omissive STATE_WAIT state
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:11 +08:00
Haomai Wang
cb3e1bf40b AsyncConnection: Adjust backoff wakeup granularity
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:10 +08:00
Haomai Wang
44a01894d9 AsyncConnection: using send_keepalive instead of _send_keepalive_or_ack
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:10 +08:00
Haomai Wang
a98b9e2f70 AsyncConnection: Fix mark_down race condition
Previously, if caller want to mark_down one connection and caller is event
thread callback, it will block for the wakeup. Meanwhile, the expected event
thread which will signal the blocked thread may also want to mark_down
connection which is own by already blocked thread. So deadlock is happen.

As tradeoff, introduce lock to file_events which can avoid create/delete
file_event callback. So we don't need wait for callback again.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:10 +08:00
Haomai Wang
24fd12f48d MessengerTest: Add markdown with caller lock tests
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:10 +08:00
Haomai Wang
abb4e68200 AsyncMessenger: Retry binding on addresses if binding fails
Learn from commit(2d4dca757e) for
SimpleMessenger:

If binding on a IP-Address fails, delay and retry again.

This happens mainly on IPv6 deployments. Due to DAD (Duplicate Address Detection)
or SLAAC it can be that IPv6 is not yet available when the daemons start.

Monitor daemons try to bind on a static IPv6 address and that might not be available
yet and that causes the monitor not to start.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:10 +08:00
Haomai Wang
0a7c331c49 AsyncMessenger: allow RESETSESSION whenever we forget an endpoint
Learn from SimpleMessenger(8cd1fdd7a7)

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:10 +08:00
Haomai Wang
d93bdade3e AsyncConnection: Using buffer read to avoid small read overhead
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:09 +08:00
Haomai Wang
8d2af2faee AsyncMessenger: Using EventCenter instead of poll for bind
Totally avoid extra thread in AsyncMessenger now. The bind socket will be
regarded as a normal socket and will dispatch a random Worker thread to
handle accept event.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:09 +08:00
Haomai Wang
f4fcff16b6 AsyncMessenger: Bind async thread to special cpu core
Now, 2-4 async op thread can fully meet a OSD's network demand with SSD
backend. So we can bind limited thread to special cores, it can improve
async event loop performance because most of structure and method will
processed within thread.

For example,

ms_async_op_threads = 2
ms_async_affinity_cores = 0,3

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2015-01-16 03:07:09 +08:00
xinxin shu
9db596974c fix command 'ceph pg dump_stuck degraded'
undersized not valid:  undersized not in inactive|unclean|stale
undersized not valid:  undersized doesn't represent an int
Invalid command:  unused arguments: ['undersized']
pg dump_stuck {inactive|unclean|stale [inactive|unclean|stale...]} {<int>} :  show information about stuck pgs

Signed-off-by: xinxin shu <xinxin.shu@intel.com>
2015-01-16 01:33:07 +08:00
Joao Eduardo Luis
34081562a8 mon: Monitor: drop StoreConverter code
We no longer convert stores on upgrade.  Users coming from bobtail or
before sould go through an interim version such as cuttlefish, dumpling,
firefly or giant.

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
2015-01-15 16:06:21 +00:00
Joao Eduardo Luis
1d814b76b8 ceph_mon: no longer attempt store conversion on start
People upgrading from bobtail or previous clusters should first go
through an interim version (quite a few to pick from: cuttlefish,
dumpling, firefly, giant).

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
2015-01-15 16:02:28 +00:00
Gregory Farnum
d4a64474e5 Merge pull request #3376 from dachary/wip-10547-formatter
common: restore format fallback semantic

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2015-01-15 07:11:17 -08:00
Joao Eduardo Luis
447d46991c mon: Monitor: health to clog writes every X seconds on the second
3600 will mean every hour, on the hour; 60 will mean every minute, on
the minute.  This will allow the monitors to spit out the info at
regular intervals, regardless the time at which they formed quorum or
which monitor is now the leader.

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
2015-01-15 14:58:36 +00:00
Joao Eduardo Luis
ae1032e2f0 mon: Monitor: cache 'summary' string to avoid dups on clog
By caching the summary string we can avoid writing dups on clog.

We will still write dups every 'mon_health_to_clog_interval', to make
sure that we still output health status every now and then, but we
increased the interval from 120 seconds to 3600 seconds -- once every
hour unless the health status changes.

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
2015-01-15 14:58:35 +00:00
Joao Eduardo Luis
fcd7aa00f5 mon: Monitor: reset health status cache on _reset()
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
2015-01-15 14:58:35 +00:00
Joao Eduardo Luis
81a2faf359 mon: Monitor: write health status to clog every X seconds
Instead of writing the health status only when a user action calls
get_health(), have the monitor writing it every X seconds.

Adds a new config option 'mon_health_to_clog_tick_interval' (default:
60 [seconds]), and changes the default value of
'mon_health_to_clog_interval' from 60 (seconds) to 120 (seconds).

If 'mon_health_to_clog' is 'true' and 'mon_health_to_clog_tick_interval'
is greater than 0.0, the monitor will now start a tick event when it
wins an election (meaning, only the leader will write this info to
clog).

This tick will, by default, run every 60 seconds.  It will call
Monitor::get_health() to obtain current health summary and overall
status.  If overall status is the same as the cached status, then it
will attempt to ignore it.  The status will not be ignored if the last
write to clog happened more than 'mon_health_to_clog_interval' seconds
ago (default: 120).

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
2015-01-15 14:58:35 +00:00
Joao Eduardo Luis
e2d66ae3cf mon: Monitor: 'get_health()' returns overall health status
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
2015-01-15 14:58:35 +00:00
Joao Eduardo Luis
7ce770d9c2 mon: Monitor: health summary to clog on get_health()
Output health summary to clog on Monitor::get_health() (called during,
e.g., 'ceph -s', 'ceph health' and alikes) if 'mon_health_to_clog' is
true (default: false) and if last update is at least
'mon_health_to_clog_interval' old (default: 60.0 (seconds)).

This patch is far from optimal for several reasons though:

1. health summary is still generated on-the-fly by the monitor each time
Monitor::get_health() is called.

2. health summary will only be outputted to clog IF and WHEN
Monitor::get_health() is called.

3. patch does not account for duplicate summaries.  We may have the same
string outputted every time Monitor::get_health() is called (as long as
enough time passed since we last wrote to clog)

4. each monitor will output to clog independently from the other
monitors.  This means that running a 'ceph -s' 3 times in a row, on a
cluster with at least 3 monitors, may result in writing the same string
3 times.

5. We reduce the amount of writes to clog by caching the last overall
health status.  We only write to clog if the overall status is different
from the cached value OR enough time has passed since we last wrote to
clog.  This may result in ignoring new contributing factors to overall
cluster health that by themselves do not change the overall status; and
even though we will pick on them once enough time has passed, we may end
up losing intermediate states (which may be good if they're transient,
but not as awesome if they reflect some kind of instability).

Fixes: #9440 (even if in a poor manner)

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
2015-01-15 14:58:35 +00:00
John Spray
889969e21d mon/MDSMonitor: make 'mds fail' idempotent for IDs
Was returning ENOENT, should succeed for 'fail' on
a non-existent name, as the fail operation makes
it cease to exist.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-01-15 14:23:26 +00:00
Loic Dachary
b957fa8ecf tests: adapt to new json-pretty format
The json-pretty format was modified for readability and now includes
additional newlines / spaces. Either switch to json to avoid dealing
with space changes or modify the expected output to include them.

http://tracker.ceph.com/issues/10547 Fixes: #10547

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-01-15 13:29:32 +01:00
Loic Dachary
97609a3309 test: rename test_activate_osd
It was incorrectly shadowing test_run_osd.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-01-15 13:27:01 +01:00
Loic Dachary
8d8ce96b58 common: restore format fallback semantic
When Formatter::create replaced new_formatter, the handling of an
invalid format was also incorrectly changed. When an invalid format (for
instance "plain") was specified, new_formatter returned a NULL pointer
which was sometime handled by creating a json-pretty formatter and
sometimes differently.

A new Formatter::create prototype with a fallback argument is added and
is used if it is not the empty string and that the format is not
known. This prototype is used where new_formatter returning NULL was
replaced by a json-pretty formatter.

http://tracker.ceph.com/issues/10547 Fixes: #10547

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-01-15 13:26:26 +01:00
Yan, Zheng
a6cb74702d Merge pull request #3370 from ceph/wip-10382
mds: handle heartbeat_reset during shutdown
2015-01-15 19:54:55 +08:00
Loic Dachary
e9aeaf813e mailmap: Loic Dachary name normalization
Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-01-15 11:14:06 +01:00
Loic Dachary
d80ded9dc6 mailmap: David Zhang affiliation
Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-01-15 11:14:01 +01:00
xinxin shu
d532f3ed2e remove unused hold_map_lock in _open_lock_pg
Signed-off-by: xinxin shu <xinxin.shu@intel.com>
2015-01-15 12:47:07 +08:00
Yunchuan Wen
9748655921 man: add help for rbd merge-diff command
Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Signed-off-by: Li Wang <liwang@ubuntukylin.com>
2015-01-15 02:31:41 +00:00
Dan Mick
9542416890 Merge pull request #3366 from ceph/wip-formatter
formatter: improve pretty output, rename factory method

Reviewed-by: Dan Mick <dan.mick@redhat.com>
2015-01-14 15:25:25 -08:00
Loic Dachary
5b0e8aef67 mailmap: Yehuda Sadeh name normalization
Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-01-15 00:12:07 +01:00
Sage Weil
3f03a7b2ee doc/release-notes: v0.91
Signed-off-by: Sage Weil <sage@redhat.com>
2015-01-14 15:11:19 -08:00
Sage Weil
4ca69313e5 doc/release-notes: typo
Signed-off-by: Sage Weil <sage@redhat.com>
2015-01-14 15:11:19 -08:00
Josh Durgin
e7cc6117ad qa: ignore duplicates in rados ls
These can happen with split or with state changes due to reordering
results within the hash range requested. It's easy enough to filter
them out at this stage.

Backport: giant, firefly
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2015-01-14 15:02:38 -08:00
Gregory Farnum
6fa29f6f19 Merge pull request #3372 from ceph/wip-10539
qa: fail_all_mds between fs reset and fs rm

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2015-01-14 14:50:46 -08:00
John Spray
e5591f8a98 qa: fail_all_mds between fs reset and fs rm
Because fs reset opens a brief window for the previously
failed MDSs to spring back into life.

Fixes: #10539

Signed-off-by: John Spray <john.spray@redhat.com>
2015-01-14 22:08:09 +00:00
Loic Dachary
26a2df2835 mailmap: Josh Durgin name normalization
Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-01-14 23:00:32 +01:00
Sage Weil
d6a9d25cf1 doc/release-notes: v0.80.8
Signed-off-by: Sage Weil <sage@redhat.com>
2015-01-14 13:48:32 -08:00
Matt Benjamin
45e9cd5bd4 Fix make check blockers.
Replace ceph-helpers.sh check for ms_nocrc with the new formula
for this.  Fixes make check for default build.

	Additionally, fix linkage of several unittests when building with
	--enable-xio.
	xio:  add missing noinst headers
		The common/address_helper.h file was not mentioned, also
		msg/xio/XioSubmit.h.
	Fix for Message.cc compilation error when Xio disabled.
	Mention simple_dispatcher.h and xio_dispatcher.h in noinst_HEADERS.
	xio:  require boost-regex.
	Make address_helper conditional on Xio.
		This carries over to simple_client/simple_server,
		for convenience.

Signed-off-by: Matt Benjamin <matt@cohortfs.com>
2015-01-14 16:44:47 -05:00
Vu Pham
daefad7a4b xio: enable accelio debug on level 2
Enable accelio debug (mostly on connection) on level 2
and sync with XioConnection debug events

Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Matt Benjamin <matt@cohortfs.com>
2015-01-14 16:44:37 -05:00
Vu Pham
aa5f1955a8 xio: Get the right Accelio errno code
Get the right Accelio errno code on xio_send_msg in
order to correctly requeue or fail the xmsg

Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Matt Benjamin <matt@cohortfs.com>
2015-01-14 16:44:30 -05:00