Commit Graph

36769 Commits

Author SHA1 Message Date
Sage Weil
da6a8a36e2 mon: move log config parsing into LogClient.h helper
Signed-off-by: Sage Weil <sage@redhat.com>
2014-11-05 01:06:02 -08:00
Sage Weil
0fd54a7e4a move Monitor::update_log_client to LogChannel::update_config
None of this is specific to the monitor.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-11-05 01:06:02 -08:00
Sage Weil
4561aff746 move get_conf_str_map_helper to str_map.h (from Monitor.h)
Signed-off-by: Sage Weil <sage@redhat.com>
2014-11-05 01:06:02 -08:00
Sage Weil
84fec864ca osd: add 'cluster_log [type] [message ...]' tell command
Useful for debugging.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-11-05 01:06:02 -08:00
Sage Weil
4f40975013 commong/LogEntry: string_to_clog_type
Signed-off-by: Sage Weil <sage@redhat.com>
2014-11-05 01:06:02 -08:00
Sage Weil
4a9ad7dc2d osd/ReplicatedPG: fix compile error
From 1fef4c3d54.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-10-31 19:33:59 -07:00
Sage Weil
4d0bba8b22 Merge pull request #2816 from XinzeChi/master
Get the currently atime of the object in cache pool for eviction 

Backport: giant, firefly
Reviewed-by: Sage Weil <sage@redhat.com>
2014-10-31 17:16:30 -07:00
Sage Weil
f30fddd07b Merge pull request #2796 from ceph/wip-rwtimer
common/Timer: kill RWTimer

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2014-10-31 10:55:01 -07:00
Sage Weil
1ef9e2f71a Merge pull request #2826 from wonzhq/evict-atime-nohitset
osd: tiering: calculate object age during eviction when there is no hit set

Reviewed-by: Sage Weil <sage@redhat.com>
2014-10-31 10:50:31 -07:00
Sage Weil
08d5945522 Merge pull request #2827 from thesues/fix-hang
Fix rados_shutdown hang forever when using radosstriper

Reviewed-by: Sage Weil <sage@redhat.com>
2014-10-31 10:17:24 -07:00
Gregory Farnum
ed2ff15c94 Merge pull request #2813 from ceph/wip-9894
client: fix I_COMPLETE_ORDERED checking

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2014-10-31 08:18:12 -07:00
Dongmao Zhang
75332450e3 Fix rados_shutdown hang forever when using radosstriper
Dear list,

I have met this when I was using radosstriper C API. My program is
roughly like this:

    rados_striper_aio_write
    rados_aio_flush
    rados_aio_wait_for_safe
    rados_aio_release
    rados_striper_destroy
    rados_ioctx_destroy
    rados_shutdown /Hangs here/

In most time, this works well, But the programm occasionally
hangs forever. Output of gstack:

Thread 1 (Thread 0x7fe0afba0760 (LWP 18509)):
0 0x000000330f20822d in pthread_join () from /lib64/libpthread.so.0
1 0x000000347566cea2 in Thread::join(void**) () from
/usr/lib64/librados.so.2
2 0x00000034755ac535 in librados::RadosClient::shutdown() () from
/usr/lib64/librados.so.2
3 0x0000003475592269 in rados_shutdown () from /usr/lib64/librados.so.2
4 0x0000000000402349 in main ()

Thread 4 (Thread 0x7fe0ab14d700 (LWP 18541)):
0 0x000000330f20e264 in __lll_lock_wait () from /lib64/libpthread.so.0
1 0x000000330f209508 in _L_lock_854 () from /lib64/libpthread.so.0
2 0x000000330f2093d7 in pthread_mutex_lock () from
/lib64/libpthread.so.0
3 0x0000003475633af1 in Mutex::Lock(bool) () from
/usr/lib64/librados.so.2
4 0x00000034755abd37 in librados::RadosClient::put() () from
/usr/lib64/librados.so.2
5 0x0000003475592501 in librados::Rados::shutdown() () from
/usr/lib64/librados.so.2
6 0x00007fe0afbba9f7 in
libradosstriper::RadosStriperImpl::CompletionData::~CompletionData() ()
from /usr/lib64/libradosstriper.so.1
7 0x00007fe0afbbaad9 in
libradosstriper::RadosStriperImpl::WriteCompletionData::~WriteCompletionData()
() from /usr/lib64/libradosstriper.so.1
8 0x00007fe0afbc1d75 in RefCountedObject::put() () from
/usr/lib64/libradosstriper.so.1
9 0x00007fe0afbc224d in
libradosstriper::MultiAioCompletionImpl::safe_request(long) () from
/usr/lib64/libradosstriper.so.1
10 0x00000034755c5ce8 in librados::C_AioSafe::finish(int) () from
/usr/lib64/librados.so.2
11 0x00000034755a0e89 in Context::complete(int) () from
/usr/lib64/librados.so.2
12 0x000000347564d4c8 in Finisher::finisher_thread_entry() () from
/usr/lib64/librados.so.2
13 0x000000330f2079d1 in start_thread () from /lib64/libpthread.so.0
14 0x000000330eae886d in clone () from /lib64/libc.so.6

It is obvious that librados::Rados::shutdown is not a thread-safe
function here. It will hang forever. The culprit of this is when
CompletionData is released, it will first notify
"rados_aio_wait_for_safe" to continue, and CompletionData will call
put() to release other data. But if the main thread(Thread 1 here) runs
fast enough, rados_striper_destroy will be executed before other
thread(Thread 4 here)'s releasing refcnf. In this situation, main thread
runs Rados::shutdown() while other thread runs Rados::shutdown() in the
same time.

My suggestion is to let RadosStriperImpl::aio_flush to block until all
the CompletionData has been released. This makes sure other thread will
never call rados_shutdown.
2014-10-31 10:52:20 +08:00
Gregory Farnum
2474f0ccfa Merge pull request #2843 from dachary/wip-9752-past-intervals
osd: past_interval display bug on acting

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-10-30 18:33:15 -07:00
Loic Dachary
c5f8d6eded osd: past_interval display bug on acting
The acting array was incorrectly including the primary and up_primary.

http://tracker.ceph.com/issues/9752 Fixes: #9752

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-10-31 00:49:21 +01:00
Josh Durgin
c489aafed0 Merge pull request #2835 from leseb/doc-rbd-juno
doc: update RBD for Juno

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-10-30 13:22:06 -07:00
Sage Weil
936c74fdad Merge pull request #2831 from yuyuyu101/async-kqueue
AsyncMessenger: Add kqueue support
2014-10-30 11:35:43 -07:00
Josh Durgin
632c145563 Merge pull request #2839 from ceph/wip-9944
osdc/Objecter: fix null dref when pool dne

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-10-30 11:31:34 -07:00
Dan Mick
0778a4f243 Merge pull request #2811 from ceph/wip-vstart
init-ceph: make ./init-ceph behave from src dir on systemd
2014-10-30 11:19:01 -07:00
Sage Weil
50c2c7589a osdc/Objecter: fix null dref when pool dne
If the base pool does not exist, we need to avoid dereferencing pi.
This simplest fix is to return with POOL_DNE early and skip all of the
checks.

Note that there is one other small semantic change in this function: if
we are using the precalc_pgid then base_oloc pool has to match.  But
the list_objects() caller does that, so we're fine.

Backport: giant
Fixes: #9944
Signed-off-by: Sage Weil <sage@redhat.com>
2014-10-30 10:56:36 -07:00
Sage Weil
0ba01583c5 Merge pull request #2837 from ceph/wip-9945
messages: fix COMPAT_VERSION on MClientSession

Reviewed-by: Sage Weil <sage@redhat.com>
2014-10-30 10:05:36 -07:00
John Spray
1eb9bcb1d3 messages: fix COMPAT_VERSION on MClientSession
This was incorrectly incremented to 2 by omission
of an explicit COMPAT_VERSION value.

Fixes: #9945

Signed-off-by: John Spray <john.spray@redhat.com>
2014-10-30 16:50:32 +00:00
Gregory Farnum
48c9f8c440 Merge pull request #2830 from ceph/wip-9800-giant
client: allow xattr caps in inject_release_failure

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2014-10-30 09:14:00 -07:00
Loic Dachary
51e189c1b0 Merge pull request #2834 from dachary/wip-warning
tests: fix signed/unsigned warning

Reviewed-by: Christophe Courtaut <christophe.courtaut@gmail.com>
2014-10-30 12:20:36 +01:00
Sébastien Han
c96fe592f2 doc: update RBD for Juno
This commit introduces some updates for the OpenStack Juno release. New
flags have been added, many trailing spaces were removed and a new
recommendation for Glance cache management has been added too.

Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
2014-10-30 11:59:14 +01:00
Sage Weil
56ee3b4157 doc/release-notes: it's 8MB, not 32MB
Signed-off-by: Sage Weil <sage@redhat.com>
2014-10-29 22:54:26 -07:00
Sage Weil
f7431cc3c2 msg/Pipe: discard delay queue before incoming queue
Shutdown the delayed delivery before the incoming queue in case the
DelayedDelivery thread is busy queuing messages.

Fixes: #9910
Signed-off-by: Sage Weil <sage@redhat.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-10-29 14:45:54 -07:00
Sage Weil
e2e6f9739d v0.87
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJUUSwLAAoJEH6/3V0X7TFtsLMQAM0xPn3NFFOGMrZobs4ogB6Q
 kPCSf21cHdreExNpUcUDIgaH8Vff63yUKghkkSBYESI8IA0/tuJcClL98sWuWyyj
 aU1zEomjOMtKgb5cKdQSjX3ss2GYZgQGLWAeAawdIaNO1WaXXPjg/mVSdWL2tFAJ
 EkhPg3THS2Bvnm+B1g3QY9QZTU9EA3fm4Np/UjBxZToD6TL+GNXXIjYUSE11PTIB
 gfnWhpvhqK3DTFkjtKvlPTEiYRd60nnnbhYXI3Ry2bmrJIJ+lIzXUlFfjtuBRjc6
 ZQvwBPXuxbUvo3dfI5c75PKk8BCSdBtA5gZ8rrgpdcp8AC8pX/5DhuNamfgBMOug
 s+H5j07De9/FrVJ5JW8CkSQLyQt2HD2E8cNAa5me87kOv9DIWC1fMFmA/mGPlDCz
 NJhpl/z4BBfmB0AtCVvjqpeP7vJWV74rrnWUET7FTj/1xCY4EmX5CalCvbE1Q7e5
 1nA0RoZ8EPtP/VLfBzlglv7MPelrTsq1BaUzP5YtZ5XPVShCZCIc/lvJZz4tOFaU
 0PFA9GrHIGRn6WPzQGDLiyN6XE8W+t/fWEs6N7ToFrRsMpmxdgwWtERfXhGGBNVJ
 8HYrIlfOKLAsQ1HpOEyn9cMF1AW2gVAn6wdmyPuahmm83Z6XprhL6i3V+sdLyhRx
 LSzWJ+Dufn4+K4AA73mi
 =oz3o
 -----END PGP SIGNATURE-----

Merge tag 'v0.87'

v0.87
2014-10-29 13:50:24 -07:00
Sage Weil
675f1c7ece Merge pull request #2829 from ceph/wip-doc-fs-quickstart
doc: include 'fs new' stuff in cephfs quickstart
2014-10-29 13:08:52 -07:00
Josh Durgin
5a473a9ea6 Merge pull request #2828 from Vicente-Cheng/master
rbd: Fix the rbd export when image size more than 2G

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-10-29 12:16:08 -07:00
Jenkins
c51c8f9d80 0.87 2014-10-29 11:03:55 -07:00
Haomai Wang
ce6f22d698 AsyncMessenger: Add kqueue support
AsyncMessenger will select event driver following epoll, kqueue and
select(now not exists) sequence

Fix #9926
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2014-10-30 01:15:33 +08:00
John Spray
5a4c3aa5c9 client: allow xattr caps in inject_release_failure
Because some test environments generate spurious
rmxattr operations, allow the client to release
'X' caps.  Allows xattr operations to proceed
while still preventing client releasing other caps.

Fixes: #9800
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 5691c68a0a)
2014-10-29 11:09:55 +00:00
John Spray
214ac9fb06 doc: include 'fs new' stuff in cephfs quickstart
Not sure how 'quick' this really is now compared with
the full filesystem instructions, but let's not leave
it incomplete.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-10-29 11:02:19 +00:00
Xinze Chi
1fef4c3d54 Get the currently atime of the object in cache pool for eviction
Because if there are mutiple atime in agent_state for the same object, we should use the recently one.

Signed-off-by: Xinze Chi <xmdxcxz@gmail.com>
2014-10-29 07:11:11 +00:00
Loic Dachary
66b4cd9d2c tests: fix signed/unsigned warning
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-10-29 08:09:33 +01:00
Vicente Cheng
4b87a81c86 rbd: Fix the rbd export when image size more than 2G
When using export <image-name> <path> and the size of image is more
than 2G, the previous version about finish() could not handle in
seeking the offset in image and return error.

This is caused by the incorrect variable type. Try to use the correct
variable type to fixed it.

I use another variable which type is uint64_t for confirming seeking
and still use the previous r for return error.

uint64_t is more better than type int for handle lseek64().

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
2014-10-29 12:21:11 +08:00
Zhiqiang Wang
ef1980f528 osd: tiering: calculate object age during eviction when there is no hit
set

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2014-10-29 10:59:09 +08:00
Sage Weil
5c051f5c0c Merge pull request #2823 from dachary/wip-9919-injectargs-side-effects
qa: avoid qa/workunits/cephtool/test.sh unstability

Backport: giant, firefly
Reviewed-by: Sage Weil <sage@redhat.com>
2014-10-28 17:03:17 -07:00
Loic Dachary
6fca23f610 qa: avoid qa/workunits/cephtool/test.sh unstability
For testing injectargs a configuration option was changed that has side
effects on the cluster. It could introduce random failures later. It is
replaced with a configuration option that cannot have adverse side
effects on the cluster.

http://tracker.ceph.com/issues/9919 Fixes: #9919

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-10-28 22:23:11 +01:00
Josh Durgin
548dfa5f02 Merge remote-tracking branch 'origin/giant' 2014-10-28 13:10:52 -07:00
Josh Durgin
5d74c8101c Merge remote-tracking branch 'origin/wip-9806-giant' into giant
Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-10-28 13:09:06 -07:00
John Spray
54abbc61fd Merge pull request #2809 from ceph/wip-9800
client: allow xattr caps in inject_release_failure

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-10-28 12:29:29 +00:00
John Spray
5691c68a0a client: allow xattr caps in inject_release_failure
Because some test environments generate spurious
rmxattr operations, allow the client to release
'X' caps.  Allows xattr operations to proceed
while still preventing client releasing other caps.

Fixes: #9800
Signed-off-by: John Spray <john.spray@redhat.com>
2014-10-28 11:40:42 +00:00
Loic Dachary
aa640840dc Merge pull request #2818 from hjwsm1989/master
Fix the match error when starting OSD daemons.

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
2014-10-28 08:28:17 +01:00
huangjun
59507101a0 Fix the match error when starting OSD daemons.
If we have osd.7 and osd.77 on the same host, osd.7 will not be mounted if
  osd.77 is mounted.
  Signed-off-by: huangjun <hjwsm1989@gmail.com>
2014-10-28 15:09:28 +08:00
Sage Weil
9fbfb3919f Merge pull request #2815 from wonzhq/evict-atime
osd: cache tiering: fix the atime logic of the eviction

Reviewed-by: Sage Weil <sage@redhat.com>
2014-10-27 19:44:37 -07:00
Zhiqiang Wang
622c5ac417 osd: cache tiering: fix the atime logic of the eviction
Reported-by: Xinze Chi <xmdxcxz@gmail.com>
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
2014-10-28 09:37:11 +08:00
Yan, Zheng
35a387a4bb Merge pull request #2786 from ceph/wip-9869
client: cast m->get_client_tid() to compare to 16-bit Inode::flushing_cap_tid
2014-10-27 18:32:14 -07:00
Greg Farnum
a5184cf46a client: cast m->get_client_tid() to compare to 16-bit Inode::flushing_cap_tid
m->get_client_tid() is 64 bits (as it should be), but Inode::flushing_cap_tid
is only 16 bits. 16 bits should be plenty to let the cap flush updates
pipeline appropriately, but we need to cast in the proper direction when
comparing these differently-sized versions. So downcast the 64-bit one
to 16 bits.

Fixes: #9869
Backport: giant, firefly, dumpling

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-10-27 16:19:13 -07:00
Yan, Zheng
a4caed8a53 client: fix I_COMPLETE_ORDERED checking
Current code marks a directory inode as complete and ordered when readdir
finishes, but it does not check if the directory was modified in the middle
of readdir. This is wrong, directory inode should not be marked as ordered
if it was modified during readddir

The fix is introduce a new counter to the inode data struct, we increase
the counter each time the directory is modified. When readdir finishes, we
check the counter to decide if the directory should be marked as ordered.

Fixes: #9894
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-10-27 15:27:37 -07:00