Commit Graph

18364 Commits

Author SHA1 Message Date
Sage Weil
41425f6be9 osd: skip threadpool pause on shutdown when blackholed
We can't pause the threadpools if they're blocked on a blackholed
filestore.  Instead, just call _exit().

Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-16 15:18:58 -08:00
Sage Weil
4b3bb5ab37 osd: fix _activate_committed replica->primary message
Normally we take a fresh map reference in PG::lock().  However,
_activate_committed needs to make sure the map hasn't changed significantly
before acting.  In the case of #2068, the OSD map has moved forward and
the mapping has changed, but the PG hasn't processed that yet, and thus
mis-tags the MOSDPGInfo message.

Tag the message with the e epoch, and also pass down the primary's address
to send the message to the right location.

Fixes: #2068
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-16 09:12:45 -08:00
Sage Weil
82eceb9a3b osd: fix do not always clear DEGRADED/set CLEAN on recovery finish
Clean means we have exactly the right number of replicas and recovery is
complete.  Degraded means we do not have enough replicas, either because
recovery is in progress, or because acting is too small.

A consequence is that if we have a PG with len(up) == 1 but a pg_temp
mapping so that len(acting) == 2, it will be active and not clean.

Fixes: #2060
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
2012-02-15 15:20:35 -08:00
Wido den Hollander
45701f5b68 init: Only check if auto start is disabled when the issued command is "start"
This still makes sure daemons don't start on boot.

When auto start was disabled it would also prevent logrotate from doing it's job.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-15 09:29:15 -08:00
Holger Macht
543e8b98d0 ceph.spec.in: Move libcls_*.so from -devel to base package
OSDs (src/osd/ClassHandler.cc) specifically look for libcls_*.so in
/usr/$libdir/rados-classes, so libcls_rbd.so and libcls_rgw.so need to
be shipped along with the base package.

Signed-off-by: Holger Macht <hmacht@suse.de>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-15 09:28:41 -08:00
Sage Weil
1a994bed63 objclass: add debug_objclass knob, default to off
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-15 09:04:22 -08:00
Sage Weil
ba0ef62f86 osd: reduce watch/notify debug noise
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-15 09:03:28 -08:00
Sage Weil
ebbfdefa12 msgr: mark_all_down on shutdown
This ensures we destroy all the Pipes and discard their messages.  Among
other things, this can avoid

2012-02-15 03:16:46.385242 7fe712b9a700 mon.f@5(peon) e1 *** Got Signal Terminated ***
2012-02-15 03:16:46.470227 7fe712b9a700 mon.f@5(peon) e1 shutdown
msg/SimpleMessenger.h: In function 'virtual SimpleMessenger::Pipe::~Pipe()' thread 7fe716a37780 time 2012-02-15 03:16:46.471005
msg/SimpleMessenger.h: 234: FAILED assert(!i->second->is_on_list())
 ceph version 0.41-362-g40802ae (commit:40802ae883a94d205a8716065b80ad5d7ff57d12)
 1: (SimpleMessenger::Pipe::~Pipe()+0x199) [0x4669d9]
 2: (SimpleMessenger::~SimpleMessenger()+0x31) [0x552231]
 3: (main()+0x3026) [0x4614a6]
 4: (__libc_start_main()+0xfe) [0x7fe714dd6d8e]
 5: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e219]
 ceph version 0.41-362-g40802ae (commit:40802ae883a94d205a8716065b80ad5d7ff57d12)
 1: (SimpleMessenger::Pipe::~Pipe()+0x199) [0x4669d9]
 2: (SimpleMessenger::~SimpleMessenger()+0x31) [0x552231]
 3: (main()+0x3026) [0x4614a6]
 4: (__libc_start_main()+0xfe) [0x7fe714dd6d8e]
 5: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e219]

Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-15 08:21:10 -08:00
Sage Weil
c1b6b218d2 osd: do not sync_and_flush if blackholed
If we have blackholed this will block forever.  In that case dont' bother.

Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-15 08:21:02 -08:00
Sage Weil
e6ffe31bdf workqueue: make pause/unpause count
We can pause() multiple times, and we need as many unpause()s to actually
resume work.

This resolves problems where we have two actors interested in pausing a
queue, both want to stop work, and they aren't interacting/coordinating.

Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-15 08:20:32 -08:00
Sage Weil
40802ae883 osd: exit code 0 on SIGINT/SIGTERM
This makes daemon-handler happy...

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 22:05:36 -08:00
Sage Weil
2aafdeada8 signals: check write(2) return values
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-14 21:04:05 -08:00
Sage Weil
9cd090038f osd: semi-clean shutdown on signal
Make some effort to stop work in progress, remove pid file, and exit with
informative error code.

Note that this is much simpler than the shutdown() exit path; I'm not sure
whether a complete teardown is useful.  It's also difficult to maintain
and get right with everything else going on, and it's not clear that it's
worth the effort right now.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 21:03:54 -08:00
Sage Weil
ec066829a7 mds: remove some cruft
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 21:03:54 -08:00
Sage Weil
395dc659b9 mds: remove pidfile
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 21:03:53 -08:00
Sage Weil
bbe5cd755f mon: do a clean shutdown on SIGINT/SIGTERM
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 21:03:53 -08:00
Sage Weil
eafe832791 mon: install async signal handlers for SIG{HUP,INT,TERM}
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 21:03:53 -08:00
Sage Weil
e905564bb2 osd: install async signal handlers for SIG{HUP,INT,TERM}
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 21:03:53 -08:00
Sage Weil
be704fe1d9 mds: install async signal handlers for SIG{HUP,INT,TERM}
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 21:03:53 -08:00
Sage Weil
afa1f9e392 signal: remove unused/obsolete handle_shutdown_signal
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 21:03:53 -08:00
Sage Weil
7951315580 signals: do not install default SIGHUP, SIGINT, SIGTERM handlers
These should be app specific and async.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 21:03:53 -08:00
Sage Weil
ecd280253a signals: implement safe async signal handler framework
Based on http://evbergen.home.xs4all.nl/unix-signals.html.

Instead of his design, though, we write single bytes, and create a pipe per
signal we have handlers registered for.

Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-14 21:03:53 -08:00
Sage Weil
4425f3b34b libradospp: add config_t typedef
Don't expose internal CephContext type name.

Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-14 17:03:54 -08:00
Sage Weil
06fa26853d librados: use rados_config_t typedef instead of CephContext
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-14 17:03:00 -08:00
Tommi Virtanen
e32668f8b8 doc: Balance backticks.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
2012-02-14 15:52:55 -08:00
Sage Weil
8d19e735c1 Merge branch 'wip-osd-hb'
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
2012-02-14 14:01:22 -08:00
Sage Weil
2281a00942 librados: expose CephContext via C API
We can already create rados cluster handles with an existing CephContext,
but that is only useful if you are building something that has access to
ceph internals; the cct isn't exposed via the API itself.

Do so, for both teh cluster and pool handles.  Add cluster handle accessor
for the C++ API too.

Fixes: #1821
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-14 13:59:30 -08:00
Sage Weil
bc4e78ddfa mds: use new tmap_get pbl argument
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-14 13:41:29 -08:00
Sage Weil
dd32285816 librados: need prval for tmap_get
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-14 13:39:46 -08:00
Samuel Just
7842bf1246 librados: add aio_operate for reads and tmap_get for ObjectWriteOp
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2012-02-14 13:37:08 -08:00
Sage Weil
704509637f osd: remove unused need_size
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 13:35:04 -08:00
Samuel Just
34145d5dd2 Merge branch 'wip_push_refactor'
Reviewed-by: Sage Weil <sage@newdream.net>
2012-02-14 13:03:38 -08:00
Samuel Just
a53a01740f ReplicatedPG: pull() should return PULL_NONE, not false
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2012-02-14 12:56:32 -08:00
Samuel Just
5a3ef17c39 ReplicatedPG: clean up push/pull
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2012-02-14 12:55:43 -08:00
Samuel Just
f9b7529fd6 osd_types.h: Add constructors for ObjectRecovery*
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2012-02-14 12:52:59 -08:00
Sage Weil
7b1c144f21 test_filestore_idempotent: fix test to create initial object
Filestore now properly fails to clone a non-existent object, which means
we should create one.

Fixes: #2062
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2012-02-14 11:53:05 -08:00
Sage Weil
6b30cd3ba3 libcephfs: define CEPH_SETATTR_*
These are also defined internally in ceph_fs.h, so use a guard.  Annoying,
but gives us consistent naming (ceph_*/CEPH_*, not LIBCEPHFS_SETATTR_*).

Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-14 09:06:32 -08:00
Sage Weil
b54bac3061 test/encoding/readable.sh: drop bashisms
=, not ==!

Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-13 14:43:18 -08:00
Sage Weil
ffa1de32c5 filejournal: drop unused variable
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-13 14:35:01 -08:00
Sage Weil
ccf8867f15 filejournal: aio off by default
For now, until we have a better handle on the ext4 bug, and demonstrate
that it is a clear performance win with the full stack.

Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-13 14:32:07 -08:00
Sage Weil
12035cd4e3 Merge remote-tracking branch 'gh/wip-journal-aio-rebased' 2012-02-13 14:31:17 -08:00
Sage Weil
3d3237fef4 Merge remote-tracking branch 'gh/wip-osd'
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
2012-02-13 14:09:04 -08:00
Sage Weil
9fded38f53 test/encoding/readable.sh: skip old version with known incompatibilities
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-13 14:08:25 -08:00
Sage Weil
3e1cc0b951 ceph-dencoder: add osd_peer_stat_t
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-13 12:41:18 -08:00
Yehuda Sadeh
9065dbd36d rgw: remove extra useless info in bucket entry encoding
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2012-02-13 12:08:19 -08:00
Samuel Just
1bf037bf76 ReplicatedPG: refactor push and pull
Now, push progress is represented by ObjectRecoveryProgress.  In
particular, rather than tracking data_subset_*ing, we track the furthest
offset before which the data will be consistent once cloning is complete.
sub_op_push now separates the pull response implementation from the
replica push implementation.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2012-02-13 12:07:39 -08:00
Sage Weil
fbbbd01bfe add CEPH_FEATURE_OSDENC
Require it for osd <-> osd and osd <-> mon communication.

This covers all the new encoding changes, except hobject_t, which is used
between the rados command line tool and the OSD for a object listing
position marker.  We can't distinguish between specific types of clients,
though, and we don't want to introduce any incompatibility with other
clients, so we'll just have to make do here.  :(

Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-13 11:27:11 -08:00
Samuel Just
af38ce1f7c ReplicatedPG: consider backfill_pos to be degraded
A write may trigger via make_writeable the creation of a clone which
sorts before the object being written.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-13 11:26:52 -08:00
Samuel Just
d0ccf28086 ReplicatedPG: add debugging for in flight backfill ops
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2012-02-13 11:26:52 -08:00
Samuel Just
94a198c87c ReplicatedPG: is_degraded may return true for backfill
If is_degraded returns true for backfill, the object may not be
in any replica's missing set.  Only call start_recovery_op if
we actually started an op.  This bug could cause a stuck
in backfill error.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-13 11:26:52 -08:00