Commit Graph

132 Commits

Author SHA1 Message Date
Bassam Tabbara
b5d57c97c7 embedded: add compression and EC plugins to libcephd
Compression and erasure coding plugins are now statically compiled
into libcephd. A new method is added to load them into the
respective registry.

The static libraries are only built when WITH_EMBEDDED is enabled
and existing plugins are unaffected.

Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>
2016-11-28 23:48:02 -08:00
Kefu Chai
cb1cda9671 common,test: g_ceph_context->put() upon return
prior to this change, global_init() could create a new CephContext
and assign it to g_ceph_context. it's our responsibilty to release
the CephContext explicitly using cct->put() before the application
quits. but sometimes, we fail to do so.

in this change, global_init() will return an intrusive_ptr<CephContext>,
which calls `g_ceph_context->put()` in its dtor. this ensures that
the CephContext is always destroyed before main() returns. so the
log is flushed before _log_exp_length is destroyed.

there are two cases where global_pre_init() is called directly.
- ceph_conf.cc: g_ceph_context->put() will be called by an intrusive_ptr<>
  deleter.
- rgw_main.cc: global_init() is called later on on the success code
  path, so it will be taken care of.

Fixes: http://tracker.ceph.com/issues/17762
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-11-24 22:38:28 +08:00
Sage Weil
0dbe8fd398 msg: make loopback Connection feature accurate all the time
In 626360aab0 we made the
OSD cluster loopback connection CEPH_FEATURES_ALL, but
all other loopback connections got features == 0.  I
can't come up with any reason we wouldn't want those
connections to have accurate feature bits, so let's just
use CEPH_FEATURES_ALL for all of them.

While we're here, make the cflags argument required.

Signed-off-by: Sage Weil <sage@redhat.com>
2016-10-10 09:55:54 -04:00
Loic Dachary
00873f26e5 Merge pull request #11086 from bassamtabbara/wip-ec-simd-runtime-detection
erasure-code: Runtime detection of SIMD for jerasure and shec

Reviewed-by: Loic Dachary <ldachary@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-10-03 11:43:29 +02:00
Bassam Tabbara
e7e0b1bc1e erasure-code: Backward compatibility with legacy EC plugins
Resurrected jerasure_generic, jerasure_sse3, jerasure_sse4, jerasure_neon,
shec_generic, shec_sse3, shec_sse4 and shec_neon. These all are exact
copies of the new jerasure and shec plugins that support SIMD detection.

Moved EC preload code in ceph-mon and ceph-osd to a central location, added
warning when preloading legacy plugins.

OSMonitor::get_erasure_code and OSDMonitor:normalize_profile will now check
if legacy EC plugins are used and log a warning.

Added tests to check that warnings make it to the log.

Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>
2016-09-29 10:34:34 -07:00
John Spray
10da6d23cd osd: embed a MgrClient
Signed-off-by: John Spray <john.spray@redhat.com>
2016-09-29 17:26:55 +01:00
Vu Pham
5b218fb794 msgr,xio: flexible Messenger::create configuration flags
Widen Messenger::create and XioMessenger constructor to support
per-Messenger instance creation parameters.

This introduce a minimalist generic set of flags to describe
the type of Messenger and its associated resources.

We apply the usage of these flags to ceph-osd's "workhorse",
"heartbeat" and "light" Messenger instances, ceph-mon and
other ceph clients Messengers.

Signed-off-by: Vu Pham <vu@mellanox.com>
2016-06-01 10:26:23 -07:00
Sage Weil
a5564a664c os/ObjectStore: make device uuid probe output something friendly
Otherwise, all you see is errors about the probes that failed (e.g., a
failure to decode a non-bluestore superblock as bluestore).

Signed-off-by: Sage Weil <sage@redhat.com>
2016-04-05 11:10:54 -04:00
Sage Weil
81f53dfb67 Merge pull request #8214 from athanatos/wip-feature-bits
Deprecate or free up a bunch of feature bits

Reviewed-by: Sage Weil <sage@redhat.com>
2016-03-21 07:35:27 -04:00
Samuel Just
9b27e73564 features: deprecate CEPH_FEATURE_PACKED_RECOVERY
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-03-16 18:10:12 -07:00
Samuel Just
9326c2b4ab features: deprecate CEPH_OSD_FEATURE_SNAPMAPPER
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-03-16 18:10:12 -07:00
Samuel Just
ccbceef594 features: deprecate CEPH_FEATURE_RECOVERY_RESERVATION
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-03-16 18:10:12 -07:00
Samuel Just
4591dc06f2 features: deprecate CEPH_FEATURE_BACKFILL_RESERVATON
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-03-16 18:10:12 -07:00
Samuel Just
37518a3e8a features: deprecate CEPH_FEATURE_CHUNKY_SCRUB
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-03-16 18:10:12 -07:00
Samuel Just
ac443a899a features: deprecate CEPH_FEATURE_INDEP_PG_MAP
Signed-off-by: Samuel Just <sjust@redhat.com>
2016-03-16 18:10:12 -07:00
xie xingguo
a835b0ed51 osd: make os_flags an option
In one of our test environments an osd is unable to back to work
due to the journal is totally unrecoverable. The os_flags field
is introduced to handle such a case but never be made an option
and visible to normal user.

This commit tries to make os_flags field a configurable option and
no flags is enabled by default and thus shall cause no compatibility
relevant issues.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-03-15 19:05:40 +08:00
Avner BenHanoch
027edc3a08 osd: don't crash because of bad ms-type in ceph.conf (or missing enablement for experimental ms-type)
missing nullity check will cause the process to crash in the following cases:
 a. bad ms-type is configured in ceph.conf
 b. an experimental ms-type is configured (ms-type-async or ms-type-xio) without explicitly enabling the experimental feature in ceph.conf
 c. ms-type is configured to "random" and at runtime the random feature yields ms-type which is an experimental feature while in ceph.conf this experimental feature was not explicitly enabled

Signed-off-by: Avner BenHanoch <avnerb@mellanox.com>
2016-02-17 18:42:16 +02:00
Willem Jan Withagen
34f896b0db src/ceph_osd.cc Add missing newline to usage message
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2016-02-11 21:29:19 +01:00
Sage Weil
2e4fa4b3bd Merge pull request #7377 from liewegas/wip-datadir-search
config: add $data_dir/config to config search path
2016-02-05 20:03:50 -05:00
Sage Weil
3944d560e4 global: add data_dir_option for all daemons
This let's us use a generic $data_dir substitution that will map
to rgw_data, osd_data, etc.

Signed-off-by: Sage Weil <sage@redhat.com>
2016-02-04 17:06:02 -05:00
Sage Weil
43dce135d6 ceph-osd: don't mention journal on mkfs
Some backends don't use it, almost all clusters use the default
path, and there'll always be a symlink.

Signed-off-by: Sage Weil <sage@redhat.com>
2016-02-04 14:36:37 +07:00
Jiaying Ren
4ca6bfd43f ceph_osd.cc: fix unreachable flush call
The calling chain for generic_server_usage():

  generic_server_usage()
    ->generic_usage(true)
    ->exit(1)
  cout.flush()

any statements after generic_server_usage() would not be reached,so we
need to flush cout in generic_usage().

Signed-off-by: Jiaying Ren <jiaying.ren@umcloud.com>
2016-01-29 22:22:34 +08:00
Jiaying Ren
b99b61e9a5 ceph_osd.cc/ceph_mon.cc: cleanup unreachable exit call
The calling chain of usage() is:

  usage()
    ->generic_server_usage()
      ->exit(1)
  exit(0)

so the exit(0) after usage() would not be reached.

Signed-off-by: Jiaying Ren <jiaying.ren@umcloud.com>
2016-01-29 16:25:48 +08:00
Sage Weil
db9ec690e6 osd: mark osd backend type in osd_data dir
When we create an osd, mark the type of the backend in the
osd_data dir in the 'type' file.  On startup, if this file is
present, us this to decide which ObjectStore to instantiate.  If
it is not present, use the osd_objectstore option as before.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-12-01 17:21:00 -05:00
Sage Weil
880a59d1b7 osd: make block device fsid probing generic
Currently the option name and invocation assume that the block device
is a journal (and FileStore journal, managed by FileJournal).  Rework
the interface so that we can probe any block device and other ObjectStore
implementations will have a chance to identify the device (and return the
osd fsid).

Switch to a static method while we are at it so we avoid instantiating
each backend.

Note that only FileStore is probed at the moment; that will change soon!

Signed-off-by: Sage Weil <sage@redhat.com>
2015-12-01 17:16:11 -05:00
Yunchuan Wen
0669cba457 use new api and fix some wrong flag caller
Signed-off-by: Yunchuan Wen <yunchuan.wen@kylin-cloud.com>
2015-11-12 16:11:05 +08:00
Jason Dillaman
0009f343a5 osd: conditionally initialize the tracepoint provider
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-10-14 12:07:36 -04:00
Sage Weil
8c08b6b061 Merge pull request #5636 from liewegas/wip-12747
make EC plugin path static

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-08-25 21:47:35 -04:00
Sage Weil
660ae5bcbb osd: always load erasure plugins from the configured directory
Ignore the profile 'directory' field.

This ensures that we can always find plugins even when teh cluster
is installed across a mix of distros.

Rename the option to have no osd_ (or mon_) prefix since anybody
may use the ec factory/plugin code.

We still hard-code .libs in the unit tests... sigh.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-08-21 16:03:30 -04:00
Sage Weil
5837ec69d1 osd: drop explicit sync/flush calls before umount
Do it implicity on umount.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-08-19 17:03:55 -04:00
Rohan Mars
62bfc7a1ab moved to use boost uuid implementation, based on commit 4fe89a7b14
Signed-off-by: Rohan Mars <code@rohanmars.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
2015-08-19 06:46:17 -04:00
David Zafman
0b2bab460c ceph_osd: Add required feature bits related to this branch to osd_required mask
Signed-off-by: David Zafman <dzafman@redhat.com>
2015-06-19 17:00:03 -07:00
David Zafman
626360aab0 msg, ceph_osd: Support feature bits for all message type's local connection
Signed-off-by: David Zafman <dzafman@redhat.com>
2015-06-19 17:00:03 -07:00
Loic Dachary
db7936ae1c erasure-code: implement consistent error stream
The error stream in the erasure code path is broken and the error
message is sometime not reported back to the user. For instance the
ErasureCodePlugin::factory method has no error stream: when an error
happens the user is left with a cryptic error code that needs lookup in
the sources to figure it out.

The error stream is made more systematic by:

  * always pass it as ostream *ss (instead of something passing it as
    a reference and sometime as a stringstream)

  * ostream *ss is added to ErasureCodePlugin::factory

  * define the ErasureCodeInterface::init pure virtual. It is
    already implemented by all plugins, only in slightly different
    ways. The ostream *ss is added so the init function has a way to
    report error in a human readable way to the caller, in addition to
    the error code.

The ErasureCodePluginJerasure::init return value was incorrectly ignored
when called from ErasureCodePluginJerasure::factory and now returns when
it fails.

The ErasureCodeLrc::layers_init method is given ostream *ss for error
messages instead of printing them via derr.

The ErasureCodePluginLrc::factory method no longer prints errors via
derr: this workaround is made unnecessary by the ostream *ss argument.

The ErasureCodeShec::init ostream *ss argument is ignored. The
ErasureCodeShec::parse method entirely relies on derr to report errors
and converting it goes beyond the scope of this cleanup. There is a
slight risk of getting it wrong and it deserves a separate commit and
careful and independent review.

The PGBackend, OSDMonitor.{cc,h} changes are only about prototype
changes.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-05-25 16:59:02 +02:00
Sage Weil
542820d065 ceph-osd: add --check-wants-journal, --check-allows-journal
Completes OSD side of #9580

Signed-off-by: Sage Weil <sage@redhat.com>
2015-03-31 13:49:35 -07:00
Sage Weil
0c29343548 ceph-osd: fix usage
Signed-off-by: Sage Weil <sage@redhat.com>
2015-03-31 13:44:08 -07:00
Sage Weil
2980f3ac52 Merge branch 'prio_hb_pkts' of git://github.com/wenjianhn/ceph
Conflicts:
	src/msg/Messenger.h
2015-01-29 13:28:26 -08:00
Matt Benjamin
53bc4d1757 Cosmetic ceph_osd.cc.
Signed-off-by: Matt Benjamin <matt@cohortfs.com>
2015-01-14 16:43:08 -05:00
Jian Wen
4aa02f8472 osd: add an option to prioritize heartbeat traffic
By default every hardware queue of a network interface is assigned a
pfifo_fast QDisc. When network congestion occurs, the data traffic may
starve out the heartbeat traffic.

To make sure that heartbeat packets are always transmitted(dequeued) first,
Setting the SO_PRIORITY as 6 for the sockets that are used to transmit
heartbeat messages. The length of heartbeat messages are small. And an
OSD Daemon doesn't ping its peers very often. So the heartbeat traffic
is not likely to starve out the data traffic.

Using fq_codel instead of pfifo_fast is another good choice to avoid
bufferbloat. It's not available until Linux 3.5 though.

Signed-off-by: Jian Wen <wenjianhn@gmail.com>
2015-01-12 13:45:20 +08:00
Sage Weil
e3f370f246 Merge pull request #3101 from yuyuyu101/wip-10147
Messenger Unit Tests

Reviewed-by: Sage Weil <sage@redhat.com>
2014-12-18 14:05:16 -08:00
Haomai Wang
001ea29386 Messenger: Create an Messenger implementation by name.
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2014-12-18 21:51:12 +08:00
Dan Mick
17c72f591f ceph-osd: remove extra close of stderr
Otherwise, one loses log messages when running with -f or -d.  When
daemonizing, stderr is already closed in global_init_postfork_finish.

Fixes: #10010, #10113, #9810

Signed-off-by: Dan Mick <dan.mick@redhat.com>
2014-12-09 19:28:49 -08:00
Sage Weil
bc5a22b5dc osd: require SNAPMAPPER feature from peers
This was introduced before cuttlefish.  We require users to upgrade first
to a newer release, so there is no need to support a mixed cluster with
such old code.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-12-02 15:19:40 -08:00
Sage Weil
de52873dcf osd, filestore: move automatic upgrade into mount()
Signed-off-by: Sage Weil <sage@redhat.com>
2014-10-27 16:58:01 -07:00
Sage Weil
86919f5edf osd, filestore: mount in upgrade() caller
Signed-off-by: Sage Weil <sage@redhat.com>
2014-10-27 16:58:00 -07:00
Sage Weil
5f8a1df0d3 osd, filestore: move convertfs into FileStore
There is no reason for this level of detail to be exposed to the OSD.

There is one significant change here: we are not touching the SnapMapper
in the FileStore collection removal.  This is, most likely, fixing a bug.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-10-27 16:58:00 -07:00
Sage Weil
7d6e21d8d1 osd: fix need_journal call
From 2955b3da4e

Signed-off-by: Sage Weil <sage@redhat.com>
2014-10-01 06:34:20 -07:00
Sage Weil
2b441e5a0b Merge pull request #2607 from yuyuyu101/wip-9580
ObjectStore: Add "need_journal" interface to make aware of journal devic...

Reviewed-by: Sage Weil <sage@redhat.com>
2014-10-01 06:11:10 -07:00
Haomai Wang
2955b3da4e ObjectStore: Add "need_journal" interface to make aware of journal device
Impl feature #9580

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2014-10-01 10:40:09 +08:00
Sage Weil
46d5518644 osd: do not bind ms_objecter messenger
The objecter messenger is only used as a client to initiate client-side
connections to other OSDs.  It doesn't need to bind to a port.

This was added in 558d9fc956 to push client
traffic to the cluster interface.  This doesn't actually help/work because
we are still connecting to our peers' client-facing addresses.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-09-29 16:11:06 -07:00