a747aeac13
Each log Entry now exists on the stack and uses a large (4k) buffer for its log stream. This Entry is std::move'd to the queues (std::vector and boost::circular_buffer) in the Log, involving only memory copies in the general case. There are two memory copies (std::move) for any given Entry, once in Log::submit_entry and again in Log::_flush In practice, this eliminates 100% of allocations outside of startup allocations I've run a simple experiment with the MDS that copies /usr/bin to CephFS. I got measurements for the number of allocations from the heap profiler and the profile of CPU usage in the MDS. ** Before this patch ** == Heap profile: == $ google-pprof --alloc_objects --text bin/ceph-mds out/mds.a.profile.0001.heap Total: 1105048 objects 433329 39.2% 39.2% 433329 39.2% ceph::logging::Log::create_entry 209311 18.9% 58.2% 209311 18.9% __gnu_cxx::__aligned_membuf::_M_addr (inline) 192963 17.5% 75.6% 192963 17.5% __gnu_cxx::new_allocator::allocate (inline) 61774 5.6% 81.2% 61774 5.6% std::__cxx11::basic_string::_M_mutate 37689 3.4% 84.6% 37689 3.4% ceph::buffer::raw_combined::create (inline) 22773 2.1% 86.7% 22773 2.1% mempool::pool_allocator::allocate (inline) 17761 1.6% 88.3% 20523 1.9% std::pair::pair (inline) 15795 1.4% 89.7% 15797 1.4% std::swap (inline) 11011 1.0% 90.7% 130061 11.8% std::__cxx11::list::_M_insert (inline) 10822 1.0% 91.7% 10822 1.0% std::__cxx11::basic_string::reserve 9108 0.8% 92.5% 32721 3.0% __gnu_cxx::new_allocator::construct (inline) 8608 0.8% 93.3% 8610 0.8% std::_Deque_base::_M_initialize_map (inline) 7694 0.7% 94.0% 7694 0.7% std::__cxx11::basic_string::_M_capacity (inline) 6160 0.6% 94.5% 6160 0.6% Journaler::wrap_finisher 6084 0.6% 95.1% 70892 6.4% std::map::operator[] (inline) 5347 0.5% 95.6% 5347 0.5% MutationImpl::add_projected_fnode 4381 0.4% 96.0% 7706 0.7% mempool::pool_allocator::construct (inline) 3588 0.3% 96.3% 182966 16.6% Locker::_do_cap_update 3049 0.3% 96.6% 5280 0.5% std::__shared_count::__shared_count (inline) 3043 0.3% 96.9% 3043 0.3% MDSLogContextBase::MDSLogContextBase (inline) 3038 0.3% 97.1% 14763 1.3% std::__uninitialized_copy::__uninit_copy (inline) So approximately 430k heap allocations for Entry were created! The basic_string::_M_mutate is also another allocation via PrebufferedStreambuf == Profile data == Selecting interesting functions Samples: 798K of event 'cycles:pp', Event count (approx Children Self Com Shared Object Symbol + 1.04% 1.04% log libceph-common.so.0 [.] ceph::logging::Log::_flush + 0.05% 0.05% log libceph-common.so.0 [.] ceph::logging::Log::flush + 0.00% 0.00% log libceph-common.so.0 [.] ceph::logging::Log::_log_safe_write + 0.00% 0.00% log libceph-common.so.0 [.] ceph::logging::Log::entry + 0.00% 0.00% log libceph-common.so.0 [.] ceph::logging::Log::_flush_logbuf ... Children Self Command Shared Object Symbol + 3.69% 0.00% safe_timer libceph-common.so.0 [.] CachedPrebufferedStreambuf::~CachedPrebufferedStreambuf + 0.53% 0.00% ms_dispatch libceph-common.so.0 [.] CachedPrebufferedStreambuf::~CachedPrebufferedStreambuf + 0.13% 0.00% fn_anonymous libceph-common.so.0 [.] CachedPrebufferedStreambuf::~CachedPrebufferedStreambuf + 0.00% 0.00% ms_dispatch libceph-common.so.0 [.] CachedPrebufferedStreambuf::create + 0.00% 0.00% fn_anonymous libceph-common.so.0 [.] CachedPrebufferedStreambuf::create Children Self Command Shared Object Symbol + 0.07% 0.07% fn_anonymous libceph-common.so.0 [.] ceph::logging::Log::create_entry + 0.00% 0.00% ms_dispatch libceph-common.so.0 [.] ceph::logging::Log::create_entry + 0.00% 0.00% ms_dispatch ceph-mds [.] _ZN4ceph7logging3Log12create_entryEiiPm@plt + 0.00% 0.00% md_submit libceph-common.so.0 [.] ceph::logging::Log::create_entry + 0.00% 0.00% fn_anonymous ceph-mds [.] _ZN4ceph7logging3Log12create_entryEiiPm@plt + 0.00% 0.00% safe_timer libceph-common.so.0 [.] ceph::logging::Log::create_entry + 0.00% 0.00% mds_rank_progr libceph-common.so.0 [.] ceph::logging::Log::create_entry + 0.00% 0.00% mds_rank_progr ceph-mds [.] _ZN4ceph7logging3Log12create_entryEiiPm@plt + 0.00% 0.00% msgr-worker-0 libceph-common.so.0 [.] ceph::logging::Log::create_entry + 0.00% 0.00% msgr-worker-2 libceph-common.so.0 [.] ceph::logging::Log::create_entry + 0.00% 0.00% md_submit ceph-mds [.] _ZN4ceph7logging3Log12create_entryEiiPm@plt + 0.00% 0.00% msgr-worker-1 libceph-common.so.0 [.] ceph::logging::Log::create_entry Children Self Command Shared Object Symbol + 8.29% 0.00% ms_dispatch libstdc++.so.6.0.24 [.] virtual thunk to std::basic_ostream<char, std::char_traits<char> >::~basic_ostream + 7.55% 1.46% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::_M_insert<long> + 3.87% 0.00% fn_anonymous libstdc++.so.6.0.24 [.] std::basic_ostream<char, std::char_traits<char> >::~basic_ostream + 2.92% 0.00% md_submit libstdc++.so.6.0.24 [.] virtual thunk to std::basic_ostream<char, std::char_traits<char> >::~basic_ostream + 2.38% 0.00% fn_anonymous libstdc++.so.6.0.24 [.] virtual thunk to std::basic_ostream<char, std::char_traits<char> >::~basic_ostream + 2.22% 2.22% fn_anonymous libstdc++.so.6.0.24 [.] std::ostream::sentry::sentry + 1.89% 0.13% fn_anonymous libstdc++.so.6.0.24 [.] std::__ostream_insert<char, std::char_traits<char> > + 0.71% 0.00% ms_dispatch libstdc++.so.6.0.24 [.] std::basic_ostream<char, std::char_traits<char> >::~basic_ostream + 0.39% 0.06% fn_anonymous libstdc++.so.6.0.24 [.] std::ostream::_M_insert<long> + 0.29% 0.21% ms_dispatch libstdc++.so.6.0.24 [.] std::__ostream_insert<char, std::char_traits<char> > + 0.27% 0.27% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::sentry::~sentry + 0.27% 0.27% fn_anonymous libstdc++.so.6.0.24 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<long> + 0.22% 0.22% ms_dispatch libstdc++.so.6.0.24 [.] std::basic_streambuf<char, std::char_traits<char> >::xsputn + 0.20% 0.20% ms_dispatch libstdc++.so.6.0.24 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<long> + 0.15% 0.15% fn_anonymous libstdc++.so.6.0.24 [.] std::ostream::sentry::~sentry + 0.14% 0.14% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::sentry::sentry + 0.13% 0.00% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::_M_insert<unsigned long> + 0.13% 0.13% fn_anonymous libstdc++.so.6.0.24 [.] std::basic_streambuf<char, std::char_traits<char> >::xsputn + 0.00% 0.00% fn_anonymous libstdc++.so.6.0.24 [.] std::ostream::_M_insert<unsigned long> + 0.00% 0.00% ms_dispatch libstdc++.so.6.0.24 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned long> And the unittest_log time: $ bin/unittest_log [==========] Running 15 tests from 1 test case [----------] Global test environment set-up [----------] 15 tests from Log [ RUN ] Log.Simple [ OK ] Log.Simple (0 ms) [ RUN ] Log.ReuseBad [ OK ] Log.ReuseBad (1 ms) [ RUN ] Log.ManyNoGather [ OK ] Log.ManyNoGather (0 ms) [ RUN ] Log.ManyGatherLog [ OK ] Log.ManyGatherLog (12 ms) [ RUN ] Log.ManyGatherLogStringAssign [ OK ] Log.ManyGatherLogStringAssign (27 ms) [ RUN ] Log.ManyGatherLogStringAssignWithReserve [ OK ] Log.ManyGatherLogStringAssignWithReserve (27 ms) [ RUN ] Log.ManyGatherLogPrebuf [ OK ] Log.ManyGatherLogPrebuf (15 ms) [ RUN ] Log.ManyGatherLogPrebufOverflow [ OK ] Log.ManyGatherLogPrebufOverflow (15 ms) [ RUN ] Log.ManyGather [ OK ] Log.ManyGather (8 ms) [ RUN ] Log.InternalSegv [WARNING] /home/pdonnell/cephfs-shell/src/googletest/googletest/src/gtest-death-test.cc:836:: Death tests use fork(), which is unsafe particularly in a threaded context. For this test, Google Test detected 3 threads [ OK ] Log.InternalSegv (8 ms) [ RUN ] Log.LargeLog [ OK ] Log.LargeLog (43 ms) [ RUN ] Log.TimeSwitch [ OK ] Log.TimeSwitch (1 ms) [ RUN ] Log.TimeFormat [ OK ] Log.TimeFormat (0 ms) [ RUN ] Log.Speed_gather [ OK ] Log.Speed_gather (1779 ms) [ RUN ] Log.Speed_nogather [ OK ] Log.Speed_nogather (64 ms) [----------] 15 tests from Log (2000 ms total) [----------] Global test environment tear-down [==========] 15 tests from 1 test case ran. (2000 ms total) [ PASSED ] 15 tests ** After Patch ** The StackStreamBuf uses 4k for its default buffer. This appears to be more than reasonable for preventing allocations for logging == Heap profile: == $ google-pprof --alloc_objects --text bin/ceph-mds out/mds.a.profile.0001.heap Total: 1052127 objects 384957 36.6% 36.6% 384957 36.6% __gnu_cxx::new_allocator::allocate (inline) 274720 26.1% 62.7% 274720 26.1% __gnu_cxx::__aligned_membuf::_M_addr (inline) 68496 6.5% 69.2% 68496 6.5% std::__cxx11::basic_string::_M_mutate 44140 4.2% 73.4% 51828 4.9% std::pair::pair (inline) 43091 4.1% 77.5% 43091 4.1% ceph::buffer::raw_combined::create (inline) 27706 2.6% 80.1% 236407 22.5% std::__cxx11::list::_M_insert (inline) 25699 2.4% 82.6% 25699 2.4% std::__cxx11::basic_string::reserve 23183 2.2% 84.8% 23183 2.2% mempool::pool_allocator::allocate (inline) 19466 1.9% 86.6% 81716 7.8% __gnu_cxx::new_allocator::construct (inline) 17606 1.7% 88.3% 17606 1.7% std::__cxx11::basic_string::_M_capacity (inline) 16879 1.6% 89.9% 16881 1.6% std::swap (inline) 8572 0.8% 90.7% 8574 0.8% std::_Deque_base::_M_initialize_map (inline) 8477 0.8% 91.5% 11808 1.1% mempool::pool_allocator::construct (inline) 6166 0.6% 92.1% 6166 0.6% Journaler::wrap_finisher 6080 0.6% 92.7% 134975 12.8% std::map::operator[] (inline) 6079 0.6% 93.3% 6079 0.6% MutationImpl::add_projected_fnode == Profile data == Samples: 62K of event 'cycles:u', Event count (approx.) Overhead Command Shared Object Symbol + 5.91% log libc-2.23.so [.] vfprintf + 5.75% ms_dispatch libstdc++.so.6.0.24 [.] std::__ostream_insert<char, std::char_traits<char> > + 4.59% ms_dispatch ceph-mds [.] StackStringBuf<4096ul>::xsputn + 4.26% ms_dispatch libc-2.23.so [.] __memmove_ssse3_back + 4.07% log libceph-common.so.0 [.] ceph::logging::Log::_flush + 2.48% ms_dispatch libstdc++.so.6.0.24 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<l + 2.13% fn_anonymous libstdc++.so.6.0.24 [.] std::__ostream_insert<char, std::char_traits<char> > + 2.09% ms_dispatch ceph-mds [.] CDir::check_rstats + 2.06% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::sentry::sentry + 1.98% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::sentry::~sentry + 1.87% log libc-2.23.so [.] __strcpy_sse2_unaligned + 1.60% fn_anonymous ceph-mds [.] StackStringBuf<4096ul>::xsputn + 1.46% log libc-2.23.so [.] _IO_default_xsputn + 1.45% log libc-2.23.so [.] _itoa_word + 1.43% fn_anonymous libc-2.23.so [.] __memmove_ssse3_back + 1.40% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::_M_insert<long> + 0.98% fn_anonymous libstdc++.so.6.0.24 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<l + 0.89% ms_dispatch libstdc++.so.6.0.24 [.] 0x + 0.88% ms_dispatch libstdc++.so.6.0.24 [.] std::_Rb_tree_increment And the unittest_log time: $ bin/unittest_log [==========] Running 13 tests from 1 test case. [----------] Global test environment set-up. [----------] 13 tests from Log [ RUN ] Log.Simple [ OK ] Log.Simple (1 ms) [ RUN ] Log.ReuseBad [ OK ] Log.ReuseBad (0 ms) [ RUN ] Log.ManyNoGather [ OK ] Log.ManyNoGather (0 ms) [ RUN ] Log.ManyGatherLog [ OK ] Log.ManyGatherLog (83 ms) [ RUN ] Log.ManyGatherLogStringAssign [ OK ] Log.ManyGatherLogStringAssign (79 ms) [ RUN ] Log.ManyGatherLogStackSpillover [ OK ] Log.ManyGatherLogStackSpillover (81 ms) [ RUN ] Log.ManyGather [ OK ] Log.ManyGather (80 ms) [ RUN ] Log.InternalSegv [WARNING] /home/pdonnell/ceph/src/googletest/googletest/src/gtest-death-test.cc:836:: Death tests use fork(), which is unsafe particularly in a threaded context. For this test, Google Test detected 3 threads. [ OK ] Log.InternalSegv (7 ms) [ RUN ] Log.LargeLog [ OK ] Log.LargeLog (55 ms) [ RUN ] Log.TimeSwitch [ OK ] Log.TimeSwitch (4 ms) [ RUN ] Log.TimeFormat [ OK ] Log.TimeFormat (1 ms) [ RUN ] Log.Speed_gather [ OK ] Log.Speed_gather (1441 ms) [ RUN ] Log.Speed_nogather [ OK ] Log.Speed_nogather (63 ms) [----------] 13 tests from Log (1895 ms total) [----------] Global test environment tear-down [==========] 13 tests from 1 test case ran. (1895 ms total) [ PASSED ] 13 tests. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> |
||
---|---|---|
.github | ||
admin | ||
alpine | ||
bin | ||
ceph-erasure-code-corpus@2d7d78b9cc | ||
ceph-object-corpus@e32bf8ca3d | ||
cmake/modules | ||
debian | ||
doc | ||
etc | ||
examples | ||
fusetrace | ||
keys | ||
man | ||
mirroring | ||
monitoring/grafana | ||
qa | ||
selinux | ||
share | ||
src | ||
sudoers.d | ||
systemd | ||
udev | ||
.gitattributes | ||
.githubmap | ||
.gitignore | ||
.gitmodule_mirrors | ||
.gitmodules | ||
.mailmap | ||
.organizationmap | ||
.peoplemap | ||
AUTHORS | ||
ceph.spec.in | ||
CMakeLists.txt | ||
CodingStyle | ||
CONTRIBUTING.rst | ||
COPYING | ||
COPYING-GPL2 | ||
COPYING-LGPL2.1 | ||
do_cmake.sh | ||
do_freebsd.sh | ||
doc_deps.deb.txt | ||
Doxyfile | ||
install-deps.sh | ||
make-apk.sh | ||
make-debs.sh | ||
make-dist | ||
make-srpm.sh | ||
PendingReleaseNotes | ||
pom.xml | ||
README.aix | ||
README.alpine.md | ||
README.FreeBSD | ||
README.md | ||
README.solaris | ||
README.xio | ||
run-make-check.sh | ||
SubmittingPatches.rst |
Ceph - a scalable distributed storage system
Please see http://ceph.com/ for current info.
Contributing Code
Most of Ceph is licensed under the LGPL version 2.1. Some miscellaneous code is under BSD-style license or is public domain. The documentation is licensed under Creative Commons Attribution Share Alike 3.0 (CC-BY-SA-3.0). There are a handful of headers included here that are licensed under the GPL. Please see the file COPYING for a full inventory of licenses by file.
Code contributions must include a valid "Signed-off-by" acknowledging the license for the modified or contributed file. Please see the file SubmittingPatches.rst for details on what that means and on how to generate and submit patches.
We do not require assignment of copyright to contribute code; code is contributed under the terms of the applicable license.
Checking out the source
You can clone from github with
git clone git@github.com:ceph/ceph
or, if you are not a github user,
git clone git://github.com/ceph/ceph
Ceph contains many git submodules that need to be checked out with
git submodule update --init --recursive
Build Prerequisites
The list of Debian or RPM packages dependencies can be installed with:
./install-deps.sh
Building Ceph
Note that these instructions are meant for developers who are
compiling the code for development and testing. To build binaries
suitable for installation we recommend you build deb or rpm packages,
or refer to the ceph.spec.in
or debian/rules
to see which
configuration options are specified for production builds.
Prerequisite: CMake 2.8.12
Build instructions:
./do_cmake.sh
cd build
make
This assumes you make your build dir a subdirectory of the ceph.git
checkout. If you put it elsewhere, just replace ..
in do_cmake.sh with a
correct path to the checkout. Any additional CMake args can be specified
setting ARGS before invoking do_cmake. See cmake options
for more details. Eg.
ARGS="-DCMAKE_C_COMPILER=gcc-7" ./do_cmake.sh
To build only certain targets use:
make [target name]
To install:
make install
CMake Options
If you run the cmake
command by hand, there are many options you can
set with "-D". For example the option to build the RADOS Gateway is
defaulted to ON. To build without the RADOS Gateway:
cmake -DWITH_RADOSGW=OFF [path to top level ceph directory]
Another example below is building with debugging and alternate locations for a couple of external dependencies:
cmake -DLEVELDB_PREFIX="/opt/hyperleveldb" -DOFED_PREFIX="/opt/ofed" \
-DCMAKE_INSTALL_PREFIX=/opt/accelio -DCMAKE_C_FLAGS="-O0 -g3 -gdwarf-4" \
..
To view an exhaustive list of -D options, you can invoke cmake
with:
cmake -LH
If you often pipe make
to less
and would like to maintain the
diagnostic colors for errors and warnings (and if your compiler
supports it), you can invoke cmake
with:
cmake -DDIAGNOSTICS_COLOR=always ..
Then you'll get the diagnostic colors when you execute:
make | less -R
Other available values for 'DIAGNOSTICS_COLOR' are 'auto' (default) and 'never'.
Building a source tarball
To build a complete source tarball with everything needed to build from source and/or build a (deb or rpm) package, run
./make-dist
This will create a tarball like ceph-$version.tar.bz2 from git. (Ensure that any changes you want to include in your working directory are committed to git.)
Running a test cluster
To run a functional test cluster,
cd build
make vstart # builds just enough to run vstart
../src/vstart.sh --debug --new -x --localhost --bluestore
./bin/ceph -s
Almost all of the usual commands are available in the bin/ directory. For example,
./bin/rados -p rbd bench 30 write
./bin/rbd create foo --size 1000
To shut down the test cluster,
../src/stop.sh
To start or stop individual daemons, the sysvinit script can be used:
./bin/init-ceph restart osd.0
./bin/init-ceph stop
Running unit tests
To build and run all tests (in parallel using all processors), use ctest
:
cd build
make
ctest -j$(nproc)
(Note: Many targets built from src/test are not run using ctest
.
Targets starting with "unittest" are run in make check
and thus can
be run with ctest
. Targets starting with "ceph_test" can not, and should
be run by hand.)
When failures occur, look in build/Testing/Temporary for logs.
To build and run all tests and their dependencies without other unnecessary targets in Ceph:
cd build
make check -j$(nproc)
To run an individual test manually, run ctest
with -R (regex matching):
ctest -R [regex matching test name(s)]
(Note: ctest
does not build the test it's running or the dependencies needed
to run it)
To run an individual test manually and see all the tests output, run
ctest
with the -V (verbose) flag:
ctest -V -R [regex matching test name(s)]
To run an tests manually and run the jobs in parallel, run ctest
with
the -j
flag:
ctest -j [number of jobs]
There are many other flags you can give ctest
for better control
over manual test execution. To view these options run:
man ctest
Building the Documentation
Prerequisites
The list of package dependencies for building the documentation can be
found in doc_deps.deb.txt
:
sudo apt-get install `cat doc_deps.deb.txt`
Building the Documentation
To build the documentation, ensure that you are in the top-level
/ceph
directory, and execute the build script. For example:
admin/build-doc