We wish to be able to scrape SMART and NVMe metrics from OSD and MON
nodes. For this we require / recommend smartmontools and nvme-cli
dependencies for both the ceph-osd and ceph-mon packages. However, the
sudoers file (which is required for invoking `smartctl` by user 'ceph')
was installed only in the ceph-osd package. Since different packages
cannot own the same file, and because we want to be able to scrape from
every daemon, we move the dependencies and the sudoers installation to
ceph-base. For generalization, we rename:
sudoers.d/ceph-osd-smartctl -> sudoers.d/ceph-smartctl
Fixes: https://tracker.ceph.com/issues/50657
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
A tool to test the effect (number of pgs, objects, bytes moved)
of a crushmap change. This is a wrapper around osdmaptool, hardly
relying on its --test-map-pgs-dump option to get the list of
changed pgs. Additionally it uses pg stats to calculate the
numbers of objects and bytes moved.
Signed-off-by: Mykola Golub <mykola.golub@clyso.com>
mgr/cephadm: replace execnet and remoto with asyncssh
Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Dimitri Savineau <dsavinea@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
We can now use LTO when building ceph. The symver issue was fixed by
using the gcc __symver__ attribute. The systems that support it can now
re-enable LTO.
Fixes: https://tracker.ceph.com/issues/40060
Signed-off-by: Boris Ranto <branto@redhat.com>
libcls_kvs was introduced back in
73d016fdb3, but we don't have an internal
user so far. to reduce the build time. let's disable the build of it by
default.
Signed-off-by: Kefu Chai <kchai@redhat.com>
since we've replaced "virtualenv" with "python3 -m venv", there is no
need to have it in the build deps list.
since, on ubuntu, venv modules is not available by default, we need to
install python3-venv.
Signed-off-by: Kefu Chai <kchai@redhat.com>
this change reverts 9132269421
back then, we were using rpm < 4.13, which does not support
the feature of "Debugsource and debuginfo sub-packages", but per
https://bugzilla.redhat.com/show_bug.cgi?id=185590. rpm >= 4.13
has this feature. see also http://rpm.org/wiki/Releases/4.13.0
in CentOS 8, RPM v4.14.3 is available. and by inspecting the log
when building ceph packages on CentOS 8, we have:
Wrote: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-6524-g4e868f9a/rpm/el8/RPMS/x86_64/ceph-debugsource-17.0.0-6524.g4e868f9a.el8.x86_64.rpm
Wrote: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-6524-g4e868f9a/rpm/el8/RPMS/x86_64/ceph-base-debuginfo-17.0.0-6524.g4e868f9a.el8.x86_64.rpm
Wrote: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-6524-g4e868f9a/rpm/el8/RPMS/x86_64/ceph-common-debuginfo-17.0.0-6524.g4e868f9a.el8.x86_64.rpm
Wrote: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-6524-g4e868f9a/rpm/el8/RPMS/x86_64/ceph-mds-debuginfo-17.0.0-6524.g4e868f9a.el8.x86_64.rpm
....
build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-6524-g4e868f9a/rpm/el8/RPMS/x86_64/ceph-test-debuginfo-17.0.0-6524.g4e868f9a.el8.x86_64.rpm
so, rpmbuild does generate debuginfo package for each binary
package. this should make the life of valgrind a lot easier
when reading the dwz -- no need to read the debuginfo of all
the packages, only the .dwz of the related subpackage is read.
this change should help to decrease the size of debuginfo
rpm packages a little bit. see https://tracker.ceph.com/issues/19099#note-7
this change was inspired by Yuanming Chai <ychai@redhat.com>
See-also: https://tracker.ceph.com/issues/19099
Signed-off-by: Kefu Chai <kchai@redhat.com>
because fmt is packaged in EPEL, while librados is packaged
in RHEL, so we cannot have fmt as a runtime dependency of librados.
to address this issue, we should compile librados either with static library
or with header-only library of fmt. but because the fedora packaging
guideline does no encourage us to package static libraries, and it would
be complicated to package both static and dynamic library for fmt.
the simpler solution would be to compile Ceph with the header-only
version of fmt.
in this change, we compile ceph with the header-only version of fmt
on RHEL to address the runtime dependency issue.
Signed-off-by: Kefu Chai <kchai@redhat.com>
The use of $FIRST_ARG was probably required because the SUSE-specific
%service_* rpm macros were playing tricks on the shell positional parameters.
This is bad practice and error-prone, so let's assume that no macros should do
that anymore and hence it's safe to assume that positional parameters remain
unchanged after any rpm macro call.
Thanks to Franck Bui for providing the original patch
926433f5d4 that this patch is modeled after.
NOTE: the use of FIRST_ARG had already been eliminated by
926433f5d4 but was re-introduced later by
9466d70985
Fixes: 9466d70985
Fixes: https://tracker.ceph.com/issues/51797
Signed-off-by: Nathan Cutler <ncutler@suse.com>
the change to build and ship libthift was added when we didn't have 0.13.0
version shipped via distro pkgs, now that centos 8 and F34 supports req.
version, we do not need to build and ship it with jaeger library.
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
* since focal and centos both have yaml-cpp 0.6 available, which dropped
having boost as it's dependency, moving to 0.6 seems a good upgrade.
* cmake: delete Buildyaml, since distro suppilies v0.6 this is not needed
This fixes the build failure, as jaegertracing requires yaml-cpp v0.6+
```
Could NOT find yaml-cpp: Found unsuitable version "", but required is at
least "0.5.1" (found yaml-cpp_LIBRARY-NOTFOUND)
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
In /etc/sysconfig/ceph we allow operators to define if ceph daemons
should be restarted on upgrade: CEPH_AUTO_RESTART_ON_UPGRADE.
But the post selinux scripts will stop ceph.target regardless if this
is set to `no`, leading to operators adding various hacks to prevent
these unexpected or inconvenient daemon restarts. By now, if users
are using rpms directly, they are likely orchestrating their own
daemon restarts so should not rely on the rpm itself to do this.
Fixes: https://tracker.ceph.com/issues/21672
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
In RPM spec files, comment lines should not include macro invocations,
because RPM can and will expand them, with unpredictable results.
Fixes: https://tracker.ceph.com/issues/51622
Signed-off-by: Nathan Cutler <ncutler@suse.com>
since Seastar has dropped the protobuf dependencies, there is no
need to prepare them for building crimson anymore.
Signed-off-by: Kefu Chai <kchai@redhat.com>
in the KVM instance offered by OBS, we have
[ 346s] + cat /proc/meminfo
[ 347s] MemTotal: 10167736 kB
[ 347s] MemFree: 4983964 kB
[ 347s] MemAvailable: 9826800 kB
[ 347s] Buffers: 85856 kB
[ 347s] Cached: 4615192 kB
[ 347s] SwapCached: 0 kB
...
[ 347s] SwapTotal: 2097148 kB
and its number of hardware threads is
[ 346s] ++ /usr/bin/getconf _NPROCESSORS_ONLN
[ 346s] + _threads=8
so ($MemTotal+$SwapTotal)/1024/2600 = 4.6, which is less
than the # of threads, so "4" was used for the number of jobs.
but per our recent observation in
38be14bc0f, some compiling jobs could
take up to 3GB. in the OOM failure in OBS, we had
[24915s] [24848.843594] Out of memory: Killed process 16894 (cc1plus) total-vm:4293756kB, anon-rss:2970012kB, file-rss:0kB, shmem-rss:0kB, UID:399 pgtables:8324kB oom_score_adj:0
where 4GiB memory was allocated, in which 3GiB was mapped into
memory. this matches with our findings.
in this change, the memory per core is bumped up to 3000MB
in hope to address the OOB. the downside of this change is
that it would take even longer to finish the build if the
building host is limited in memory.
Signed-off-by: Kefu Chai <kchai@redhat.com>
unlike rbd_rwl_cache, rbd_ssd_cache does not depend on pmdk (libpmem),
so let's enable it on all supported architecture and rpm based distros.
Signed-off-by: Kefu Chai <kchai@redhat.com>
ceph-deploy is not actively maintained anymore, and it was replaced by
ceph-volume and other high-level tools.
so there is no point to package its manpage anymore.
Signed-off-by: Kefu Chai <kchai@redhat.com>
the ceph-volume tool is composed of the cli frontend and ceph_volume
python module. in 02bc369e05, its cli
frontend is moved from ceph-base package to ceph-osd. but the python
module was left in ceph-base.
since the only consumer of ceph_volume python package is ceph-volume,
better off moving this python module into ceph-osd. this also aligns
the rpm packaging with the deb packaging, where ceph-osd deb package
also include ceph_volume python module.
we could extract ceph-volumne into its own package, so it can be an
arch-independent package. let's leave it as a follow-up change.
Signed-off-by: Kefu Chai <kchai@redhat.com>
python3-setuptools was originally added to ceph-base as a dependency of
ceph-detect-init, see https://tracker.ceph.com/issues/14864. but since
ceph-disk and ceph-detect-init were replaced by ceph-volume, and were
removed from the debian packaging in
ee6bc23e89.
there is no need to have python3-setuptools in the ceph-base packages
anymore.
but since we are still using pkg_resources module provided by setuptools
in ceph-volume, we need to preserve this runtime dependency in ceph-osd.
as ceph-osd packages ceph-volume.
please note, pkg_resources module is also used by cephadm to poke around
ceph_iscsi python module installed in a container, so python-setuptools
should be installed along with ceph-iscsi if we need a better
interoperability between ceph-iscsi and cephadm. this is not in the
scope of this change.
Signed-off-by: Kefu Chai <kchai@redhat.com>
6.2.1 is the version packaged by EPEL8, in other words, this is the
version we've been testing. so to be more consistent with the
known-to-be-good version, let's bump up the required version.
Signed-off-by: Kefu Chai <kchai@redhat.com>
to lower the number of jobs, we are experiencing build failures on
a builder with 48c96t, 193 free mem. the failures were caused by
OOM killer which kills the c++ compiler
[498376.128969] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/jenkins.service,task=cc1plus,pid=1387895,uid=1110
[498376.145288] Out of memory: Killed process 1387895 (cc1plus) total-vm:3323312kB, anon-rss:3164568kB, file-rss:0kB, shmem-rss:0kB, UID:1110
[498376.315185] oom_reaper: reaped process 1387895 (cc1plus), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[498377.882072] cc1plus invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
before this change, we use the total memory to calculate the number
of jobs, and assume that each job takes at most 2.5GiB mem. in the
case above, the # of job is 96.
after this change, we use the free memory, and increse the mem per job
to 3.0GiB. in the case above, the # of job would be 85.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Otherwise fedora 33 complains there is no gcc-toolset-9-gcc-c++
when running "WITH_SEASTAR=true ./install_deps.sh"
Related to: 36759b5363
Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
we need to use libpmem 1.10 in #40493.
without enabling the module stream offering libpmem 1.9.2, we can only
have access to libpmem 1.6.1. and fedora 33 only has libpmem 1.9
packaged. the same applies to openSUSE Tumbleweed and openSUSE Leap. so
let's stop using libpmem packaged by distro by default, until these
distros include libpmem 1.10.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* refs/pull/40526/head:
spec: add nfs to spec file
mgr/nfs: Don't enable nfs module by default
mgr/nfs: check for invalid chars in cluster id
mgr/nfs: Use CLICommand wrapper
mgr/nfs: reorg nfs files
mgr/nfs: Check if transport or protocol are list instance
mgr/nfs: reorg cluster class and common helper methods
mgr/nfs: move common export helper methods to ExportMgr class
mgr/nfs: move validate methods into new ValidateExport class
mgr/nfs: add custom exception module
mgr/nfs: create new module for export utils
mgr/nfs: rename fs dir to export
mgr/volumes/nfs: Move nfs code out of volumes plugin
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
extract the options in common/options.cc into separate .yaml.in
files, and preprocess them using CMake before translating them into .cc
files using a python script.
this change paves the road to render the options using sphinx, and
will allow us to further annotate the options to include more metadata.
also, a this YAML file can be consumed by applications like dashboard
and Sphinx to consume these metadata in a simpler way.
* use @variable-name@ for substituting the variables in .yaml.in file
* use cmake variable of `mgr_disabled_modules` instead of C macro
to define `mgr_disabled_modules` in global.yaml.in
* debian/control, ceph.spec.in, win32_deps_build.sh: add python3-yaml
as build dep
* add y2c.py (short for YAML to C++) to translate .yaml to .cc file
* common/options/*.yaml.in: extract and split options into .yaml.in
files, the subvars in it is then replaced with CMake variables,
and copied to the corresponding .yaml files
* include/config-h.in.cmake: remove MGR_DISABLED_MODULES, as it
is not a CMake variable.
Signed-off-by: Kefu Chai <kchai@redhat.com>
This daemon has a systemd service which starts it with --setuser ceph
--setgroup ceph. "ceph" user and group are created by ceph-common and
won't be there unless ceph-common is installed.
Fixes: https://tracker.ceph.com/issues/50207
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
de6c8250a6 added an explicit %dir directive for
a new directory added to the ceph-common package, but -- due to a typo --
neglected to include the "%". As a result, RPM builds started to fail with:
Processing files: ceph-common-17.0.0-2787.gde6c8250.el8.x86_64
error: File must begin with "/": {_libdir}/ceph/denc/
RPM build errors:
File must begin with "/": {_libdir}/ceph/denc/
Fixes: de6c8250a6
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2d3c6561b4 introduced a new library directory
"%{_libdir}/ceph/denc/" in ceph-common but did not explicitly state that it
should be owned by the package. This caused OBS builds to fail as follows:
[ 5515s] ceph-common-17.0.0-2786.1.x86_64.rpm: directories not owned by a package:
[ 5515s] - /usr/lib64/ceph/denc
Fixes: 2d3c6561b4
Signed-off-by: Nathan Cutler <ncutler@suse.com>
to reduce the memory footprint when linking ceph-dencoder.
* src/tools/ceph-dencoder:
* build dencoders as shared libraries named with the prefix of
"den-mod-". so ceph-dencoder can find them
* install dencoders into $prefix/lib/ceph/denc, so ceph-dencoder
can find them
* only expose "register_dencoders()" function from plugins.
* load plugins in specified directory
* ceph.spec.in: package plugins
* debian: package plugins
Signed-off-by: Kefu Chai <kchai@redhat.com>
Commit 75980798f1 introduced a new package,
libcephsqlite, with a hard RPM dependency on a package "sqlite-libs" which
does not exist in openSUSE.
Since the runtime library dependencies of libcephsqlite are handled by RPM
transparently, this line is not needed.
Fixes: https://tracker.ceph.com/issues/50007
Signed-off-by: Nathan Cutler <ncutler@suse.com>
The ceph-resource-agents package contains an architecture-independent
bash script and parent directories. There are no architecture-dependent
files here, so we can use a single noarch RPM across all host
architectures.
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
And since the dependency is now distro-conditional, moving it down
to the 'distro-conditional make check dependencies' section.
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
* build with WITH_SYSTEM_PMDK=ON on fedora, as f32 and f33 ship
libpmem1.8 and libpmem1.9 respectively. and we need libpmem v1.7
* build with WITH_SYSTEM_PMDK=ON on el8, as el8 and CentOS8 AppStream
ships libpmem v1.6,
quote from nvml.spec:
> By design, PMDK does not support any 32-bit architecture.
> Due to dependency on some inline assembly, PMDK can be compiled only
> on these architectures:
> - x86_64
> - ppc64le (experimental)
> - aarch64 (unmaintained, supporting hardware doesn't exist?)
so far, only x86_64 and ppc64le packages are built.
see also,
https://src.fedoraproject.org/rpms/nvml/blob/rawhide/f/nvml.spec
this change addresses a regression introduced by
a49d1dbb32
Signed-off-by: Kefu Chai <kchai@redhat.com>
to ease the build for developers using SUSE, Fedora, CentOS or RHEL.
so install-deps.sh can install ninja for them.
Fixes: https://tracker.ceph.com/issues/49694
Signed-off-by: Kefu Chai <kchai@redhat.com>
The recently merged commit 0e511973f7 replaced
make %{_smp_mflags}
with
%make_build
for the stated purpose of hiding the %_smp_mflags macro in a higher-level macro.
But, on SUSE, the higher-level macro (%make_build) expands to:
make -O %{_smp_mflags}
The addition of the -O flag makes the build considerably slower and increases
the memory requirement. The exact reason for this is unknown - possibly it's due
to a bug in make, although the same slowness was observed with ninja as well.
In any event, this is a deal-breaker when building in the OBS, because the build
infrastructure there is optimized for builds that do not require huge amounts of
memory and we would rather have a fast build with mixed up compiler messages
than a very slow one with synced compiler messages.
Fixes: 0e511973f7
Signed-off-by: Nathan Cutler <ncutler@suse.com>
for encryption, aws s3 provides an "encryption" context to vary per-object
keys. The encryption context is a base64 encoded json structure, which
must be converted to a determinstic form -- "canonical json". This
requires converting all strings to a normalized canonical form: "utf-8 nfc",
it also requires thta keys in objects be sorted in a fixed order; so some
form of sorting based on nfc.
It turns out that libicu was the best way to produce utf-8 nfc (boost also
provides a mechanism, but it has many quirks). So, here are the hooks
to pull the system libicu into the build.
Fixes: http://tracker.ceph.com/issues/48746
Signed-off-by: Marcus Watts <mwatts@redhat.com>
As of a49d1dbb32, when the rbd_rwl_cache and
rbd_ssd_cache bconds are enabled and WITH_SYSTEM_PMDK is disabled (as it is by
default), the RPM build attempts to
git clone https://github.com/ceph/pmdk.git
but of course that won't work in the OBS, where the build workers have no
Internet connectivity.
Fortunately, the openSUSE/SLE versions targeted by Ceph master and pacific ship
the necessary PMDK libraries as RPM packages.
Fixes: a49d1dbb32
Fixes: https://tracker.ceph.com/issues/49550
Signed-off-by: Nathan Cutler <ncutler@suse.com>
43b441f9a3 removed a bunch of code which the SUSE
builds were relying on to avoid OOM. This commit brings back that code in
a much-streamlined form: the SUSE-specific %limit_build macro.
This also has the advantage of not breaking the build on older RPMs which only
know about %_smp_mflags, and not the newer %_smp_build_ncpus etc. macros.
Fixes: 43b441f9a3
Fixes: https://tracker.ceph.com/issues/49556
Signed-off-by: Nathan Cutler <ncutler@suse.com>
This code causes the Ceph build in OBS to fail due to OOM, because the typical
setting of %_smp_build_ncpus in the OBS is 16, but available memory is
insufficient to sustain such a high degree of parallelism for the in-memory
compression operation.
Fixes: b50fc9e61c
Fixes: https://tracker.ceph.com/issues/49583
Signed-off-by: Nathan Cutler <ncutler@suse.com>
The luarocks conditional had gotten hard to read, and the openSUSE Leap 15.3
build needs lua53 as well.
Signed-off-by: Nathan Cutler <ncutler@suse.com>
This partially reverts 2b1e646f7a which
mistakenly changed a line inside an "%if 0%{?suse_version}" conditional.
Fixes: 2b1e646f7a
Signed-off-by: Nathan Cutler <ncutler@suse.com>
this change partially reverts da7030db79
which use %cmake rpm macro in the place of "cmake". but
%cmake sets BUILD_SHARED_LIBS=ON. so quite a few internal libraries
defined using add_library() are now compiled into shared libraries which
are not installed or packagesd. when we are installing the rpm packages
compiled with this option, rpm compiles because the linked libraries are
missing, for instance, `libgmock.so.1.10.0` was compiled as a static
library before da7030db79, and was
included by the test executables. but after that change it's compiled
as a shared library.
so we need to either package the linked shared libraries or just link
against them statically. at this moment, the latter approach is simpler,
albeit larger size of exectuable and dbg symbols.
Fixes: https://tracker.ceph.com/issues/49395
Signed-off-by: Kefu Chai <kchai@redhat.com>
RPM's parseSpec() Python method internally strips this whitespace
character.
Tools that process this spec file with parseSpec() and evaluate
RPMTAG_DESCRIPTION cannot match this exact %description string as
written here.
Strip the trailing whitespace so that the RPMTAG_DESCRIPTION header
matches what we've written in the spec.
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
gperftools' libprofiler did not build on ppc64le until 2.7.90.
The EPEL 8 package is being updated accordingly.
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
it errors while building without lua_packages:
+ -DBOOST_J=102 -DWITH_GRAFANA=ON
/var/tmp/rpm-tmp.aq2X3J: line 91: -DBOOST_J=102: command not found
Signed-off-by: luo.runbing <luo.runbing@zte.com.cn>
The rgw-gap-list tool can produce a number of false positives when the
cluster is being used during its run. One technique to minimize the
number of false positives is to run the tool twice and look for the
objects that appear in both lists. The rgw-gap-list-comparator tool is
designed to do this comparison.
Signed-off-by: Michael Kidd <linuxkidd@gmail.com>
Due to a prior bug (pr: 38228) tail rados objects of some RGW objects
could have been incorrectly deleted. This tool is designed to look for
such cases. It essentially does the opposite of rgw-orphan-list,
looking for rados objects that RGW expects to be there, but which are
not to be found.
IMPORTANT: This is very experimental at this point in time, and any
"results" produced should be verified by other means.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Signed-off-by: Michael Kidd <linuxkidd@gmail.com>