mirror of
https://github.com/ceph/ceph
synced 2025-01-10 13:10:46 +00:00
b11226d3ee
Signed-off-by: Sage Weil <sage@newdream.net>
708 lines
32 KiB
ReStructuredText
708 lines
32 KiB
ReStructuredText
=======
|
|
Pacific
|
|
=======
|
|
|
|
Pacific is the 16th stable release of Ceph. It is named after the
|
|
giant pacific octopus (Enteroctopus dofleini).
|
|
|
|
v16.2.3 Pacific
|
|
===============
|
|
|
|
This is the third backport release in the Pacific series. We recommend all users
|
|
update to this release.
|
|
|
|
Notable Changes
|
|
---------------
|
|
|
|
* This release fixes a cephadm upgrade bug that caused some systems to get stuck in a loop
|
|
restarting the first mgr daemon.
|
|
|
|
|
|
v16.2.2 Pacific
|
|
===============
|
|
|
|
This is the second backport release in the Pacific series. We recommend all
|
|
users update to this release.
|
|
|
|
Notable Changes
|
|
---------------
|
|
|
|
* Cephadm now supports an *ingress* service type that provides load
|
|
balancing and HA (via haproxy and keepalived on a virtual IP) for
|
|
RGW service (see :ref:`orchestrator-haproxy-service-spec`). (The experimental
|
|
*rgw-ha* service has been removed.)
|
|
|
|
Changelog
|
|
---------
|
|
|
|
* ceph-fuse: src/include/buffer.h: 1187: FAILED ceph_assert(_num <= 1024) (`pr#40628 <https://github.com/ceph/ceph/pull/40628>`_, Yanhu Cao)
|
|
* ceph-volume: fix "device" output (`pr#41054 <https://github.com/ceph/ceph/pull/41054>`_, Sébastien Han)
|
|
* ceph-volume: fix raw listing when finding OSDs from different clusters (`pr#40985 <https://github.com/ceph/ceph/pull/40985>`_, Sébastien Han)
|
|
* ceph.spec.in: Enable tcmalloc on IBM Power and Z (`pr#39488 <https://github.com/ceph/ceph/pull/39488>`_, Nathan Cutler, Yaakov Selkowitz)
|
|
* cephadm april batch 3 (`issue#49737 <http://tracker.ceph.com/issues/49737>`_, `pr#40922 <https://github.com/ceph/ceph/pull/40922>`_, Adam King, Sage Weil, Daniel Pivonka, Shreyaa Sharma, Sebastian Wagner, Juan Miguel Olmo Martínez, Zac Dover, Jeff Layton, Guillaume Abrioux, 胡玮文, Melissa Li, Nathan Cutler, Yaakov Selkowitz)
|
|
* cephadm: april batch 1 (`pr#40544 <https://github.com/ceph/ceph/pull/40544>`_, Sage Weil, Daniel Pivonka, Joao Eduardo Luis, Adam King)
|
|
* cephadm: april batch backport 2 (`pr#40746 <https://github.com/ceph/ceph/pull/40746>`_, Guillaume Abrioux, Sage Weil, Paul Cuzner)
|
|
* cephadm: specify addr on bootstrap's host add (`pr#40554 <https://github.com/ceph/ceph/pull/40554>`_, Joao Eduardo Luis)
|
|
* cephfs: minor ceph-dokan improvements (`pr#40627 <https://github.com/ceph/ceph/pull/40627>`_, Lucian Petrut)
|
|
* client: items pinned in cache preventing unmount (`pr#40629 <https://github.com/ceph/ceph/pull/40629>`_, Xiubo Li)
|
|
* client: only check pool permissions for regular files (`pr#40686 <https://github.com/ceph/ceph/pull/40686>`_, Xiubo Li)
|
|
* cmake: define BOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT globally (`pr#40706 <https://github.com/ceph/ceph/pull/40706>`_, Kefu Chai)
|
|
* cmake: pass unparsed args to add_ceph_test() (`pr#40523 <https://github.com/ceph/ceph/pull/40523>`_, Kefu Chai)
|
|
* cmake: use --smp 1 --memory 256M to crimson tests (`pr#40568 <https://github.com/ceph/ceph/pull/40568>`_, Kefu Chai)
|
|
* crush/CrushLocation: do not print logging message in constructor (`pr#40679 <https://github.com/ceph/ceph/pull/40679>`_, Alex Wu)
|
|
* doc/cephfs/nfs: add user id, fs name and key to FSAL block (`pr#40687 <https://github.com/ceph/ceph/pull/40687>`_, Varsha Rao)
|
|
* include/librados: fix doxygen syntax for docs build (`pr#40805 <https://github.com/ceph/ceph/pull/40805>`_, Josh Durgin)
|
|
* mds: "cluster [WRN] Scrub error on inode 0x1000000039d (/client.0/tmp/blogbench-1.0/src/blogtest_in) see mds.a log and `damage ls` output for details" (`pr#40825 <https://github.com/ceph/ceph/pull/40825>`_, Milind Changire)
|
|
* mds: skip the buffer in UnknownPayload::decode() (`pr#40682 <https://github.com/ceph/ceph/pull/40682>`_, Xiubo Li)
|
|
* mgr/PyModule: put mgr_module_path before Py_GetPath() (`pr#40517 <https://github.com/ceph/ceph/pull/40517>`_, Kefu Chai)
|
|
* mgr/dashboard: Device health status is not getting listed under hosts section (`pr#40494 <https://github.com/ceph/ceph/pull/40494>`_, Aashish Sharma)
|
|
* mgr/dashboard: Fix for alert notification message being undefined (`pr#40588 <https://github.com/ceph/ceph/pull/40588>`_, Nizamudeen A)
|
|
* mgr/dashboard: Fix for broken User management role cloning (`pr#40398 <https://github.com/ceph/ceph/pull/40398>`_, Nizamudeen A)
|
|
* mgr/dashboard: Improve descriptions in some parts of the dashboard (`pr#40545 <https://github.com/ceph/ceph/pull/40545>`_, Nizamudeen A)
|
|
* mgr/dashboard: Remove username and password from request body (`pr#40981 <https://github.com/ceph/ceph/pull/40981>`_, Nizamudeen A)
|
|
* mgr/dashboard: Remove username, password fields from Manager Modules/dashboard,influx (`pr#40489 <https://github.com/ceph/ceph/pull/40489>`_, Aashish Sharma)
|
|
* mgr/dashboard: Revoke read-only user's access to Manager modules (`pr#40648 <https://github.com/ceph/ceph/pull/40648>`_, Nizamudeen A)
|
|
* mgr/dashboard: Unable to login to ceph dashboard until clearing cookies manually (`pr#40586 <https://github.com/ceph/ceph/pull/40586>`_, Avan Thakkar)
|
|
* mgr/dashboard: debug nodeenv hangs (`pr#40815 <https://github.com/ceph/ceph/pull/40815>`_, Ernesto Puerta)
|
|
* mgr/dashboard: filesystem pool size should use stored stat (`pr#40980 <https://github.com/ceph/ceph/pull/40980>`_, Avan Thakkar)
|
|
* mgr/dashboard: fix broken feature toggles (`pr#40474 <https://github.com/ceph/ceph/pull/40474>`_, Ernesto Puerta)
|
|
* mgr/dashboard: fix duplicated rows when creating NFS export (`pr#40990 <https://github.com/ceph/ceph/pull/40990>`_, Alfonso Martínez)
|
|
* mgr/dashboard: fix errors when creating NFS export (`pr#40822 <https://github.com/ceph/ceph/pull/40822>`_, Alfonso Martínez)
|
|
* mgr/dashboard: improve telemetry opt-in reminder notification message (`pr#40887 <https://github.com/ceph/ceph/pull/40887>`_, Waad Alkhoury)
|
|
* mgr/dashboard: test prometheus rules through promtool (`pr#40929 <https://github.com/ceph/ceph/pull/40929>`_, Aashish Sharma, Kefu Chai)
|
|
* mon: Modifying trim logic to change paxos_service_trim_max dynamically (`pr#40691 <https://github.com/ceph/ceph/pull/40691>`_, Aishwarya Mathuria)
|
|
* monmaptool: Don't call set_port on an invalid address (`pr#40690 <https://github.com/ceph/ceph/pull/40690>`_, Brad Hubbard, Kefu Chai)
|
|
* os/FileStore: don't propagate split/merge error to "create"/"remove" (`pr#40989 <https://github.com/ceph/ceph/pull/40989>`_, Mykola Golub)
|
|
* os/bluestore/BlueFS: do not _flush_range deleted files (`pr#40677 <https://github.com/ceph/ceph/pull/40677>`_, weixinwei)
|
|
* osd/PeeringState: fix acting_set_writeable min_size check (`pr#40759 <https://github.com/ceph/ceph/pull/40759>`_, Samuel Just)
|
|
* packaging: require ceph-common for immutable object cache daemon (`pr#40665 <https://github.com/ceph/ceph/pull/40665>`_, Ilya Dryomov)
|
|
* pybind/mgr/volumes: deadlock on async job hangs finisher thread (`pr#40630 <https://github.com/ceph/ceph/pull/40630>`_, Kefu Chai, Patrick Donnelly)
|
|
* qa/suites/krbd: don't require CEPHX_V2 for unmap subsuite (`pr#40826 <https://github.com/ceph/ceph/pull/40826>`_, Ilya Dryomov)
|
|
* qa/suites/rados/cephadm: stop testing on broken focal kubic podman (`pr#40512 <https://github.com/ceph/ceph/pull/40512>`_, Sage Weil)
|
|
* qa/tasks/ceph.conf: shorten cephx TTL for testing (`pr#40663 <https://github.com/ceph/ceph/pull/40663>`_, Sage Weil)
|
|
* qa/tasks/cephfs: create enough subvolumes (`pr#40688 <https://github.com/ceph/ceph/pull/40688>`_, Ramana Raja)
|
|
* qa/tasks/vstart_runner.py: start max required mgrs (`pr#40612 <https://github.com/ceph/ceph/pull/40612>`_, Alfonso Martínez)
|
|
* qa/tasks: Add wait_for_clean() check prior to initiating scrubbing (`pr#40461 <https://github.com/ceph/ceph/pull/40461>`_, Sridhar Seshasayee)
|
|
* qa: "AttributeError: 'NoneType' object has no attribute 'mon_manager'" (`pr#40645 <https://github.com/ceph/ceph/pull/40645>`_, Rishabh Dave)
|
|
* qa: "log [ERR] : error reading sessionmap 'mds2_sessionmap'" (`pr#40852 <https://github.com/ceph/ceph/pull/40852>`_, Patrick Donnelly)
|
|
* qa: fix ino_release_cb racy behavior (`pr#40683 <https://github.com/ceph/ceph/pull/40683>`_, Patrick Donnelly)
|
|
* qa: fs:cephadm mount does not wait for mds to be created (`pr#40528 <https://github.com/ceph/ceph/pull/40528>`_, Patrick Donnelly)
|
|
* qa: test standby_replay in workloads (`pr#40853 <https://github.com/ceph/ceph/pull/40853>`_, Patrick Donnelly)
|
|
* rbd-mirror: fix UB while registering perf counters (`pr#40680 <https://github.com/ceph/ceph/pull/40680>`_, Arthur Outhenin-Chalandre)
|
|
* rgw: add latency to the request summary of an op (`pr#40448 <https://github.com/ceph/ceph/pull/40448>`_, Ali Maredia)
|
|
* rgw: Backport of datalog improvements to Pacific (`pr#40559 <https://github.com/ceph/ceph/pull/40559>`_, Yuval Lifshitz, Adam C. Emerson)
|
|
* test: disable mgr/mirroring for `test_mirroring_init_failure_with_recovery` test (`issue#50020 <http://tracker.ceph.com/issues/50020>`_, `pr#40684 <https://github.com/ceph/ceph/pull/40684>`_, Venky Shankar)
|
|
* tools/cephfs_mirror/PeerReplayer.cc: add missing include (`pr#40678 <https://github.com/ceph/ceph/pull/40678>`_, Duncan Bellamy)
|
|
* vstart.sh: disable "auth_allow_insecure_global_id_reclaim" (`pr#40957 <https://github.com/ceph/ceph/pull/40957>`_, Kefu Chai)
|
|
|
|
|
|
v16.2.1 Pacific
|
|
===============
|
|
|
|
This is the first bugfix release in the Pacific stable series. It addresses a
|
|
security vulnerability in the Ceph authentication framework.
|
|
|
|
We recommend all Pacific users upgrade.
|
|
|
|
Security fixes
|
|
--------------
|
|
|
|
* This release includes a security fix that ensures the global_id
|
|
value (a numeric value that should be unique for every authenticated
|
|
client or daemon in the cluster) is reclaimed after a network
|
|
disconnect or ticket renewal in a secure fashion. Two new health
|
|
alerts may appear during the upgrade indicating that there are
|
|
clients or daemons that are not yet patched with the appropriate
|
|
fix.
|
|
|
|
To temporarily mute the health alerts around insecure clients for the duration of the
|
|
upgrade, you may want to::
|
|
|
|
ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM 1h
|
|
ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1h
|
|
|
|
For more information, see :ref:`CVE-2021-20288`.
|
|
|
|
|
|
v16.2.0 Pacific
|
|
===============
|
|
|
|
This is the first stable release of Ceph Pacific.
|
|
|
|
Major Changes from Octopus
|
|
--------------------------
|
|
|
|
General
|
|
~~~~~~~
|
|
|
|
* Cephadm can automatically upgrade an Octopus cluster to Pacific with a single
|
|
command to start the process.
|
|
* Cephadm has improved significantly over the past year, with improved
|
|
support for RGW (standalone and multisite), and new support for NFS
|
|
and iSCSI. Most of these changes have already been backported to
|
|
recent Octopus point releases, but with the Pacific release we will
|
|
switch to backporting bug fixes only.
|
|
* :ref:`Packages <packages>` are built for the following distributions:
|
|
|
|
- CentOS 8
|
|
- Ubuntu 20.04 (Focal)
|
|
- Ubuntu 18.04 (Bionic)
|
|
- Debian Buster
|
|
- :ref:`Container image <containers>` (based on CentOS 8)
|
|
|
|
With the exception of Debian Buster, packages and containers are
|
|
built for both x86_64 and aarch64 (arm64) architectures.
|
|
|
|
Note that cephadm clusters may work on many other distributions,
|
|
provided Python 3 and a recent version of Docker or Podman is
|
|
available to manage containers. For more information, see
|
|
:ref:`cephadm-host-requirements`.
|
|
|
|
|
|
Dashboard
|
|
~~~~~~~~~
|
|
|
|
The :ref:`mgr-dashboard` brings improvements in the following management areas:
|
|
|
|
* Orchestrator/Cephadm:
|
|
|
|
- Host management: maintenance mode, labels.
|
|
- Services: display placement specification.
|
|
- OSD: disk replacement, display status of ongoing deletion, and improved
|
|
health/SMART diagnostics reporting.
|
|
|
|
* Official :ref:`mgr ceph api`:
|
|
|
|
- OpenAPI v3 compliant.
|
|
- Stability commitment starting from Pacific release.
|
|
- Versioned via HTTP ``Accept`` header (starting with v1.0).
|
|
- Thoroughly tested (>90% coverage and per Pull Request validation).
|
|
- Fully documented.
|
|
|
|
* RGW:
|
|
|
|
- Multi-site synchronization monitoring.
|
|
- Management of multiple RGW daemons and their resources (buckets and users).
|
|
- Bucket and user quota usage visualization.
|
|
- Improved configuration of S3 tenanted users.
|
|
|
|
* Security (multiple enhancements and fixes resulting from a pen testing conducted by IBM):
|
|
|
|
- Account lock-out after a configurable number of failed log-in attempts.
|
|
- Improved cookie policies to mitigate XSS/CSRF attacks.
|
|
- Reviewed and improved security in HTTP headers.
|
|
- Sensitive information reviewed and removed from logs and error messages.
|
|
- TLS 1.0 and 1.1 support disabled.
|
|
- Debug mode when enabled triggers HEALTH_WARN.
|
|
|
|
* Pools:
|
|
|
|
- Improved visualization of replication and erasure coding modes.
|
|
- CLAY erasure code plugin supported.
|
|
|
|
* Alerts and notifications:
|
|
|
|
- Alert triggered on MTU mismatches in the cluster network.
|
|
- Favicon changes according cluster status.
|
|
|
|
* Other:
|
|
|
|
- Landing page: improved charts and visualization.
|
|
- Telemetry configuration wizard.
|
|
- OSDs: management of individual OSD flags.
|
|
- RBD: per-RBD image Grafana dashboards.
|
|
- CephFS: Dirs and Caps displayed.
|
|
- NFS: v4 support only (v3 backward compatibility planned).
|
|
- Front-end: Angular 10 update.
|
|
|
|
|
|
RADOS
|
|
~~~~~
|
|
|
|
* Pacific introduces :ref:`bluestore-rocksdb-sharding`, which reduces disk space requirements.
|
|
|
|
* Ceph now provides QoS between client I/O and background operations via the
|
|
mclock scheduler.
|
|
|
|
* The balancer is now on by default in upmap mode to improve distribution of
|
|
PGs across OSDs.
|
|
|
|
* The output of ``ceph -s`` has been improved to show recovery progress in
|
|
one progress bar. More detailed progress bars are visible via the
|
|
``ceph progress`` command.
|
|
|
|
|
|
RBD block storage
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
* Image live-migration feature has been extended to support external data
|
|
sources. Images can now be instantly imported from local files, remote
|
|
files served over HTTP(S) or remote S3 buckets in ``raw`` (``rbd export v1``)
|
|
or basic ``qcow`` and ``qcow2`` formats. Support for ``rbd export v2``
|
|
format, advanced QCOW features and ``rbd export-diff`` snapshot differentials
|
|
is expected in future releases.
|
|
|
|
* Initial support for client-side encryption has been added. This is based
|
|
on LUKS and in future releases will allow using per-image encryption keys
|
|
while maintaining snapshot and clone functionality -- so that parent image
|
|
and potentially multiple clone images can be encrypted with different keys.
|
|
|
|
* A new persistent write-back cache is available. The cache operates in
|
|
a log-structured manner, providing full point-in-time consistency for the
|
|
backing image. It should be particularly suitable for PMEM devices.
|
|
|
|
* A Windows client is now available in the form of ``librbd.dll`` and
|
|
``rbd-wnbd`` (Windows Network Block Device) daemon. It allows mapping,
|
|
unmapping and manipulating images similar to ``rbd-nbd``.
|
|
|
|
* librbd API now offers quiesce/unquiesce hooks, allowing for coordinated
|
|
snapshot creation.
|
|
|
|
|
|
RGW object storage
|
|
~~~~~~~~~~~~~~~~~~
|
|
|
|
* Initial support for S3 Select. See :ref:`s3-select-feature-table` for supported queries.
|
|
|
|
* Bucket notification topics can be configured as ``persistent``, where events
|
|
are recorded in rados for reliable delivery.
|
|
|
|
* Bucket notifications can be delivered to SSL-enabled AMQP endpoints.
|
|
|
|
* Lua scripts can be run during requests and access their metadata.
|
|
|
|
* SSE-KMS now supports KMIP as a key management service.
|
|
|
|
* Multisite data logs can now be deployed on ``cls_fifo`` to avoid large omap
|
|
cluster warnings and make their trimming cheaper. See ``rgw_data_log_backing``.
|
|
|
|
|
|
CephFS distributed file system
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
* The CephFS MDS modifies on-RADOS metadata such that the new format is no
|
|
longer backwards compatible. It is not possible to downgrade a file system from
|
|
Pacific (or later) to an older release.
|
|
|
|
* Multiple file systems in a single Ceph cluster is now stable. New Ceph
|
|
clusters enable support for multiple file systems by default. Existing clusters
|
|
must still set the "enable_multiple" flag on the FS. See also
|
|
:ref:`cephfs-multifs`.
|
|
|
|
* A new ``mds_autoscaler`` ``ceph-mgr`` plugin is available for automatically
|
|
deploying MDS daemons in response to changes to the ``max_mds`` configuration.
|
|
Expect further enhancements in the future to simplify and automate MDS scaling.
|
|
|
|
* ``cephfs-top`` is a new utility for looking at performance metrics from CephFS
|
|
clients. It is development preview quality and will have bugs. For more
|
|
information, see :ref:`cephfs-top`.
|
|
|
|
* A new ``snap_schedule`` ``ceph-mgr`` plugin provides a command toolset for
|
|
scheduling snapshots on a CephFS file system. For more information, see
|
|
:ref:`snap-schedule`.
|
|
|
|
* First class NFS gateway support in Ceph is here! It's now possible to create
|
|
scale-out ("active-active") NFS gateway clusters that export CephFS using
|
|
a few commands. The gateways are deployed via cephadm (or Rook, in the future).
|
|
For more information, see :ref:`cephfs-nfs`.
|
|
|
|
* Multiple active MDS file system scrub is now stable. It is no longer necessary
|
|
to set ``max_mds`` to 1 and wait for non-zero ranks to stop. Scrub commands
|
|
can only be sent to rank 0: ``ceph tell mds.<fs_name>:0 scrub start /path ...``.
|
|
For more information, see :ref:`mds-scrub`.
|
|
|
|
* Ephemeral pinning -- policy based subtree pinning -- is considered stable.
|
|
``mds_export_ephemeral_random`` and ``mds_export_ephemeral_distributed`` now
|
|
default to true. For more information, see :ref:`cephfs-ephemeral-pinning`.
|
|
|
|
* A new ``cephfs-mirror`` daemon is available to mirror CephFS file systems to
|
|
a remote Ceph cluster. For more information, see :ref:`cephfs-mirroring`.
|
|
|
|
* A Windows client is now available for connecting to CephFS. This is offered
|
|
through a new ``ceph-dokan`` utility which operates via the Dokan userspace
|
|
API, similar to FUSE. For more information, see :ref:`ceph-dokan`.
|
|
|
|
|
|
Upgrading from Octopus or Nautilus
|
|
----------------------------------
|
|
|
|
Before starting, make sure your cluster is stable and healthy (no down or
|
|
recovering OSDs). (This is optional, but recommended.)
|
|
|
|
Upgrading cephadm clusters
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
If your cluster is deployed with cephadm (first introduced in Octopus), then
|
|
the upgrade process is entirely automated. To initiate the upgrade,
|
|
|
|
.. prompt:: bash #
|
|
|
|
ceph orch upgrade start --ceph-version 16.2.0
|
|
|
|
The same process is used to upgrade to future minor releases.
|
|
|
|
Upgrade progress can be monitored with ``ceph -s`` (which provides a simple
|
|
progress bar) or more verbosely with
|
|
|
|
.. prompt:: bash #
|
|
|
|
ceph -W cephadm
|
|
|
|
The upgrade can be paused or resumed with
|
|
|
|
.. prompt:: bash #
|
|
|
|
ceph orch upgrade pause # to pause
|
|
ceph orch upgrade resume # to resume
|
|
|
|
or canceled with
|
|
|
|
.. prompt:: bash #
|
|
|
|
ceph orch upgrade stop
|
|
|
|
Note that canceling the upgrade simply stops the process; there is no ability to
|
|
downgrade back to Octopus.
|
|
|
|
|
|
Upgrading non-cephadm clusters
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. note::
|
|
If you cluster is running Octopus (15.2.x), you might choose
|
|
to first convert it to use cephadm so that the upgrade to Pacific
|
|
is automated (see above). For more information, see
|
|
:ref:`cephadm-adoption`.
|
|
|
|
#. Set the ``noout`` flag for the duration of the upgrade. (Optional,
|
|
but recommended.)::
|
|
|
|
# ceph osd set noout
|
|
|
|
#. Upgrade monitors by installing the new packages and restarting the
|
|
monitor daemons. For example, on each monitor host,::
|
|
|
|
# systemctl restart ceph-mon.target
|
|
|
|
Once all monitors are up, verify that the monitor upgrade is
|
|
complete by looking for the ``octopus`` string in the mon
|
|
map. The command::
|
|
|
|
# ceph mon dump | grep min_mon_release
|
|
|
|
should report::
|
|
|
|
min_mon_release 16 (pacific)
|
|
|
|
If it doesn't, that implies that one or more monitors hasn't been
|
|
upgraded and restarted and/or the quorum does not include all monitors.
|
|
|
|
#. Upgrade ``ceph-mgr`` daemons by installing the new packages and
|
|
restarting all manager daemons. For example, on each manager host,::
|
|
|
|
# systemctl restart ceph-mgr.target
|
|
|
|
Verify the ``ceph-mgr`` daemons are running by checking ``ceph
|
|
-s``::
|
|
|
|
# ceph -s
|
|
|
|
...
|
|
services:
|
|
mon: 3 daemons, quorum foo,bar,baz
|
|
mgr: foo(active), standbys: bar, baz
|
|
...
|
|
|
|
#. Upgrade all OSDs by installing the new packages and restarting the
|
|
ceph-osd daemons on all OSD hosts::
|
|
|
|
# systemctl restart ceph-osd.target
|
|
|
|
Note that if you are upgrading from Nautilus, the first time each
|
|
OSD starts, it will do a format conversion to improve the
|
|
accounting for "omap" data. This may take a few minutes to as much
|
|
as a few hours (for an HDD with lots of omap data). You can
|
|
disable this automatic conversion with::
|
|
|
|
# ceph config set osd bluestore_fsck_quick_fix_on_mount false
|
|
|
|
You can monitor the progress of the OSD upgrades with the
|
|
``ceph versions`` or ``ceph osd versions`` commands::
|
|
|
|
# ceph osd versions
|
|
{
|
|
"ceph version 14.2.5 (...) nautilus (stable)": 12,
|
|
"ceph version 16.2.0 (...) pacific (stable)": 22,
|
|
}
|
|
|
|
#. Upgrade all CephFS MDS daemons. For each CephFS file system,
|
|
|
|
#. Disable standby_replay:
|
|
|
|
# ceph fs set <fs_name> allow_standby_replay false
|
|
|
|
#. Reduce the number of ranks to 1. (Make note of the original
|
|
number of MDS daemons first if you plan to restore it later.)::
|
|
|
|
# ceph status
|
|
# ceph fs set <fs_name> max_mds 1
|
|
|
|
#. Wait for the cluster to deactivate any non-zero ranks by
|
|
periodically checking the status::
|
|
|
|
# ceph status
|
|
|
|
#. Take all standby MDS daemons offline on the appropriate hosts with::
|
|
|
|
# systemctl stop ceph-mds@<daemon_name>
|
|
|
|
#. Confirm that only one MDS is online and is rank 0 for your FS::
|
|
|
|
# ceph status
|
|
|
|
#. Upgrade the last remaining MDS daemon by installing the new
|
|
packages and restarting the daemon::
|
|
|
|
# systemctl restart ceph-mds.target
|
|
|
|
#. Restart all standby MDS daemons that were taken offline::
|
|
|
|
# systemctl start ceph-mds.target
|
|
|
|
#. Restore the original value of ``max_mds`` for the volume::
|
|
|
|
# ceph fs set <fs_name> max_mds <original_max_mds>
|
|
|
|
#. Upgrade all radosgw daemons by upgrading packages and restarting
|
|
daemons on all hosts::
|
|
|
|
# systemctl restart ceph-radosgw.target
|
|
|
|
#. Complete the upgrade by disallowing pre-Pacific OSDs and enabling
|
|
all new Pacific-only functionality::
|
|
|
|
# ceph osd require-osd-release pacific
|
|
|
|
#. If you set ``noout`` at the beginning, be sure to clear it with::
|
|
|
|
# ceph osd unset noout
|
|
|
|
#. Consider transitioning your cluster to use the cephadm deployment
|
|
and orchestration framework to simplify cluster management and
|
|
future upgrades. For more information on converting an existing
|
|
cluster to cephadm, see :ref:`cephadm-adoption`.
|
|
|
|
|
|
Post-upgrade
|
|
~~~~~~~~~~~~
|
|
|
|
#. Verify the cluster is healthy with ``ceph health``.
|
|
|
|
If your CRUSH tunables are older than Hammer, Ceph will now issue a
|
|
health warning. If you see a health alert to that effect, you can
|
|
revert this change with::
|
|
|
|
ceph config set mon mon_crush_min_required_version firefly
|
|
|
|
If Ceph does not complain, however, then we recommend you also
|
|
switch any existing CRUSH buckets to straw2, which was added back
|
|
in the Hammer release. If you have any 'straw' buckets, this will
|
|
result in a modest amount of data movement, but generally nothing
|
|
too severe.::
|
|
|
|
ceph osd getcrushmap -o backup-crushmap
|
|
ceph osd crush set-all-straw-buckets-to-straw2
|
|
|
|
If there are problems, you can easily revert with::
|
|
|
|
ceph osd setcrushmap -i backup-crushmap
|
|
|
|
Moving to 'straw2' buckets will unlock a few recent features, like
|
|
the `crush-compat` :ref:`balancer <balancer>` mode added back in Luminous.
|
|
|
|
#. If you did not already do so when upgrading from Mimic, we
|
|
recommened you enable the new :ref:`v2 network protocol <msgr2>`,
|
|
issue the following command::
|
|
|
|
ceph mon enable-msgr2
|
|
|
|
This will instruct all monitors that bind to the old default port
|
|
6789 for the legacy v1 protocol to also bind to the new 3300 v2
|
|
protocol port. To see if all monitors have been updated,::
|
|
|
|
ceph mon dump
|
|
|
|
and verify that each monitor has both a ``v2:`` and ``v1:`` address
|
|
listed.
|
|
|
|
#. Consider enabling the :ref:`telemetry module <telemetry>` to send
|
|
anonymized usage statistics and crash information to the Ceph
|
|
upstream developers. To see what would be reported (without actually
|
|
sending any information to anyone),::
|
|
|
|
ceph mgr module enable telemetry
|
|
ceph telemetry show
|
|
|
|
If you are comfortable with the data that is reported, you can opt-in to
|
|
automatically report the high-level cluster metadata with::
|
|
|
|
ceph telemetry on
|
|
|
|
The public dashboard that aggregates Ceph telemetry can be found at
|
|
`https://telemetry-public.ceph.com/ <https://telemetry-public.ceph.com/>`_.
|
|
|
|
For more information about the telemetry module, see :ref:`the
|
|
documentation <telemetry>`.
|
|
|
|
|
|
Upgrade from pre-Nautilus releases (like Mimic or Luminous)
|
|
-----------------------------------------------------------
|
|
|
|
You must first upgrade to Nautilus (14.2.z) or Octopus (15.2.z) before
|
|
upgrading to Pacific.
|
|
|
|
|
|
Notable Changes
|
|
---------------
|
|
|
|
* A new library is available, libcephsqlite. It provides a SQLite Virtual File
|
|
System (VFS) on top of RADOS. The database and journals are striped over
|
|
RADOS across multiple objects for virtually unlimited scaling and throughput
|
|
only limited by the SQLite client. Applications using SQLite may change to
|
|
the Ceph VFS with minimal changes, usually just by specifying the alternate
|
|
VFS. We expect the library to be most impactful and useful for applications
|
|
that were storing state in RADOS omap, especially without striping which
|
|
limits scalability.
|
|
|
|
* New ``bluestore_rocksdb_options_annex`` config parameter. Complements
|
|
``bluestore_rocksdb_options`` and allows setting rocksdb options without
|
|
repeating the existing defaults.
|
|
|
|
* $pid expansion in config paths like ``admin_socket`` will now properly expand
|
|
to the daemon pid for commands like ``ceph-mds`` or ``ceph-osd``. Previously
|
|
only ``ceph-fuse``/``rbd-nbd`` expanded ``$pid`` with the actual daemon pid.
|
|
|
|
* The allowable options for some ``radosgw-admin`` commands have been changed.
|
|
|
|
* ``mdlog-list``, ``datalog-list``, ``sync-error-list`` no longer accepts
|
|
start and end dates, but does accept a single optional start marker.
|
|
* ``mdlog-trim``, ``datalog-trim``, ``sync-error-trim`` only accept a
|
|
single marker giving the end of the trimmed range.
|
|
* Similarly the date ranges and marker ranges have been removed on
|
|
the RESTful DATALog and MDLog list and trim operations.
|
|
|
|
* ceph-volume: The ``lvm batch`` subcommand received a major rewrite. This
|
|
closed a number of bugs and improves usability in terms of size specification
|
|
and calculation, as well as idempotency behaviour and disk replacement
|
|
process.
|
|
Please refer to https://docs.ceph.com/en/latest/ceph-volume/lvm/batch/ for
|
|
more detailed information.
|
|
|
|
* Configuration variables for permitted scrub times have changed. The legal
|
|
values for ``osd_scrub_begin_hour`` and ``osd_scrub_end_hour`` are 0 - 23.
|
|
The use of 24 is now illegal. Specifying ``0`` for both values causes every
|
|
hour to be allowed. The legal values for ``osd_scrub_begin_week_day`` and
|
|
``osd_scrub_end_week_day`` are 0 - 6. The use of 7 is now illegal.
|
|
Specifying ``0`` for both values causes every day of the week to be allowed.
|
|
|
|
* volume/nfs: Recently "ganesha-" prefix from cluster id and nfs-ganesha common
|
|
config object was removed, to ensure consistent namespace across different
|
|
orchestrator backends. Please delete any existing nfs-ganesha clusters prior
|
|
to upgrading and redeploy new clusters after upgrading to Pacific.
|
|
|
|
* A new health check, DAEMON_OLD_VERSION, will warn if different versions of Ceph are running
|
|
on daemons. It will generate a health error if multiple versions are detected.
|
|
This condition must exist for over mon_warn_older_version_delay (set to 1 week by default) in order for the
|
|
health condition to be triggered. This allows most upgrades to proceed
|
|
without falsely seeing the warning. If upgrade is paused for an extended
|
|
time period, health mute can be used like this
|
|
"ceph health mute DAEMON_OLD_VERSION --sticky". In this case after
|
|
upgrade has finished use "ceph health unmute DAEMON_OLD_VERSION".
|
|
|
|
* MGR: progress module can now be turned on/off, using the commands:
|
|
``ceph progress on`` and ``ceph progress off``.
|
|
|
|
* An AWS-compliant API: "GetTopicAttributes" was added to replace the existing "GetTopic" API. The new API
|
|
should be used to fetch information about topics used for bucket notifications.
|
|
|
|
* librbd: The shared, read-only parent cache's config option ``immutable_object_cache_watermark`` now has been updated
|
|
to property reflect the upper cache utilization before space is reclaimed. The default ``immutable_object_cache_watermark``
|
|
now is ``0.9``. If the capacity reaches 90% the daemon will delete cold cache.
|
|
|
|
* OSD: the option ``osd_fast_shutdown_notify_mon`` has been introduced to allow
|
|
the OSD to notify the monitor it is shutting down even if ``osd_fast_shutdown``
|
|
is enabled. This helps with the monitor logs on larger clusters, that may get
|
|
many 'osd.X reported immediately failed by osd.Y' messages, and confuse tools.
|
|
|
|
* The mclock scheduler has been refined. A set of built-in profiles are now available that
|
|
provide QoS between the internal and external clients of Ceph. To enable the mclock
|
|
scheduler, set the config option "osd_op_queue" to "mclock_scheduler". The
|
|
"high_client_ops" profile is enabled by default, and allocates more OSD bandwidth to
|
|
external client operations than to internal client operations (such as background recovery
|
|
and scrubs). Other built-in profiles include "high_recovery_ops" and "balanced". These
|
|
built-in profiles optimize the QoS provided to clients of mclock scheduler.
|
|
|
|
* The balancer is now on by default in upmap mode. Since upmap mode requires
|
|
``require_min_compat_client`` luminous, new clusters will only support luminous
|
|
and newer clients by default. Existing clusters can enable upmap support by running
|
|
``ceph osd set-require-min-compat-client luminous``. It is still possible to turn
|
|
the balancer off using the ``ceph balancer off`` command. In earlier versions,
|
|
the balancer was included in the ``always_on_modules`` list, but needed to be
|
|
turned on explicitly using the ``ceph balancer on`` command.
|
|
|
|
* Version 2 of the cephx authentication protocol (``CEPHX_V2`` feature bit) is
|
|
now required by default. It was introduced in 2018, adding replay attack
|
|
protection for authorizers and making msgr v1 message signatures stronger
|
|
(CVE-2018-1128 and CVE-2018-1129). Support is present in Jewel 10.2.11,
|
|
Luminous 12.2.6, Mimic 13.2.1, Nautilus 14.2.0 and later; upstream kernels
|
|
4.9.150, 4.14.86, 4.19 and later; various distribution kernels, in particular
|
|
CentOS 7.6 and later. To enable older clients, set ``cephx_require_version``
|
|
and ``cephx_service_require_version`` config options to 1.
|
|
|
|
* `blacklist` has been replaced with `blocklist` throughout. The following commands have changed:
|
|
|
|
- ``ceph osd blacklist ...`` are now ``ceph osd blocklist ...``
|
|
- ``ceph <tell|daemon> osd.<NNN> dump_blacklist`` is now ``ceph <tell|daemon> osd.<NNN> dump_blocklist``
|
|
|
|
* The following config options have changed:
|
|
|
|
- ``mon osd blacklist default expire`` is now ``mon osd blocklist default expire``
|
|
- ``mon mds blacklist interval`` is now ``mon mds blocklist interval``
|
|
- ``mon mgr blacklist interval`` is now ''mon mgr blocklist interval``
|
|
- ``rbd blacklist on break lock`` is now ``rbd blocklist on break lock``
|
|
- ``rbd blacklist expire seconds`` is now ``rbd blocklist expire seconds``
|
|
- ``mds session blacklist on timeout`` is now ``mds session blocklist on timeout``
|
|
- ``mds session blacklist on evict`` is now ``mds session blocklist on evict``
|
|
|
|
* The following librados API calls have changed:
|
|
|
|
- ``rados_blacklist_add`` is now ``rados_blocklist_add``; the former will issue a deprecation warning and be removed in a future release.
|
|
- ``rados.blacklist_add`` is now ``rados.blocklist_add`` in the C++ API.
|
|
|
|
* The JSON output for the following commands now shows ``blocklist`` instead of ``blacklist``:
|
|
|
|
- ``ceph osd dump``
|
|
- ``ceph <tell|daemon> osd.<N> dump_blocklist``
|
|
|
|
* Monitors now have config option ``mon_allow_pool_size_one``, which is disabled
|
|
by default. However, if enabled, user now have to pass the
|
|
``--yes-i-really-mean-it`` flag to ``osd pool set size 1``, if they are really
|
|
sure of configuring pool size 1.
|
|
|
|
* ``ceph pg #.# list_unfound`` output has been enhanced to provide
|
|
might_have_unfound information which indicates which OSDs may
|
|
contain the unfound objects.
|
|
|
|
* OSD: A new configuration option ``osd_compact_on_start`` has been added which triggers
|
|
an OSD compaction on start. Setting this option to ``true`` and restarting an OSD
|
|
will result in an offline compaction of the OSD prior to booting.
|
|
|
|
* OSD: the option named ``bdev_nvme_retry_count`` has been removed. Because
|
|
in SPDK v20.07, there is no easy access to bdev_nvme options, and this
|
|
option is hardly used, so it was removed.
|
|
|
|
* Alpine build related script, documentation and test have been removed since
|
|
the most updated APKBUILD script of Ceph is already included by Alpine Linux's
|
|
aports repository.
|
|
|