ceph/PendingReleaseNotes
Sage Weil 251f667ef8 Merge PR #25009 into master
* refs/pull/25009/head:
	librbd: stringify locker name with get_legacy_str()
	osdc/Objecter: fix list_watchers addr rendering to match legacy
	test/crimson: disable unittest_seastar_messenger test
	msg/msg_types: encode entity_addr_t TYPE_ANY as TYPE_LEGACY for pre-nautilus
	client: make blacklist detection handle TYPE_ANY entries
	mon/OSDMonitor: maintain compat output for 'blacklist ls'
	client: maintain compat for {inst,addr}_str in status dump
	qa/tasks/ceph_manager: compare osd flush seq #'s as ints
	qa/suites/fs: make use of simple.yaml where appropriate
	qa/msgr: move msgr factet into generic re-usable dir
	crimson: fix monmap build for seastar
	doc/start/ceph.conf: trim the sample ceph.conf file
	doc/rados/operations: only describe --public-{addr,network} method for adding mons
	PendingReleaseNotes: deprecate 'mon addr'
	doc: fix some 'mon addr' references
	doc/rados/configuration: fix some 'mon addr' references
	doc/rados/configuration/network-config-ref: revise network docs somewhat
	doc/rados/configuration/network-config-ref: remove totally obsolete section
	qa/suites/rados: replace mon_seesaw.py task with a small bash script
	qa/suites/fs/upgrade: don't bind to v2 addrs
	qa/tasks/mon_thrash: avoid 'mon addr' in mon section
	mon/MonClient: disable ms_bind_msgr2 if NAUTILUS feature not set
	osd/OSDMap: maintain compat addr fields
	msg/msg_types: add get_legacy_str()
	mds/MDSMap.h: maintain compat addr field
	mon/MgrMap: maintain compat active_addr field
	mon/MonClient: reconnect to mon if it's addrvec appears to have changed
	qa/tasks/ceph.conf.template: increase mon_mgr_mkfs_grace
	msg/async/ProtocolV2: fill in IP for all peer_addrs
	msg/async: print all addrs on debug lines
	mon/MonMap: no noname- mon name prefix when for_mkfs
	ceph-monstore-tool: print initial monmap
	msg/async/ProtocolV2: advertise ourselves as a v2 addr when using v2 protocol
	msg/async: assert existing protocol matches current protocol
	msg/async: add missing modelines
	mon/MonMap: add missing modeline
	vstart.sh: put mon addrs in mon_host, not 'mon addr'
	msg/async: better debug around conn map lookups and updates
	mon/MonClient: dump initial monmap at debug level 10
	qa/standalone/osd/osd-fast-mark-down: use v1 addr w/ simplemessenger
	qa/tasks/ceph: set initial monmap features with using addrvec addrs
	monmaptool: add --enable-all-features option
	qa/tasks/ceph: only use monmaptool --addv if addr has [,:v]
	qa/tasks/ceph_manager: make get_mon_status use mon addr
	qa/tasks/ceph: keep mon addrs in ctx namespace
	mon/OSDMonitor: log all osd addrs on boot
	msg/simple: behave when v2 and v1 addrs are present at target
	mon/MonClient: warn if global_id changes
	msg/Connection: add warning/note on get_peer_global_id
	mds/MDSDaemon: clean up handle_mds_map debug output a bit
	qa/suites/rados/upgrade: debug mds
	mds/MDSRank: improve is_stale_message to handle addrvecs
	msg/async: make loopback detect when sending to one of our many addrs
	qa/suites/rados/upgrade: no aggressive pg num changes
	mon/OSDMonitor: require nautilus mons for require_osd_release=nautilus
	mon/OSDMonitor: require mimic mons for require_osd_release=mimic
	qa/suites/rados/thrash-old-clients: use legacy addr syntax in ceph.conf
	msg/async: preserve peer features when replacing a connection
	qa/tasks/ceph.py: move methods from teuthology.git into ceph.py directly; support mon bind * options
	mon/MonMap: adjust build_initial behavior for mkfs vs probe
	mon/MonMap: improve ambiguous addr behavior
	qa/suites/rados/upgrade: spread mons a bit
	qa/rados/thrash-old-clients: keep mons on separate hosts
	qa/standalone/mon/misc.sh: tweak test to be more robust
	qa/tasks/mon_seesaw: expect v1/v2 prefix in addr
	osd/OSDMap: fix is_blacklisted() check to assume type ANY
	mon/OSDMonitor: use ANY addr type for blacklisting
	mon/msg_types: TYPE_V1ORV2 -> TYPE_ANY
	qa/workunits/cephtool: fix blacklist test
	qa/suites/upgrade: install old version with only v1 addrs
	common/options: by default, bind to both msgr v1 and v2 addresses
	vstart.sh: add --msgr1, --msgr2, --msgr21 options
	msg/async/ProtocolV2: be flexible with server identity check
	msg/msg_types: fix entity_addrvec_t::parse() with null end arg
	qa/suites/rados/basic/msgr: no msgr2 addrs in initial monmaps
	qa/tasks/ceph: add 'mon_bind_addrvec' and 'mon_bind_msgr2' options
	monmaptool: add --addv argument to pass in addrvec directly
	qa/suites/rados/basic/msgr: do not use msgr2 with simplemessenger
	qa/suites/rados/basic/msgr: async is not experimental
	messages/MOSDBoot: fix compat with pre-nautilus
	mon/MonMap: allow v1 or v2 to be explicitly specified along with part
	msg/msg_types: allow parsing of IPs without assuming v1 vs v2
	msg/msg_types: default parse to v2 addrs
	msg: standarize on v1: and v2: prefixes for *all* entity_addr_t's
	vstart.sh: use msgr2 by default
	mon/MonMap: remove get_addr() methods
	ceph-mon: adjust startup/bind/join sequence to use addrs
	mon: use MonMap::get_addrs() (instead of get_addr())
	mon/MonClient: change pending_cons to addrvec-based map
	mon/MonMap: fix set_addr() caller, kill wrapper
	mon/MonMap: remove addr-based add()
	monmaptool: fix --add to do either legacy or msgr2+legacy
	monmaptool: clean up iterator use a bit
	mon/MonMap: handle ambiguous mon addrs by trying both legacy and msgr
	mon/MonMap: take addrvec for set_initial_members
	mon/MonMap: use addrvecs for test instances
	mon: pass addrvec via MMonJoin
	mon/MonmapMonitor: fix 'mon add' to populate addrvec
	mon/MonMap: addr -> addrvec
	msg/async/ProtocolV2: only update socket_addr if we learned our addr
	osd: go active even if mon only accepted our v1 addr
	test/msgr: add test for msgr2 protocol
	msg/async/ProtocolV2: share socket_addr and all addrs during handshake
	msg/async: print socket_addr for the connection
	msg/async: msgr2 protocol placeholder
	msg/async: move ProtocolV1 class to its own source file
	msg/async: keep listen addr in ServerSocket, pass to new connections
	msg/async/AsyncMessenger: fix set_addr_unknowns

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-01-04 13:42:09 -06:00

268 lines
12 KiB
Plaintext

14.0.1
------
* ceph pg stat output has been modified in json
format to match ceph df output:
* "raw_bytes" field renamed to "total_bytes"
* "raw_bytes_avail" field renamed to "total_bytes_avail"
* "raw_bytes_avail" field renamed to "total_bytes_avail"
* "raw_bytes_used" field renamed to "total_bytes_raw_used"
* "total_bytes_used" field added to represent the space (accumulated over
all OSDs) allocated purely for data objects kept at block(slow) device
* ceph df [detail] output (GLOBAL section) has been modified in plain
format:
* new 'USED' column shows the space (accumulated over all OSDs) allocated
purely for data objects kept at block(slow) device.
* 'RAW USED' is now a sum of 'USED' space and space allocated/reserved at
block device for Ceph purposes, e.g. BlueFS part for BlueStore.
* ceph df [detail] output (GLOBAL section) has been modified in json
format:
* 'total_used_bytes' column now shows the space (accumulated over all OSDs)
allocated purely for data objects kept at block(slow) device
* new 'total_used_raw_bytes' column shows a sum of 'USED' space and space
allocated/reserved at block device for Ceph purposes, e.g. BlueFS part for
BlueStore.
* ceph df [detail] output (POOLS section) has been modified in plain
format:
* 'BYTES USED' column renamed to 'STORED'. Represents amount of data
stored by the user.
* 'USED' column now represent amount of space allocated purely for data
by all OSD nodes in KB.
* 'QUOTA BYTES', 'QUOTA OBJECTS' aren't showed anumore in non-detailed mode.
* new column 'USED COMPR' - amount of space allocated for compressed
data. I.e. comrpessed data plus all the allocation, replication and erasure
coding overhead.
* new column 'UNDER COMPR' - amount of data passed through compression
(summed over all replicas) and beneficial enough to be stored in a
compressed form.
* Some columns reordering
* ceph df [detail] output (POOLS section) has been modified in json
format:
* 'bytes used' column renamed to 'stored'. Represents amount of data
stored by the user.
* 'raw bytes used' column renamed to "stored_raw". Totals of user data
over all OSD excluding degraded.
* new 'bytes_used' column now represent amount of space allocated by
all OSD nodes.
* 'kb_used' column - the same as 'bytes_used' but in KB.
* new column 'compress_bytes_used' - amount of space allocated for compressed
data. I.e. comrpessed data plus all the allocation, replication and erasure
coding overhead.
* new column 'compress_under_bytes' amount of data passed through compression
(summed over all replicas) and beneficial enough to be stored in a
compressed form.
* rados df [detail] output (POOLS section) has been modified in plain
format:
* 'USED' column now shows the space (accumulated over all OSDs) allocated
purely for data objects kept at block(slow) device.
* new column 'USED COMPR' - amount of space allocated for compressed
data. I.e. comrpessed data plus all the allocation, replication and erasure
coding overhead.
* new column 'UNDER COMPR' - amount of data passed through compression
(summed over all replicas) and beneficial enough to be stored in a
compressed form.
* rados df [detail] output (POOLS section) has been modified in json
format:
* 'size_bytes' and 'size_kb' columns now show the space (accumulated
over all OSDs) allocated purely for data objects kept at block
device.
* new column 'compress_bytes_used' - amount of space allocated for compressed
data. I.e. comrpessed data plus all the allocation, replication and erasure
coding overhead.
* new column 'compress_under_bytes' amount of data passed through compression
(summed over all replicas) and beneficial enough to be stored in a
compressed form.
* ceph pg dump output (totals section) has been modified in json
format:
* new 'USED' column shows the space (accumulated over all OSDs) allocated
purely for data objects kept at block(slow) device.
* 'USED_RAW' is now a sum of 'USED' space and space allocated/reserved at
block device for Ceph purposes, e.g. BlueFS part for BlueStore.
* The 'ceph osd rm' command has been deprecated. Users should use
'ceph osd destroy' or 'ceph osd purge' (but after first confirming it is
safe to do so via the 'ceph osd safe-to-destroy' command).
* The MDS now supports dropping its cache for the purposes of benchmarking.
ceph tell mds.* cache drop <timeout>
Note that the MDS cache is cooperatively managed by the clients. It is
necessary for clients to give up capabilities in order for the MDS to fully
drop its cache. This is accomplished by asking all clients to trim as many
caps as possible. The timeout argument to the `cache drop` command controls
how long the MDS waits for clients to complete trimming caps. This is optional
and is 0 by default (no timeout). Keep in mind that clients may still retain
caps to open files which will prevent the metadata for those files from being
dropped by both the client and the MDS. (This is an equivalent scenario to
dropping the Linux page/buffer/inode/dentry caches with some processes pinning
some inodes/dentries/pages in cache.)
* The mon_health_preluminous_compat and mon_health_preluminous_compat_warning
config options are removed, as the related functionality is more
than two versions old. Any legacy monitoring system expecting Jewel-style
health output will need to be updated to work with Nautilus.
* Nautilus is not supported on any distros still running upstart so upstart
specific files and references have been removed.
* The 'ceph pg <pgid> list_missing' command has been renamed to
'ceph pg <pgid> list_unfound' to better match its behaviour.
* The 'rbd-mirror' daemon can now retrieve remote peer cluster configuration
secrets from the monitor. To use this feature, the 'rbd-mirror' daemon
CephX user for the local cluster must use the 'profile rbd-mirror' mon cap.
The secrets can be set using the 'rbd mirror pool peer add' and
'rbd mirror pool peer set' actions.
* The `ceph mds deactivate` is fully obsolete and references to it in the docs
have been removed or clarified.
* The libcephfs bindings added the ceph_select_filesystem function
for use with multiple filesystems.
* The cephfs python bindings now include mount_root and filesystem_name
options in the mount() function.
* erasure-code: add experimental *Coupled LAYer (CLAY)* erasure codes
support. It features less network traffic and disk I/O when performing
recovery.
* The 'cache drop' OSD command has been added to drop an OSD's caches:
- ``ceph tell osd.x cache drop``
* The 'cache status' OSD command has been added to get the cache stats of an
OSD:
- ``ceph tell osd.x cache status'
* The libcephfs added several functions that allow restarted client to destroy
or reclaim state held by a previous incarnation. These functions are for NFS
servers.
* The `ceph` command line tool now accepts keyword arguments in
the format "--arg=value" or "--arg value".
* librados::IoCtx::nobjects_begin() and librados::NObjectIterator now communicate
errors by throwing a std::system_error exception instead of std::runtime_error.
* the callback function passed to LibRGWFS.readdir() now accepts a ``flags``
parameter. it will be the last parameter passed to ``readdir()` method.
>=13.1.0
--------
* The Telegraf module for the Manager allows for sending statistics to
an Telegraf Agent over TCP, UDP or a UNIX Socket. Telegraf can then
send the statistics to databases like InfluxDB, ElasticSearch, Graphite
and many more.
* The graylog fields naming the originator of a log event have
changed: the string-form name is now included (e.g., ``"name":
"mgr.foo"``), and the rank-form name is now in a nested section
(e.g., ``"rank": {"type": "mgr", "num": 43243}``).
* If the cluster log is directed at syslog, the entries are now
prefixed by both the string-form name and the rank-form name (e.g.,
``mgr.x mgr.12345 ...`` instead of just ``mgr.12345 ...``).
* The JSON output of the ``osd find`` command has replaced the ``ip``
field with an ``addrs`` section to reflect that OSDs may bind to
multiple addresses.
* CephFS clients without the 's' flag in their authentication capability
string will no longer be able to create/delete snapshots. To allow
``client.foo`` to create/delete snapshots in the ``bar`` directory of
filesystem ``cephfs_a``, use command:
- ``ceph auth caps client.foo mon 'allow r' osd 'allow rw tag cephfs data=cephfs_a' mds 'allow rw, allow rws path=/bar'``
* The ``osd_heartbeat_addr`` option has been removed as it served no
(good) purpose: the OSD should always check heartbeats on both the
public and cluster networks.
* The ``rados`` tool's ``mkpool`` and ``rmpool`` commands have been
removed because they are redundant; please use the ``ceph osd pool
create`` and ``ceph osd pool rm`` commands instead.
* The ``auid`` property for cephx users and RADOS pools has been
removed. This was an undocumented and partially implemented
capability that allowed cephx users to map capabilities to RADOS
pools that they "owned". Because there are no users we have removed
this support. If any cephx capabilities exist in the cluster that
restrict based on auid then they will no longer parse, and the
cluster will report a health warning like::
AUTH_BAD_CAPS 1 auth entities have invalid capabilities
client.bad osd capability parse failed, stopped at 'allow rwx auid 123' of 'allow rwx auid 123'
The capability can be adjusted with the ``ceph auth caps`` command. For example,::
ceph auth caps client.bad osd 'allow rwx pool foo'
* The ``ceph-kvstore-tool`` ``repair`` command has been renamed
``destructive-repair`` since we have discovered it can corrupt an
otherwise healthy rocksdb database. It should be used only as a last-ditch
attempt to recover data from an otherwise corrupted store.
* The default memory utilization for the mons has been increased
somewhat. Rocksdb now uses 512 MB of RAM by default, which should
be sufficient for small to medium-sized clusters; large clusters
should tune this up. Also, the ``mon_osd_cache_size`` has been
increase from 10 OSDMaps to 500, which will translate to an
additional 500 MB to 1 GB of RAM for large clusters, and much less
for small clusters.
* The ``mgr/balancer/max_misplaced`` option has been replaced by a new
global ``target_max_misplaced_ratio`` option that throttles both
balancer activity and automated adjustments to ``pgp_num`` (normally as a
result of ``pg_num`` changes). If you have customized the balancer module
option, you will need to adjust your config to set the new global option
or revert to the default of .05 (5%).
* By default, Ceph no longer issues a health warning when there are
misplaced objects (objects that are fully replicated but not stored
on the intended OSDs). You can reenable the old warning by setting
``mon_warn_on_misplaced`` to ``true``.
* The ``ceph-create-keys`` tool is now obsolete. The monitors
automatically create these keys on their own. For now the script
prints a warning message and exits, but it will be removed in the
next release. Note that ``ceph-create-keys`` would also write the
admin and bootstrap keys to /etc/ceph and /var/lib/ceph, but this
script no longer does that. Any deployment tools that relied on
this behavior should instead make use of the ``ceph auth export
<entity-name>`` command for whichever key(s) they need.
* The ``mon_osd_pool_ec_fast_read`` option has been renamed
``osd_pool_default_ec_fast_read`` to be more consistent with other
``osd_pool_default_*`` options that affect default values for newly
created RADOS pools.
* The ``mon addr`` configuration option is now deprecated. It can
still be used to specify an address for each monitor in the
``ceph.conf`` file, but it only affects cluster creation and
bootstrapping, and it does not support listing multiple addresses
(e.g., both a v2 and v1 protocol address). We strongly recommend
the option be removed and instead a single ``mon host`` option be
specified in the ``[global]`` section to allow daemons and clients
to discover the monitors.
Upgrading from Luminous
-----------------------
* During the upgrade from luminous to nautilus, it will not be possible to create
a new OSD using a luminous ceph-osd daemon after the monitors have been
upgraded to nautilus.