ceph/PendingReleaseNotes

>=20.0.0

* RBD: All Python APIs that produce timestamps now return "aware" `datetime`
  objects instead of "naive" ones (i.e. those including time zone information
  instead of those not including it).  All timestamps remain to be in UTC but
  including `timezone.utc` makes it explicit and avoids the potential of the
  returned timestamp getting misinterpreted -- in Python 3, many `datetime`
  methods treat "naive" `datetime` objects as local times.
* RBD: `rbd group info` and `rbd group snap info` commands are introduced to
  show information about a group and a group snapshot respectively.
* RBD: `rbd group snap ls` output now includes the group snapshot IDs. The header
  of the column showing the state of a group snapshot in the unformatted CLI
  output is changed from 'STATUS' to 'STATE'. The state of a group snapshot
  that was shown as 'ok' is now shown as 'complete', which is more descriptive.
* Based on tests performed at scale on an HDD based Ceph cluster, it was found
  that scheduling with mClock was not optimal with multiple OSD shards. For
  example, in the test cluster with multiple OSD node failures, the client
  throughput was found to be inconsistent across test runs coupled with multiple
  reported slow requests. However, the same test with a single OSD shard and
  with multiple worker threads yielded significantly better results in terms of
  consistency of client and recovery throughput across multiple test runs.
  Therefore, as an interim measure until the issue with multiple OSD shards
  (or multiple mClock queues per OSD) is investigated and fixed, the following
  changes to the default option values have been made:
   - osd_op_num_shards_hdd = 1 (was 5)
   - osd_op_num_threads_per_shard_hdd = 5 (was 1)
  For more details see https://tracker.ceph.com/issues/66289.
* MGR: The Ceph Manager's always-on modulues/plugins can now be force-disabled.
  This can be necessary in cases where we wish to prevent the manager from being
  flooded by module commands when Ceph services are down or degraded.

* CephFS: Modifying the setting "max_mds" when a cluster is
  unhealthy now requires users to pass the confirmation flag
  (--yes-i-really-mean-it). This has been added as a precaution to tell the
  users that modifying "max_mds" may not help with troubleshooting or recovery
  effort. Instead, it might further destabilize the cluster.

* mgr/restful, mgr/zabbix: both modules, already deprecated since 2020, have been
  finally removed. They have not been actively maintenance in the last years,
  and started suffering from vulnerabilities in their dependency chain (e.g.:
  CVE-2023-46136).  As alternatives, for the `restful` module, the `dashboard` module
  provides a richer and better maintained RESTful API. Regarding the `zabbix` module,
  there are alternative monitoring solutions, like `prometheus`, which is the most
  widely adopted among the Ceph user community.

* CephFS: EOPNOTSUPP (Operation not supported ) is now returned by the CephFS
  fuse client for `fallocate` for the default case (i.e. mode == 0) since
  CephFS does not support disk space reservation. The only flags supported are
  `FALLOC_FL_KEEP_SIZE` and `FALLOC_FL_PUNCH_HOLE`.

>=19.0.0

* cephx: key rotation is now possible using `ceph auth rotate`. Previously,
  this was only possible by deleting and then recreating the key.
* Ceph: a new --daemon-output-file switch is available for `ceph tell` commands
  to dump output to a file local to the daemon. For commands which produce
  large amounts of output, this avoids a potential spike in memory usage on the
  daemon, allows for faster streaming writes to a file local to the daemon, and
  reduces time holding any locks required to execute the command. For analysis,
  it is necessary to retrieve the file from the host running the daemon
  manually. Currently, only --format=json|json-pretty are supported.
* RGW: GetObject and HeadObject requests now return an x-rgw-replicated-at
  header for replicated objects. This timestamp can be compared against the
  Last-Modified header to determine how long the object took to replicate.
* The cephfs-shell utility is now packaged for RHEL / CentOS / Rocky 9 as required
  Python dependencies are now available in EPEL9.
* RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in
  multi-site deployments Previously, replicas of such objects were corrupted on decryption.
  A new tool, ``radosgw-admin bucket resync encrypted multipart``, can be used to
  identify these original multipart uploads. The ``LastModified`` timestamp of any
  identified object is incremented by one ns to cause peer zones to replicate it again.
  For multi-site deployments that make use of Server-Side Encryption, we
  recommended running this command against every bucket in every zone after all
  zones have upgraded.
* Tracing: The blkin tracing feature (see https://docs.ceph.com/en/reef/dev/blkin/)
  is now deprecated in favor of Opentracing (https://docs.ceph.com/en/reef/dev/developer_guide/jaegertracing/)
  and will be removed in a later release.
* RGW: Introducing a new data layout for the Topic metadata associated with S3
  Bucket Notifications, where each Topic is stored as a separate RADOS object
  and the bucket notification configuration is stored in a bucket attribute.
  This new representation supports multisite replication via metadata sync and
  can scale to many topics. This is on by default for new deployments, but is
  is not enabled by default on upgrade. Once all radosgws have upgraded (on all
  zones in a multisite configuration), the ``notification_v2`` zone feature can
  be enabled to migrate to the new format. See
  https://docs.ceph.com/en/squid/radosgw/zone-features for details. The "v1"
  format is now considered deprecated and may be removed after 2 major releases.
* CephFS: The MDS evicts clients which are not advancing their request tids, which causes
  a large buildup of session metadata, which in turn results in the MDS going read-only
  due to RADOS operations exceeding the size threshold. `mds_session_metadata_threshold`
  config controls the maximum size to which (encoded) session metadata can grow.
* CephFS: A new "mds last-seen" command is available for querying the last time
  an MDS was in the FSMap, subject to a pruning threshold.
* CephFS: For clusters with multiple CephFS file systems, all snap-schedule
  commands now expect the '--fs' argument.
* CephFS: The period specifier ``m`` now implies minutes and the period specifier
  ``M`` now implies months. This is consistent with the rest of the system.
* RGW: New tools have been added to radosgw-admin for identifying and
  correcting issues with versioned bucket indexes. Historical bugs with the
  versioned bucket index transaction workflow made it possible for the index
  to accumulate extraneous "book-keeping" olh entries and plain placeholder
  entries. In some specific scenarios where clients made concurrent requests
  referencing the same object key, it was likely that extra index
  entries would accumulate. When a significant number of these entries are
  present in a single bucket index shard, they can cause high bucket listing
  latency and lifecycle processing failures. To check whether a versioned
  bucket has unnecessary olh entries, users can now run ``radosgw-admin
  bucket check olh``. If the ``--fix`` flag is used, the extra entries will
  be safely removed. An additional issue is that some versioned buckets
  may maintain extra unlinked objects that are not listable via the S3/Swift
  APIs. These extra objects are typically a result of PUT requests that 
  exited abnormally in the middle of a bucket index transaction, and thus 
  the client would not have received a successful response. Bugs in prior 
  releases made these unlinked objects easy to reproduce with any PUT 
  request made on a bucket that was actively resharding. In certain 
  scenarios, a client of a bucket that was a victim of this bug may find 
  the object associated with the key to be in an inconsistent state. To check 
  whether a versioned bucket has unlinked entries, users can now run 
  ``radosgw-admin bucket check unlinked``. If the ``--fix`` flag is used, 
  the unlinked objects will be safely removed. Finally, a third issue made 
  it possible for versioned bucket index stats to be accounted inaccurately. 
  The tooling for recalculating versioned bucket stats also had a bug, and 
  was not previously capable of fixing these inaccuracies.  This release 
  resolves those issues and users can now expect that the existing 
  ``radosgw-admin bucket check`` command will produce correct results. 
  We recommend that users with versioned buckets, especially those that 
  existed on prior releases, use these new tools to check whether their 
  buckets are affected and to clean them up accordingly.
* RGW: The "user accounts" feature unlocks several new AWS-compatible IAM APIs
  for self-service management of users, keys, groups, roles, policy and
  more. Existing users can be adopted into new accounts. This process is optional
  but irreversible. See https://docs.ceph.com/en/squid/radosgw/account and
  https://docs.ceph.com/en/squid/radosgw/iam for details.
* RGW: On startup, radosgw and radosgw-admin now validate the ``rgw_realm``
  config option. Previously, they would ignore invalid or missing realms and
  go on to load a zone/zonegroup in a different realm. If startup fails with
  a  "failed to load realm" error, fix or remove the ``rgw_realm`` option.
* RGW: The radosgw-admin commands ``realm create`` and ``realm pull`` no
  longer set the default realm without ``--default``.
* CephFS: Running the command "ceph fs authorize" for an existing entity now
  upgrades the entity's capabilities instead of printing an error. It can now
  also change read/write permissions in a capability that the entity already
  holds. If the capability passed by user is same as one of the capabilities
  that the entity already holds, idempotency is maintained.
* CephFS: Two FS names can now be swapped, optionally along with their IDs,
  using "ceph fs swap" command. The function of this API is to facilitate
  file system swaps for disaster recovery. In particular, it avoids situations
  where a named file system is temporarily missing which would prompt a higher
  level storage operator (like Rook) to recreate the missing file system.
  See https://docs.ceph.com/en/latest/cephfs/administration/#file-systems
  docs for more information.
* CephFS: Before running the command "ceph fs rename", the filesystem to be
  renamed must be offline and the config "refuse_client_session" must be set
  for it. The config "refuse_client_session" can be removed/unset and
  filesystem can be online after the rename operation is complete.
* RADOS: A POOL_APP_NOT_ENABLED health warning will now be reported if
  the application is not enabled for the pool irrespective of whether
  the pool is in use or not. Always tag a pool with an application
  using ``ceph osd pool application enable`` command to avoid reporting
  of POOL_APP_NOT_ENABLED health warning for that pool.
  The user might temporarily mute this warning using
  ``ceph health mute POOL_APP_NOT_ENABLED``.
* The `mon_cluster_log_file_level` and `mon_cluster_log_to_syslog_level` options
  have been removed. Henceforth, users should use the new generic option
  `mon_cluster_log_level` to control the cluster log level verbosity for the cluster
  log file as well as for all external entities.
CephFS: Disallow delegating preallocated inode ranges to clients. Config
  `mds_client_delegate_inos_pct` defaults to 0 which disables async dirops
  in the kclient.
* S3 Get/HeadObject now support query parameter `partNumber` to read a specific
  part of a completed multipart upload.
* RGW: Fixed a S3 Object Lock bug with PutObjectRetention requests that specify
  a RetainUntilDate after the year 2106. This date was truncated to 32 bits when
  stored, so a much earlier date was used for object lock enforcement. This does
  not effect PutBucketObjectLockConfiguration where a duration is given in Days.
  The RetainUntilDate encoding is fixed for new PutObjectRetention requests, but
  cannot repair the dates of existing object locks. Such objects can be identified
  with a HeadObject request based on the x-amz-object-lock-retain-until-date
  response header.
* RADOS: `get_pool_is_selfmanaged_snaps_mode` C++ API has been deprecated
  due to being prone to false negative results.  It's safer replacement is
  `pool_is_in_selfmanaged_snaps_mode`.
* RADOS: For bug 62338 (https://tracker.ceph.com/issues/62338), in order to simplify
  backporting, we choose to not
  condition the fix on a server flag.  As
  a result, in rare cases it may be possible for a PG to flip between two acting
  sets while an upgrade to a version with the fix is in progress.  If you observe
  this behavior, you should be able to work around it by completing the upgrade or
  by disabling async recovery by setting osd_async_recovery_min_cost to a very
  large value on all OSDs until the upgrade is complete:
  ``ceph config set osd osd_async_recovery_min_cost 1099511627776``
* RADOS: A detailed version of the `balancer status` CLI command in the balancer
  module is now available. Users may run `ceph balancer status detail` to see more
  details about which PGs were updated in the balancer's last optimization.
  See https://docs.ceph.com/en/latest/rados/operations/balancer/ for more information.
* CephFS: Full support for subvolumes and subvolume groups is now available
  for snap_schedule Manager module.
* RGW: The SNS CreateTopic API now enforces the same topic naming requirements as AWS:
  Topic names must be made up of only uppercase and lowercase ASCII letters, numbers,
  underscores, and hyphens, and must be between 1 and 256 characters long.
* RBD: When diffing against the beginning of time (`fromsnapname == NULL`) in
  fast-diff mode (`whole_object == true` with `fast-diff` image feature enabled
  and valid), diff-iterate is now guaranteed to execute locally if exclusive
  lock is available.  This brings a dramatic performance improvement for QEMU
  live disk synchronization and backup use cases.
* RBD: The ``try-netlink`` mapping option for rbd-nbd has become the default
  and is now deprecated. If the NBD netlink interface is not supported by the
  kernel, then the mapping is retried using the legacy ioctl interface.
* RADOS: Read balancing may now be managed automatically via the balancer
  manager module. Users may choose between two new modes: ``upmap-read``, which
  offers upmap and read optimization simultaneously, or ``read``, which may be used
  to only optimize reads. For more detailed information see https://docs.ceph.com/en/latest/rados/operations/read-balancer/#online-optimization.
* CephFS: MDS log trimming is now driven by a separate thread which tries to
  trim the log every second (`mds_log_trim_upkeep_interval` config). Also,
  a couple of configs govern how much time the MDS spends in trimming its
  logs. These configs are `mds_log_trim_threshold` and `mds_log_trim_decay_rate`.
* RGW: Notification topics are now owned by the user that created them. 
  By default, only the owner can read/write their topics. Topic policy documents
  are now supported to grant these permissions to other users. Preexisting topics
  are treated as if they have no owner, and any user can read/write them using the SNS API. 
  If such a topic is recreated with CreateTopic, the issuing user becomes the new owner.
  For backward compatibility, all users still have permission to publish bucket 
  notifications to topics owned by other users. A new configuration parameter:
  ``rgw_topic_require_publish_policy`` can be enabled to deny ``sns:Publish``
  permissions unless explicitly granted by topic policy.
* RGW: Fix issue with persistent notifications where the changes to topic param that
  were modified while persistent notifications were in the queue will be reflected in notifications.
  So if user sets up topic with incorrect config (password/ssl) causing failure while delivering the
  notifications to broker, can now modify the incorrect topic attribute and on retry attempt to delivery
  the notifications, new configs will be used.
* RBD: The option ``--image-id`` has been added to `rbd children` CLI command,
  so it can be run for images in the trash.
* PG dump: The default output of `ceph pg dump --format json` has changed. The
  default json format produces a rather massive output in large clusters and
  isn't scalable. So we have removed the 'network_ping_times' section from
  the output. Details in the tracker: https://tracker.ceph.com/issues/57460
* mgr/REST: The REST manager module will trim requests based on the 'max_requests' option.
  Without this feature, and in the absence of manual deletion of old requests,
  the accumulation of requests in the array can lead to Out Of Memory (OOM) issues, 
  resulting in the Manager crashing.

* CephFS: The `subvolume snapshot clone` command now depends on the config option
  `snapshot_clone_no_wait` which is used to reject the clone operation when
  all the cloner threads are busy. This config option is enabled by default which means 
  that if no cloner threads are free, the clone request errors out with EAGAIN.
  The value of the config option can be fetched by using:
   `ceph config get mgr mgr/volumes/snapshot_clone_no_wait`
  and it can be disabled by using:
   `ceph config set mgr mgr/volumes/snapshot_clone_no_wait false`
* RBD: `RBD_IMAGE_OPTION_CLONE_FORMAT` option has been exposed in Python
  bindings via `clone_format` optional parameter to `clone`, `deep_copy` and
  `migration_prepare` methods.
* RBD: `RBD_IMAGE_OPTION_FLATTEN` option has been exposed in Python bindings via
  `flatten` optional parameter to `deep_copy` and `migration_prepare` methods.

* CephFS: Command "ceph mds fail" and "ceph fs fail" now requires a
  confirmation flag when some MDSs exhibit health warning MDS_TRIM or
  MDS_CACHE_OVERSIZED. This is to prevent accidental MDS failover causing
  further delays in recovery.
* CephFS: fixes to the implementation of the ``root_squash`` mechanism enabled
  via cephx ``mds`` caps on a client credential require a new client feature
  bit, ``client_mds_auth_caps``. Clients using credentials with ``root_squash``
  without this feature will trigger the MDS to raise a HEALTH_ERR on the
  cluster, MDS_CLIENTS_BROKEN_ROOTSQUASH. See the documentation on this warning
  and the new feature bit for more information.
* CephFS: Expanded removexattr support for cephfs virtual extended attributes.
  Previously one had to use setxattr to restore the default in order to "remove".
  You may now properly use removexattr to remove. You can also now remove layout
  on root inode, which then will restore layout to default layout.

* cls_cxx_gather is marked as deprecated.
* CephFS: cephfs-journal-tool is guarded against running on an online file system.
  The 'cephfs-journal-tool --rank <fs_name>:<mds_rank> journal reset' and
  'cephfs-journal-tool --rank <fs_name>:<mds_rank> journal reset --force'
  commands require '--yes-i-really-really-mean-it'.

* Dashboard: Rearranged Navigation Layout: The navigation layout has been reorganized
  for improved usability and easier access to key features.
* Dashboard: CephFS Improvments
  * Support for managing CephFS snapshots and clones, as well as snapshot schedule
    management
  * Manage authorization capabilities for CephFS resources
  * Helpers on mounting a CephFS volume
* Dashboard: RGW Improvements
  * Support for managing bucket policies
  * Add/Remove bucket tags
  * ACL Management
  * Several UI/UX Improvements to the bucket form
* Monitoring: Grafana dashboards are now loaded into the container at runtime rather than
  building a grafana image with the grafana dashboards. Official Ceph grafana images
  can be found in quay.io/ceph/grafana
* Monitoring: RGW S3 Analytics: A new Grafana dashboard is now available, enabling you to
  visualize per bucket and user analytics data, including total GETs, PUTs, Deletes,
  Copies, and list metrics.
* RBD: `Image::access_timestamp` and `Image::modify_timestamp` Python APIs now
  return timestamps in UTC.
* RBD: Support for cloning from non-user type snapshots is added.  This is
  intended primarily as a building block for cloning new groups from group
  snapshots created with `rbd group snap create` command, but has also been
  exposed via the new `--snap-id` option for `rbd clone` command.
* RBD: The output of `rbd snap ls --all` command now includes the original
  type for trashed snapshots.
* CephFS: "ceph fs clone status" command will now print statistics about clone
  progress in terms of how much data has been cloned (in both percentage as
  well as bytes) and how many files have been cloned.
* CephFS: "ceph status" command will now print a progress bar when cloning is
  ongoing. If clone jobs are more than the cloner threads, it will print one
  more progress bar that shows total amount of progress made by both ongoing
  as well as pending clones. Both progress are accompanied by messages that
  show number of clone jobs in the respective categories and the amount of
  progress made by each of them.
* RGW: in bucket notifications, the `principalId` inside `ownerIdentity` now contains
  complete user id, prefixed with tenant id

* NFS: The export create/apply of CephFS based exports will now have a additional parameter `cmount_path` under the FSAL block,
  which specifies the path within the CephFS to mount this export on. If this and the other
  `EXPORT { FSAL {} }` options are the same between multiple exports, those exports will share a single CephFS client. If not specified, the default is `/`.

>=18.0.0

* The RGW policy parser now rejects unknown principals by default. If you are
  mirroring policies between RGW and AWS, you may wish to set
  "rgw policy reject invalid principals" to "false". This affects only newly set
  policies, not policies that are already in place.
* The CephFS automatic metadata load (sometimes called "default") balancer is
  now disabled by default. The new file system flag `balance_automate`
  can be used to toggle it on or off. It can be enabled or disabled via
  `ceph fs set <fs_name> balance_automate <bool>`.
* RGW's default backend for `rgw_enable_ops_log` changed from RADOS to file.
  The default value of `rgw_ops_log_rados` is now false, and `rgw_ops_log_file_path`
  defaults to "/var/log/ceph/ops-log-$cluster-$name.log".
* The SPDK backend for BlueStore is now able to connect to an NVMeoF target.
  Please note that this is not an officially supported feature.
* RGW's pubsub interface now returns boolean fields using bool. Before this change,
  `/topics/<topic-name>` returns "stored_secret" and "persistent" using a string
  of "true" or "false" with quotes around them. After this change, these fields
  are returned without quotes so they can be decoded as boolean values in JSON.
  The same applies to the `is_truncated` field returned by `/subscriptions/<sub-name>`.
* RGW's response of `Action=GetTopicAttributes&TopicArn=<topic-arn>` REST API now
  returns `HasStoredSecret` and `Persistent` as boolean in the JSON string
  encoded in `Attributes/EndPoint`.
* All boolean fields previously rendered as string by `rgw-admin` command when
  the JSON format is used are now rendered as boolean. If your scripts/tools
  relies on this behavior, please update them accordingly. The impacted field names
  are:
  * absolute
  * add
  * admin
  * appendable
  * bucket_key_enabled
  * delete_marker
  * exists
  * has_bucket_info
  * high_precision_time
  * index
  * is_master
  * is_prefix
  * is_truncated
  * linked
  * log_meta
  * log_op
  * pending_removal
  * read_only
  * retain_head_object
  * rule_exist
  * start_with_full_sync
  * sync_from_all
  * syncstopped
  * system
  * truncated
  * user_stats_sync
* RGW: The beast frontend's HTTP access log line uses a new debug_rgw_access
  configurable. This has the same defaults as debug_rgw, but can now be controlled
  independently.
* RBD: The semantics of compare-and-write C++ API (`Image::compare_and_write`
  and `Image::aio_compare_and_write` methods) now match those of C API.  Both
  compare and write steps operate only on `len` bytes even if the respective
  buffers are larger. The previous behavior of comparing up to the size of
  the compare buffer was prone to subtle breakage upon straddling a stripe
  unit boundary.
* RBD: compare-and-write operation is no longer limited to 512-byte sectors.
  Assuming proper alignment, it now allows operating on stripe units (4M by
  default).
* RBD: New `rbd_aio_compare_and_writev` API method to support scatter/gather
  on both compare and write buffers.  This compliments existing `rbd_aio_readv`
  and `rbd_aio_writev` methods.
* The 'AT_NO_ATTR_SYNC' macro is deprecated, please use the standard 'AT_STATX_DONT_SYNC'
  macro. The 'AT_NO_ATTR_SYNC' macro will be removed in the future.
* Trimming of PGLog dups is now controlled by the size instead of the version.
  This fixes the PGLog inflation issue that was happening when the on-line
  (in OSD) trimming got jammed after a PG split operation. Also, a new off-line
  mechanism has been added: `ceph-objectstore-tool` got `trim-pg-log-dups` op
  that targets situations where OSD is unable to boot due to those inflated dups.
  If that is the case, in OSD logs the "You can be hit by THE DUPS BUG" warning
  will be visible.
  Relevant tracker: https://tracker.ceph.com/issues/53729
* RBD: `rbd device unmap` command gained `--namespace` option.  Support for
  namespaces was added to RBD in Nautilus 14.2.0 and it has been possible to
  map and unmap images in namespaces using the `image-spec` syntax since then
  but the corresponding option available in most other commands was missing.
* RGW: Compression is now supported for objects uploaded with Server-Side Encryption.
  When both are enabled, compression is applied before encryption. Earlier releases
  of multisite do not replicate such objects correctly, so all zones must upgrade to
  Reef before enabling the `compress-encrypted` zonegroup feature: see
  https://docs.ceph.com/en/reef/radosgw/multisite/#zone-features and note the
  security considerations.
* RGW: the "pubsub" functionality for storing bucket notifications inside Ceph
  is removed. Together with it, the "pubsub" zone should not be used anymore.
  The REST operations, as well as radosgw-admin commands for manipulating
  subscriptions, as well as fetching and acking the notifications are removed 
  as well.
  In case that the endpoint to which the notifications are sent maybe down or 
  disconnected, it is recommended to use persistent notifications to guarantee 
  the delivery of the notifications. In case the system that consumes the 
  notifications needs to pull them (instead of the notifications be pushed 
  to it), an external message bus (e.g. rabbitmq, Kafka) should be used for 
  that purpose.
* RGW: The serialized format of notification and topics has changed, so that 
  new/updated topics will be unreadable by old RGWs. We recommend completing 
  the RGW upgrades before creating or modifying any notification topics.
* RBD: Trailing newline in passphrase files (`<passphrase-file>` argument in
  `rbd encryption format` command and `--encryption-passphrase-file` option
  in other commands) is no longer stripped.
* RBD: Support for layered client-side encryption is added.  Cloned images
  can now be encrypted each with its own encryption format and passphrase,
  potentially different from that of the parent image.  The efficient
  copy-on-write semantics intrinsic to unformatted (regular) cloned images
  are retained.
* CEPHFS: Rename the `mds_max_retries_on_remount_failure` option to
  `client_max_retries_on_remount_failure` and move it from mds.yaml.in to
  mds-client.yaml.in because this option was only used by MDS client from its
  birth.
* The `perf dump` and `perf schema` commands are deprecated in favor of new
  `counter dump` and `counter schema` commands. These new commands add support
  for labeled perf counters and also emit existing unlabeled perf counters. Some
  unlabeled perf counters became labeled in this release, with more to follow in
  future releases; such converted perf counters are no longer emitted by the
  `perf dump` and `perf schema` commands.
* `ceph mgr dump` command now outputs `last_failure_osd_epoch` and
  `active_clients` fields at the top level.  Previously, these fields were
  output under `always_on_modules` field.
* `ceph mgr dump` command now displays the name of the mgr module that
  registered a RADOS client in the `name` field added to elements of the
  `active_clients` array. Previously, only the address of a module's RADOS
  client was shown in the `active_clients` array.
* RBD: All rbd-mirror daemon perf counters became labeled and as such are now
  emitted only by the new `counter dump` and `counter schema` commands.  As part
  of the conversion, many also got renamed to better disambiguate journal-based
  and snapshot-based mirroring.
* RBD: list-watchers C++ API (`Image::list_watchers`) now clears the passed
  `std::list` before potentially appending to it, aligning with the semantics
  of the corresponding C API (`rbd_watchers_list`).
* The rados python binding is now able to process (opt-in) omap keys as bytes
  objects. This enables interacting with RADOS omap keys that are not decodeable as
  UTF-8 strings.
* Telemetry: Users who are opted-in to telemetry can also opt-in to
  participating in a leaderboard in the telemetry public
  dashboards (https://telemetry-public.ceph.com/). Users can now also add a
  description of the cluster to publicly appear in the leaderboard.
  For more details, see:
  https://docs.ceph.com/en/latest/mgr/telemetry/#leaderboard
  See a sample report with `ceph telemetry preview`.
  Opt-in to telemetry with `ceph telemetry on`.
  Opt-in to the leaderboard with
  `ceph config set mgr mgr/telemetry/leaderboard true`.
  Add leaderboard description with:
  `ceph config set mgr mgr/telemetry/leaderboard_description ‘Cluster description’`.
* CEPHFS: After recovering a Ceph File System post following the disaster recovery
  procedure, the recovered files under `lost+found` directory can now be deleted.
* core: cache-tiering is now deprecated.
* mClock Scheduler: The mClock scheduler (default scheduler in Quincy) has
  undergone significant usability and design improvements to address the slow
  backfill issue. Some important changes are:
  * The 'balanced' profile is set as the default mClock profile because it
    represents a compromise between prioritizing client IO or recovery IO. Users
    can then choose either the 'high_client_ops' profile to prioritize client IO
    or the 'high_recovery_ops' profile to prioritize recovery IO.
  * QoS parameters like reservation and limit are now specified in terms of a
    fraction (range: 0.0 to 1.0) of the OSD's IOPS capacity.
  * The cost parameters (osd_mclock_cost_per_io_usec_* and
    osd_mclock_cost_per_byte_usec_*) have been removed. The cost of an operation
    is now determined using the random IOPS and maximum sequential bandwidth
    capability of the OSD's underlying device.
  * Degraded object recovery is given higher priority when compared to misplaced
    object recovery because degraded objects present a data safety issue not
    present with objects that are merely misplaced. Therefore, backfilling
    operations with the 'balanced' and 'high_client_ops' mClock profiles may
    progress slower than what was seen with the 'WeightedPriorityQueue' (WPQ)
    scheduler.
  * The QoS allocations in all the mClock profiles are optimized based on the above
    fixes and enhancements.
  * For more detailed information see:
    https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/
* mgr/snap_schedule: The snap-schedule mgr module now retains one less snapshot
  than the number mentioned against the config tunable `mds_max_snaps_per_dir`
  so that a new snapshot can be created and retained during the next schedule
  run.
* `ceph config dump --format <json|xml>` output will display the localized
  option names instead of its normalized version. For e.g.,
  "mgr/prometheus/x/server_port" will be displayed instead of
  "mgr/prometheus/server_port". This matches the output of the non pretty-print
  formatted version of the command.
* CEPHFS: MDS config option name "mds_kill_skip_replaying_inotable" is a bit
  confusing with "mds_inject_skip_replaying_inotable", therefore renaming it to
  "mds_kill_after_journal_logs_flushed"


>=17.2.1

* The "BlueStore zero block detection" feature (first introduced to Quincy in
https://github.com/ceph/ceph/pull/43337) has been turned off by default with a
new global configuration called `bluestore_zero_block_detection`. This feature,
intended for large-scale synthetic testing, does not interact well with some RBD
and CephFS features. Any side effects experienced in previous Quincy versions
would no longer occur, provided that the configuration remains set to false.
Relevant tracker: https://tracker.ceph.com/issues/55521

* telemetry: Added new Rook metrics to the 'basic' channel to report Rook's
  version, Kubernetes version, node metrics, etc.
  See a sample report with `ceph telemetry preview`.
  Opt-in with `ceph telemetry on`.

  For more details, see:

  https://docs.ceph.com/en/latest/mgr/telemetry/

* OSD: The issue of high CPU utilization during recovery/backfill operations
  has been fixed. For more details, see: https://tracker.ceph.com/issues/56530.

>=15.2.17

* OSD: Octopus modified the SnapMapper key format from
  <LEGACY_MAPPING_PREFIX><snapid>_<shardid>_<hobject_t::to_str()>
  to
  <MAPPING_PREFIX><pool>_<snapid>_<shardid>_<hobject_t::to_str()>
  When this change was introduced, 94ebe0e also introduced a conversion
  with a crucial bug which essentially destroyed legacy keys by mapping them
  to
  <MAPPING_PREFIX><poolid>_<snapid>_
  without the object-unique suffix. The conversion is fixed in this release.
  Relevant tracker: https://tracker.ceph.com/issues/56147
  
* Cephadm may now be configured to carry out CephFS MDS upgrades without
reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to
avoid having two active MDS modifying on-disk structures with new versions,
communicating cross-version-incompatible messages, or other potential
incompatibilities. This could be disruptive for large-scale CephFS deployments
because the cluster cannot easily reduce active MDS daemons to 1.
NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage
of the feature, refer this link on how to perform it:
https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade
Relevant tracker: https://tracker.ceph.com/issues/55715

* Introduced a new file system flag `refuse_client_session` that can be set using the
`fs set` command. This flag allows blocking any incoming session
request from client(s). This can be useful during some recovery situations
where it's desirable to bring MDS up but have no client workload.
Relevant tracker: https://tracker.ceph.com/issues/57090

* New MDSMap field `max_xattr_size` which can be set using the `fs set` command.
  This MDSMap field allows to configure the maximum size allowed for the full
  key/value set for a filesystem extended attributes.  It effectively replaces
  the old per-MDS `max_xattr_pairs_size` setting, which is now dropped.
  Relevant tracker: https://tracker.ceph.com/issues/55725

* Introduced a new file system flag `refuse_standby_for_another_fs` that can be
set using the `fs set` command. This flag prevents using a standby for another
file system (join_fs = X) when standby for the current filesystem is not available.
Relevant tracker: https://tracker.ceph.com/issues/61599
* mon: add NVMe-oF gateway monitor and HA
  This PR adds high availability support for the nvmeof Ceph service. High availability
means that even in the case that a certain GW is down, there will be another available
path for the initiator to be able to continue the IO through another GW.
It is also adding 2 new mon commands, to notify monitor about the gateway creation/deletion:
  - nvme-gw create
  - nvme-gw delete
Relevant tracker: https://tracker.ceph.com/issues/64777
-												pybind/rbd: change to return "aware" datetime objects

utcfromtimestamp() and utcnow() have been deprecated in Python 3.12.
Let's follow suit because it turns out that many datetime methods in
Python 3 interpret "naive" objects that we are currently returning as
local times.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2024-06-24 13:25:11 +00:00
+								>=20.0.0
 								* RBD: All Python APIs that produce timestamps now return "aware" `datetime`
 								  objects instead of "naive" ones (i.e. those including time zone information
 								  instead of those not including it).  All timestamps remain to be in UTC but
 								  including `timezone.utc` makes it explicit and avoids the potential of the
 								  returned timestamp getting misinterpreted -- in Python 3, many `datetime`
 								  methods treat "naive" `datetime` objects as local times.
-												rbd: add group snap info command

... to show information about a group snapshot.

And also include group snap ID in `group snap ls` output.

Fixes: https://tracker.ceph.com/issues/66011
Signed-off-by: Ramana Raja <rraja@redhat.com>

											
										
										
											2024-06-18 21:32:24 +00:00
+								* RBD: `rbd group info` and `rbd group snap info` commands are introduced to
 								  show information about a group and a group snapshot respectively.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* RBD: `rbd group snap ls` output now includes the group snapshot IDs. The header
-												rbd: add group snap info command

... to show information about a group snapshot.

And also include group snap ID in `group snap ls` output.

Fixes: https://tracker.ceph.com/issues/66011
Signed-off-by: Ramana Raja <rraja@redhat.com>

											
										
										
											2024-06-18 21:32:24 +00:00
+								  of the column showing the state of a group snapshot in the unformatted CLI
 								  output is changed from 'STATUS' to 'STATE'. The state of a group snapshot
 								  that was shown as 'ok' is now shown as 'complete', which is more descriptive.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* Based on tests performed at scale on an HDD based Ceph cluster, it was found
-												common/options: Change HDD OSD shard configuration defaults for mClock

Based on tests performed at scale on a HDD based cluster, it was found
that scheduling with mClock was not optimal with multiple OSD shards. For
e.g., in the scaled cluster with multiple OSD node failures, the client
throughput was found to be inconsistent across test runs coupled with
multiple reported slow requests.

However, the same test with a single OSD shard and with multiple worker
threads yielded significantly better results in terms of consistency of
client and recovery throughput across multiple test runs.

For more details see https://tracker.ceph.com/issues/66289.

Therefore, as an interim measure until the issue with multiple OSD shards
(or multiple mClock queues per OSD) is investigated and fixed, the
following change to the default HDD OSD shard configuration is made:

 - osd_op_num_shards_hdd = 1 (was 5)
 - osd_op_num_threads_per_shard_hdd = 5 (was 1)

The other changes in this commit include:
 - Doc change to the OSD and mClock config reference describing
   this change.
 - OSD troubleshooting entry on the procedure to change the shard
   configuration for clusters affected by this issue running on older
   releases.
 - Add release note for this change.

Fixes: https://tracker.ceph.com/issues/66289
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

# Conflicts:
#	doc/rados/troubleshooting/troubleshooting-osd.rst

											
										
										
											2024-09-03 05:39:08 +00:00
+								  that scheduling with mClock was not optimal with multiple OSD shards. For
 								  example, in the test cluster with multiple OSD node failures, the client
 								  throughput was found to be inconsistent across test runs coupled with multiple
 								  reported slow requests. However, the same test with a single OSD shard and
 								  with multiple worker threads yielded significantly better results in terms of
 								  consistency of client and recovery throughput across multiple test runs.
 								  Therefore, as an interim measure until the issue with multiple OSD shards
 								  (or multiple mClock queues per OSD) is investigated and fixed, the following
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								  changes to the default option values have been made:
-												common/options: Change HDD OSD shard configuration defaults for mClock

Based on tests performed at scale on a HDD based cluster, it was found
that scheduling with mClock was not optimal with multiple OSD shards. For
e.g., in the scaled cluster with multiple OSD node failures, the client
throughput was found to be inconsistent across test runs coupled with
multiple reported slow requests.

However, the same test with a single OSD shard and with multiple worker
threads yielded significantly better results in terms of consistency of
client and recovery throughput across multiple test runs.

For more details see https://tracker.ceph.com/issues/66289.

Therefore, as an interim measure until the issue with multiple OSD shards
(or multiple mClock queues per OSD) is investigated and fixed, the
following change to the default HDD OSD shard configuration is made:

 - osd_op_num_shards_hdd = 1 (was 5)
 - osd_op_num_threads_per_shard_hdd = 5 (was 1)

The other changes in this commit include:
 - Doc change to the OSD and mClock config reference describing
   this change.
 - OSD troubleshooting entry on the procedure to change the shard
   configuration for clusters affected by this issue running on older
   releases.
 - Add release note for this change.

Fixes: https://tracker.ceph.com/issues/66289
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

# Conflicts:
#	doc/rados/troubleshooting/troubleshooting-osd.rst

											
										
										
											2024-09-03 05:39:08 +00:00
+								   - osd_op_num_shards_hdd = 1 (was 5)
 								   - osd_op_num_threads_per_shard_hdd = 5 (was 1)
 								  For more details see https://tracker.ceph.com/issues/66289.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* MGR: The Ceph Manager's always-on modulues/plugins can now be force-disabled.
 								  This can be necessary in cases where we wish to prevent the manager from being
 								  flooded by module commands when Ceph services are down or degraded.
-												pybind/rbd: change to return "aware" datetime objects

utcfromtimestamp() and utcnow() have been deprecated in Python 3.12.
Let's follow suit because it turns out that many datetime methods in
Python 3 interpret "naive" objects that we are currently returning as
local times.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2024-06-24 13:25:11 +00:00
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* CephFS: Modifying the setting "max_mds" when a cluster is
-												PendingReleaseNotes: add a release note about confirm flag for max_mds

Add a release note for the fact that users now need to pass the
confirmation flag for modifying "max_mds" when cluster is unhealthy.

Signed-off-by: Rishabh Dave <ridave@redhat.com>

											
										
										
											2024-08-27 08:20:49 +00:00
+								  unhealthy now requires users to pass the confirmation flag
 								  (--yes-i-really-mean-it). This has been added as a precaution to tell the
 								  users that modifying "max_mds" may not help with troubleshooting or recovery
 								  effort. Instead, it might further destabilize the cluster.
-												mgr/{restful,zabbix}: document removal

Fixes: https://tracker.ceph.com/issues/47066
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

											
										
										
											2024-05-10 12:28:20 +00:00
+								* mgr/restful, mgr/zabbix: both modules, already deprecated since 2020, have been
 								  finally removed. They have not been actively maintenance in the last years,
 								  and started suffering from vulnerabilities in their dependency chain (e.g.:
 								  CVE-2023-46136).  As alternatives, for the `restful` module, the `dashboard` module
 								  provides a richer and better maintained RESTful API. Regarding the `zabbix` module,
 								  there are alternative monitoring solutions, like `prometheus`, which is the most
 								  widely adopted among the Ceph user community.
-												PendingReleaseNotes: add a release note about confirm flag for max_mds

Add a release note for the fact that users now need to pass the
confirmation flag for modifying "max_mds" when cluster is unhealthy.

Signed-off-by: Rishabh Dave <ridave@redhat.com>

											
										
										
											2024-08-27 08:20:49 +00:00
-												PendingReleaseNotes: add note about fallocate mode 0

fallocate now returns EOPNOTSUPP for mode 0

Signed-off-by: Milind Changire <mchangir@redhat.com>

											
										
										
											2024-09-19 04:24:20 +00:00
+								* CephFS: EOPNOTSUPP (Operation not supported ) is now returned by the CephFS
 								  fuse client for `fallocate` for the default case (i.e. mode == 0) since
 								  CephFS does not support disk space reservation. The only flags supported are
 								  `FALLOC_FL_KEEP_SIZE` and `FALLOC_FL_PUNCH_HOLE`.
-												PendingReleaseNotes: add a release note about confirm flag for max_mds

Add a release note for the fact that users now need to pass the
confirmation flag for modifying "max_mds" when cluster is unhealthy.

Signed-off-by: Rishabh Dave <ridave@redhat.com>

											
										
										
											2024-08-27 08:20:49 +00:00
-												ReleaseNotes: document recovery of encrypted multipart objects

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2023-08-03 21:18:05 +00:00
+								>=19.0.0
-												PendingReleaseNotes: add note for new `auth rotate`

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

											
										
										
											2024-06-18 17:47:29 +00:00
+								* cephx: key rotation is now possible using `ceph auth rotate`. Previously,
 								  this was only possible by deleting and then recreating the key.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* Ceph: a new --daemon-output-file switch is available for `ceph tell` commands
-												doc: document new --output-file switch

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

											
										
										
											2024-05-08 15:27:11 +00:00
+								  to dump output to a file local to the daemon. For commands which produce
 								  large amounts of output, this avoids a potential spike in memory usage on the
 								  daemon, allows for faster streaming writes to a file local to the daemon, and
 								  reduces time holding any locks required to execute the command. For analysis,
 								  it is necessary to retrieve the file from the host running the daemon
 								  manually. Currently, only --format=json|json-pretty are supported.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* RGW: GetObject and HeadObject requests now return an x-rgw-replicated-at
-												rgw/multisite: handle object stat output of attrs
"user.rgw.replicated-at"

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>

											
										
										
											2024-02-13 16:03:04 +00:00
+								  header for replicated objects. This timestamp can be compared against the
 								  Last-Modified header to determine how long the object took to replicate.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* The cephfs-shell utility is now packaged for RHEL / CentOS / Rocky 9 as required
 								  Python dependencies are now available in EPEL9.
-												ReleaseNotes: document recovery of encrypted multipart objects

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2023-08-03 21:18:05 +00:00
+								* RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								  multi-site deployments Previously, replicas of such objects were corrupted on decryption.
-												ReleaseNotes: document recovery of encrypted multipart objects

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2023-08-03 21:18:05 +00:00
+								  A new tool, ``radosgw-admin bucket resync encrypted multipart``, can be used to
 								  identify these original multipart uploads. The ``LastModified`` timestamp of any
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								  identified object is incremented by one ns to cause peer zones to replicate it again.
 								  For multi-site deployments that make use of Server-Side Encryption, we
-												ReleaseNotes: document recovery of encrypted multipart objects

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2023-08-03 21:18:05 +00:00
+								  recommended running this command against every bucket in every zone after all
 								  zones have upgraded.
-												doc: deprecate blkin tracing

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2024-02-08 15:56:03 +00:00
+								* Tracing: The blkin tracing feature (see https://docs.ceph.com/en/reef/dev/blkin/)
 								  is now deprecated in favor of Opentracing (https://docs.ceph.com/en/reef/dev/developer_guide/jaegertracing/)
 								  and will be removed in a later release.
-												PendingReleaseNotes: announce the notification_v2 feature and its migration

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2024-03-13 22:47:35 +00:00
+								* RGW: Introducing a new data layout for the Topic metadata associated with S3
 								  Bucket Notifications, where each Topic is stored as a separate RADOS object
 								  and the bucket notification configuration is stored in a bucket attribute.
 								  This new representation supports multisite replication via metadata sync and
 								  can scale to many topics. This is on by default for new deployments, but is
 								  is not enabled by default on upgrade. Once all radosgws have upgraded (on all
 								  zones in a multisite configuration), the ``notification_v2`` zone feature can
 								  be enabled to migrate to the new format. See
 								  https://docs.ceph.com/en/squid/radosgw/zone-features for details. The "v1"
 								  format is now considered deprecated and may be removed after 2 major releases.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* CephFS: The MDS evicts clients which are not advancing their request tids, which causes
 								  a large buildup of session metadata, which in turn results in the MDS going read-only
 								  due to RADOS operations exceeding the size threshold. `mds_session_metadata_threshold`
 								  config controls the maximum size to which (encoded) session metadata can grow.
-												PendingReleaseNotes: add note on last-seen command

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

											
										
										
											2023-09-19 19:24:10 +00:00
+								* CephFS: A new "mds last-seen" command is available for querying the last time
 								  an MDS was in the FSMap, subject to a pruning threshold.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* CephFS: For clusters with multiple CephFS file systems, all snap-schedule
-												PendingReleaseNotes: add note about 'm' and 'M' periods

Signed-off-by: Milind Changire <mchangir@redhat.com>

											
										
										
											2023-09-12 08:39:39 +00:00
+								  commands now expect the '--fs' argument.
 								* CephFS: The period specifier ``m`` now implies minutes and the period specifier
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								  ``M`` now implies months. This is consistent with the rest of the system.
-												rgw: fix rgw versioned bucket stat accounting during reshard and check index

Fixes: https://tracker.ceph.com/issues/62760
Signed-off-by: Cory Snyder <csnyder@1111systems.com>

											
										
										
											2023-09-07 17:23:14 +00:00
+								* RGW: New tools have been added to radosgw-admin for identifying and
 								  correcting issues with versioned bucket indexes. Historical bugs with the
 								  versioned bucket index transaction workflow made it possible for the index
 								  to accumulate extraneous "book-keeping" olh entries and plain placeholder
 								  entries. In some specific scenarios where clients made concurrent requests
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								  referencing the same object key, it was likely that extra index
-												rgw: fix rgw versioned bucket stat accounting during reshard and check index

Fixes: https://tracker.ceph.com/issues/62760
Signed-off-by: Cory Snyder <csnyder@1111systems.com>

											
										
										
											2023-09-07 17:23:14 +00:00
+								  entries would accumulate. When a significant number of these entries are
 								  present in a single bucket index shard, they can cause high bucket listing
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								  latency and lifecycle processing failures. To check whether a versioned
-												rgw: fix rgw versioned bucket stat accounting during reshard and check index

Fixes: https://tracker.ceph.com/issues/62760
Signed-off-by: Cory Snyder <csnyder@1111systems.com>

											
										
										
											2023-09-07 17:23:14 +00:00
+								  bucket has unnecessary olh entries, users can now run ``radosgw-admin
 								  bucket check olh``. If the ``--fix`` flag is used, the extra entries will
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								  be safely removed. An additional issue is that some versioned buckets
 								  may maintain extra unlinked objects that are not listable via the S3/Swift
 								  APIs. These extra objects are typically a result of PUT requests that
 								  exited abnormally in the middle of a bucket index transaction, and thus
 								  the client would not have received a successful response. Bugs in prior
 								  releases made these unlinked objects easy to reproduce with any PUT
 								  request made on a bucket that was actively resharding. In certain
 								  scenarios, a client of a bucket that was a victim of this bug may find
 								  the object associated with the key to be in an inconsistent state. To check
 								  whether a versioned bucket has unlinked entries, users can now run
 								  ``radosgw-admin bucket check unlinked``. If the ``--fix`` flag is used,
 								  the unlinked objects will be safely removed. Finally, a third issue made
 								  it possible for versioned bucket index stats to be accounted inaccurately.
 								  The tooling for recalculating versioned bucket stats also had a bug, and
 								  was not previously capable of fixing these inaccuracies.  This release
 								  resolves those issues and users can now expect that the existing
 								  ``radosgw-admin bucket check`` command will produce correct results.
 								  We recommend that users with versioned buckets, especially those that
 								  existed on prior releases, use these new tools to check whether their
 								  buckets are affected and to clean them up accordingly.
 								* RGW: The "user accounts" feature unlocks several new AWS-compatible IAM APIs
 								  for self-service management of users, keys, groups, roles, policy and
-												PendingReleaseNotes: announce the rgw user account feature

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2024-03-15 14:36:46 +00:00
+								  more. Existing users can be adopted into new accounts. This process is optional
 								  but irreversible. See https://docs.ceph.com/en/squid/radosgw/account and
 								  https://docs.ceph.com/en/squid/radosgw/iam for details.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* RGW: On startup, radosgw and radosgw-admin now validate the ``rgw_realm``
-												doc/rgw: add release note for changes to rgw_realm init

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2024-04-18 21:57:46 +00:00
+								  config option. Previously, they would ignore invalid or missing realms and
 								  go on to load a zone/zonegroup in a different realm. If startup fails with
 								  a  "failed to load realm" error, fix or remove the ``rgw_realm`` option.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* RGW: The radosgw-admin commands ``realm create`` and ``realm pull`` no
-												rgw: realm create only sets default realm on --default

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2024-08-23 18:49:32 +00:00
+								  longer set the default realm without ``--default``.
-												cephfs: move release note for PR #41779 to right spot

Release note for PR #41779 applies releases after Reef, not  Reef
onwards. Move the release note accordingly.

Fixes: https://github.com/ceph/ceph/pull/41779
Fixes: https://tracker.ceph.com/issues/47264
Signed-off-by: Rishabh Dave <ridave@redhat.com>

											
										
										
											2023-12-05 05:49:07 +00:00
+								* CephFS: Running the command "ceph fs authorize" for an existing entity now
 								  upgrades the entity's capabilities instead of printing an error. It can now
 								  also change read/write permissions in a capability that the entity already
 								  holds. If the capability passed by user is same as one of the capabilities
 								  that the entity already holds, idempotency is maintained.
-												cephfs: add command "ceph fs swap"

Add a FS command that enables users to swap names of two file systems in
a single PAXOS transaction. Add an option to this command that swaps
FSCIDS along with FS names. This commands also updates the application
pool tags and fails when mirroring is enabled on either or both FSs.

Fixes: https://tracker.ceph.com/issues/58129
Signed-off-by: Rishabh Dave <ridave@redhat.com>

											
										
										
											2023-01-29 20:48:54 +00:00
+								* CephFS: Two FS names can now be swapped, optionally along with their IDs,
 								  using "ceph fs swap" command. The function of this API is to facilitate
 								  file system swaps for disaster recovery. In particular, it avoids situations
 								  where a named file system is temporarily missing which would prompt a higher
 								  level storage operator (like Rook) to recreate the missing file system.
 								  See https://docs.ceph.com/en/latest/cephfs/administration/#file-systems
 								  docs for more information.
-												cephfs: move release note for PR #41779 to right spot

Release note for PR #41779 applies releases after Reef, not  Reef
onwards. Move the release note accordingly.

Fixes: https://github.com/ceph/ceph/pull/41779
Fixes: https://tracker.ceph.com/issues/47264
Signed-off-by: Rishabh Dave <ridave@redhat.com>

											
										
										
											2023-12-05 05:49:07 +00:00
+								* CephFS: Before running the command "ceph fs rename", the filesystem to be
 								  renamed must be offline and the config "refuse_client_session" must be set
 								  for it. The config "refuse_client_session" can be removed/unset and
 								  filesystem can be online after the rename operation is complete.
-												PendingReleaseNotes: Add note for POOL_APP_NOT_ENABLED

Adds release notes for the fix added in #47560

Signed-off-by: Prashant D <pdhange@redhat.com>

											
										
										
											2023-11-13 21:34:59 +00:00
+								* RADOS: A POOL_APP_NOT_ENABLED health warning will now be reported if
 								  the application is not enabled for the pool irrespective of whether
 								  the pool is in use or not. Always tag a pool with an application
 								  using ``ceph osd pool application enable`` command to avoid reporting
 								  of POOL_APP_NOT_ENABLED health warning for that pool.
 								  The user might temporarily mute this warning using
 								  ``ceph health mute POOL_APP_NOT_ENABLED``.
-												mon/LogMonitor: Use generic cluster log level config

We do not control the verbosity of the LogEntry
which is getting logged to stderr, graylog and
journald. This causes excessive flooding of logs
to /var/log, making a filesystem to fill up quickly.
Also we have different config variables namely
mon_cluster_log_file_level and mon_cluster_log_to_syslog_level
to control verbosity at cluster log file and
syslog level respectively. Add a generic cluster log
level config variable which controls cluster log
verbosity for all external entities.

Additionally, this patch addresses the regression of
`mon_cluster_log_file_level` option which doesn't take effect
because of code refactoring of LogMonitor::update_from_paxos
(commit : 7c84e06).

Fixes: https://tracker.ceph.com/issues/57061
Fixes: https://tracker.ceph.com/issues/57049

Signed-off-by: Prashant D <pdhange@redhat.com>

											
										
										
											2022-08-08 14:55:23 +00:00
+								* The `mon_cluster_log_file_level` and `mon_cluster_log_to_syslog_level` options
 								  have been removed. Henceforth, users should use the new generic option
 								  `mon_cluster_log_level` to control the cluster log level verbosity for the cluster
 								  log file as well as for all external entities.
-												ReleaseNotes: document support for partNumber

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2023-08-03 20:52:43 +00:00
+								CephFS: Disallow delegating preallocated inode ranges to clients. Config
 								  `mds_client_delegate_inos_pct` defaults to 0 which disables async dirops
 								  in the kclient.
 								* S3 Get/HeadObject now support query parameter `partNumber` to read a specific
 								  part of a completed multipart upload.
-												rgw: object lock uses 64-bit encoding for RetainUntilDate

the default encoding of ceph::real_time truncates seconds to uint32_t,
so stores the wrong timestamp for object lock enforcement

Fixes: https://tracker.ceph.com/issues/63537

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2023-11-15 21:24:47 +00:00
+								* RGW: Fixed a S3 Object Lock bug with PutObjectRetention requests that specify
 								  a RetainUntilDate after the year 2106. This date was truncated to 32 bits when
 								  stored, so a much earlier date was used for object lock enforcement. This does
 								  not effect PutBucketObjectLockConfiguration where a duration is given in Days.
 								  The RetainUntilDate encoding is fixed for new PutObjectRetention requests, but
 								  cannot repair the dates of existing object locks. Such objects can be identified
 								  with a HeadObject request based on the x-amz-object-lock-retain-until-date
 								  response header.
-												librados: make querying pools for selfmanaged snaps reliable

If get_pool_is_selfmanaged_snaps_mode() is invoked on a fresh RADOS
client instance that still lacks an osdmap, it returns false, same as
for "this pool is not in selfmanaged snaps mode".  The same happens if
the pool in question doesn't exist since the signature doesn't allow to
return an error.

The motivation for this API was to prevent users from running "rados
cppool" on a pool with unmanaged snapshots and deleting the original
thinking that they have a full copy.  Unfortunately, it's exactly
"rados cppool" that fell into this trap, so no warning is printed and
--yes-i-really-mean-it flag isn't enforced.

Fixes: https://tracker.ceph.com/issues/63607
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2023-11-22 13:39:13 +00:00
+								* RADOS: `get_pool_is_selfmanaged_snaps_mode` C++ API has been deprecated
 								  due to being prone to false negative results.  It's safer replacement is
 								  `pool_is_in_selfmanaged_snaps_mode`.
-												doc: improve pending release notes and CephFS

fixup

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>

											
										
										
											2024-10-26 23:23:35 +00:00
+								* RADOS: For bug 62338 (https://tracker.ceph.com/issues/62338), in order to simplify
 								  backporting, we choose to not
 								  condition the fix on a server flag.  As
-												PendingReleaseNotes: add release note for 62338

See https://tracker.ceph.com/issues/62338 and
2fc5486e.

Signed-off-by: Samuel Just <sjust@redhat.com>

											
										
										
											2023-11-22 03:12:12 +00:00
+								  a result, in rare cases it may be possible for a PG to flip between two acting
 								  sets while an upgrade to a version with the fix is in progress.  If you observe
 								  this behavior, you should be able to work around it by completing the upgrade or
 								  by disabling async recovery by setting osd_async_recovery_min_cost to a very
 								  large value on all OSDs until the upgrade is complete:
 								  ``ceph config set osd osd_async_recovery_min_cost 1099511627776``
-												doc/rados/operations: document `ceph balancer status detail`

Document change in https://github.com/ceph/ceph/pull/54801

Signed-off-by: Laura Flores <lflores@ibm.com>

											
										
										
											2023-12-22 22:55:29 +00:00
+								* RADOS: A detailed version of the `balancer status` CLI command in the balancer
 								  module is now available. Users may run `ceph balancer status detail` to see more
 								  details about which PGs were updated in the balancer's last optimization.
 								  See https://docs.ceph.com/en/latest/rados/operations/balancer/ for more information.
-												PendingReleaseNotes: support for subvolumes and subvolume groups in snap_schedule

Signed-off-by: Milind Changire <mchangir@redhat.com>

											
										
										
											2023-12-14 07:25:08 +00:00
+								* CephFS: Full support for subvolumes and subvolume groups is now available
 								  for snap_schedule Manager module.
-												rgw/pubsub: CreateTopic validates topic name

existing topics may have invalid names, so this is only enforced by
CreateTopic

Fixes: https://tracker.ceph.com/issues/65212

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2024-03-28 17:47:30 +00:00
+								* RGW: The SNS CreateTopic API now enforces the same topic naming requirements as AWS:
 								  Topic names must be made up of only uppercase and lowercase ASCII letters, numbers,
 								  underscores, and hyphens, and must be between 1 and 256 characters long.
-												PendingReleaseNotes: add rbd_diff_iterate2 note

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2024-01-20 15:00:46 +00:00
+								* RBD: When diffing against the beginning of time (`fromsnapname == NULL`) in
 								  fast-diff mode (`whole_object == true` with `fast-diff` image feature enabled
 								  and valid), diff-iterate is now guaranteed to execute locally if exclusive
 								  lock is available.  This brings a dramatic performance improvement for QEMU
 								  live disk synchronization and backup use cases.
-												rbd-nbd: map using netlink interface by default

Mapping rbd images to nbd devices using ioctl interface is not
robust. It was discovered that the device size or the md5 checksum
of the nbd device was incorrect immediately after mapping using
ioctl method. When using the nbd netlink interface to map RBD images
the issue was not encountered. Switch to using nbd netlink interface
for mapping.

Fixes: https://tracker.ceph.com/issues/64063
Signed-off-by: Ramana Raja <rraja@redhat.com>

											
										
										
											2024-01-17 18:24:36 +00:00
+								* RBD: The ``try-netlink`` mapping option for rbd-nbd has become the default
 								  and is now deprecated. If the NBD netlink interface is not supported by the
 								  kernel, then the mapping is retried using the legacy ioctl interface.
-												PendingReleaseNotes: add note about read balancer mgr module integration

Signed-off-by: Laura Flores <lflores@ibm.com>

											
										
										
											2023-12-22 19:23:41 +00:00
+								* RADOS: Read balancing may now be managed automatically via the balancer
 								  manager module. Users may choose between two new modes: ``upmap-read``, which
 								  offers upmap and read optimization simultaneously, or ``read``, which may be used
 								  to only optimize reads. For more detailed information see https://docs.ceph.com/en/latest/rados/operations/read-balancer/#online-optimization.
-												PendingReleaseNotes: add note about new mdlog trimming configurations

Signed-off-by: Venky Shankar <vshankar@redhat.com>

											
										
										
											2023-09-26 12:22:03 +00:00
+								* CephFS: MDS log trimming is now driven by a separate thread which tries to
 								  trim the log every second (`mds_log_trim_upkeep_interval` config). Also,
 								  a couple of configs govern how much time the MDS spends in trimming its
 								  logs. These configs are `mds_log_trim_threshold` and `mds_log_trim_decay_rate`.
-												rgw: modify topic owner check when creating

add tests to cover topic policies
as well as behavior when no policies are defined

Fixes: https://tracker.ceph.com/issues/64124

Signed-off-by: Zhipeng Li <qiuxinyidian@gmail.com>

											
										
										
											2024-01-23 06:50:52 +00:00
+								* RGW: Notification topics are now owned by the user that created them.
 								  By default, only the owner can read/write their topics. Topic policy documents
 								  are now supported to grant these permissions to other users. Preexisting topics
 								  are treated as if they have no owner, and any user can read/write them using the SNS API.
 								  If such a topic is recreated with CreateTopic, the issuing user becomes the new owner.
 								  For backward compatibility, all users still have permission to publish bucket
 								  notifications to topics owned by other users. A new configuration parameter:
 								  ``rgw_topic_require_publish_policy`` can be enabled to deny ``sns:Publish``
 								  permissions unless explicitly granted by topic policy.
-												rgw/notification doc: doc: Update pendingreleasenotes for notification.

Signed-off-by: kchheda3 <kchheda3@bloomberg.net>

											
										
										
											2024-06-03 18:44:31 +00:00
+								* RGW: Fix issue with persistent notifications where the changes to topic param that
 								  were modified while persistent notifications were in the queue will be reflected in notifications.
 								  So if user sets up topic with incorrect config (password/ssl) causing failure while delivering the
 								  notifications to broker, can now modify the incorrect topic attribute and on retry attempt to delivery
 								  the notifications, new configs will be used.
-												tools/rbd: make 'children' command support --image-id

Fixes: https://tracker.ceph.com/issues/64376
Signed-off-by: Mykola Golub <mykola.golub@clyso.com>

											
										
										
											2024-02-11 09:43:30 +00:00
+								* RBD: The option ``--image-id`` has been added to `rbd children` CLI command,
 								  so it can be run for images in the trash.
-												mon, mgr: do not output network ping stats

When doing PG dump using 'ceph pg dump --format json-pretty'
the output is extremely big that the command hangs and also
the ceph-mgr hangs and eventuall fails over.

The exact size depends on the number of OSDs in the cluster
and the number of peers for each OSD.

In tests, it's been identified that the network ping times
is the largest component in terms of size which is removed
from the output now so as to limit the overall size.

Fixes https://tracker.ceph.com/issues/57460

Signed-off-by: Ponnuvel Palaniyappan <pponnuvel@gmail.com>

											
										
										
											2022-09-15 14:55:06 +00:00
+								* PG dump: The default output of `ceph pg dump --format json` has changed. The
 								  default json format produces a rather massive output in large clusters and
 								  isn't scalable. So we have removed the 'network_ping_times' section from
 								  the output. Details in the tracker: https://tracker.ceph.com/issues/57460
-												PendingReleaseNotes: Adding note about rest module change and adding max_request option

Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>

											
										
										
											2024-02-21 09:21:25 +00:00
+								* mgr/REST: The REST manager module will trim requests based on the 'max_requests' option.
 								  Without this feature, and in the absence of manual deletion of old requests,
 								  the accumulation of requests in the array can lead to Out Of Memory (OOM) issues,
 								  resulting in the Manager crashing.
-												doc: add the reject the clone when threads are not available feature in the document

Fixes: https://tracker.ceph.com/issues/59714
Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>

											
										
										
											2023-08-22 08:02:58 +00:00
+								* CephFS: The `subvolume snapshot clone` command now depends on the config option
 								  `snapshot_clone_no_wait` which is used to reject the clone operation when
 								  all the cloner threads are busy. This config option is enabled by default which means
 								  that if no cloner threads are free, the clone request errors out with EAGAIN.
 								  The value of the config option can be fetched by using:
 								   `ceph config get mgr mgr/volumes/snapshot_clone_no_wait`
 								  and it can be disabled by using:
 								   `ceph config set mgr mgr/volumes/snapshot_clone_no_wait false`
-												pybind/rbd: expose RBD_IMAGE_OPTION_CLONE_FORMAT option

It takes effect with clone(), deep_copy() and migration_prepare().

Fixes: https://tracker.ceph.com/issues/65624
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2024-04-28 17:19:22 +00:00
+								* RBD: `RBD_IMAGE_OPTION_CLONE_FORMAT` option has been exposed in Python
 								  bindings via `clone_format` optional parameter to `clone`, `deep_copy` and
 								  `migration_prepare` methods.
-												pybind/rbd: expose RBD_IMAGE_OPTION_FLATTEN option

It takes effect with deep_copy() and migration_prepare().

Fixes: https://tracker.ceph.com/issues/65624
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2024-05-01 13:49:47 +00:00
+								* RBD: `RBD_IMAGE_OPTION_FLATTEN` option has been exposed in Python bindings via
 								  `flatten` optional parameter to `deep_copy` and `migration_prepare` methods.
-												doc: add the reject the clone when threads are not available feature in the document

Fixes: https://tracker.ceph.com/issues/59714
Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>

											
										
										
											2023-08-22 08:02:58 +00:00
-												PendingReleaseNotes: note need of confirmation for "ceph fs fail"

Signed-off-by: Rishabh Dave <ridave@redhat.com>

											
										
										
											2024-04-19 11:38:50 +00:00
+								* CephFS: Command "ceph mds fail" and "ceph fs fail" now requires a
 								  confirmation flag when some MDSs exhibit health warning MDS_TRIM or
 								  MDS_CACHE_OVERSIZED. This is to prevent accidental MDS failover causing
 								  further delays in recovery.
-												PendingReleaseNotes: add note on the client incompatibility health warning and feature bit

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

											
										
										
											2024-05-03 00:45:43 +00:00
+								* CephFS: fixes to the implementation of the ``root_squash`` mechanism enabled
 								  via cephx ``mds`` caps on a client credential require a new client feature
 								  bit, ``client_mds_auth_caps``. Clients using credentials with ``root_squash``
 								  without this feature will trigger the MDS to raise a HEALTH_ERR on the
 								  cluster, MDS_CLIENTS_BROKEN_ROOTSQUASH. See the documentation on this warning
 								  and the new feature bit for more information.
-												PendingReleaseNotes: add note about CephFS set_vxattrs

Signed-off-by: Christopher Hoffman <choffman@redhat.com>

											
										
										
											2024-02-06 20:59:21 +00:00
+								* CephFS: Expanded removexattr support for cephfs virtual extended attributes.
 								  Previously one had to use setxattr to restore the default in order to "remove".
 								  You may now properly use removexattr to remove. You can also now remove layout
 								  on root inode, which then will restore layout to default layout.
-												PendingReleaseNotes: add note on the client incompatibility health warning and feature bit

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

											
										
										
											2024-05-03 00:45:43 +00:00
-												objclass: deprecate cls_cxx_gather

cls_cxx_gather is not maintained and having issues with retry.
since there is no current use of it, we will deprecate it.

Fixes: https://tracker.ceph.com/issues/64258
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>

											
										
										
											2024-03-04 13:34:39 +00:00
+								* cls_cxx_gather is marked as deprecated.
-												doc: update 'journal reset' command with --yes-i-really-really-mean-it

Fixes: https://tracker.ceph.com/issues/62925
Signed-off-by: Jos Collin <jcollin@redhat.com>

											
										
										
											2024-02-27 08:45:26 +00:00
+								* CephFS: cephfs-journal-tool is guarded against running on an online file system.
 								  The 'cephfs-journal-tool --rank <fs_name>:<mds_rank> journal reset' and
 								  'cephfs-journal-tool --rank <fs_name>:<mds_rank> journal reset --force'
 								  commands require '--yes-i-really-really-mean-it'.
-												PendingReleaseNotes: note need of confirmation for "ceph mds fail"

Signed-off-by: Rishabh Dave <ridave@redhat.com>

											
										
										
											2024-04-19 11:32:29 +00:00
-												doc: Update pendingreleasenotes for dashboard

Signed-off-by: Nizamudeen A <nia@redhat.com>

											
										
										
											2024-05-21 05:11:26 +00:00
+								* Dashboard: Rearranged Navigation Layout: The navigation layout has been reorganized
 								  for improved usability and easier access to key features.
 								* Dashboard: CephFS Improvments
 								  * Support for managing CephFS snapshots and clones, as well as snapshot schedule
 								    management
 								  * Manage authorization capabilities for CephFS resources
 								  * Helpers on mounting a CephFS volume
 								* Dashboard: RGW Improvements
 								  * Support for managing bucket policies
 								  * Add/Remove bucket tags
 								  * ACL Management
 								  * Several UI/UX Improvements to the bucket form
 								* Monitoring: Grafana dashboards are now loaded into the container at runtime rather than
 								  building a grafana image with the grafana dashboards. Official Ceph grafana images
 								  can be found in quay.io/ceph/grafana
 								* Monitoring: RGW S3 Analytics: A new Grafana dashboard is now available, enabling you to
 								  visualize per bucket and user analytics data, including total GETs, PUTs, Deletes,
 								  Copies, and list metrics.
-												pybind/rbd: parse access and modify timestamps in UTC

It appears that commits 08cee16d0a4b ("pybind/rbd: always parse
timestamps in UTC") and 809c5430c292 ("librbd: add image access/last
modified timestamps") raced with each other and we ended up with two
more timezone-dependent timestamps.

Fixes: https://tracker.ceph.com/issues/66359
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2024-06-05 06:36:12 +00:00
+								* RBD: `Image::access_timestamp` and `Image::modify_timestamp` Python APIs now
 								  return timestamps in UTC.
-												rbd: add --snap-id option to "rbd clone"

Enable cloning from non-user snapshots via the CLI.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2024-05-30 09:38:53 +00:00
+								* RBD: Support for cloning from non-user type snapshots is added.  This is
 								  intended primarily as a building block for cloning new groups from group
 								  snapshots created with `rbd group snap create` command, but has also been
 								  exposed via the new `--snap-id` option for `rbd clone` command.
-												rbd: include original namespace type in "rbd snap ls --all" output

Before (snap 22 comes from "rbd group snap create", snap 23 created
manually with "rbd snap create"):

SNAPID  NAME                                  SIZE   PROTECTED  TIMESTAMP                 NAMESPACE
    21  f7cfdcfe-5f71-40e4-be82-3fb0e7caf2aa  1 GiB             Mon Jun 10 09:23:40 2024  trash (mysnap)
    22  bd67397f-32cb-48fe-b1ac-ef6f02319239  1 GiB             Mon Jun 10 09:26:06 2024  trash (.group.2_1491b049b556_1497bf66f586)
    23  27a5f053-8431-428e-ab33-be9d8b6cf51e  1 GiB             Mon Jun 10 09:28:30 2024  trash (.group.2_1491b049b556_1497bf66f586)

After:

SNAPID  NAME                                  SIZE   PROTECTED  TIMESTAMP                 NAMESPACE
    21  f7cfdcfe-5f71-40e4-be82-3fb0e7caf2aa  1 GiB             Mon Jun 10 09:23:40 2024  trash (user mysnap)
    22  bd67397f-32cb-48fe-b1ac-ef6f02319239  1 GiB             Mon Jun 10 09:26:06 2024  trash (group .group.2_1491b049b556_1497bf66f586)
    23  27a5f053-8431-428e-ab33-be9d8b6cf51e  1 GiB             Mon Jun 10 09:28:30 2024  trash (user .group.2_1491b049b556_1497bf66f586)

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2024-06-10 11:19:25 +00:00
+								* RBD: The output of `rbd snap ls --all` command now includes the original
 								  type for trashed snapshots.
-												doc/cephfs: add release notes and docs for clone progress report

Update docs and add release notes about the progress report that is
printed in output of "ceph fs clone status" command and progress bars
that is/are printed in output of "ceph status" command.

Signed-off-by: Rishabh Dave <ridave@redhat.com>

											
										
										
											2024-06-28 05:45:44 +00:00
+								* CephFS: "ceph fs clone status" command will now print statistics about clone
 								  progress in terms of how much data has been cloned (in both percentage as
 								  well as bytes) and how many files have been cloned.
 								* CephFS: "ceph status" command will now print a progress bar when cloning is
 								  ongoing. If clone jobs are more than the cloner threads, it will print one
 								  more progress bar that shows total amount of progress made by both ongoing
 								  as well as pending clones. Both progress are accompanied by messages that
 								  show number of clone jobs in the respective categories and the amount of
 								  progress made by each of them.
-												rgw/notifications: update release notes with fix to principalId

Fixes: https://tracker.ceph.com/issues/67857

Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>

											
										
										
											2024-09-02 11:07:19 +00:00
+								* RGW: in bucket notifications, the `principalId` inside `ownerIdentity` now contains
 								  complete user id, prefixed with tenant id
-												doc: Update pendingreleasenotes for dashboard

Signed-off-by: Nizamudeen A <nia@redhat.com>

											
										
										
											2024-05-21 05:11:26 +00:00
-												doc: nit fixes for nfs doc

Signed-off-by: Avan Thakkar <athakkar@redhat.com>

Fixes some doc lint and also fixed qa tests for having both 3 & 4 protocols
by default in expot config.

											
										
										
											2024-09-03 13:15:47 +00:00
+								* NFS: The export create/apply of CephFS based exports will now have a additional parameter `cmount_path` under the FSAL block,
-												doc: Update pendingreleasenotes for CephFS NFS exports

Signed-off-by: Avan Thakkar <athakkar@redhat.com>

											
										
										
											2024-08-27 07:43:11 +00:00
+								  which specifies the path within the CephFS to mount this export on. If this and the other
 								  `EXPORT { FSAL {} }` options are the same between multiple exports, those exports will share a single CephFS client. If not specified, the default is `/`.
-												PendingReleaseNotes for ops log backend

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2022-04-05 21:20:22 +00:00
+								>=18.0.0
-												rgw: Add `rgw_policy_reject_invalid_principals` and messages

Reject policies with invalid principals by default and provide more
useful error messages while doing so.

(Log them but do *not* reject the policy if it's set to false.)

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

											
										
										
											2022-12-13 01:40:33 +00:00
+								* The RGW policy parser now rejects unknown principals by default. If you are
 								  mirroring policies between RGW and AWS, you may wish to set
 								  "rgw policy reject invalid principals" to "false". This affects only newly set
 								  policies, not policies that are already in place.
-												mds: add balance_automate fs setting

To turn off the automatic ("default") balancer in multiple MDS clusters. The
new default is "off" as the balancer  is a constant source of problems and
surprise for administrators trying multiple actives. Instead, it should be a
deliberate decision to turn it on and usually with customization like the
"bal_rank_mask" setting or pinning.

Fixes: https://tracker.ceph.com/issues/61378
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

											
										
										
											2023-06-23 21:01:00 +00:00
+								* The CephFS automatic metadata load (sometimes called "default") balancer is
 								  now disabled by default. The new file system flag `balance_automate`
 								  can be used to toggle it on or off. It can be enabled or disabled via
 								  `ceph fs set <fs_name> balance_automate <bool>`.
-												PendingReleaseNotes for ops log backend

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2022-04-05 21:20:22 +00:00
+								* RGW's default backend for `rgw_enable_ops_log` changed from RADOS to file.
 								  The default value of `rgw_ops_log_rados` is now false, and `rgw_ops_log_file_path`
 								  defaults to "/var/log/ceph/ops-log-$cluster-$name.log".
-												blk/spdk: Add the support to use nvme device provided by NVMe-of Target

This patch is used to add the support to use the nvmedevice provided
by NVMe-oF target.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>

											
										
										
											2022-04-17 23:40:24 +00:00
+								* The SPDK backend for BlueStore is now able to connect to an NVMeoF target.
 								  Please note that this is not an officially supported feature.
-												common/ceph_json: dump bool using f->dump_bool()

as per https://www.json.org/json-en.html, JSON encodes bool as
"true" or "false", without the quotes. before this change, the quotes
are always added when encoding boolean values.

but this change is not backward compatible.

encode_json()'s bool overload is used by rgw. it uses JSONObj
defined in common/ceph_json.h to decode JSON-encoded structs.
and it does not differentiate bool from str when decoding a boolean
value despite that it could have check the "quoted" member variable
of JSONObj for validating the type of value. so we should be fine.

Fixes: https://tracker.ceph.com/issues/55189
Signed-off-by: Kefu Chai <tchaikov@gmail.com>

											
										
										
											2022-04-10 01:23:59 +00:00
+								* RGW's pubsub interface now returns boolean fields using bool. Before this change,
 								  `/topics/<topic-name>` returns "stored_secret" and "persistent" using a string
 								  of "true" or "false" with quotes around them. After this change, these fields
 								  are returned without quotes so they can be decoded as boolean values in JSON.
 								  The same applies to the `is_truncated` field returned by `/subscriptions/<sub-name>`.
 								* RGW's response of `Action=GetTopicAttributes&TopicArn=<topic-arn>` REST API now
 								  returns `HasStoredSecret` and `Persistent` as boolean in the JSON string
 								  encoded in `Attributes/EndPoint`.
 								* All boolean fields previously rendered as string by `rgw-admin` command when
 								  the JSON format is used are now rendered as boolean. If your scripts/tools
 								  relies on this behavior, please update them accordingly. The impacted field names
 								  are:
 								  * absolute
 								  * add
 								  * admin
 								  * appendable
 								  * bucket_key_enabled
 								  * delete_marker
 								  * exists
 								  * has_bucket_info
 								  * high_precision_time
 								  * index
 								  * is_master
 								  * is_prefix
 								  * is_truncated
 								  * linked
 								  * log_meta
 								  * log_op
 								  * pending_removal
 								  * read_only
 								  * retain_head_object
 								  * rule_exist
 								  * start_with_full_sync
 								  * sync_from_all
 								  * syncstopped
 								  * system
 								  * truncated
 								  * user_stats_sync
-												rgw: add 'rgw_access' log subsys for frontend http access log

this allows the log level of this http access log to be configured
separately from the 'rgw' subsystem, though the defaults are the same

Fixes: https://tracker.ceph.com/issues/54405

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2022-05-05 15:36:34 +00:00
+								* RGW: The beast frontend's HTTP access log line uses a new debug_rgw_access
 								  configurable. This has the same defaults as debug_rgw, but can now be controlled
 								  independently.
-												PendingReleaseNotes: add rbd compare-and-write notes

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2022-08-12 11:55:01 +00:00
+								* RBD: The semantics of compare-and-write C++ API (`Image::compare_and_write`
 								  and `Image::aio_compare_and_write` methods) now match those of C API.  Both
 								  compare and write steps operate only on `len` bytes even if the respective
 								  buffers are larger. The previous behavior of comparing up to the size of
 								  the compare buffer was prone to subtle breakage upon straddling a stripe
 								  unit boundary.
 								* RBD: compare-and-write operation is no longer limited to 512-byte sectors.
 								  Assuming proper alignment, it now allows operating on stripe units (4M by
 								  default).
-												PendingReleaseNotes: add rbd_aio_compare_and_writev note

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2022-10-06 10:36:00 +00:00
+								* RBD: New `rbd_aio_compare_and_writev` API method to support scatter/gather
 								  on both compare and write buffers.  This compliments existing `rbd_aio_readv`
 								  and `rbd_aio_writev` methods.
-												libcephfs: define AT_NO_ATTR_SYNC back for backward compatibility

This was introduce by commit e2a67f2a65553ad45721bb391081bc61aa97e0e9,
for the third part applications they may still use the old macro.

Add it back and marked it as deprecated.

Fixes: https://tracker.ceph.com/issues/56638
Signed-off-by: Xiubo Li <xiubli@redhat.com>

											
										
										
											2022-07-20 01:37:25 +00:00
+								* The 'AT_NO_ATTR_SYNC' macro is deprecated, please use the standard 'AT_STATX_DONT_SYNC'
 								  macro. The 'AT_NO_ATTR_SYNC' macro will be removed in the future.
-												PendingReleaseNotes: document online and offline trimming of PG Log's dups

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

											
										
										
											2022-08-23 19:50:48 +00:00
+								* Trimming of PGLog dups is now controlled by the size instead of the version.
 								  This fixes the PGLog inflation issue that was happening when the on-line
 								  (in OSD) trimming got jammed after a PG split operation. Also, a new off-line
 								  mechanism has been added: `ceph-objectstore-tool` got `trim-pg-log-dups` op
 								  that targets situations where OSD is unable to boot due to those inflated dups.
 								  If that is the case, in OSD logs the "You can be hit by THE DUPS BUG" warning
 								  will be visible.
 								  Relevant tracker: https://tracker.ceph.com/issues/53729
-												PendingReleaseNotes: add "rbd device unmap --namespace" note

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2022-10-10 18:18:12 +00:00
+								* RBD: `rbd device unmap` command gained `--namespace` option.  Support for
 								  namespaces was added to RBD in Nautilus 14.2.0 and it has been possible to
 								  map and unmap images in namespaces using the `image-spec` syntax since then
 								  but the corresponding option available in most other commands was missing.
-												PendingReleaseNotes: add note for rgw compression+encryption

adds release notes for the feature added in
https://github.com/ceph/ceph/pull/46188

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2022-10-24 16:40:07 +00:00
+								* RGW: Compression is now supported for objects uploaded with Server-Side Encryption.
-												PendingReleaseNotes: note rgw's compress-encrypted zonegroup feature flag

Signed-off-by: Casey Bodley <cbodley@redhat.com>

											
										
										
											2023-07-03 19:06:29 +00:00
+								  When both are enabled, compression is applied before encryption. Earlier releases
 								  of multisite do not replicate such objects correctly, so all zones must upgrade to
 								  Reef before enabling the `compress-encrypted` zonegroup feature: see
 								  https://docs.ceph.com/en/reef/radosgw/multisite/#zone-features and note the
 								  security considerations.
-												rgw: update release notes on the removal of pubsub

Signed-off-by: yuval Lifshitz <ylifshit@redhat.com>

											
										
										
											2022-12-01 15:43:35 +00:00
+								* RGW: the "pubsub" functionality for storing bucket notifications inside Ceph
 								  is removed. Together with it, the "pubsub" zone should not be used anymore.
 								  The REST operations, as well as radosgw-admin commands for manipulating
 								  subscriptions, as well as fetching and acking the notifications are removed
 								  as well.
 								  In case that the endpoint to which the notifications are sent maybe down or
 								  disconnected, it is recommended to use persistent notifications to guarantee
 								  the delivery of the notifications. In case the system that consumes the
 								  notifications needs to pull them (instead of the notifications be pushed
 								  to it), an external message bus (e.g. rabbitmq, Kafka) should be used for
 								  that purpose.
-												rgw/notifications: add const to APIs when possible

Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>

											
										
										
											2022-11-25 14:15:27 +00:00
+								* RGW: The serialized format of notification and topics has changed, so that
 								  new/updated topics will be unreadable by old RGWs. We recommend completing
 								  the RGW upgrades before creating or modifying any notification topics.
-												rbd, rbd-nbd: don't strip trailing newline in passphrase files

One of the stated goals is compatibility with standard LUKS tools,
in particular being able to load encryption on images formatted with
cryptsetup.  cryptsetup doesn't do this and this really interferes
with randomly generated (binary) passphrases.

While at it, open passphrase files as binary -- it communicates the
intent if nothing else on POSIX.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2022-11-14 12:24:00 +00:00
+								* RBD: Trailing newline in passphrase files (`<passphrase-file>` argument in
 								  `rbd encryption format` command and `--encryption-passphrase-file` option
 								  in other commands) is no longer stripped.
-												doc/rbd: add clone encryption details and examples

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2022-10-28 10:42:14 +00:00
+								* RBD: Support for layered client-side encryption is added.  Cloned images
 								  can now be encrypted each with its own encryption format and passphrase,
 								  potentially different from that of the parent image.  The efficient
 								  copy-on-write semantics intrinsic to unformatted (regular) cloned images
 								  are retained.
-												client: move a client's option to mds-client.yaml

mds_max_retries_on_remount_failure option is used by Client.cc only.

Fixes: https://tracker.ceph.com/issues/56532
Signed-off-by: Xiubo Li <xiubli@redhat.com>

											
										
										
											2022-07-15 09:13:37 +00:00
+								* CEPHFS: Rename the `mds_max_retries_on_remount_failure` option to
 								  `client_max_retries_on_remount_failure` and move it from mds.yaml.in to
 								  mds-client.yaml.in because this option was only used by MDS client from its
 								  birth.
-												common: Add labeled perf counters

Add the ability to dump labeled perf counters
for a daemon. Labeled perf counters are stored
in a CephContext's PerfCountersCollection.

Labeled and unlabeled perf counters are dumped
to the admin socket via `counters dump` command.

The schema for labeled and unlabeled perf
counters are dumped to the admin socket via
`counters schema` command.

This commit includes docs and additional unit tests

Signed-off-by: Ali Maredia <amaredia@redhat.com>

											
										
										
											2022-07-19 21:39:02 +00:00
+								* The `perf dump` and `perf schema` commands are deprecated in favor of new
 								  `counter dump` and `counter schema` commands. These new commands add support
 								  for labeled perf counters and also emit existing unlabeled perf counters. Some
-												PendingReleaseNotes: add a note for rbd-mirror daemon perf counters

This was missed in commit 1a1477b9fd7f ("rbd-mirror: add and rename
perf counters for journal and snapshot mirroring").

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2023-04-06 10:32:11 +00:00
+								  unlabeled perf counters became labeled in this release, with more to follow in
 								  future releases; such converted perf counters are no longer emitted by the
 								  `perf dump` and `perf schema` commands.
-												mon/MgrMap: dump last_failure_osd_epoch and active_clients at top level

Currently last_failure_osd_epoch and active_clients are dumped in the
always_on_modules dictionary in "ceph mgr dump" output.  This goes back
to when these fields were added in commits f2986a4400bb ("mon/MgrMonitor:
blacklist previous instance") and df507cde8d71 ("mgr: forward RADOS
client instances for potential blacklist") but is wrong as these fields
have nothing to do with always-on modules.

Fixes: https://tracker.ceph.com/issues/58647
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2023-02-06 16:56:00 +00:00
+								* `ceph mgr dump` command now outputs `last_failure_osd_epoch` and
 								  `active_clients` fields at the top level.  Previously, these fields were
 								  output under `always_on_modules` field.
-												mgr: store names of modules that register RADOS clients in the MgrMap

The MgrMap stores a list of RADOS clients' addresses registered by the
mgr modules. During failover of ceph-mgr, the list is used to blocklist
clients belonging to the failed ceph-mgr.

Store the names of the mgr modules that registered the RADOS clients
along with the clients' addresses in the MgrMap. During debugging, this
allows easy identification of the mgr module that registered a
particular RADOS client by just dumping the MgrMap (`ceph mgr dump`).

Following is the MgrMap output with a module's client name displayed
along with its client addrvec,
$ ceph mgr dump | jq '.active_clients[0]'
{
  "name": "devicehealth",
  "addrvec": [
    {
      "type": "v2",
      "addr": "10.0.0.148:0",
      "nonce": 612376578
    }
  ]
}

Fixes: https://tracker.ceph.com/issues/58691
Signed-off-by: Ramana Raja <rraja@redhat.com>

											
										
										
											2023-01-30 07:21:54 +00:00
+								* `ceph mgr dump` command now displays the name of the mgr module that
 								  registered a RADOS client in the `name` field added to elements of the
 								  `active_clients` array. Previously, only the address of a module's RADOS
 								  client was shown in the `active_clients` array.
-												PendingReleaseNotes: add a note for rbd-mirror daemon perf counters

This was missed in commit 1a1477b9fd7f ("rbd-mirror: add and rename
perf counters for journal and snapshot mirroring").

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2023-04-06 10:32:11 +00:00
+								* RBD: All rbd-mirror daemon perf counters became labeled and as such are now
 								  emitted only by the new `counter dump` and `counter schema` commands.  As part
 								  of the conversion, many also got renamed to better disambiguate journal-based
 								  and snapshot-based mirroring.
-												librbd: clear Image::list_watchers() list before populating it

The "append to the passed list" behavior is confusing and not what the
corresponding C API (rbd_watchers_list) or other similar C++ APIs (e.g.
list_lockers) do.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

											
										
										
											2023-03-30 11:58:20 +00:00
+								* RBD: list-watchers C++ API (`Image::list_watchers`) now clears the passed
 								  `std::list` before potentially appending to it, aligning with the semantics
 								  of the corresponding C API (`rbd_watchers_list`).
-												PendingReleaseNotes: add note that pyrados may have omap keys as bytes

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

											
										
										
											2023-05-11 16:25:51 +00:00
+								* The rados python binding is now able to process (opt-in) omap keys as bytes
 								  objects. This enables interacting with RADOS omap keys that are not decodeable as
 								  UTF-8 strings.
-												PendingReleaseNotes: add a note about telemetry leaderboard

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>

											
										
										
											2023-04-12 12:00:31 +00:00
+								* Telemetry: Users who are opted-in to telemetry can also opt-in to
 								  participating in a leaderboard in the telemetry public
 								  dashboards (https://telemetry-public.ceph.com/). Users can now also add a
 								  description of the cluster to publicly appear in the leaderboard.
 								  For more details, see:
 								  https://docs.ceph.com/en/latest/mgr/telemetry/#leaderboard
 								  See a sample report with `ceph telemetry preview`.
 								  Opt-in to telemetry with `ceph telemetry on`.
 								  Opt-in to the leaderboard with
 								  `ceph config set mgr mgr/telemetry/leaderboard true`.
 								  Add leaderboard description with:
 								  `ceph config set mgr mgr/telemetry/leaderboard_description ‘Cluster description’`.
-												PendingReleaseNotes: add a note about deleting files from lost+found directory

Signed-off-by: Venky Shankar <vshankar@redhat.com>

											
										
										
											2023-05-06 14:54:28 +00:00
+								* CEPHFS: After recovering a Ceph File System post following the disaster recovery
 								  procedure, the recovered files under `lost+found` directory can now be deleted.
-												doc: deprecate the cache tiering

This topic has been discussed many times; recently at the Dev
Summit of Cephalocon 2023.

This commit is the minial version of the work, contained entirely
within the `doc`. However, likely it will be expanded as there
were ideas like e.g. adding cache tiering back experimental feature
list (Sam) to warn users when deploying a new cluster.

Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>

											
										
										
											2023-05-02 15:52:23 +00:00
+								* core: cache-tiering is now deprecated.
-												PendingReleaseNotes: Document mClock scheduler fixes and enhancements

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

											
										
										
											2023-06-05 08:11:28 +00:00
+								* mClock Scheduler: The mClock scheduler (default scheduler in Quincy) has
 								  undergone significant usability and design improvements to address the slow
 								  backfill issue. Some important changes are:
 								  * The 'balanced' profile is set as the default mClock profile because it
 								    represents a compromise between prioritizing client IO or recovery IO. Users
 								    can then choose either the 'high_client_ops' profile to prioritize client IO
 								    or the 'high_recovery_ops' profile to prioritize recovery IO.
 								  * QoS parameters like reservation and limit are now specified in terms of a
 								    fraction (range: 0.0 to 1.0) of the OSD's IOPS capacity.
 								  * The cost parameters (osd_mclock_cost_per_io_usec_* and
 								    osd_mclock_cost_per_byte_usec_*) have been removed. The cost of an operation
 								    is now determined using the random IOPS and maximum sequential bandwidth
 								    capability of the OSD's underlying device.
 								  * Degraded object recovery is given higher priority when compared to misplaced
 								    object recovery because degraded objects present a data safety issue not
 								    present with objects that are merely misplaced. Therefore, backfilling
 								    operations with the 'balanced' and 'high_client_ops' mClock profiles may
 								    progress slower than what was seen with the 'WeightedPriorityQueue' (WPQ)
 								    scheduler.
 								  * The QoS allocations in all the mClock profiles are optimized based on the above
 								    fixes and enhancements.
 								  * For more detailed information see:
 								    https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/
-												doc: add note about snap-schedule snapshot retention

Signed-off-by: Milind Changire <mchangir@redhat.com>

											
										
										
											2023-05-16 07:55:59 +00:00
+								* mgr/snap_schedule: The snap-schedule mgr module now retains one less snapshot
 								  than the number mentioned against the config tunable `mds_max_snaps_per_dir`
 								  so that a new snapshot can be created and retained during the next schedule
 								  run.
-												PendingReleaseNotes: Note change to 'ceph config dump' pretty-print output.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

											
										
										
											2023-08-29 04:29:15 +00:00
+								* `ceph config dump --format <json|xml>` output will display the localized
 								  option names instead of its normalized version. For e.g.,
 								  "mgr/prometheus/x/server_port" will be displayed instead of
 								  "mgr/prometheus/server_port". This matches the output of the non pretty-print
 								  formatted version of the command.
-												mds: fix the description for inotable testing only options

The description text are mixed for mds_kill_skip_replaying_inotable
and mds_inject_skip_replaying_inotable.

At the same time rename "mds_kill_skip_replaying_inotable", which
is a bit confusing to "mds_kill_after_journal_logs_flushed".

Fixes: https://tracker.ceph.com/issues/61660
Signed-off-by: Xiubo Li <xiubli@redhat.com>

											
										
										
											2023-06-13 10:30:34 +00:00
+								* CEPHFS: MDS config option name "mds_kill_skip_replaying_inotable" is a bit
 								  confusing with "mds_inject_skip_replaying_inotable", therefore renaming it to
 								  "mds_kill_after_journal_logs_flushed"
-												PendingReleaseNotes: add note about `bluestore_zero_block_detection` config option

Signed-off-by: Laura Flores <lflores@redhat.com>

											
										
										
											2022-05-27 18:28:19 +00:00
 								>=17.2.1
 								* The "BlueStore zero block detection" feature (first introduced to Quincy in
 								https://github.com/ceph/ceph/pull/43337) has been turned off by default with a
 								new global configuration called `bluestore_zero_block_detection`. This feature,
 								intended for large-scale synthetic testing, does not interact well with some RBD
 								and CephFS features. Any side effects experienced in previous Quincy versions
 								would no longer occur, provided that the configuration remains set to false.
 								Relevant tracker: https://tracker.ceph.com/issues/55521
-												PendingReleaseNotes: add a note about Rook telemetry

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>

											
										
										
											2022-06-06 19:34:19 +00:00
 								* telemetry: Added new Rook metrics to the 'basic' channel to report Rook's
 								  version, Kubernetes version, node metrics, etc.
 								  See a sample report with `ceph telemetry preview`.
 								  Opt-in with `ceph telemetry on`.
 								  For more details, see:
 								  https://docs.ceph.com/en/latest/mgr/telemetry/
-												PendingReleaseNotes: add a note about SnapMapper key coversion

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

											
										
										
											2022-07-21 16:23:58 +00:00
-												PendingReleaseNotes: Note the fix for high CPU utilization during recovery

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

											
										
										
											2022-08-16 11:45:29 +00:00
+								* OSD: The issue of high CPU utilization during recovery/backfill operations
 								  has been fixed. For more details, see: https://tracker.ceph.com/issues/56530.
-												PendingReleaseNotes: add a note about SnapMapper key coversion

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

											
										
										
											2022-07-21 16:23:58 +00:00
+								>=15.2.17
 								* OSD: Octopus modified the SnapMapper key format from
 								  <LEGACY_MAPPING_PREFIX><snapid>_<shardid>_<hobject_t::to_str()>
 								  to
 								  <MAPPING_PREFIX><pool>_<snapid>_<shardid>_<hobject_t::to_str()>
 								  When this change was introduced, 94ebe0e also introduced a conversion
 								  with a crucial bug which essentially destroyed legacy keys by mapping them
 								  to
 								  <MAPPING_PREFIX><poolid>_<snapid>_
 								  without the object-unique suffix. The conversion is fixed in this release.
-												PendingReleaseNotes: fix typo in 15.2.17

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

											
										
										
											2022-08-17 16:33:39 +00:00
+								  Relevant tracker: https://tracker.ceph.com/issues/56147
-												PendingReleaseNotes: added note related to new mds upgrade option using cephadm

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>

											
										
										
											2022-06-13 14:11:40 +00:00
 								* Cephadm may now be configured to carry out CephFS MDS upgrades without
 								reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to
 								avoid having two active MDS modifying on-disk structures with new versions,
 								communicating cross-version-incompatible messages, or other potential
 								incompatibilities. This could be disruptive for large-scale CephFS deployments
 								because the cluster cannot easily reduce active MDS daemons to 1.
-												PendingReleaseNotes: added note related to new mds upgrade option using cephadm

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>

											
										
										
											2022-06-13 14:11:40 +00:00
+								NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage
 								of the feature, refer this link on how to perform it:
 								https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade
 								Relevant tracker: https://tracker.ceph.com/issues/55715
-												PendingReleaseNotes: noted new MDSMap field refuse_client_session

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>

											
										
										
											2022-11-07 13:23:41 +00:00
+								* Introduced a new file system flag `refuse_client_session` that can be set using the
 								`fs set` command. This flag allows blocking any incoming session
 								request from client(s). This can be useful during some recovery situations
 								where it's desirable to bring MDS up but have no client workload.
 								Relevant tracker: https://tracker.ceph.com/issues/57090
-												PendingReleaseNotes: add reference to the new mdsmap max_xattr_size field

Signed-off-by: Luís Henriques <lhenriques@suse.de>

											
										
										
											2022-06-02 14:12:29 +00:00
 								* New MDSMap field `max_xattr_size` which can be set using the `fs set` command.
 								  This MDSMap field allows to configure the maximum size allowed for the full
 								  key/value set for a filesystem extended attributes.  It effectively replaces
 								  the old per-MDS `max_xattr_pairs_size` setting, which is now dropped.
 								  Relevant tracker: https://tracker.ceph.com/issues/55725
-												mds: optionally forbid to use standby for another fs as last resort

Signed-off-by: Mykola Golub <mykola.golub@clyso.com>

											
										
										
											2023-06-07 12:57:38 +00:00
 								* Introduced a new file system flag `refuse_standby_for_another_fs` that can be
 								set using the `fs set` command. This flag prevents using a standby for another
 								file system (join_fs = X) when standby for the current filesystem is not available.
 								Relevant tracker: https://tracker.ceph.com/issues/61599
-												mon: add NVMe-oF gateway monitor and HA

- gateway submodule

Fixes: https://tracker.ceph.com/issues/64777

This PR adds high availability support for the nvmeof Ceph service. High availability means that even in the case that a certain GW is down, there will be another available path for the initiator to be able to continue the IO through another GW. High availability is achieved by running nvmeof service consisting of at least 2 nvmeof GWs in the Ceph cluster. Every GW will be seen by the host (initiator) as a separate path to the nvme namespaces (volumes).

The implementation consists of the following main modules:

- NVMeofGWMon - a PaxosService. It is a monitor that tracks the status of the nvmeof running services, and take actions in case that services fail, and in case services restored.
- NVMeofGwMonitorClient – It is an agent that is running as a part of each nvmeof GW. It is sending beacons to the monitor to signal that the GW is alive. As a part of the beacon, the client also sends information about the service. This information is used by the monitor to take decisions and perform some operations.
- MNVMeofGwBeacon – It is a structure used by the client and the monitor to send/recv the beacons.
- MNVMeofGwMap – The map is tracking the nvmeof GWs status. It also defines what should be the new role of every GW. So in the events of GWs go down or GWs restored, the map will reflect the new role of each GW resulted by these events. The map is distributed to the NVMeofGwMonitorClient on each GW, and it knows to update the GW with the required changes.

It is also adding 3 new mon commands:
- nvme-gw create
- nvme-gw delete
- nvme-gw show

The commands are used by the ceph adm to update the monitor that a new GW is deployed. The monitor will update the map accordingly and will start tracking this GW until it is deleted.

Signed-off-by: Leonid Chernin <lechernin@gmail.com>
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>

											
										
										
											2023-10-17 13:25:07 +00:00
+								* mon: add NVMe-oF gateway monitor and HA
 								  This PR adds high availability support for the nvmeof Ceph service. High availability
 								means that even in the case that a certain GW is down, there will be another available
 								path for the initiator to be able to continue the IO through another GW.
 								It is also adding 2 new mon commands, to notify monitor about the gateway creation/deletion:
 								  - nvme-gw create
 								  - nvme-gw delete
 								Relevant tracker: https://tracker.ceph.com/issues/64777