RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-11 21:50:26 +00:00

Author	SHA1	Message	Date
SrinivasaBharathKanta	1017b1d230	Merge pull request #59743 from sseshasa/wip-fix-mclock-low-iops-capacity-threshold common,osd: Use last valid OSD IOPS value if measured IOPS is unrealistic	2024-11-06 15:46:54 +05:30
Kamoltat (Junior) Sirivadhna	28e38e30bb	Merge pull request #59483 from kamoltat/wip-ksirivad-exit-stretch-mode mon [stretch mode]: support disable_stretch_mode Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>	2024-11-05 13:07:06 -05:00
Samuel Just	048ce81f45	Merge pull request #56677 from athanatos/sjust/for-review/wip-replica-read osd,crimson/osd: rework of replica read and related state Reviewed-by: Matan Breizman <mbreizma@redhat.com>	2024-11-04 09:49:09 -08:00
Ernesto Puerta	8ccb634804	mgr/zabbix: remove deprecated module This (already deprecated) module is removed as a side-effect of the deprecation and removal of the `restful` module. Fixes: https://tracker.ceph.com/issues/47066 Signed-off-by: Ernesto Puerta <epuertat@redhat.com>	2024-10-28 14:17:19 +01:00
Ernesto Puerta	96ec7badb8	mgr/restful: remove deprecated module Detailed changes: * Remove `restful` mgr module dir, * Remove Python depedencies (`pecan`, `werkzeug`) from ceph.spec and debian control, * Remove docs, * Remove associated QA tests, * Update vstart. Fixes: https://tracker.ceph.com/issues/47066 Signed-off-by: Ernesto Puerta <epuertat@redhat.com>	2024-10-28 14:17:18 +01:00
Samuel Just	dda683b20c	suites/rados/thrash-erasure-code/.../ec-small-objects-balanced.yaml: remove We don't support balanced reads on ec pools. Additionally, the yaml actually specifies 'balanced_reads' rather than 'balance_reads' and therefore has no actual effect. Signed-off-by: Samuel Just <sjust@redhat.com>	2024-10-21 17:04:51 +00:00
Sridhar Seshasayee	da4b85c55a	common,osd: Use last valid OSD IOPS value if measured IOPS is unrealistic The OSD's IOPS capacity is used by the mClock scheduler to determine the quantum of bandwidth allocation for the various operations on the OSD. Prior to this commit, maybe_override_max_osd_capacity_for_qos() only checked if the measured IOPS capacity exceeded the higher threshold defined by 'osd_mclock_iops_capacity_threshold_[hdd\|ssd]' and if so fallback to the last valid or the default IOPS capacity as defined by osd_mclock_max_capacity_iops_[hdd\|ssd]. It's quite possible that the reported IOPS is unrealistically low. This could be due to transient factors on the underlying device or it could indicate bad health of the device. Either way, the safer option would be to fallback to the last valid or the default IOPS setting for that OSD in order to avoid cluster performance (slow or stalled ops) issues down the line. Therefore, to handle this case, the commit introduces additional config options viz., - osd_mclock_iops_capacity_low_threshold_hdd - set to 50 IOPS and - osd_mclock_iops_capacity_low_threshold_ssd - set to 1000 IOPS If the measured IOPS capacity doesn't fall within the low and high threshold range, the default or the last valid IOPS capacity is used. The existing cluster log warning is suitably modified to convey the reason. Additionally, for a couple of valgrind related teuthology tests, the cluster warning is added to the ignorelist since the reported IOPS can be very low due to slowness. Fixes: https://tracker.ceph.com/issues/67421 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2024-10-17 16:38:20 +05:30
Kamoltat (Junior) Sirivadhna	abfff2b714	Merge pull request #57146 from kamoltat/wip-ksirivad-fix-connection-score-json src/mon/ConnectionTracker.cc: Fix dump function Reviewed-by Kamoltat Sirivadhna <ksirivad@redhat.com>	2024-09-26 10:15:04 -04:00
Kamoltat Sirivadhna	4d2f8879be	qa: Added tests for disabling stretch mode Test disabling stretch mode with the following scenario: 1. Healthy Stretch Mode 2. Degraded Stretch Mode Fixes: https://tracker.ceph.com/issues/67467 Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>	2024-09-22 17:12:07 +00:00
Adam Kupczyk	a787a91719	Merge pull request #54504 from aclamk/wip-aclamk-bs-refactor-write-path os/bluestore: Recompression, part 2. New write path.	2024-08-13 15:15:50 +02:00
Laura Flores	bd1082daaa	Merge pull request #58736 from amathuria/wip-66922-amat qa/rados/dashboard: Add PG_DEGRADED to ignorelist	2024-08-08 15:41:18 -05:00
Patrick Donnelly	cfed7c0baa	Merge PR #59029 into main * refs/pull/59029/head: qa: simplify postmerge construction Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Brad Hubbard <bhubbard@redhat.com>	2024-08-07 20:58:17 -04:00
Kamoltat (Junior) Sirivadhna	6a0d503a59	Merge pull request #56233 from kamoltat/wip-ksirivad-fix-64802 RADOS: Generalize stretch mode pg temp handling to be usable without stretch mode Samuel Just <sjust@redhat.com>	2024-08-07 09:45:54 -04:00
Adam Kupczyk	8bd233bef5	qa/bluestore: Add write_v1/v2 selection Add framework for various random options for debug bluestore. Use framework to select: - write_v1 - write_v2 - write_v1 / write_v2 selected at random Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>	2024-08-07 10:55:46 +00:00
Adam Kupczyk	e88ab6547e	Merge pull request #58664 from aclamk/wip-aclamk-qa-less-bluestore-debug qa/suites/rados: Reduced BlueStore log levels	2024-08-06 12:53:02 +02:00
Patrick Donnelly	382357dcd4	qa: simplify postmerge construction and avoid errors when "clusternodes" is not defined. Fixes: https://tracker.ceph.com/issues/67352 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2024-08-05 21:07:24 -04:00
Yuri Weinstein	dd87d573cf	Merge pull request #58635 from badone/wip-tracker-50371-rados_api_test-timeout-failures qa: Restrict rados api tests to large clusters and increase timeout Reviewed-by: Laura Flores <lflores@redhat.com>	2024-08-01 06:44:34 -07:00
Yuri Weinstein	a2c60161de	Merge pull request #57863 from NitzanMordhai/wip-nitzan-thrash-erasure-code-crush-4-nodes-8-6-overrides suites/ec-rados-plugin=jerasure-k=8-m=6-crush: roles set with overrides Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Reviewed-by: Matan Breizman <Matan.Brz@gmail.com>	2024-08-01 06:42:09 -07:00
Yuri Weinstein	3956c4278a	Merge pull request #58205 from NitzanMordhai/wip-nitzan-rados-dashboard-test-update-ignorelist suites: test should ignore osd_down warnings Reviewed-by: Ronen Friedman <rfriedma@redhat.com>	2024-07-26 10:24:44 -07:00
Yuri Weinstein	4adc795c49	Merge pull request #58215 from badone/wip-tracker-59380-admin-socket-injectfull qa/suites/rados: Cancel injectfull to allow cleanup Reviewed-by: Neha Ojha <nojha@redhat.com>	2024-07-23 10:57:08 -07:00
Yuri Weinstein	1fa959e982	Merge pull request #57485 from sseshasa/wip-fix-validator-osd-down-grace-tmout qa/suites/rados/verify/validater: increase heartbeat grace timeout Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Laura Flores <lflores@redhat.com>	2024-07-23 10:50:32 -07:00
Laura Flores	39a09a3590	Merge pull request #58275 from NitzanMordhai/wip-nitzn-host-thraser-fix-min-in-checks suites: host thrasher should check min_in before thrashing host	2024-07-22 13:22:30 -05:00
Aishwarya Mathuria	4a4f9a3e99	qa/rados/dashboard: Add PG_DEGRADED to ignorelist Eventually, the PG_DEGRADED warning goes away and cluster goes back to healthy state before the end of the test Fixes: https://tracker.ceph.com/issues/66922 Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>	2024-07-22 22:22:59 +05:30
Adam Kupczyk	8ee137f662	qa/suites/rados: Reduced BlueStore log levels Having debug 20 is impractical. Slows down execution and takes disk space, but gives little help in eventual debugging. Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>	2024-07-22 15:34:02 +02:00
Adam Kupczyk	337c8bf901	Merge pull request #57002 from aclamk/wip-aclamk-bs-storetest-expand-synthetic Improved structure for objectstore unit tests.	2024-07-22 13:48:06 +02:00
Brad Hubbard	d034fec463	qa: Restrict rados api tests to large clusters and increase timeout Running these tests with thrashers on small clusters leads to many very slow ops due to the cluster being overloaded. That has a tendency to make some of the API tests timeout and fail. Fixes: https://tracker.ceph.com/issues/50371 Signed-off-by: Brad Hubbard <bhubbard@redhat.com>	2024-07-18 09:09:22 +10:00
Kamoltat	ed7f4e8829	qa: Added mon connection score tests Basically when we deploy a 3 MONS Check if the connection scores are clean with a 60 seconds grace period Fixes: https://tracker.ceph.com/issues/65695 Signed-off-by: Kamoltat <ksirivad@redhat.com>	2024-07-17 22:26:55 +00:00
Kamoltat	7b41aff3f0	qa/suites/rados: 3-az-stretch-cluster-netsplit test Test the case where 2 DC loses connection with each other for a 3 AZ stretch cluster with stretch pool enabled. Check if cluster is accessible and PGs are active+clean after reconnected. Signed-off-by: Kamoltat <ksirivad@redhat.com>	2024-07-17 22:16:01 +00:00
Kamoltat	4ca1320727	qa/suites/rados/singleton/all: init mon-stretch-pool.yaml Test the following new Ceph CLI commands: `ceph osd pool stretch set` `ceph osd pool stretch unset` `ceph osd pool stretch show` `qa/workunits/mon/mon-stretch-pool.sh` will create the stretch cluster while performing input validation for the CLI Commands mentioned above. `qa/tasks/stretch_cluster.py` is in charge of setting a pool to stretch cluster and checks whether it prevents PGs from the going active when there is not enough buckets available in the acting set of PGs to go active. Also, test different MON fail over scenarios after setting pool as stretch `qa/suites/rados/singleton/all/mon-stretch-pool.yaml` brings the scripts together. Fixes: https://tracker.ceph.com/issues/64802 Signed-off-by: Kamoltat <ksirivad@redhat.com>	2024-07-17 22:12:04 +00:00
Nitzan Mordechai	e5cd5469b2	suites/ec-rados-plugin=jerasure-k=8-m=6-crush: roles set with overrides roles being set without overrides causing too many values to unpack (expected 1) Fixes: https://tracker.ceph.com/issues/66209 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>	2024-06-27 06:29:02 +00:00
Nitzan Mordechai	89d695fb8b	suites: check for host thrasher The last PR modified the suites to only check for host thrasher. This update fixes that issue by implementing different settings with dedicated YAML files for host thrashing Fixes: https://tracker.ceph.com/issues/66657 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>	2024-06-26 12:16:48 +00:00
Brad Hubbard	4c5d0e30d2	qa/suites/rados: Cancel injectfull to allow cleanup IO is frozen when the injectfull command is sent as part of the test which causes the cleanup to hang so we need to clear it. Fixes: https://tracker.ceph.com/issues/59380 Signed-off-by: Brad Hubbard <bhubbard@redhat.com>	2024-06-26 10:03:43 +10:00
Yuri Weinstein	359d20f326	Merge pull request #58141 from ljflores/wip-tracker-65852 qa/suites/rados/thrash/workloads: remove cache tiering workload Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Reviewed-by: Samuel Just <sjust@redhat.com>	2024-06-25 06:47:14 -07:00
Nitzan Mordechai	2c65f1da96	suites: test should ignore osd_down warnings Fixes: https://tracker.ceph.com/issues/64870 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>	2024-06-23 08:49:45 +00:00
Adam King	98c986f1f5	Merge pull request #57412 from adk3798/stray-laundry2 qa/cephadm: fix ignorelist of CEPHADM_STRAY_DAEMON for rados_api_tests Reviewed-by: Laura Flores <lflores@ibm.com>	2024-06-20 12:06:18 -04:00
Laura Flores	35505a7f1f	qa/suites/rados/thrash/workloads: remove cache tiering workload Fixes: https://tracker.ceph.com/issues/65852 Signed-off-by: Laura Flores <lflores@ibm.com>	2024-06-19 12:53:44 -05:00
Laura Flores	820e4004f3	qa/suites/rados/thrash-old-clients: update supported releases and distro thrash-old-clients tests should only support N-3 releases. To fix this for main, I have removed all releases < quincy and have added squid. Also, we are fully switching to centos.9_stream packages/containers after the centos.8_stream end of life, so I changed the distro from centos.8_stream to centos.9_stream. *** Note: If this commit is backported, it should be done in such a way that only releases >= quincy reference centos.9_stream. For instance, if backporting to squid, a reef/squid thrash test is okay to make references to centos.9_stream since both reef and squid support this, but a pacific/squid test will have to take a different approach since pacific does not support centos.9_stream. Fixes: https://tracker.ceph.com/issues/66398 Signed-off-by: Laura Flores <lflores@ibm.com>	2024-06-10 17:34:27 -05:00
nmordech@redhat.com	3f26a965f6	suites: adding dencoder test multi versions We are currently conducting regular ceph-dencoder tests for backward compatibility. However, we are omitting tests for forward compatibility. This suite will introduce tests against the ceph-objects-corpus to address forward compatibility issues that may arise. the script will install N-2 version and run against the latest version corpus objects that we have, then install N-1 to N version and check them as well. Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>	2024-05-16 05:16:17 +00:00
Sridhar Seshasayee	aae02b6af4	qa/suites/rados/verify/validater: increase heartbeat grace timeout OSD_DOWN cluster log warning is raised on rare occasions due to the osd_hearbeat_grace timeout getting exceeded. The warning is soon cleared. Given the nature of the test (valgrind), the grace timeout is increased to 160 secs to avoid generating the warning. Fixes: https://tracker.ceph.com/issues/65768 Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>	2024-05-16 10:02:43 +05:30
Adam King	0a67436e36	qa/cephadm: fix ignorelist of CEPHADM_STRAY_DAEMON for rados_api_tests Not every log with this error has the parentheses, so these warnings were still causing the test to fail [ERR] [WRN] CEPHADM_STRAY_DAEMON: 2 stray daemon(s)... in cluster log Signed-off-by: Adam King <adking@redhat.com>	2024-05-10 16:21:45 -04:00
Patrick Donnelly	07afb4ae09	Merge PR #56997 into main * refs/pull/56997/head: pybind/mgr: disable sqlite3/python autocommit qa/tasks/mgr: add tests for sqlite autocommit qa/tasks/vstart_runner: run daemons in foreground qa/tasks/vstart_runner: add missing poll method qa/suites/rados/mgr: add cli/devicehealth tasks qa: reorganize mgr unit tests qa: use position-independent link qa: add missing terminating newline pybind/mgr: add killpoint for sqlite3 database setup mgr: allow specifying module option level mon/MgrMonitor: promote standby when unsetting down flag mon/MgrMonitor: only drop active if exists Reviewed-by: Ernesto Puerta <epuertat@redhat.com>	2024-04-30 16:46:06 -04:00
Adam Kupczyk	9eb14fc01c	qa/rados: Adapt bluestore tests to new naming in ceph_test_objectstore Plus: fixed bluestore compression test invocation. Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>	2024-04-30 14:24:49 +00:00
Patrick Donnelly	fb82b6d35a	qa/tasks/mgr: add tests for sqlite autocommit That autocommit is properly turned off and that commits via context managers work as expected. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2024-04-29 16:33:32 -04:00
Adam King	1d97f673d3	qa/cephadm: ignore stray daemon warning during rados_api_tests The "stray daemon" that is getting logged about in this test is from "stray daemon laundry.pid70383 on host smithi027 not managed by cephadm". It seems the rados_api_tests is creating some additional "laundry" entity during these tests that gets reported as an actual daemon in the mgr, but cephadm is unaware of it, resulting in the warning. Originally we thought to maybe add "laundry" itself to the ignorelist, but without an additional patch that added extra logging for debug purposes (which can't be merged) the log statement found in the logs due to this problem will not say what daemon it found to be stray. There will just be a generic warning about a stray daemon. In a real cluster, a user would then check "ceph health detail" to find out what daemon is stray, but the log scraper can't do this and just fails the test due to the presence of the warning. Signed-off-by: Adam King <adking@redhat.com>	2024-04-29 13:54:37 -04:00
Patrick Donnelly	440f25e1ec	qa/suites/rados/mgr: add cli/devicehealth tasks These should have been part of the commit adding the tests. Fixes: `9ebcbdbed0` Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2024-04-29 12:22:27 -04:00
Patrick Donnelly	2f48dc9a00	qa: reorganize mgr unit tests Refactor common tasks and allow loading mgrmodules before unittests start. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2024-04-29 12:22:27 -04:00
Patrick Donnelly	1749edd668	qa: use position-independent link Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2024-04-29 12:22:27 -04:00
Casey Bodley	0c72fcc26a	Merge pull request #56008 from kchheda3/wip-notification-subsys rgw/notification: add rgw notification specific debug log subsystem Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>	2024-03-21 15:08:35 +00:00
Laura Flores	88f8db5c4b	Merge pull request #56146 from ljflores/wip-tracker-64725 qa/suites/rados/singleton: add POOL_APP_NOT_ENABLED to ignorelist	2024-03-20 16:50:47 -05:00
Yuri Weinstein	98a7421080	Merge pull request #53308 from NitzanMordhai/wip-nitzan-qa-tasks-with-crush-rules suites: qa tasks with crush rules Reviewed-by: Samuel Just <sjust@redhat.com>	2024-03-20 08:37:45 -07:00

1 2 3 4 5 ...

938 Commits