Commit Graph

41 Commits

Author SHA1 Message Date
Kefu Chai
ec8a40b08f qa/tasks/mgr: clean crash reports before waiting for clean
otherwise we have following warning in health report

{"status":"HEALTH_WARN","checks":{"RECENT_MGR_MODULE_CRASH":{"severity":"HEALTH_WARN","summary":{"message":"1 mgr modules have recently crashed","count":1},"muted":false}},"mutes":[]}

and it does not disappear after the test waits for 30 seconds.
and the tasks.mgr.test_module_selftest.TestModuleSelftest test
fails like:

2021-07-21T09:59:52.560 INFO:tasks.cephfs_test_runner:======================================================================
2021-07-21T09:59:52.561 INFO:tasks.cephfs_test_runner:ERROR: test_module_commands (tasks.mgr.test_module_selftest.TestModuleSelftest)
2021-07-21T09:59:52.561 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2021-07-21T09:59:52.561 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2021-07-21T09:59:52.562 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_6a5d5abc027f706687dec92f92ff6fc6f074d2ae/qa/tasks/mgr/test_module_selftest.py", line 201, in
test_mo
dule_commands
2021-07-21T09:59:52.562 INFO:tasks.cephfs_test_runner:    self.wait_for_health_clear(timeout=30)
2021-07-21T09:59:52.562 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_6a5d5abc027f706687dec92f92ff6fc6f074d2ae/qa/tasks/ceph_test_case.py", line 172, in
wait_for_health_c
lear
2021-07-21T09:59:52.563 INFO:tasks.cephfs_test_runner:    self.wait_until_true(is_clear, timeout)
2021-07-21T09:59:52.563 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_6a5d5abc027f706687dec92f92ff6fc6f074d2ae/qa/tasks/ceph_test_case.py", line 209, in
wait_until_true
2021-07-21T09:59:52.563 INFO:tasks.cephfs_test_runner:    raise TestTimeoutError("Timed out after {0}s and {1} retries".format(elapsed, retry_count))
2021-07-21T09:59:52.564 INFO:tasks.cephfs_test_runner:tasks.ceph_test_case.TestTimeoutError: Timed out after 30s and 0 retries

in this change, the crash reports are nuked right after
we see the warning, so that we can have a clean health
report.

Fixes: https://tracker.ceph.com/issues/51743
Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-07-21 22:46:18 +08:00
Patrick Donnelly
0d9032771c
qa: fix api test failures
"device_health_metrics" pool is gone -- .mgr pool is in.

I don't think the pool removal code in some test cases is necessary any
longer with recent changes to remove those warnings; so that code is
gone too.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-11 19:35:17 -07:00
Kefu Chai
39b2b5edc0 qa/tasks/mgr: skip test_diskprediction_local on python>=3.8
query the python version before trying to test diskprediction_local

Fixes: https://tracker.ceph.com/issues/50196
Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-04-07 21:27:44 +08:00
Kefu Chai
70e99e76b5 mgr: do not migrate conf from config-key store to new-style conf
since all module options are using the new-style config framework.
the migration is offered for the use case of upgrade from luminous to mimic,
since pacific can only be upgraded from octopus. the mimic monitors are alreay
able to populate the configurations to mgr, not to mention the octopus
monitors, so there is no need to migrate the options stored in config-key
store anymore.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-11-25 23:30:15 +08:00
Kefu Chai
dc977d543d qa/tasks/mgr: drop commented code
the test for diskprediction_cloud is never enabled, and the used
cloud-based service is not reachable anymore. let's just remove the dead
code.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-08-19 11:08:39 +08:00
Kefu Chai
31704b3ecc qa/tasks/mgr: skip test_diskprediction_local on python>=3.8
See-also: https://tracker.ceph.com/issues/45147
Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-05-12 11:54:23 +08:00
Kefu Chai
7d37226548 qa/tasks/mgr: use relative import
for better readability, and to ease the pain of developer to track back
to the top level python package for referencing a submodule

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-03-27 14:51:24 +08:00
David Zafman
4841100b4e test: Disable self-test of diskprediction_cloud since it isn't loaded
See qa/packages/packages.yaml

Signed-off-by: David Zafman <dzafman@redhat.com>
2020-02-27 13:12:45 -08:00
Sebastian Wagner
f2c5472286 mgr/orchestrator_cli: rename to mgr/orchestrator
* Move `mgr/orchestrator.py` to `orchestrator/_interface.py`

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
2020-02-17 10:24:01 +01:00
Kefu Chai
7d262db114 qa/tasks: call super class's setUp()
to address the regression introduced by
8729281121

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-02-15 12:39:08 +08:00
Sebastian Wagner
0342479835 mgr/mgr_module: Allow resetting module options
Introduced in 4872cc5aa3

`_ceph_set_module_option` also accepts `None`, not just strings.

Fixes: http://tracker.ceph.com/issues/40779

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
2019-07-17 15:41:45 +02:00
Lenz Grimmer
96a65fbfb7
Merge pull request #26914 from votdev/issue_38331
mgr/dashboard: Add separate option to config SSL port

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
2019-03-28 10:55:27 +01:00
Volker Theile
86f47f6bfd mgr/dashboard: Add separate option to config SSL port
There is a need to introduce this new config option because the MgrModule::get_module_option() and MgrModule::get_localized_module_option() method will be refactored soon and will not support the default parameter anymore. Instead the default value must be configured in the MODULE_OPTIONS. Currently we misuse the server_port depending on if SSL is enabled or not.

Fixes: https://tracker.ceph.com/issues/38331

Signed-off-by: Volker Theile <vtheile@suse.com>
2019-03-13 13:50:14 +01:00
Sage Weil
ebdd003bf4 qa/tasks/mgr/test_module_selftest: fix localized value test
When mgr/selftest/testkey = foo and mgr/selftest/x/testkey is not set,
then get_localized() should return foo.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-13 07:11:47 -05:00
Volker Theile
bc9643657a mgr: Fix broken get_localized_module_option function
Fixes: https://tracker.ceph.com/issues/38560

Signed-off-by: Volker Theile <vtheile@suse.com>
2019-03-11 17:25:18 +01:00
hsiang41
9799eb67eb mgr: Separate diskprediction cloud plugin from the diskprediction plugin
Separate diskprediction local cloud from the diskprediction plugin.
Devicehealth invoke device prediction function related on the global
configuration "device_failure_prediction_mode".

Signed-off-by: Rick Chen <rick.chen@prophetstor.com>
2018-11-16 00:15:41 -06:00
Volker Theile
34525ba3af Relocate cluster_log(). Only active modules can use it.
Signed-off-by: Volker Theile <vtheile@suse.com>
2018-10-05 14:46:58 +02:00
Volker Theile
95746ecce9 mgr: Add ability to trigger a cluster/audit log message from Python
Fixes: https://tracker.ceph.com/issues/36194

Signed-off-by: Volker Theile <vtheile@suse.com>
2018-10-04 13:33:18 +02:00
Rick Chen
4abb79f159 mgr/diskprediction: add prototype diskprediction module
This module is written by Rick Chen <rick.chen@prophetstor.com> and
provides both a built-in local predictor and a cloud mode that queries
a cloud service (provided by ProphetStor) to predict device failures.

Signed-off-by: Rick Chen <rick.chen@prophetstor.com>
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-17 08:20:57 -05:00
Noah Watkins
ea15b625f3 qa/mgr/selftest: handle always-on module fall out
need a non-always-on module. hello doesn't work because it isn't
installed. so switch to selftest.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-08-28 13:45:58 -07:00
Sage Weil
00223d2364 qa/tasks/mgr/test_module_selftest: use hello instead of status for disbled command test
Signed-off-by: Sage Weil <sage@redhat.com>
2018-08-18 09:29:00 -05:00
John Spray
d24e6cb32a mgr: generic self test command
Avoid need for each module to expose a self-test
command: they can just implement the method,
and then get it called via the selftest module.

As well as fewer LOC, this means that the self
test commands are not cluttering the interface
for end users, as they've invisible until
the selftest module is loaded.

Signed-off-by: John Spray <john.spray@redhat.com>
2018-07-20 13:09:19 -04:00
John Spray
f02316adb4 mgr: enable inter-module calls
This is being done by passing native CPython objects
back and forth.  It's safe because sub-interpreters in CPython
share memory allocation infrastructure and share the GIL.

With a view to PEP554, we limit inter-interpreter calls
to pickleable objects, so that this may be implemented
using byte-arrays in future.

This infrastructure should enable:
 - the dashboard to display the status of other modules, for
   example the set of progress indicators from `progress`
 - dashboard and restful to share an underlying long running
   job mechanism.

Signed-off-by: John Spray <john.spray@redhat.com>
2018-07-20 13:07:17 -04:00
John Spray
d7601c546f qa/mgr: delete devicehealth pool after selftest
This prevents tests getting hung up on the health
warnings from its very low pg count.

Signed-off-by: John Spray <john.spray@redhat.com>
2018-07-10 12:54:52 -04:00
Dan Mick
8145598f59 qa/tasks/mgr: add test_crash, call from test_module_selftest
Signed-off-by: Dan Mick <dan.mick@redhat.com>
2018-06-29 14:51:45 -07:00
Sage Weil
dd6ad72b90 mgr/devicehealth: add self-test
Signed-off-by: Sage Weil <sage@redhat.com>
2018-06-23 17:01:55 -05:00
Wido den Hollander
394b10049e mgr/telemetry: Add Ceph Telemetry module to send reports back to project
This Manager Module will send statistics and version information from
a Ceph cluster back to telemetry.ceph.com if the user has opted-in on sending
this information.

Additionally a user can tell that the information is allowed to be made
public which then allows other users to see this information.

Signed-off-by: Wido den Hollander <wido@42on.com>
(cherry picked from commit 8f6137d162)
2018-05-14 23:34:25 +08:00
Wido den Hollander
d15d510bab
mgr/telegraf: Telegraf module for Ceph Mgr
Telegraf is a agent for collecting and reporting metrics.

It has multiple inputs and can send data to various outputs like
for example InfluxDB or ElasticSearch.

This module works by using the socket_listener of Telegraf and can
send data over UDP, TCP and a local Unix Socket.

Signed-off-by: Wido den Hollander <wido@42on.com>
2018-05-09 16:00:15 +02:00
John Spray
b7a5da8bf0 qa: update dashboard tests for https://
Signed-off-by: John Spray <john.spray@redhat.com>
2018-04-27 09:58:47 -04:00
John Spray
4b3f026d07 qa: update mgr test for MgrModule.OPTIONS
Signed-off-by: John Spray <john.spray@redhat.com>
2018-04-23 10:14:31 -04:00
John Spray
d4ed33c2e0 qa: test mgr live configuration updates
Signed-off-by: John Spray <john.spray@redhat.com>
2018-04-23 07:29:47 -04:00
Mohamad Gebai
fb638381b2 mgr/iostat: add self-test
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
2018-04-12 00:26:24 -04:00
Ricardo Dias
86264d4b02
qa/tasks/mgr: move test initialization to setUpClass method
With this change, we avoid the disabling/enabling of the ceph-mgr module
being tested for each test function declared in each test case. Now
the ceph-mgr module being tested is disabled/enabled only once for each
test case.

Signed-off-by: Ricardo Dias <rdias@suse.com>
2018-03-05 13:07:18 +00:00
Kefu Chai
52bf90bb10
Merge pull request #20047 from jcsp/wip-prometheus-qa
qa: add new prometheus test to rados/mgr suite

Reviewed-By: Nathan Cutler <ncutler@suse.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-01-25 23:43:07 +08:00
John Spray
d9a47181c4 mgr: add health checks for failed modules
Signed-off-by: John Spray <john.spray@redhat.com>
2018-01-24 13:08:20 -05:00
John Spray
f95b079c21 qa/mgr: add test for command execution errors
Signed-off-by: John Spray <john.spray@redhat.com>
2018-01-24 13:08:20 -05:00
John Spray
e2c68d5e25 qa: assign prometheus ports during selftest
This was throwing IOError("Port 9283 not free on '::'",)
when trying to serve, since merging https://github.com/ceph/ceph/pull/19744

It's because the standbys (on the same node as the active) are
now trying to listen too.

Fixes: https://tracker.ceph.com/issues/22755
Signed-off-by: John Spray <john.spray@redhat.com>
2018-01-23 10:23:39 +00:00
John Spray
c64c9ff00d qa: configure zabbix properly before selftest
Even though the selftest routine doesn't care about
the settings, we should set them to avoid emitting
nasty log/health messages when enabling the module.

Fixes: http://tracker.ceph.com/issues/22514
Signed-off-by: John Spray <john.spray@redhat.com>
2017-12-21 08:28:55 -05:00
John Spray
05e648be6a qa: expand mgr testing
Some extra coverage of the dashboard, including its standby
redirect mode and the publishing of URIs.

Also invoking the command_spam mode of the selftest module.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-11-01 08:21:42 -04:00
John Spray
d96a59e74b qa/mgr: fix influx/prometheus test names
This was a typo: they were swapped around.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-10-11 17:00:01 +01:00
John Spray
99352ceced qa: add mgr module selftest task
The module self test commands give us a chance to
catch any other ceph changes that change something
that a module was relying on reading.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-09-27 14:20:22 -04:00