* refs/pull/33064/head:
cephadm: add version to `command_ls` output
cephadm: add type checking to `update_filewalld`
cephadm: allow prepare-host to start an enabled service
cephadm: add type checking for `check_host` and `prepare_host`
cephadm: generalize logic for checking and enabling units
cephadm: add 'CEPH_CONF' to the NFS ganesha container envs
cephadm: trim nfs.json sample
qa/workunits/cephadm/test_cephadm.sh: systemctl stop nfs-server
qa/workunits/cephadm/test_cephadm.sh: make pgs available
cephadm: add some log lines
cephadm: check port in use
cephadm: add/remove nfs ganesha grace
cephadm: update firewalld with nfs service
qa/workunits/cephadm/test_cephadm.sh: add nfs-ganesha test
cephadm: add ganasha.conf
cephadm: add NFSGanesha deployment type
cephadm: consolidate list of supported daemons
cephadm: use keyword instead of positional args
Reviewed-by: Sebastian Wagner <swagner@suse.com>
For the case when the non-global level does not have a schedule
and a higher level is used as the parent, it wrongly listed
schedules from all branches under the parent, instead of only the
interested one.
Signed-off-by: Mykola Golub <mgolub@suse.com>
we normalize object-locator to object_locator when parsing command line
options. but object-locator is more consistent with other options
suppored by "rados" cli, and "-" is easier to type than "_". it's also
more widely used in command line options.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* refs/pull/33885/head:
Merge pull request #33848 from mchangir/octopus-tests-remove-suprious-whitespace
Merge PR #33746 into octopus
Merge PR #33830 into octopus
Merge PR #33732 into octopus
Merge PR #33620 into octopus
Merge pull request #33876 from tchaikov/octopus-cephadm-mypy
cephadm: add "assert foo is not None" for mypy check
Merge pull request #33067 from tspmelo/wip-rbd-delete-with-snapshot
cephadm: add grafana adopt
Merge PR #33771 into octopus
Merge PR #33850 into octopus
Merge PR #33853 into octopus
Merge PR #33857 into octopus
Merge PR #32990 into octopus
Merge PR #33713 into octopus
Merge PR #33838 into octopus
qa/tasks/cephadm: no default mon|mgr|crash service specs
qa/suites/rados/cephadm/upgrade: upgrade start point that supports the no-spec option
Merge PR #33832 into octopus
cephadm: bootstrap: wait for mgr to restart after enabling a module
mgr: add 'mgr_status' tell command
Merge pull request #33839 from rhcs-dashboard/44538-fix-rgw-grafana-get-put-latencies
Merge pull request #33743 from votdev/issue_43869_fix_qa_test
cephadm: create initial mon and mgr service specs too
cephadm: no need to pregenerate a crash key for the bootstrap host
mgr/cephadm: do not complain when we don't have enough hosts
mgr/cephadm: remove orphan daemons
mgr/cephadm: report size=0 for fabricated ServiceDescription
mgr/cephadm: safety check to prevent removing all mon|mgr daemons
mgr/cephadm: prevent scaling mon|mgr below count=1
mgr/cephadm: do not remove daemons from remove_service
Merge pull request #33805 from tchaikov/wip-44500
spec: Podman (temporarily) requires apparmor-abstractions on suse
mgr/cephadm: Make sure we don't co-locate the same daemon
monitoring: fix RGW grafana chart 'Average GET/PUT Latencies'
tests: remove spurious whitespace
mgr/cephadm: fix service list filtering
Merge PR #33825 into octopus
Merge PR #33811 into octopus
Revert "Merge pull request #33673 from cbodley/wip-denc-enum"
mgr/cephadm: fix upgrade order
Merge PR #33801 into octopus
Merge PR #33822 into octopus
cephadm: bootstrap: tolerate error return from -h
Merge PR #33809 into octopus
Merge PR #32678 into octopus
cephadm: use `sh` instead of `bash` during enter
ceph.in: only shut down rados on clean exit
common/ceph_timer: Pass reference to waited time on stack
common/ceph_timer: Add test
common/ceph_timer: Use unique_function, allowing noncopyable events
common/ceph_timer: Couple cleanups
common/ceph_timer: Fix namespaces
common/ceph_timer: Add missing includes
common/ceph_timer.h: Don't indent contents of a namespace
mgr/dashboard: Crush rule modal
mgr/dashboard: Preserve rule selection on pool type change
mgr/dashboard: Crush rule is only send during replicated pool creation
mgr/dashboard: Explicit returns in pool form
mgr/dashboard: Removes fork join in pool form
mgr/dashboard: Hide ECP actions during ec pool edit
mgr/dashboard: Pool form erasure/replicated boolean
mgr/dashboard: Change pool info API endpoint
mgr/dashboard: Moves ECP info endpoint to UI-API
mgr/cephadm: add _remove_osds_bg back to main loop
mgr/cephadm/osd: update removal report immediately
qa/tasks/ceph_manager: use StringIO for capturing COT output
qa/standalone/scrub/osd-scrub-repair: force osdmap prop to osds
qa/standalone/scrub/osd-scrub-test: wait longer for update
qa/tasks/ceph_manager: capture stderr for COT
qa/suites/rados/ceph: drop opensuse for now
mon/MonClient: send logs to mon on separate schedule than pings
mgr/dashboard: Fix missing ImageSpec usage
mgr/dashboard: Allow removing RBD with snapshots
mgr/dashboard: Refactor and cleanup tasks.mgr.dashboard.test_user
mgr/dashboard: support multiple DriveGroups when creating OSDs
mon/MonClient: send logs to mon even if we have no keelalive2
cephadm: flag dashboard user to change password
Reviewed-by: Sebastian Wagner <swagner@suse.com>
The OpenStack tempests tests do not stay stable and break approximately
every six months. Remove the test suite for now.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The unmap action only sends a signal to the kernel to notify the
rbd-nbd daemon to disconnect. Therefore, it's possible that an
unmap followed by an immediate re-map to the same device might
fail since the unmap is still in-progress.
Fixes: https://tracker.ceph.com/issues/44567
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
* refs/pull/33830/head:
qa/tasks/cephadm: no default mon|mgr|crash service specs
qa/suites/rados/cephadm/upgrade: upgrade start point that supports the no-spec option
cephadm: create initial mon and mgr service specs too
cephadm: no need to pregenerate a crash key for the bootstrap host
mgr/cephadm: do not complain when we don't have enough hosts
mgr/cephadm: remove orphan daemons
mgr/cephadm: report size=0 for fabricated ServiceDescription
mgr/cephadm: safety check to prevent removing all mon|mgr daemons
mgr/cephadm: prevent scaling mon|mgr below count=1
mgr/cephadm: do not remove daemons from remove_service
Reviewed-by: Sebastian Wagner <swagner@suse.com>
* refs/pull/33620/head:
mgr/dashboard: Crush rule modal
mgr/dashboard: Preserve rule selection on pool type change
mgr/dashboard: Crush rule is only send during replicated pool creation
mgr/dashboard: Explicit returns in pool form
mgr/dashboard: Removes fork join in pool form
mgr/dashboard: Hide ECP actions during ec pool edit
mgr/dashboard: Pool form erasure/replicated boolean
mgr/dashboard: Change pool info API endpoint
mgr/dashboard: Moves ECP info endpoint to UI-API
Reviewed-by: Tiago Melo <tmelo@suse.com>
The rados api tests are failing WatchNotify because the OSDs are so
heavily lagged.. in large part due to the high debug level of debug_ms=20
and debug_osd=25. Reduce that.
Also increase the heartbeat grace so slow valgrind-y osds don't get marked
down.
Signed-off-by: Sage Weil <sage@redhat.com>
see also qa/suites/krbd/rbd/tasks/rbd_workunit_suites_fsx.yaml
Fixes: https://tracker.ceph.com/issues/44552
Signed-off-by: Kefu Chai <kchai@redhat.com>
mgr/dashboard: Allow deletion of RBD with snapshots
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Ensure that snapshot-based mirroring is tested in different RBD image
feature combinations.
Fixes: https://tracker.ceph.com/issues/44396
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
mgr/dashboard: Refactor and cleanup tasks.mgr.dashboard.test_user
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
* refs/pull/33809/head:
qa/standalone/scrub/osd-scrub-repair: force osdmap prop to osds
qa/standalone/scrub/osd-scrub-test: wait longer for update
Reviewed-by: David Zafman <dzafman@redhat.com>
Adds option `mon_allow_pool_size_one` which will be disabled by default
to ensure pools are not configured without replicas.
If the user still wants to use pool size 1, they will have to change the
value of `mon_allow_pool_size_one` to true and then have to pass flag
`--yes-i-really-mean-it` to cli command:
Example:
`ceph osd pool test set size 1 --yes-i-really-mean-it`
Fixes: https://tracker.ceph.com/issues/44025
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
* refs/pull/33793/head:
qa/suites/rados/cephadm/upgrade: new start point
qa/tasks/cephadm: put bootstrap config etc directly in /etc/ceph
cephadm: shell: default to config and keyring in /etc/ceph, if present
Reviewed-by: Ricardo Marques <rimarques@suse.com>
Now a crush rule can be created and deleted through the pool form,
similar to the ECP profile.
The creation form is somewhat more intelligent as it checks the crush
map to help create a usable rule, with only a few clicks
through preselections.
Fixes: https://tracker.ceph.com/issues/43260
Signed-off-by: Stephan Müller <smueller@suse.com>
Moves the "_info" endpoint of pool into an equivalent
UI-API call with the name "info".
Added three more attributes to the info dict which enables the dashboard
to only call info to get all the needed data, currently three calls will
be used to do that.
Removed pool_name parameter as the outcome was not used.
Updated the tests and related angular files accordingly.
Fixes: https://tracker.ceph.com/issues/44371
Signed-off-by: Stephan Müller <smueller@suse.com>
Moves the "_info" endpoint of erasure code profile into an equivalent
UI-API call with the name "info".
The serialization of the profile was outsourced into "ceph-service" as
it's used somewhere else (follow up commit).
Removed unused methods in angular service and REST controller.
Fixed path in angular service.
Fixes: https://tracker.ceph.com/issues/44371
Signed-off-by: Stephan Müller <smueller@suse.com>
there are couple factors we should consider when choosing between
BytesIO and StringIO:
- if the producer is producing binary
- if we are expecting binary
- if the layers in between them are doing the decoding/encoding
automatically.
in our case, the producer is either the ChannelFile instances returned
by paramiko.SSHClient or subprocess.CompletedProcess insances returned
by subprocess.run(). the former are file-like objects opened in "r" mode,
but their contents are decoded with utf-8 when reading if
ChannelFile.FLAG_BINARY is not specified. that's why we always try to
add this flag in orchestra/run.py when collecting the stdout and stderr
from paramiko.SSHClient after executing a command.
back in python2, this works just fine. as we don't differentiate bytes
from str by then.
but in python3, we have to make a decision. in the case of
ceph-objectstore-tool (COT for short), it does not produce binary and
we don't check its output with binary, so, if neither Remote.run() nor
LocalRemote.run() decodes/encodes for us, it's fine.
so it boils down to `copy_to_log()`:
i think we we should respect the consumer's expectation, and only decode
the output if a StringIO is passed in as stdout or stderr.
as we always log the output with logging we could either set
`ChannelFile.FLAG_BINARY` depending on the type of `capture` or not.
if it's not set, paramiko will return str (bytes) on python2, and str on
python3. if it's not set paramiko will return str (bytes) on python2,
and bytes on python3.
if there is non-ASCII in the output, logging will bail fail with
`UnicodeDecodeError` exception. and paramiko throws the same exception
when trying to decode for us if `ChannelFile.FLAG_BINARY` is not
specified.
so to ensure that we always have logging messages no matter if the
producer follows the rule of "use StringIO if you only emit text" or
not, we have to use `ChannelFile.FLAG_BINARY`, and force paramiko
to send us the bytes. but we still have the luxury to use StringIO
and do the decode when the caller asks for str explicitly. that'd save
the pain of using `str.decode()` or `six.ensure_str()` everywhere
even if we can assure that the program does not write binary.
Signed-off-by: Kefu Chai <kchai@redhat.com>
as we are expecting the error message written to stderr, and we need to
check for the error messages in it.
this change addresses the regression introduced by
204ceee156
Fixes: https://tracker.ceph.com/issues/44500
Signed-off-by: Kefu Chai <kchai@redhat.com>
This puts the conf and keyring in /etc/ceph earlier rather than later,
making them useful for debugging a live system *during* bootstrap. It's
also less code.
Signed-off-by: Sage Weil <sage@redhat.com>
This version understands how to apply a mgr spec like '2;host=x' with a
semicolon. This particular test build does.
Signed-off-by: Sage Weil <sage@redhat.com>
The 'timeout' option in the environment may will conflict with the
ones in some ceph commands, like:
$ timeout 120 ./bin/ceph daemon mds.b session config 8718 timeout 45
And the old code will also give us incorrect result like:
['adjust-ulimits', 'ceph-coverage', 'timeout', '120', 'ceph', 'fs', 'dump']
will be transfered to:
['adjust-ulimits', 'ceph-coverage', '120', 'ceph', 'fs', 'dump']
The '120' is left behind.
Fixes: https://tracker.ceph.com/issues/44437
Signed-off-by: Xiubo Li <xiubli@redhat.com>
... to verify the attributes of clone and source subvolume belonging
to different subvolume groups.
Introduced in e22d546beb
Fixes: https://tracker.ceph.com/issues/44438
Signed-off-by: Ramana Raja <rraja@redhat.com>
- use string.ascii_uppercase instead string.uppercase
- use six.ensure_str for bytes when required
- use six.ensure_binary if needed
- get rid of dict.itervalues in favor of dict.values
- get rid of cStringIO.StringIO in favor io.BytesIO
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
* Improve @DashboardTestCase.RunAs decorator
* Use command line to create the test user in the 'test_pwd_expiration_date_update' test case.
* Test return codes of various REST API calls.
Fixes: https://tracker.ceph.com/issues/43869
Signed-off-by: Volker Theile <vtheile@suse.com>
otherwise bash will intepret "kind" as a file when handling command like
```
sudo zgrep <kind> /var/log/ceph/valgrind/* /dev/null | sort | uniq
```
and try to feed its content to zgrep, and write the output of zgrep
to /var/log/ceph/valgrind/*. this is not the intended behavior. what we
what to do is to pass "<kind>" as an argument to zgrep, along with
the globbed files names which matches "/var/log/ceph/valgrind/*".
in this change, "<kind>" is quoted as in the command line. it's also
what `pipes.quote()` does before the change of
35cf5131e7.
this addresses the regression introduced by
35cf5131e7.
Fixes: https://tracker.ceph.com/issues/44454
Signed-off-by: Kefu Chai <kchai@redhat.com>
The `create_osds` call in orchestrator uses multiple named DriveGroups as
the parameter. Adapt the change in Dashboard.
Some minor polishes:
- Use task manager to wrap the operation.
- The submit button in Preview modal is changed from `Add` to `Create`.
- POST `/api/osd` to create OSDs:
- Bare OSDs for OSD service container
{
"method": "bare",
"data": {
"uuid": "xxxx",
"svc_id": 5
}
}
- OSDs with devices (DriveGroups)
{
"method": "drive_groups",
"data": {
< drive group spec here>
}
}
- `/orchestrator/osd` endpoint is removed.
Fixes: https://tracker.ceph.com/issues/43615
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
The ps output names daemons like 'type.foo', e.g., 'mgr.x'. Now that
the test_orchestrator impl is less bonkers this needs to be adjusted to
match reality.
Signed-off-by: Sage Weil <sage@redhat.com>
A first step to do more automatic code checks on the qa/
directory. This is useful while transitioning to python3.
Also use log_exc to top-level to not run into:
error: Argument 1 to "log_exc" has incompatible type
"Callable[[OSDThrasher], Any]"; expected "OSDThrasher"
Signed-off-by: Thomas Bechtold <tbechtold@suse.com>
* refs/pull/33705/head:
qa/suites/upgrade/nautilus-x/parallel: restart mgr.x before mons
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Use io.BytesIO instead of cStringIO.StringIO
Use six.ensure_str whenever it needs to convert binary to str.
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
this reverts 9639acfefe, as the test does
make sense. what fails this test is the machinary to marshal/unmarshal
exception fails to handle un-picklable exceptions. the previous commit
is supposed to use a fallback to handle them.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* refs/pull/33636/head:
qa: add upgrade test for volume upgrade from legacy
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
This tests that volumes created using the ceph_volume_client.py library
continue to be accessible/function via the Nautilus/Octopus ceph-mgr
volumes plugin.
Fixes: https://tracker.ceph.com/issues/42723
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/33506/head:
client: add client_fs mount option support
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
- Display services and daemons in the cluster/services page.
- Display daemons in the cluster/hosts/host-detail page (Daemons tab).
This PR also partially addresses https://tracker.ceph.com/issues/43165:
The endpoint `/api/orchestrator/service` is removed.
Create new endpoints:
- `/api/service`: listing all services in the Ceph cluster.
- `/api/service/<service_name>/daemons`: listing daemons for a
service. e.g. daemons of OSD.
- `/api/host/<hostname>/daemons`: listing daemons of a host.
Fixes: https://tracker.ceph.com/issues/44221
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>