to avoid possible deadlock. quote from doc of Popen.wait()
> This will deadlock when using stdout=PIPE and/or stderr=PIPE and the
child process generates enough output to a pipe such that it blocks
waiting for the OS pipe buffer to accept more data. Use communicate() to
avoid that.
and print out the stdout and stderr using LOG.warn() if the command
fails.
Signed-off-by: Kefu Chai <kchai@redhat.com>
The former semantic of ceph-disk destroy is now implemented with the
--purge flag. Use that for the ceph-disk suite.
Signed-off-by: Loic Dachary <loic@dachary.org>
When using --dmcrypt, the lockbox stores the OSD id and the cephx secret
that will be used when activating the OSD.
When activating, the OSD id is copied from the lockbox if available,
otherwise it is obtained from osd new.
Add support for re-using an OSD id via the --osd-id option to prepare.
Signed-off-by: Loic Dachary <loic@dachary.org>
- simplifies interaction with monitor, makes it atomic
- marks the OSD as DESTROYED so that the id may be potentially reused
- --purge option to also remove from CRUSH and deallocate the id.
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
Add a set of new tests for the case when public_addr and public_bind_addr
are different for a mon. In order to test this properly I had to employ
port forwarding with socat. This helps simulate what would happen in a
environment like Kubernetes. socat is now a build dependency.
Also, moved jq_success to ceph-helpers.sh and refactored run_mon to enable
creating the mons without creating the rbd pool immediately.
Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>
To support running in dynamic enviornments (like Kubernetes) the mon needs
to be able to advertise and ip address that is different from the ip address
that it listens on locally.
Added a new config option "public_bind_addr" which if set becomes the address
that the mon will bind to locally. If empty (the default) the public_addr
will be used to bind locally.
added a new function on Messenger to set_addr which is called by ceph-mon to set
the advertised address after doing the bind.
also relaxed the "wrong node!" errors in AsyncMessenger and SimpleMessenger as
its now valid to talk to a peer whose peer_addr_of_me is different from what
we expect.
Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>
Valgrind runs itself on forked children, and does its cleanup when they
complete, and this is slow... slow enough that it frequently makes the
test time out.
Valgrind let's you ignore child *processes* that you exec, but I can't
find a way to skip forked children in the same address space.
Work around this by skip this validation when running under valgrind.
Fixes: http://tracker.ceph.com/issues/20602
Signed-off-by: Sage Weil <sage@redhat.com>
This option allows us to disable the crush smoke test when creating pools,
injecting crush maps, or making other changes. DANGER DANGER.
Signed-off-by: Sage Weil <sage@redhat.com>
If we start_boot and see that we don't have luminous mons, we will stop.
But we don't currently reliably notice when the luminous upgrade completes.
If we happen to be connected to the last mon we will start_boot() because
of the trigger in ms_handle_connect(), but if we are not connected to the
last mon we'll eventually get a monmap update but not restart booting.
Fix by setting a flag if we are waiting, and restart boot if the flag is
set, we are in preboot, and we see we now have luminous mons.
Fixes: http://tracker.ceph.com/issues/20631
Signed-off-by: Sage Weil <sage@redhat.com>
This takes account of the new health format, also
expands and visually cleans up the frontpage
where we put the health information.
Dark backgrounds make it much easier to use
red/amber/green colours to grab attention.
Signed-off-by: John Spray <john.spray@redhat.com>