In the 0.82 release, standbyreplay MDS daemons would try
to reformat the jouranl if they saw an older version on
disk, where this should have only been done by the active
MDS for the rank. Depending on timing, this could cause
fatal corruption of the journal.
This change handles the following cases:
* only do reformat if not in standbyreplay (else raise EAGAIN
to keep trying til an active mds reformats it)
* if journal header goes away while in standbyreplay then raise
EAGAIN (handle rewrite happening in background)
* if journal version is greater than the max supported, suicide
Fixes: #8811
Signed-off-by: John Spray <john.spray@redhat.com>
Previously if the journal header contained invalid
write, expire or trimmed offsets, we would end up
hitting a hard-to-understand assertion much later.
Instead, raise the error right away if the fields
are identifiably bad at load time, and assert that
they're valid before persisting them.
Signed-off-by: John Spray <john.spray@redhat.com>
Previously this test assumed no pre-existing
filesystem and no MDS running. Generalize it
to nuke any existing filesystems found before
running, so that you can use it inside a vstart
cluster that had MDS>0.
Signed-off-by: John Spray <john.spray@redhat.com>
So that new MDSs in a new filesystem are guaranteed
to be up to date with anything we blacklisted
from a filesystem coming before.
Signed-off-by: John Spray <john.spray@redhat.com>
Detect leveldb, but do not let autoconf blindly link it with everything on the
planet.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Sighed-off-by: Sage Weil <sage@redhat.com>
In commit 7411c3c6a4 we generalized this
enumeration code by copying what was in the upstart scripts. However,
while the mon and mds directories get a 'done' file, the OSDs get a 'ready'
file. Bah! Trigger off of either one.
Backport: firefly
Signed-off-by: Sage Weil <sage@redhat.com>
Enable us to obtain the erasure-code-profile for a given erasure-pool.
Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
We need to return success if we get a dup command. Simply check whether
the fs is already enabled with the same pools and name.
Fixes: #8857
Signed-off-by: Sage Weil <sage@redhat.com>
Fixes: #8846
Backport: firefly, dumpling
This was broken at ea68b93723. We ended
up calling wait_pending_front() when pending list was empty.
This commit also moves the need_to_wait check to a different place,
where we actually throttle (and not just drain completed IOs).
Reported-by: Sylvain Munaut <s.munaut@whatever-company.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit f9f2417d7d)
Treat rbd_default_{format,order,stripe_unit,stripe_count} as defaults for
the usual arguments for specifying those properties.
librbd::create() is affected by rbd_default_format, so we need to
explicitly override it if --image-format is set. The rest of the
parameters are passed explicitly when they are used, so their rbd_default
equivalents don't matter.
Fixes: #8821
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
This way the default striping style of splitting into
object-sized chunks still works with non-default orders
specified.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Fixes: #8846
Backport: firefly, dumpling
This was broken at ea68b93723. We ended
up calling wait_pending_front() when pending list was empty.
This commit also moves the need_to_wait check to a different place,
where we actually throttle (and not just drain completed IOs).
Reported-by: Sylvain Munaut <s.munaut@whatever-company.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
We want to make sure the daemon runs in its own systemd environment. Check
for systemd as pid 1 and, when present, use systemd-run -r <cmd> to do
this.
Probably fixes#7627
Signed-off-by: Sage Weil <sage@redhat.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Tested-by: Dan Mick <dan.mick@inktank.com>
This fixes the size of some integers that are visible in the network
protocol. There should be no change for machines where sizeof(int) ==
4.
Signed-Of-By: Kevin Cox <kevincox@kevincox.ca>
This clarifies how to deal with layouts in CephFS
using vxattrs. We can point people here if they
ask what they should use instead of the deprecated
`cephfs set_layout`.
Signed-off-by: John Spray <john.spray@redhat.com>
A sample command to run the test on hadoop 2.x is
TESTDIR=/home/test HADOOP_HOME=/usr/lib/hadoop HADOOP_MR_HOME=/usr/lib/hadoop-mapreduce sh workunits/hadoop-wordcount/test.sh starting hadoop-wordcount test
Signed-off-by: rootfs <hchen@redhat.com>
`cephfs set_layout` was broken and is now deprecated
in favour of using xattrs for layout. Retire the
kclient-specific test.
Fixes: #8773
Signed-off-by: John Spray <john.spray@redhat.com>