The tid returned by reads is ignored, and would make tracking writes
internally more difficult by using the same id-space as them. Make read
void and update all implementations.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
There's no reason to check the duration of a watch. The notify will
timeout after 30s on the OSD, but there's no guarantee the client will
see that in any bounded time. This test is really meant as a stress
test of the OSDs anyway, not of the clients, so just remove asserts
about operation duration.
Fixes: #4591
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
Second guessing the first sequence number from the FileStore
was silly and broke tests which had the temerity to start at
1 instead of 2...
Fixes: #4687
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
This reverts commit a309177466. This commit
includes calls that involve Mutexes, Lockers, and lockdep -- which isn't
yet set up, so things break horribly. A more subtle approach is required.
Signed-off-by: Greg Farnum <greg@inktank.com>
Allow argparse functions to fail if no argument given by using
special versions that avoid the default CLI behavior of "cerr/exit"
Fixes: #4678
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
mon needs to call argparse for a couple of -- options, and the
argparse_witharg routines were attempting to cerr/exit on missing
arguments. This is appropriate for the CLI usage, but not the daemon
usage. Add a 'cli' flag that can be set false for the daemon usage
(and cause the parsing routine to return false instead of exit).
The daemon's parsing code due for a rewrite soon.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
The MDRequest is destroyed once the client reply is sent, but
we need the reference to the LogSegment for updating the backtrace, so
store a temporary ref to the LogSegment for later.
Fixes#4660.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
The _prefetch() function which intereprets temp_fetch_len interprets
it as the amount of data we need from read_pos, which is the beginning
of read_buf. So by setting it to the amount *more* we needed, we were
getting stuck forever if we actually hit this condition. Fix it by
setting temp_fetch_len based on the amount of data we need in aggregate.
Furthermore, we were previously rounding *down* the requested amount in
order to read only full log segments. Round up instead!
Fixes#4618
Signed-off-by: Greg Farnum <greg@inktank.com>
Currently we don't start logging on daemon startup unless the log_file
parameter was adjusted by ceph.conf. Instead, we should call all config
observers so that the logging subsystem is fully configured and we log
even prior to the daemonize and common_init_finish (when we call observers
again). This fixes logging for the initial period before we daemonize.
For some of the daemons (osd, mon), this includes significant work. It
also fixes the problem where users don't see the 'ceph version ...' banner
on daemon start.
Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Ensure that we push log data out before we restart logging. This may not
be strictly necessary, but it avoids a whole class of possible pitfalls.
Signed-off-by: Sage Weil <sage@inktank.com>
- fix seed
- the array indices are points in time; no need to subtract one from i!
- pick a random seed and print it to stdout
I ran this with several different seeds without failure, so I am confident
we are in good shape. And if we ever get a future failure, we'll have the
seed to reproduce.
Signed-off-by: Sage Weil <sage@inktank.com>
This lets us put a cap on outstanding client IOs. This is particularly
important for clients issuing lots of small IOs.
Fixes: #4579
Signed-off-by: Sage Weil <sage@inktank.com>
We already have a throttler that lets of limit the amount of memory
consumed by messages from a given source. Currently this is based only
on the size of the message payload. Add a second throttler that limits
the number of messages so that we can effectively throttle small requests
as well.
Signed-off-by: Sage Weil <sage@inktank.com>
If we write to an interval that didn't previously exist and then discard
it so that it again doesn't exist, all during the same interval, then we
should not include it in the 'written' set (or exists set, obviously).
Similarly, when we got to look at a merged diff, we can ignore extents
that were written (and possibly zeroed) if they neither existed before nor
after.
Bump up the iteration count to get more confidence that this is actually
correct.
Signed-off-by: Sage Weil <sage@inktank.com>
Fixes: #4600
Object marker should be treated as an object, so that name is formatted
correctly when getting the raw oid.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
FileStore::header_t::start_seq now encodes the op seq which may be
written at FileStore::header_t::start. This way, FileStore::open()
can pass a valid sequence number to read_entry for validation.
Otherwise, read_entry has no way of knowing whether a failure of a
read at header.start indicates that the journal was empty, or that
the entry is corrupt. With start_seq, read_entry can assume
corruption if start_seq <= committed_up_to.
Fixes: #4527
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Move 50-rbd.rules into the ceph base package since the related
ceph-rbdnamer binary is part of this package. Use correct install
pattern.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Reorder file list of ceph package. Fix handling of placeholder
directories, make use of directories marcros like %{_localstatedir}
for /var.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Install ceph-* scripts directly to $(prefix)$(sbindir) (which
normaly would be /usr/sbin) instead of moving it around after
installation in SPEC file or debian files.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Open target with O_CREAT|O_EXCL to ensure we don't overwrite some other
important file (like, say, /etc/passwd). This is irritating because there
is not c++ ofstream equivalent for O_EXCL; kludge around it using
ostringstream instead.
Fixes: #3266
Signed-off-by: Sage Weil <sage@inktank.com>
A standby MDS can attempt the handle_mds_failure paths for itself, if
it sees the transition from up to down. This leads it to insert itself
into the resolve_gather set, which is bad. So check if the failed MDS
is the same as whoami, and abort if so. This fixes#4637.
Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>