BuildRequires: cryptopp-devel has been replaced by nss-devel. Skip
google-perftools-devel because that package is not available for x86-64.
Add python.
Don't install libcls_rbd.so.1.0.0.debug.
Package crbdnamer and librados-config.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
mkostemps isn't present in older glibc versions, like the ones in CentOS
5.5. We don't really use any of the extra functionality of mkostemps in
this test.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Replay PGs already accept and queue transactions. PGs will now go to
active during replay in order to simplify the state reported to the user
and to allow recovery to being.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
On S/390, the earlier rjhash<size_t> failed with
"no match for call to '(rjhash<long unsigned int>) (size_t&)'".
It seems the rjhash<size_T> logic was only enabled
on some architectures, and relied on some pretty deep
internals of the bit layout (__LP64__).
Use an explicitly 32-bit type as early as possible, and
convert back to size_t only when really needed. This
should work, and simplifies the code. In theory, we might
have a narrower output (size_t might be 64-bit, max value
we now output is 32-bit), but this doesn't matter as this
is only ever used for picking a slot in an in-memory hash
table, hash(key) modulo num_of_buckets, there won't be >4G
buckets.
Closes: #837
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
There will be problems if two messengers use the same entity_addr_t because
they are on the same ip and choose the same nonce (e.g., because they are
in the same process). Let the caller sort this out in whatever way it
finds most appropriate.
For libceph, librados, and csyn, all N million to the pid.
Fixes: #877
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Turn off the logging and symlink rotation, not just symlink rotation.
This is a somewhat arbitrary distinction (log per instance only for
daemons), but its only used by vstart and only really useful for
development/debugging, so who cares.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Fixes the warning
src/Makefile.am:299: variable `libradosgw_a_LDFLAGS' is defined but no program or
src/Makefile.am:299: library has `libradosgw_a' as canonic name (possible typo)
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
During rejoin we may find that different MDSs have different fragmentation
for directories. When that happens we should refragment as needed on the
replicas to match what's on the primary.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Strip leading . only, to tolerate osd0 and osd.0.
This also turns osd.....foo -> osd.foo, but that's better than
osd.foo.bar -> osd.foobar.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
The PG info.history.last_epoch_started is important because it bounds how
far back in time we think we need to look in order to fully recover the
contents of the PG. That's because every replica commits the PG peering
result (the info and pg log) when it activates.
In order for this to work properly, we can only advance last_epoch_started
_after_ the peer results are stable on disk on all replicas. Otherwise a
poorly timed failure (or set of failures) could lose the PG peer results
and we wouldn't go back far enough in time to find them.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
-f now just means stay in the foreground.
-d now means stay in the foreground and log to foreground.
Both options now disable pid-files.
Update man pages.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
The underlying FS (btrfs at least) will block writes for a period while it
is doing a commit. If an OSD workload is write limited, we should raise
the op_queue max (operations that are queued to be applied to disk) during
the commit period.
For example, for a normally journal throughput limited (writeahead mode)
workload:
- journal queue throttle normally limits things.
- sync starts
- journaled items getting moved to op_queue soon fills up op_queue max
- all writes stop
- sync completes
- op_queue drains, new writes come in again
- journal queue throttle fills up, again starts limiting tput
For an fs throughput limited workload (writeahead):
- kernel buffer cache hits dirty limit
- op_queue throttle limits tput
- sync starts
- opq stalls, new writes stall on throttler
- sync completes
- opq drains (quickly: kernel has no dirty pages)
- new writes flood in
- etc.
(Actually this isn't super realistic, because hitting the kernel dirty
limit will do all sorts of other weird things with userland memory
allocations.)
In both cases, the commit phase blocks up the op queue, and raising the
limit temporarily will keep things flowing. This should be ok because the
disks are still busy during this period; they're just flushing dirty
data and metadata. Once the sync completes the opq will quickly dump dirty
data into the kernel page cache and "catch up".
Signed-off-by: Sage Weil <sage@newdream.net>
In rados_conf_read_file, read from the default configuration file
locations if the library user passes NULL as the location of the
configuration file.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
The read_iterate can cover > addressable memory on 32-bit archs.
Reported-by: Jeff Wu <cpwu@tnsoft.com.cn>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
bufferlist->claim already clears the source bufferlist,
but setting it to NULL prevented it from being destroyed.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Since cfuse usually runs as a nonprivileged user, its defaults must be a
little different from those of the other daemons. Add a flag to
common_init which can be used to set unprivileged daemon defaults.
SimpleMessenger::start() now just takes a boolean telling it whether to
daemonize. It doesn't need to check global variables or other arguments;
it just daemonizes if you tell it to; otherwise not.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>