Handle extended attributes that contain NULL bytes correctly, rather
than treating everything as zero-terminated C strings.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Previously, _activate_committed would access the osdmap epoch racing
with handle_osd_map's osdmap update. This would allow a message to be
sent from a replica to the primary tagged with the same epoch as
last_warm_restart, though the event actually occured before
last_warm_restart. Thus the primary would fail to ignore the event and
transition to crashed.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
There was an old change in file_eval() that was allowing us to switch from
SYNC to MIX or EXCL while there were rdlocks, which either caused lots of
lock thrashing or could (I think) hang things up completely. This was
from ea10a672, an ancient fix for something related that appears to have
taken out the rdlocked check by accident.
In my tests (one writer, one stat-er), this took things from long stalls
(up to 20 seconds) to very responsive stats. Yay!
Fixes: #791
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
If the user didn't specify any actions, print out a usage message rather
than silently exiting.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
If "rados put" uses write instead of write_full, the resulting object on
the server may be a mismash of old and new objects, if the old object
was longer than the new one. This is fairly counterintuitive behavior
for radostool, so remove it.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Create a libcommon service thread. Use it to handle SIGHUP.
Handle it by means of a flag that gets set. Using a queue would raise
the complicated question of what to do when the queue was full.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Instead of automatically marking unfound objects lost (once we've tried
every location we can think of), do it when the administator explicitly
says to. This avoids marking things wrong incorrectly when there are
peering issues, and also allows the administrator to decide whether there
may be offline osds that are worth bringing online.
Signed-off-by: Sage Weil <sage@newdream.net>
We may not want to do this automatically until we have more confidense in
the recovery code. Even then, possible not. In particular, the OSDs may
believe they have contact all possible homes for the data even though there
is some long-lost OSD that has the data on disk that if offline.
For now, we make the marking process explicit so that the administrator can
make the call.
Signed-off-by: Sage Weil <sage@newdream.net>
Don't use the global g_ceph_context. Instead, store the CephContext in
the structures provided by the library user.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
A CephContext represents the context held by a single library user.
There can be multiple CephContexts in the same process.
For daemons and utility programs, there will be only one CephContext.
The CephContext contains the configuration, the dout object, and
anything else that you might want to pass to libcommon with every
function call.
Move some non-config things out of md_config_t and into CephContext.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Split off common_init_daemonize from common_init_finish. cfuse is a
daemon that calls common_init_finish, but handles daemonization itself.
This fixes cfuse.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Get rid of the initialize-then-shutdown-crypto hack. We just initialize
crypto once, after it is safe to do so. There is now a single callback,
common_init_finish, which does the final stage of initialization,
including starting crypto and daemonization (if required.)
common_init_finish needs to be done before messenger::start().
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>