Commit cefa55b288 moved PG initialization
into init(), but passed acting for both up and acting args. This lead to
confusion between primary and replica.
Also fix debug print so that the output is useful.
Fixes: #2075, #2070
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Hold journal_lock during replay so that we don't stomp on variables like
op_seq and open_ops that the the commit thread cares about.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
It's now possible to send the ack and deregister the repop before the
op_applied() happens. And when that happens, we'll call eval_repop() once
more. Don't do anything in that case.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Creating a snapshot requires using "rbd snap create",
as opposed to just "rbd create". Also for purposes of
clarification, add note that removing a snapshot similarly
requires "rbd snap rm".
Thanks to Josh Durgin for the explanation on IRC.
Signed-off-by: Florian Haas <florian@hastexo.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
For repop completion, we want waitfor_ack and _commit to be empty. For
replicas, a commit reply implies ack, so ack is always a subset of commit.
But for the local write, we wait for applied separately, so we can have
repops open where we sent the reply to the client but still have it open
and consuming memory. And generating 'old request' warnings in the logs
(when the filestore is taking a long time to apply to the fs).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Set this bit whenever up != acting. This tells you that the OSDMap is
explicitly remapping the PG to different nodes (than what CRUSH specified).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- rename is_all_update() -> needs_recovery(), reverse logic.
- drop up != acting check; that has nothing to do with
recovery itself
- drop trigger in Active::react(const ActMap&)... it's nonsensical
- CompleteRecovery always leads to finish_recovery (or acting set change)
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Since clean now means not degraded, we need some other indication that
recovery has completed and we are "done" (given the current up/down state
of the OSDs).
Adding a 'recovering' state also makes it clearer to users that work is
being done, as opposed to the current situation, where they look for the
absense of 'clean'.
Signed-off-by: Sage Weil <sage@newdream.net>
If our last_committed == 1, we don't need a separate stash. This is the
logic that slurp() follows, so fix is_consistent() to match.
Fixes: #2077
Signed-off-by: Sage Weil <sage@newdream.net>
Normally we take a fresh map reference in PG::lock(). However,
_activate_committed needs to make sure the map hasn't changed significantly
before acting. In the case of #2068, the OSD map has moved forward and
the mapping has changed, but the PG hasn't processed that yet, and thus
mis-tags the MOSDPGInfo message.
Tag the message with the e epoch, and also pass down the primary's address
to send the message to the right location.
Fixes: #2068
Signed-off-by: Sage Weil <sage@newdream.net>
Normally we take a fresh map reference in PG::lock(). However,
_activate_committed needs to make sure the map hasn't changed significantly
before acting. In the case of #2068, the OSD map has moved forward and
the mapping has changed, but the PG hasn't processed that yet, and thus
mis-tags the MOSDPGInfo message.
Tag the message with the e epoch, and also pass down the primary's address
to send the message to the right location.
Fixes: #2068
Signed-off-by: Sage Weil <sage@newdream.net>
Clean means we have exactly the right number of replicas and recovery is
complete. Degraded means we do not have enough replicas, either because
recovery is in progress, or because acting is too small.
A consequence is that if we have a PG with len(up) == 1 but a pg_temp
mapping so that len(acting) == 2, it will be active and not clean.
Fixes: #2060
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
This still makes sure daemons don't start on boot.
When auto start was disabled it would also prevent logrotate from doing it's job.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Sage Weil <sage@newdream.net>
OSDs (src/osd/ClassHandler.cc) specifically look for libcls_*.so in
/usr/$libdir/rados-classes, so libcls_rbd.so and libcls_rgw.so need to
be shipped along with the base package.
Signed-off-by: Holger Macht <hmacht@suse.de>
Signed-off-by: Sage Weil <sage@newdream.net>
We can pause() multiple times, and we need as many unpause()s to actually
resume work.
This resolves problems where we have two actors interested in pausing a
queue, both want to stop work, and they aren't interacting/coordinating.
Signed-off-by: Sage Weil <sage@newdream.net>
Make some effort to stop work in progress, remove pid file, and exit with
informative error code.
Note that this is much simpler than the shutdown() exit path; I'm not sure
whether a complete teardown is useful. It's also difficult to maintain
and get right with everything else going on, and it's not clear that it's
worth the effort right now.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>