Fixes bug #2369. The problem was that sometimes we send the
notification with the un-normalized bucket/obj pair. We
should make sure that we use the caonical name before doing
any cache update.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
This only affects the decision to queue or do things inline, so it is safe
to change while the filestore is up and running.
Also adjust the #ifdef so that there we share a single path through the
code when sync_file_range() is missing.
Fixes: #2368
Signed-off-by: Sage Weil <sage@newdream.net>
If only want to include down osds if *all* of the prior acting osds are
down. If osd->whoami is one of them, then we're okay.
For example, if osd.13 is down, then the below should be satisfied that
osd.14 (osd->whoami) is alive:
2012-04-27 10:46:38.746681 7f5258a63700 15 osd.14 27 calc_priors_during 6.5 [9,25)
2012-04-27 10:46:38.746688 7f5258a63700 20 osd.14 27 6.5 in epoch 9 was [13,14]
2012-04-27 10:46:38.746695 7f5258a63700 20 osd.14 27 6.5 in epoch 10 was [13,14]
2012-04-27 10:46:38.746701 7f5258a63700 20 osd.14 27 6.5 in epoch 11 was [13,14]
2012-04-27 10:46:38.746709 7f5258a63700 20 osd.14 27 6.5 in epoch 12 was [13,14]
2012-04-27 10:46:38.746715 7f5258a63700 20 osd.14 27 6.5 in epoch 13 was [13,14]
2012-04-27 10:46:38.746722 7f5258a63700 20 osd.14 27 6.5 in epoch 14 was [13,14]
2012-04-27 10:46:38.746729 7f5258a63700 20 osd.14 27 6.5 in epoch 15 was [14]
2012-04-27 10:46:38.746735 7f5258a63700 20 osd.14 27 6.5 in epoch 16 was [14]
2012-04-27 10:46:38.746742 7f5258a63700 20 osd.14 27 6.5 in epoch 17 was [14]
2012-04-27 10:46:38.746748 7f5258a63700 20 osd.14 27 6.5 in epoch 18 was [13,14]
2012-04-27 10:46:38.746755 7f5258a63700 20 osd.14 27 6.5 in epoch 19 was [13,14]
2012-04-27 10:46:38.746762 7f5258a63700 20 osd.14 27 6.5 in epoch 20 was [13,14]
2012-04-27 10:46:38.746768 7f5258a63700 20 osd.14 27 6.5 in epoch 21 was [13,14]
2012-04-27 10:46:38.746775 7f5258a63700 20 osd.14 27 6.5 in epoch 22 was [14]
2012-04-27 10:46:38.746781 7f5258a63700 20 osd.14 27 6.5 in epoch 23 was [14]
2012-04-27 10:46:38.746788 7f5258a63700 20 osd.14 27 6.5 in epoch 24 was [14]
2012-04-27 10:46:38.746790 7f5258a63700 10 osd.14 27 calc_priors_during 6.5 [9,25) = 13
In that case, it wasn't, and the pg creation was blocked.
Fixes: #2355
Signed-off-by: Sage Weil <sage@newdream.net>
Create an is_unmanaged_snaps_mode() function to parallel
is_pool_snaps_mode(), and replace all the checks directly referencing
removed_snaps or snaps with calls to these functions.
Fixes#2345.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Give each Messenger a logical name describing its role. For instance, the
OSD will have client, cluster, and heartbeat messengers.
Signed-off-by: Sage Weil <sage@newdream.net>
There are plenty of scenarios where the user doesn't need a config file.
Instead, just print a warning and let things move on.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
There is nobody responding to CLOSE_STDERR, but this block sure looks
like it should be doing so. Fix that!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
These may be NULL if we expand the addr vectors but haven't ever stored an
address yet. Check for NULL and return a reference to a blank
entity_addr_t as needed.
Signed-off-by: Sage Weil <sage@newdream.net>
This allows the rbd tool to provide a useful error message, instead of
compounding more possible causes into one error code.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
In 313c1566d3 we switched to using the
get_addr() accessor methods, which assert that the osd exists. Check that
before calling.
Fixes: #2361
Signed-off-by: Sage Weil <sage@newdream.net>
Currently we drop and retake locks during handle_osd_map calls to
advance_map and activate_map. Instead, take them all once, and hold them.
This avoids leaving dirty in-core state in the PG without the lock held.
This will clearly go away as soon as the map threading stuff is redone.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Make sure we record any rewind_divergent_log. In the activate case, this
will happen anyway, but mark it dirty here for correctness/completeness.
The merge_log case might be a bug.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
all_activated_and_committed() is called from _activate_committed(), called
from a objectstore completion, and also from the state machine, which is
part of a larger transaction.
Instead, set dirty_info, and build/apply a transaction in the caller
(the completion) as needed. Fixes part of #2360.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
We shouldn't modify the local notion of the history without recording it to
disk. And we (probably) also don't need to do that at all on query.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
In proc_replica_info and proc_primary_info, we may or may not update
the pg_info_t. If we do, set dirty_info, so that it will be recorded.
Same goes for when the primary pushes out updated stats to us.
Also, do not write a purged_snaps() update directory; rely on the caller
to write out dirty info.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Previously we would check and write dirty_info *without the pg lock* after
doing the advance and activate map calls. This was unlikely to race with
anything because the queues were drained, but definitely not right.
Instead, do the write in activate_map, or explicitly if activate_map is
not called (so that we record our progress after handling maps when we are
not up).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Set this up in either global_init() or common_init_finish(), both opportune
times that occur after config parsing has happened and the user has the
option to modify this behavior. The exception would be libraries like
librados, which can't use rados_conf_* to enable this. Arguably flush
functionality should be exposed through the librados API directly, instead
of futzing with on_exit().
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Share past intervals when starting up new replicas. This can happen via
an MOSDPGInfo or an MOSDPGLog message.
Fix up get_or_create_pg() so the past_intervals arg is required (and a ref,
like the other args). Fix doxygen comment.
Now the only time generate_past_intervals() should do any work is when
upgrading old clusters, during pg creation, and (possibly) during pg
split (when that is fully implemented).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>