Sage Weil
a01d2dfc87
osd: accessors for num_pgs
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:27:00 -05:00
Sage Weil
bfbf2044b2
osd: fix old wake_pg_waiters references
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:59 -05:00
Sage Weil
dc66a055ea
osd: fix 'stale' message
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:59 -05:00
Sage Weil
08381749f6
osd: constify arg for handle_pg_create_info, maybe_wait_for_max_pg
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:59 -05:00
Sage Weil
db35bbd352
osd: constify arg to prime_splits
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:59 -05:00
Sage Weil
dba7521d92
osd: constify arg to identify_splits
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:59 -05:00
Sage Weil
c9bf02f481
osd: drop unused pushes_to_free variable on _process
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:59 -05:00
Sage Weil
3bd333810c
osd: handle pushes_to_free in consume_map
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:59 -05:00
Sage Weil
b57d40991c
osd: synchronously remove pgids when pool tombstone is missing or invalid
...
This is needed for upgraded clusters (e.g., v13.0.2 clusters with an
missing ec_profile or upgraded clusters with partially-deleted pools/pgs).
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:59 -05:00
Sage Weil
26f00dd67c
qa/suites: mon warn on pool no app = false for api tests
...
Among other things, the list.cc tests set pg_num which waits for cluster
healthy.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:58 -05:00
Sage Weil
c2cce3bc88
qa/suites/rados/basic/tasks/rados_api_tests: debug ms = 1
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:58 -05:00
Sage Weil
0e6db5e320
osd: periodically request newer map from mon if waiting peering events
...
If we have peering events waiting on a newer map than we have, request it
from the mon. Do this periodically in tick so that we normally wait to get
it from a peer first.
This avoids a deadlock situation where we are, say, waiting for a newer
map to create a pg or but do not ever get the map to do it (because the
cluster is idle).
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:58 -05:00
Sage Weil
740b7809af
osd: use rctx transaction for PG removal
...
In the normal case, queue up the removal work on the rctx transaction.
For the final cleanup, since we need to block, dispatch it ourselves, and
do not do so in OSD.cc.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:58 -05:00
Sage Weil
11a9fbecf9
osd: some debug output in identify_split_children
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:58 -05:00
Sage Weil
68dac914ed
osd/PG: do final pg delete transaction on pg sequencer
...
Simpler, cleaner. Also, this way we flush before returning.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:58 -05:00
Sage Weil
1eec5bb6a2
osd: better debug output in identify_splits
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:58 -05:00
Sage Weil
5155baf323
osd: handle NOUP flag vs boot race
...
If we digest maps that show a NOUP flag change *and* we also go active,
there is no need to restart the boot process--we can just go/stay active.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:58 -05:00
Sage Weil
29a885c915
qa/suites/rados/singleton/all/recovery_preemption: make test more reliable
...
A 30 second run did only 7000 ops, which means ~50 log entires per pg...
not enough to trigger backfill.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:57 -05:00
Sage Weil
c3589df320
qa/suites/rados/singleton/all/mon-seesaw: whitelist PG_AVAILABILITY
...
The seesaw might delay pg creation by more than 60s.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:57 -05:00
Sage Weil
494d02c349
osd/PG: ensure an actual transaction gets queued for recovery finish
...
Otherwise, this context gets leaked and lost.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:57 -05:00
Sage Weil
ce699ff870
osd: close split vs query race in consume_map
...
Consider the race:
- shard 0 consumes epoch E
- shard 1 consumes epoch E
- shard 1 pg P will split to C
- shard 0 processes query on C, returns DNE
- shard 0 primes slot C
Close race by priming split children before consuming map into each
OSDShard. That way the query will either (1) arrive before E and before
slot C is primed and wait for E, or find the slot present with
waiting_for_split true.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:57 -05:00
Sage Weil
b4d96be92d
osd: improve documentation for event queue ordering and requeueing rules
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:57 -05:00
Sage Weil
ff0f798e1b
osd/PG: flush sequencer/collection on shutdown
...
This should catch any in-flight work we have.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:57 -05:00
Sage Weil
40a92a1f56
osd/PG: move shutdown into PG
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:57 -05:00
Sage Weil
c454184d5e
osd/osd_types: fix pg_t::pool() return type (uint64_t -> int64_t)
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:57 -05:00
Sage Weil
38319f8300
mon/OSDMonitor: disallow pg_num changes until after pool is created
...
The pg create handling OSD code does not handle races between a mon create
message and a split message.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:56 -05:00
Sage Weil
334bf7e3dc
osd/PG: set send_notify on child
...
If we are a non-primary, we need to ensure the split children send
notifies.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:56 -05:00
Sage Weil
0713586300
osd: kill broken _process optimization; simplify null pg flow
...
- drop fast quuee to waiting list optimization: it breaks ordering and is
a useless optimization
- restructure so that we don't drop the lock and revalidate the world if
pg == nullptr
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:56 -05:00
Sage Weil
f9667a9ef3
osd: fix fast pg create vs limits
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:56 -05:00
Sage Weil
b4af83d735
osd: (pre)publish map before distributing to shards (and pgs)
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:56 -05:00
Sage Weil
cf50361066
osd: update numpg_* counters when removing a pg
...
Usually on a pg create we see an OSDMap update; on PG removal completion
we may not.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:56 -05:00
Sage Weil
9d6425ab25
osd: decrement deleting pg count in _delete_some
...
The exit() method for ToDelete state doesn't run on PG destruction.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:56 -05:00
Sage Weil
3b970e32b0
osd: clear shard osdmaps during shutdown
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:56 -05:00
Sage Weil
bfeae027aa
osd: make save osdmap accessor for OSDShard
...
The advance_pg needs to get the shard osdmap without racing against
consume_map().
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:56 -05:00
Sage Weil
540b1bc9e6
osd: clean up mutex naming for OSDShard
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:55 -05:00
Sage Weil
183e7d7bc2
common/tracked_int_ptr: fix operator= return value
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:55 -05:00
Sage Weil
3a0b197cd1
osd: fix pg removal vs _process race
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:55 -05:00
Sage Weil
7fb35ff961
osd: lookup_*pg must return PGRef
...
Otherwise it is fundamentally unsafe, as the PG might get destroyed out
from under us without a reference.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:55 -05:00
Sage Weil
1270b49fb5
osd: kill pass-through _open_pg
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:55 -05:00
Sage Weil
486faa482a
osd: remove old min pg epoch tracking
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:55 -05:00
Sage Weil
bc9436bcb5
osd/PG: remove RecoveryCtx on_applied and on_commit
...
These were awkward and unnecessary.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:55 -05:00
Sage Weil
7a9153c4b3
osd/PG: register delete completion directly on Transaction
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:55 -05:00
Sage Weil
ed72f30db7
osd: register split completion directly on Transaction
...
No need to use wonky RecoveryCtx C_Contexts
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:54 -05:00
Sage Weil
2c2378c49e
osd/PG: drop unused context list accessors for RecoveryCtx
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:54 -05:00
Sage Weil
45e07480df
osd/PG: register recovery finish context directly on Transaction
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:54 -05:00
Sage Weil
643714ff96
osd/PG: drop unused activate() context list arg
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:54 -05:00
Sage Weil
a5494b815c
osd/PG: register flush completions directly on the Transaction
...
No need to awkward list passed as an arg; all of these callbacks end up
on the Transaction anyway.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:54 -05:00
Sage Weil
6c52e5d1c7
osd: wait for pg epochs based on shard tracking
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:54 -05:00
Sage Weil
9895c9f1a9
osd: index pg (slots) by map epoch within each shard
...
This will replace the epoch tracking in OSDService shortly.
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:54 -05:00
Sage Weil
e178a6d876
osd/PG: link back to pg slot
...
Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:54 -05:00