Commit Graph

15851 Commits

Author SHA1 Message Date
Sage Weil
29158d7d0f mds: fix validation of (slave) request attempts
Verify that slave requests received are not stale.

Verify that slave replies match the currently processing request.

Clean up the code a bit.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-29 15:14:20 -07:00
Sage Weil
6ad7dfbcb7 mds: identify slave requests with reqid + attempt number
We need to distinguish between different attempts to process a request, or
else we can get annoying races in the slave request handling code.  E.g.,

- request sent to mds A
- A authpins items on B, B registered slave_request
- A forwards request to C, sends slave finish to B
- C receives request, sends authpin slave request to B
- B receives C's authpin request, discards (*)
- B receives A's finish, closes slave request

First we just add tracking of the attempt number.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-29 15:14:19 -07:00
Greg Farnum
97c3bcb7fd scatterlock: fix flag assignments.
Want |= to set a flag, not &=!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-07-29 14:58:01 -07:00
Colin Patrick McCabe
7b574ffc46 osdmap: in json dump, dump out/in, up/down status
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-07-29 14:49:38 -07:00
Yehuda Sadeh
c775c03d91 rgw: get current utc epoch differently
beforehand tm.tm_isdst was returning random results which happened
to work correctly most of the time since we're currently in dst
2011-07-29 14:37:07 -07:00
Yehuda Sadeh
aba88f5220 rgw: init correctly req_state->{bucket, object} 2011-07-29 13:17:38 -07:00
Yehuda Sadeh
a4e4c08343 rgw: fix total time reporting in rgw_admin 2011-07-29 11:48:23 -07:00
Yehuda Sadeh
5c194f5d58 rgw: tweak content-md5 handling 2011-07-29 11:48:23 -07:00
Greg Farnum
86c7260bfc heartbeatmap: fix/clarify the commenting
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-07-29 09:23:24 -07:00
Greg Farnum
925cb462f9 scatterlock: compress boolean flags into a set of state flags
While we're at it, unify the naming structure a bit and remove
the unused stale flag.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-07-29 09:17:45 -07:00
Greg Farnum
acca584b2b scatterlock: rename scatter_flags -> state_flags
We want to use this for all the bools, not just the scatter ones.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-07-29 09:17:45 -07:00
Sage Weil
521940242d Makefile: remove from libglobal
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 16:42:55 -07:00
Colin Patrick McCabe
90ce2f7d5b Add -ltr to libcommon
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-07-28 16:31:09 -07:00
Sage Weil
8524ed5dfe Makefile: -lrt for libglobal.la only
Debugging linking is a pita.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 16:28:27 -07:00
Sage Weil
fadb5ae75a unittest_bufferlist: change include order
fixes a build error (int type conflicts) for me on fatty.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 16:26:58 -07:00
Sage Weil
f82b6298df mds: fix log trimming races
trim() would iterate over segments.  It would take the *p segment, ++p,
then call try_expire().  But the _expired() function would also clean up
and (if possible) retire subsequent segments on the list if they were on
the expired list, invalidating the p iterator.

Untangle the mess by making expired segment trimming (i.e. removing from
segment list) a separate operation performed only by trim() (probably a
good idea anyway).  This keeps the iterator safe/stable.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 16:01:07 -07:00
Sage Weil
8fe50b84d8 mds: separate type for gratuitous debug ESubtreeMaps
Give these a different type so they are not interpreted as subtree
boundaries during replay.  Otherwise we break the truncate_finish code,
which references the truncate_start logsegment by offset.  Probably other
stuff too.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 16:01:07 -07:00
Sage Weil
61a501283f mon: 'ceph mon dump [--format=json]'
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 16:01:07 -07:00
Sage Weil
70dee89692 heartbeatmap: unit test
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 16:01:06 -07:00
Sage Weil
3336665339 heartbeatmap: we don't care about pthread_t
Workers don't have to be threads.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 16:01:06 -07:00
Sage Weil
7815237ebc client: open session with all mds targets
If we have an open session with an mds, we need to have an open session.

The problem is if we, say,

- client has old mdsmap
- mds A adds B as target in mdsmap
- send request to mds A
- A exports to B
- we get the EXPORT, but B isn't listed as a target for A in client map
- client gets updated map

At the time we receive the map we need to open the session to B.   We can't
really do it when we get the EXPORT because we don't know the target MDS.

We can either track which exports are pending to do it, or just blindly
open sessions with targets for any MDSs we have caps with.  Which is
basically every session we have open.  That's simplest for now.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 16:01:06 -07:00
Sage Weil
a3ed402bb5 Makefile: fix unittest_ceph_argparse build
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 15:55:53 -07:00
Colin Patrick McCabe
c304c2c692 injectargs: complain about unparsed args
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-07-28 15:17:38 -07:00
Colin Patrick McCabe
498dd53729 injectargs: print out what is changing
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-07-28 14:49:18 -07:00
Yehuda Sadeh
0b66ed2cd7 rgw: fix base64 check 2011-07-28 14:40:02 -07:00
Yehuda Sadeh
ef509b312e rgw: check content md5 validity when doing auth 2011-07-28 14:29:52 -07:00
Yehuda Sadeh
4a8d8f0eaf rgw: fix date checks 2011-07-28 14:03:38 -07:00
Yehuda Sadeh
07e60616d6 rgw: fix authentication 2011-07-28 14:03:38 -07:00
Greg Farnum
dc4834b6ea scatterlock: convert [un]scatter_wanted to a bitfield
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-07-28 12:55:33 -07:00
Greg Farnum
579f2e92c5 mds: Handle unscatter_wanted in try_eval(lock, need_issue)
commit:dac1dc83ee5598ca97c29cd5d0b12150685cd05b added handling
for scatter_wanted, but we need to handle unscatter_wanted here too.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-07-28 12:55:33 -07:00
Greg Farnum
f5f6b120ad mds: Split the CInode::scatter_wanted field in two
We use this field to indicate we want a scatter or an unscatter. Make
that distinction explicit.
Also, clear the unscatter_wanted in simple_lock when we start a gather!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-07-28 12:55:33 -07:00
Sage Weil
3d9621febf heartbeatmap: fix mode
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 10:11:00 -07:00
Sage Weil
8e4a35884b heartbeatmap: warn if previous deadline is missed
This will generate missed deadline noise in the log that may otherwise be
missed by an infrequent heartbeat_interval.  We generally want to know if
deadlines are missed, but we don't necessarily need to touch the heartbeat
file every second.  This gets us both.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 10:10:51 -07:00
Sage Weil
a9813336b1 ceph_context: only wake up periodically if heartbeat_interval is set
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:50:53 -07:00
Sage Weil
6eb213d34b osd: no need to explicitly check health
The service thread does it now.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:49:24 -07:00
Sage Weil
9e8bb84e90 vstart: set heartbeat file
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:49:23 -07:00
Sage Weil
7265e5cc20 ceph_context: check internal heartbeat in cct service thread
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:49:23 -07:00
Sage Weil
31d5cbbbe3 heartbeatmap: config options, method to touch a file if healthy
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:49:23 -07:00
Sage Weil
3dfe830e3b heartbeatmap: use atomic_t
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:49:23 -07:00
Sage Weil
e68e4f3321 heartbeatmap: put in ceph namespace
Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:49:23 -07:00
Sage Weil
058647f9c2 heartbeatmap: simplify api
reset_timeout(), clear_timeout() makes more sense than "touch".

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:49:23 -07:00
Sage Weil
7aad8f03e0 heartbeatmap: fix stupid race
atomic_t is probably better here, actually... :/

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:49:23 -07:00
Sage Weil
f5db9afb45 heartbeatmap: use a list<> instead of map<>
Don't need a map<> here.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:49:23 -07:00
Sage Weil
7c5f3bc1b5 workqueue: register and time out worker threads
Register and unregister worker threads.  Periodically touch heartbeat
when idle.  Set heartbeat timeout before processing a queue item.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-07-28 09:49:23 -07:00
Sage Weil
d7b45882f2 workqueue: provide op timeout to workqueue constructor
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-07-28 09:49:23 -07:00
Sage Weil
bdfccb0997 heartbeatmap: introduce heartbeat_map
Each thread registered and gets a private structure it can write a timeout
value to.  The timeout is time_t and always fits in a single word, so no
locking is used to update it.

Anyone can call is_healthy() to find out if any timeouts have expired.
Eventually some background thread will do this.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-07-28 09:49:23 -07:00
Sage Weil
cc82707f14 mds: mark ambig imports in ESubtreeMap during resolve
During resolve we may journal EImportFinish(true/false) as we resolve our
imports/exports.  And as a side-effect we may journal an ESubtreeMap.  We
need to properly mark ambig subtrees in that entry based on the
my_ambiguous_imports (resolve state), not just the migrator state (for the
active mds).

Note that the other Migrator::is_ambiguous_import() user
(send_resolve_now()) already does this correctly.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-07-28 09:49:23 -07:00
Sage Weil
06ebba7f7b mds: pin inodes on LogSegment::truncating_inodes list
For active MDS, pin when we add to the list, unpin when we finish
truncating.

For replay, pin when we replay a truncate start, unpin when we replay a
truncate finish.  Use a nice helper for both.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-07-28 09:49:23 -07:00
Sage Weil
a20b15cfa5 mds: handle aborted slave rename while waiting for second prep
When we get the first prep, we may respond to the master with an expanded
list of witnesses for the rename before making any change (or rollback
plan).  If the master fails before sending the second prep attempt, we
may end up in the abort path of _commit_slave_rename() with an empty
rollback_bl.  That's fine; don't crash.  We still need to unfreeze the
srci, but can skip the do_rename_rollback since we didn't actually journal
a change.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-07-28 09:49:22 -07:00
Sage Weil
dac1dc83ee mds: honor scatter_wanted while freezing
- mds A authpins item on mds B
- mds B starts to freeze tree containing item
- mds A tries wrlock_start on A, sends REQSCATTER to B
- mds B lock is unstable, sets scatter_wanted
- mds B lock stabilizes, calls try_eval, defers because freezing.
-> deadlock

In general, we want to avoid the eval while freezing to prevent starvation.
However, in this case with the multi-mds locking, we need to honor
the scatter_wanted even so.

Insert this check in try_eval().  This will catch it on the first try_eval
call after the lock stabilizes.  The ambiguous auth will never catch us
while freezing, and the master holds an auth_pin to prevent a freeze, so
we will never defer the eval; no need to do the same logic in the other
eval method (eval(MDSCacheObject*, ...)) used for retry.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-28 09:49:22 -07:00