Commit Graph

9470 Commits

Author SHA1 Message Date
Colin Patrick McCabe
e2ba601be1 logger: fix EINTR handling
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-07 14:00:01 -08:00
Colin Patrick McCabe
bacdd49352 logging: rename_output_file: fix bug
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-07 13:57:38 -08:00
Colin Patrick McCabe
d70851ef01 logging: DoutStreambuf: Implement log-to-file
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-07 13:55:54 -08:00
Colin Patrick McCabe
952111451c logging: Add log_to_file option
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-07 13:46:40 -08:00
Colin Patrick McCabe
df5d4e6297 logging: DoutStreambuf improvements
Write to stdout_fileno directly rather than using a buffer, which we
would then have to flush. Fix a bug in the buffering of priorities.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-07 12:11:18 -08:00
Colin Patrick McCabe
d4043e818f logging: add DoutStreambuf::set_prio
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 23:46:07 -08:00
Colin Patrick McCabe
6c7735f692 logging: DoutStreambuf must handle stdout + stderr
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 23:14:27 -08:00
Colin Patrick McCabe
12544a4912 logging: Add log_to_syslog option
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 23:03:43 -08:00
Colin Patrick McCabe
5ac581df02 Rename SyslogStreambuf -> DoutStreambuf
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 15:57:56 -08:00
Colin Patrick McCabe
d1e0a2ae15 logging: debug.h: move some debug functions
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 15:38:14 -08:00
Colin Patrick McCabe
c94e0d2d38 logging: optimize with likely/unlikely macros
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 15:38:14 -08:00
Colin Patrick McCabe
9811fbd047 logging: Replace derr with dout
derr was really just an alias for STDERR. Unfortunately, after we call
daemonize, STDERR is connected to /dev/null. So just replace calls to
derr with dout so that our important messages don't get lost.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 15:38:14 -08:00
Colin Patrick McCabe
ab18aaec4a logging: add g_conf.clog_to_syslog
Add a new configuration option that allows you to send central log
messages to syslog.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 15:38:12 -08:00
Colin Patrick McCabe
ab61823e29 logging: LogEntry: don't pass enums by reference
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 15:35:52 -08:00
Colin Patrick McCabe
fcae8a7aa0 logging: MLog.h: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 15:30:17 -08:00
Colin Patrick McCabe
82fa7f2d0f logging: LogClient: refactor handle_log_ack
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 15:30:17 -08:00
Colin Patrick McCabe
4ef069c3f0 logging:Move LogEntry.h into common with LogClient
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 15:30:17 -08:00
Colin Patrick McCabe
f2ead26e7b logging: better syntax for LogClient
Rather than having to write logclient.log(LOG_ERROR, ss), coders can now
write clog.error() << "str". Auto-flushing, if enabled, is still
handled automatically.

Rename instances of LogClient to clog (central log) for consistency and
brevity.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-06 15:29:32 -08:00
Colin Patrick McCabe
0ea601ab26 Create SyslogStreambuf
SyslogStreambuf is a kind of stream buffer that allows you to output
characters from an ostream to syslog. Most standard IO streams can make
use of this Streambuf.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-12-01 15:00:23 -08:00
Sage Weil
f216b0200b Merge remote branch 'origin/lost' into unstable
Conflicts:
	src/osd/osd_types.h
2010-11-30 16:11:20 -08:00
Colin Patrick McCabe
0cc8d34e7f osd: refactor object_info_t constructor a bit
Create a copy constructor for object_info_t, since we often want to copy
an object_info_t and would rather not try to remember all the fields.
Drop the lost parameter from one of the other constructors, because it's
not used that much.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:49 -08:00
Colin Patrick McCabe
cee3cd51fc osd: share_pg_log: update peer_missing
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:48 -08:00
Colin Patrick McCabe
e9ccd7eb09 osd: mark_obj_as_lost: fix oloc init, eversion
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:48 -08:00
Colin Patrick McCabe
c29fbb12e0 osd: mark_all_unfound_as_lost: bugfix, refactor
mark_all_unfound_as_lost: just delete items from the rmissing set as we
find them, rather than using a multi-pass system.

Update info.last_update as we go so that log printouts will look correct
(the log printout function checks info.last_update)

Don't remove from missing or missing_loc in mark_obj_as_lost.
PG::missing_loc should never have the soid, and PG::missing we handle
elsewhere.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:48 -08:00
Colin Patrick McCabe
b46f847cf9 osd: mark_obj_as_lost: don't assume we have obj
In PG::mark_obj_as_lost, we have to mark a missing object as lost. We
should not assume that we have an old version of the missing object in
the ObjectStore. If the object doesn't exist in the object store, we
have to create it so that recovery can function correctly.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:48 -08:00
Colin Patrick McCabe
5e243f3ee8 osd: create lost2 test
This one verifies:
1. Client asks for an unfound object and gets put to sleep
2. Object gets declared lost
3. Client wakes up

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:48 -08:00
Colin Patrick McCabe
55f7e567de osd: mark_all_unfound_as_lost: set lost attr
In mark_all_unfound_as_lost, we need to set the lost bit in the objects'
object_info_t.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:48 -08:00
Colin Patrick McCabe
d5e6cae2f4 radostool: fix memleak in error path
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:48 -08:00
Colin Patrick McCabe
c281e1e073 osd: mark_all_unfound_as_lost: wake waiters
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:48 -08:00
Colin Patrick McCabe
b15a97c71e test_lost: add lost1 test
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:47 -08:00
Colin Patrick McCabe
ad4e5f36d4 osd: ReplicatedPG::do_op: error on read-from-lost
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:48:47 -08:00
Colin Patrick McCabe
136dfdeb70 osd: don't mark objs as lost unless we're active
We don't have enough information to mark objects as lost until we
activate the PG. might_have_unfound isn't even built until PG::activate.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:47:09 -08:00
Sage Weil
08bd4eadd2 mds: fix resolve for surviving observers
Make all survivors participate in resolve stage, so that survivors can
properly determine the outcome of migrations to the failed node that did
not complete.

The sequence (before):
 - A starts to export /foo to B
 - C has ambiguous auth (A,B) in it's subtree map
 - B journals import_start
 - B fails
...
 - B restarts
 - B sends resolves to everyone
   - does not claim /foo
 - A sends resolve _only_ to B
   - does claim /foo
 - B knows it's import did not complete
 - C doesn't know anything.  Also, the maybe_resolve_finish stuff was
   totally broken because the recovery_set wasn't initialized

See new (commented out) assert in Migrator.cc to reproduce the above.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-30 15:43:53 -08:00
Colin Patrick McCabe
c0e60afea5 test: dump_osd_store: sort dump output
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:43:44 -08:00
Colin Patrick McCabe
e555899cd3 osd: active replicas process logs from primaries
In _process_pg_info, if the primary sends us a PG::Log, a replica should
merge that log into its own.

mark_all_unfound_as_lost / share_pg_log: don't send the whole PG::Log.
Just send the new entries that were just added when marking the objects
as lost.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:43:44 -08:00
Colin Patrick McCabe
de09422497 osd: object_info_t: add lost field
We can now permanently mark objects as lost by setting the lost bit in
their object_info_t. Rev the object_info_t struct.

get_object_context: re-arrange this so that we're always setting the
lost bit. Also avoid some unecessary steps.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:43:44 -08:00
Colin Patrick McCabe
80f3ea10f5 Add ./ceph dump pg debug degraded_pgs_exist
./ceph dump pg debug degraded_pgs_exist returns TRUE if some pgs are
degraded; false otherwise.

tests: move start_recovery into test_common.sh.
Create recovery1 test.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:43:44 -08:00
Colin Patrick McCabe
fb4734be56 (re)add mechanism for marking objects as lost
In activate_map, we now mark objects that we know are unfindable as
lost. This relies on the might_have_unfound set introduced earlier.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
2010-11-30 15:43:44 -08:00
Sage Weil
bf784cdb4f osd: fix object_info_t() initialization of oloc
Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-30 12:57:43 -08:00
Sage Weil
91a7559061 mds: add debug output to make completions easier to track
Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-30 12:56:15 -08:00
Sage Weil
ba1f3cb980 osd: fix misuses of OLOC_BLANK
Commit 6e2b594b fixed a bunch of bad get_object_context() calls, but even
with the parameter fixed some were still broken.  Pass in a valid oloc in
those cases.  The only places where OLOC_BLANK _is_ still uses is when we
know we have the object locally and will load a valid value off disk.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-30 12:48:32 -08:00
Sage Weil
2ad901b34c Revert "mds: resolve cleanup"
This reverts commit cd53719f3c.

We need this on surviving nodes too to resolve ambiguous migrations to/from recoverying
nodes.
2010-11-30 12:23:18 -08:00
Sage Weil
b39f042501 Merge branch 'testing' into unstable
Conflicts:
	src/os/FileJournal.cc
2010-11-30 12:19:39 -08:00
Sage Weil
1b06332de6 osd: make recovery_oids debug list per-pg
Otherwise we hit bad asserts if an object of the same name in different
pools is getting recovered simultaneously.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-30 11:43:19 -08:00
Greg Farnum
05ad97b6ab client: Set the DirResult buffer to NULL when deleting it.
This should fix a crash exposed by our bonnie workunit. Previously
the client would keep trying to read out of the (deleted) buffer!
2010-11-30 10:56:44 -08:00
Sage Weil
5eb8ef7f11 filejournal: fix throttle vs FULL behavior
We don't want to add to the throttler if we aren't going to queue the
write, or else we'll never take it off again.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-30 08:55:29 -08:00
Sage Weil
132f74c560 Merge branch 'osd_journaling' into unstable 2010-11-30 08:32:55 -08:00
Sage Weil
7af9ffdf26 filestore: make sure blocked op_start's wake up in order
If they wake up out of order (which, theoretically, they could before) we
can screw up journal submitting order in writebehind mode, or apply order
in parallel and writeahead journal mode.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-30 08:30:57 -08:00
Sage Weil
fac7266d4c filestore: assert op_submit_finish is called in order
Verify/assert that we aren't screwing up the submission pipeline ordering.
Namely, we want to make sure that if op_apply_start() blocks, we wake up
in the proper order and don't screw up the journaling.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-30 08:24:57 -08:00
Sage Weil
5e391db0a6 filejournal: rework journal FULL behavior and fix throttling
Keep distinct states for FULL, WAIT, and NOTFULL.

The old code was more or less correct at one point, but assumed the seq
changed on each commit, not each operation; in it's prior state it was
totally broken.

Also fix throttling (we were leaking items in the throttler that were
submitted while the journal was full).

Signed-off-by: Sage Weil <sage@newdream.net>
2010-11-30 07:54:42 -08:00