Commit Graph

13158 Commits

Author SHA1 Message Date
Sage Weil 82282f25c1 config: back to 6 pg bits for now
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-05 13:30:42 -08:00
Sage Weil 46b01f4a78 Merge branch 'osd_recovery' into next 2011-03-04 14:19:40 -08:00
Yehuda Sadeh 08af63da1e rgw: put object request returns etag 2011-03-04 14:25:47 -08:00
Sage Weil c07f357885 test_missing_unfound: asdf
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-04 14:19:33 -08:00
Sage Weil 2494d59303 osd: requeue pg for recovery if we may have found someting
If we get a peer log/missing and call search_for_missing, requeue the pg
for recovery so we can pull anything we may have just found.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-04 14:00:02 -08:00
Sage Weil 53428c0ea5 osd: include all up peers in might_have_unfound when desperate
If our might_have_unfound calculation was off (it currently can be, see
#865) we could prematurely give up.  Try any up OSD at this stage just to
be sure.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-04 13:59:24 -08:00
Sage Weil 30c5091c54 osd: recover_primary if recover_replicas starts no ops
recover_replicas may fail to start anything if we see an unexpected error.
In that case, try recover_primary immediately instead of waiting for the
PG to (hopefully) get requeued for recovery later.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-04 09:39:59 -08:00
Sage Weil 836f72a166 osd: discover more missing if unfound and do_recovery can't start anything
If we couldn't start any recovery ops and things are still
unfound, see if we can discover more missing object locations.
It may be that our initial locations were bad and we errored
out while trying to pull.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-04 09:38:47 -08:00
Colin Patrick McCabe 964f1e197f Fix test/signals.cc
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 22:32:21 -08:00
Sage Weil ab74d498a9 librados: cosmetic header changes
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-03 21:34:26 -08:00
Josh Durgin e779a3c03c librados, librbd: use separate IoCtxs for data and metadata
Adds deep copy method IoCtx::dup, so that the data and metadata
contexts can have different snap_seqs and snap contexts.

Also avoid calling Rados::shutdown explicitly, since the destructor
will do this, and it must run after the Image destructor.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-03-03 16:10:39 -08:00
Josh Durgin 37edd473c8 librbd: fix error message and unnamed constant
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-03-03 16:10:26 -08:00
Josh Durgin 4934329634 librbd: change RBD::open to take a reference to an Image instead of a pointer
This makes the API more consistent with the librados API.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-03-03 16:08:58 -08:00
Josh Durgin fdd50a150a librados: remove unused member of IoCtx
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-03-03 16:08:41 -08:00
Yehuda Sadeh 1de0b274cb librados: IoCtx destructor should put reference only if initalized 2011-03-03 16:08:27 -08:00
Yehuda Sadeh 2e3b84486c librados: can set up object locator 2011-03-03 16:06:27 -08:00
Colin Patrick McCabe f45a790f6e librados:rados_ioctx_stat -> rados_ioctx_pool_stat
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 16:05:59 -08:00
Yehuda Sadeh 3ebaa4c7d9 object_locator: fix clear() 2011-03-03 16:05:47 -08:00
Colin Patrick McCabe b2ceb75cf5 librados: use atomic_t for reference count
Use an use atomic_t for the reference count in IoCtxImpl.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 16:04:27 -08:00
Josh Durgin 6f797af1ad librados: make IoCtxImpl a pointer in WatchContext
Adds get and put to IoCtxImpl for refcounting,
and uses them in WatchContext, which shouldn't
be creating a copy of the IoCtxImpl.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-03-03 16:04:20 -08:00
Josh Durgin 773f0034b1 librados: decrement refcount of old io_ctx_impl in assignment operator
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-03-03 16:04:13 -08:00
Colin Patrick McCabe 0e32cd2f55 librados: fix IoCtx::from_rados_ioctx_t
IoCtx::from_rados_ioctx_t creates an IoCtx out of a rados_ioctx_t.
However, this IoCtx must share ownership of the IoCtxImpl pointer with
the C API user who first called rados_ioctx_create. This must be done
via a reference count inside the IoCtxImpl.

Also add a copy constructor and assignment operator to class IoCtx,
since it's now cheap to have them.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 16:04:06 -08:00
Colin Patrick McCabe ecab94cac3 Rename radios_ioctx_{open,close} to create/destroy
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 16:03:54 -08:00
Colin Patrick McCabe ae77624bc8 librados: remove IoCtx::close()
We decided we don't want IoCtx::close(), since IoCtx::~IoCtx() exists.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 16:03:46 -08:00
Tommi Virtanen 7d06b1b226 Make git ignore core files. 2011-03-03 16:03:23 -08:00
Tommi Virtanen 77880416c7 Make git ignore python generated files. 2011-03-03 16:03:07 -08:00
Tommi Virtanen 7d6a4fc6a4 librados: Crashed on shutdown if connect was never called.
Add a trivial unit test to trigger this.
2011-03-03 16:02:38 -08:00
Colin Patrick McCabe b734043192 libradoshpp: put ceph stuff in namespace librados
Try a little bit harder to avoid polluting the user's global namespace
with our stuff.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 16:00:35 -08:00
Colin Patrick McCabe b97388f975 librados: don't create unused SnapContext objs
There were some unused temporary variables hanging around.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 16:00:20 -08:00
Colin Patrick McCabe 062dd5ebf8 librados: fix copy ctor of ObjectIterator
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 16:00:08 -08:00
Josh Durgin 46d6214ba1 testrados: add object stat test
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-03-03 15:59:38 -08:00
Josh Durgin 60482f5a3f testlibrbd: recreate test pool each time
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-03-03 15:59:25 -08:00
Colin Patrick McCabe b941cfdc87 common: block SIGPIPE everywhere we can
It's much better to get EPIPE than SIGPIPE.

Block SIGPIPE in all threads we create. In the daemon, block SIGPIPE in
the main thread.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 07:50:35 -08:00
Yehuda Sadeh 3b34e2f363 messanger: shouldn't throw sigpipe on failed socket 2011-03-03 07:50:31 -08:00
Colin Patrick McCabe ed01fa1f49 dout: Log version message when (re)opening log
Log a version message whenever we open the dout log, not just the first
time. However, only output it to log files and syslog. Spewing versions
to stderr and stdout was determined to be annoying.

Rename dout_emergency_impl to dout_emergency_to_file_and_syslog to
better reflect its function.

Rename ceph_version_to_string to pretty_version_to_string.

Add get_process_name to do just that. Re-arrange some version.h methods.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

Conflicts:

	src/common/common_init.cc
2011-03-03 04:30:24 -08:00
Colin Patrick McCabe 82c5f3a8b2 Thread: don't mask signals except in library code
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-03 04:24:12 -08:00
Sage Weil d467fbfd17 mds: rip out rename linkmerge support
It turns out POSIX says rename(a,b) is a no-op when a and b link to the
same inode.  This is super weird but good news because it means we can
rip out a bunch of poorly tested code.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-02 16:13:54 -08:00
Colin Patrick McCabe 64186f995e dout: Reopen dout after parsing all config opts
Reopen the dout stream only after we parse all configuration options.
Specifying --log-file on the command line now works as expected.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-02 07:36:00 -08:00
Colin Patrick McCabe bf1ae37425 dout: remove g_conf.log_to_file
Remove the log_to_file configuration option. Instead, only log to a file
if either log_file or log_dir is set.

This way, command-line options like --log-file=/tmp/foo work as
expected.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-02 07:33:21 -08:00
Colin Patrick McCabe 5ebd4f8608 logging: default to foreground logging
At global constructor time: default to logging everything to stderr.

During common_init: set appropriate logging defaults based on the type
or program (daemon or other).

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
2011-03-02 07:29:56 -08:00
Alexandre Oliva 75e2a07739 cmds/cosd: Fix IsHeapProfilerRunning implicit return type cast.
G++ complains about the difference between the return type of tcmalloc's
IsHeapProfilerRunning (int) and the return type of the function that
g_conf.profiler_running is supposed to point to (bool). We could
probably get away with a type-cast, but as a compiler developer and
former C++ language lawyer, I'd rather not take the risk of destroying
the universe by invoking undefined behavior ;-)

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-03-02 13:39:09 -08:00
Sage Weil a6167332fb mds: drop some dead code
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-02 09:50:44 -08:00
Sage Weil b6bfa8c54a mds: fix one rename dentry linkage projection case
There are more.  :(

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-02 09:41:20 -08:00
Sage Weil f353f596c8 osd: simple test for random missing objects during recovery
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-01 16:10:49 -08:00
Sage Weil b86461dc8e osd: recovery cleanups, better error messages
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-01 16:03:35 -08:00
Sage Weil 29cb6f86c2 osd: update missing_loc when infering an empty missing set
We infer an empty missing set, but weren't calculating object locations
based on that.  Usually it was okay because we already had another
location, but not always!  And especially not when one location turns out
to be bad and we need to go to another.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-01 16:02:48 -08:00
Sage Weil f74593ee5e osd: fix unfound output
We were printing unfound when not, and vice versa.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-01 16:01:49 -08:00
Sage Weil f3bdfc57dc osd: add object to missing if we find it missing on disk
If the recovery finds the object missing on disk during recovery, add it
to the local missing set so we can (hopefully) recover it from another
replica.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-01 15:11:47 -08:00
Sage Weil 145923450a osd: (semi-)handle case where primary copy isn't there
Continue recovering, at least.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-01 14:42:44 -08:00
Sage Weil 5997059a60 osd: continue recovery after encountering missing objects
1- If we try to pull an object that isn't there, send an empty push in
reply.

2- If we get an empty push, call a new failed_push helper.  Also called
when we pull partial/bad data.

3- Fix the fail behavior to close out our attempt, adjust our missing_loc,
but let the calling recovery code handle the retry.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-03-01 14:25:59 -08:00