Commit Graph

22979 Commits

Author SHA1 Message Date
Sage Weil
1290671f15 Merge branch 'wip-scrub' into next
Reviewed-by: Sage Weil <sage@inktank.com>
Conflicts:
	src/osd/PG.cc
2012-12-23 14:42:51 -08:00
Sage Weil
8362e6403e monclient: fix get_monmap_privately retry interval
Use mon_client_hunt_interval (default 3) instead of hardcoding 1 second.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-23 13:53:21 -08:00
Sage Weil
d843a64a3a Makefile: fix 'base' rule
Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-23 13:53:18 -08:00
Sage Weil
a09f5b1b46 init-ceph,mkcephfs: default inode64 for mounting xfs
According to hch this is now the default or new kernels.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-23 11:18:45 -08:00
Sage Weil
5f25f9f8cf init-ceph: default osd_data path
Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-22 11:10:03 -08:00
Samuel Just
f6b2ca8b38 OSD: always do a deep scrub when repairing
Otherwise, errors turned up in a deep-scrub will be
swept under the rug without being repaired.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-21 20:37:06 -08:00
Samuel Just
ad9bcc705f PG: don't use a self-transition for WaitRemoteRecoveryReserved
Previously, using the state on active worked, but now we might
go back through WaitRemoteRecoveryReserved without resetting
Active.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-21 20:37:06 -08:00
Samuel Just
2e96bb1817 PG: Handle repair once in scrub_finish
We don't want to change missing sets during a chunky
scrub since it would cause !is_clean() and derail
the rest of the scrub.  Instead, move the missing,
inconsistent, and authoritative sets into scrubber
and add to during scrub_compare_maps().  Then,
handle repairing objects all at once in scrub_finish().

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-21 20:35:19 -08:00
Dan Mick
6325a4800d import_export.sh: sparse import export
Add tests for:
   - sparse import makes expected sparse images
   - sparse export makes expected sparse files
   - sparse import from stdin also creates sparse images
   - import from partially-sparse file leads to partially-sparse image
   - import from stdin with zeros leads to sparse
   - export from zeros-image to file leads to sparse file

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-21 17:03:38 -08:00
Dan Mick
5905d7fae7 rbd: harder-working sparse import from stdin
Try to accumulate image-sized blocks when importing from stdin, even if
each read is shorter than requested; if we get a full block, and it's
all zeroes, we can seek and make a sparse output file

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-21 17:03:38 -08:00
Dan Mick
410903fe7a rbd: check for all-zero buf in export, seek output if so
Use buf_is_zero in common/util.cc

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-21 17:03:38 -08:00
Dan Mick
4a558048cf librbd: move buf_is_zero() to new common/util.cc and include/util.h
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-21 17:03:38 -08:00
Sage Weil
8f5de15605 osd: fix pg stat msgs vs timeout
We can get a pattern like so:

- new mon session
- after say 120 seconds, we decide to send a stats msg
- outstanding_pg_stats is finally true, we immediately time out (30 second
  grace), and reconnect to a new mon
-> repeat

The problem is that we don't reset the last_sent timestamp when we send.
Or that we do this check after sending instead of before.  Fix both.

This should resolve the issue #3661 where osds that don't have pgs
updating are not stats messags to the mon to check in, and are eventually
getting marked down as a result.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2012-12-21 16:47:50 -08:00
Samuel Just
00ed6657c9 PG::scrub_compare_maps increment scrubber.fixed for missing repairs
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-21 15:20:22 -08:00
Samuel Just
c9e051746e PG::_compare_scrubmaps: increment scrubber.errors on missing object
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-21 15:16:19 -08:00
Sage Weil
206ffcd82e mkcephfs: error out if 'devs' defined but 'osd fs type' not defined
We can infer btrfs if they use btrfs devs, but if they use devs there is
no default fs.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-21 14:23:14 -08:00
Sage Weil
11fb314153 Merge remote-tracking branch 'gh/wip-scrub' into next 2012-12-21 13:56:16 -08:00
Sage Weil
47145d8009 Merge remote-tracking branch 'gh/wip-3643' into next
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-12-21 13:45:39 -08:00
Sage Weil
999ba1b2e7 monc: only warn about missing keyring if we fail to authenticate
This avoids the situation where a librados or other user with the default
of 'cephx,none' and no keyring is authenticating against a cluster with
required of 'none' and an annoying warning is generated every time.  Now
we only print a helpful message if we actually failed.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-21 13:44:19 -08:00
Sage Weil
5d5a42bc71 osd: clear CLEAN on exit from Clean state
This means we can drop the scrub repair state_clear() call.  We probably
can drop others, but lets leave that for another day.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-21 13:10:32 -08:00
Yehuda Sadeh
b3e62ad692 auth: use none auth if keyring not found
If both cephx and none are accepted auth methods, and
cephx keyring cannot be found then resort to using
none, instead of failing.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-21 12:19:41 -08:00
Samuel Just
4d661e0d01 PG::sched_scrub: only set PG_STATE_DEEP_SCRUB once reserved
Otherwise we would have +DEEP before we have +SCRUB.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-21 11:36:54 -08:00
Samuel Just
7c56d8fad0 PG::sched_scrub: return true if scrub newly kicked off
The previous return value wasn't really what OSD::sched_scrub
wanted to know.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-21 11:36:54 -08:00
Sage Weil
ae044e6405 osd: allow transition from Clean -> WaitLocalRecoveryReserved for repair
If we do a scrub repair, we need to go from clean to recovery again to
copy objects around.

This fixes a simple repair of a missing object, either on the primary or
replica.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-21 11:37:48 -08:00
Samuel Just
670afc6c0c PG: in sched_scrub() set PG_STATE_DEEP_SCRUB not scrubber.deep
scrubber.deep gets reset in scrub() to match
state_test(PG_STATE_DEEP_SCRUB).

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-21 11:29:47 -08:00
Sage Weil
19e44bff37 osd: clear scrub state if queued scrub doesn't start
We set SCRUBBING when we queue a pg for scrub.  If we dequeue and
call scrub() but abort for some reason (!active, degraded, etc.), clear
that state bit.

Bug is easily reproduced with 'ceph osd scrub N' during cluster startup
when PGs are peering; some PGs can get left in the scrubbing state.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-21 11:29:47 -08:00
Sage Weil
e765dcb4f1 osd: only dec_scrubs_active if we were active
This fixes a bug that puts scrubs_active negative.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-20 21:45:09 -08:00
Sage Weil
ada3e27fa5 osd: reintroduce inc_scrubs_active helper
This mostly generates nice debug output.  It also slightly simplifies
code and makes things symmetric.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-20 21:44:34 -08:00
Samuel Just
accce83051 Merge remote-tracking branch 'upstream/wip_notify' into next
Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-20 16:24:05 -08:00
Dan Mick
129a49ada1 cephtool: mention ceph osd ls, fix ceph osd tell N bench
Add ceph osd ls to help; make help for ceph osd tell N bench look
more like injectargs, which says <osd-id or *> to make it clear you
can benchmark all osds simultaneously

Signed-off-by: Dan Mick <dan.mick@inktank.com>
2012-12-20 15:51:55 -08:00
Yehuda Sadeh
a36d1db10f rgw: remove noisy log message
No need for that log message.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-20 15:32:59 -08:00
Yehuda Sadeh
5b5a19ac76 rgw: fix daemonize initialization
Just call the common daemonize function. Otherwise we end up
not initializng stdout / stderr correctly.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-20 15:30:53 -08:00
Sage Weil
50914e7a42 log: fix flush/signal race
We need to signal the cond in the same interval where we hold the lock
*and* modify the queue.  Otherwise, we can have a race like:

 queue has 1 item, max is 1.
 A: enter submit_entry, signal cond, wait on condition
 B: enter submit_entry, signal cond, wait on condition
 C: flush wakes up, flushes 1 previous item
 A: retakes lock, enqueues something, exits
 B: retakes lock, condition fails, waits
  -> C is never woken up as there are 2 items waiting

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2012-12-20 13:48:06 -08:00
Samuel Just
c0e2371284 ReplicatedPG::remove_notify : don't leak the notify object
Following remove_notify, there are no other references to
notif, delete it.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-20 13:29:14 -08:00
Samuel Just
b5031a2233 OSD,ReplicatedPG: do not track notifies on the session
handle_notify_timeout and remove_notify currently do not clean up this
state leaving dangling Notification*.  Further, we only use this mapping
in unwatch in order to determine which notifies to update. We can
accomplish the same thing by iterating through the obc->notifs mapping
since all notifications relevant for a given watch would have been for
the same obc as the watch.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-20 13:27:24 -08:00
Sage Weil
17c627b5e4 Merge remote-tracking branch 'gh/wip-cephtool' into next 2012-12-20 11:04:29 -08:00
Sage Weil
f38d891138 Merge branch 'wip-build-fixes' into next 2012-12-20 10:49:34 -08:00
Yehuda Sadeh
a803159b02 rgw: configurable exit timeout
Fixes: #3638

rgw exit timeout secs : number of seconds to wait for process
to exit cleanly before forcing exit. If set to 0, it'l wait
indefinitely.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-20 10:13:16 -08:00
Yehuda Sadeh
92b59e9059 rgw: don't try to assign content type if not found
Fixes: #3648
Cannot assign a NULL pointer into stl string. This is only
relevant to swift, when uploading an object without specifying
content type, and when the suffix cannot be determined.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-20 09:45:35 -08:00
Sage Weil
c02e9062b9 Merge remote-tracking branch 'gh/wip-crushtool' into next
Reviewed-by: Caleb Miles <caleb.miles@inktank.com>
2012-12-20 08:53:19 -08:00
Yehuda Sadeh
08c64249eb rgw: don't initialize keystone if not set up
Fixes: #3653
No need to initialize keystone, including the keystone
revocation thread which was verbose if key stone was
not set up. This removes some unuseful errors from the
log.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-19 22:03:56 -08:00
Yehuda Sadeh
799c59ae89 rgw: remove useless configurable, fix swift auth error handling
Fixes: #3649
No need to have an extra configurable to use keystone. Use keystone
whenever keystone url has been specified. Also, fix a bad error
handling that turned a failure to authenticate into successfully
authenticating a bad user.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2012-12-19 22:03:56 -08:00
Samuel Just
9a9778fb9c Merge remote-tracking branch 'upstream/wip_pg_temp' into next
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Luis <joao.luis@inktank.com>
2012-12-19 16:51:25 -08:00
Samuel Just
6122a9f62f OSDMonitor: remove temp pg mappings with no up pgs
Otherwise, the pg won't be validly mapped until one of the temp
pgs comes back up.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-19 10:33:40 -08:00
Samuel Just
2395af9f7a OSDMap: make apply_incremental take a const argument
This requires us to copy bufferlists in two cases since bufferlist
does not have a const interator at this time.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-12-19 10:32:52 -08:00
Sage Weil
2e49d5c4b7 cephtool: add qa workunit
A few basic sanity checks, including a tell on a down osd.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-19 08:37:42 -08:00
Gary Lowell
d9c2396b55 ceph.spec.in: Improve finding location of jni.h for sles11.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
2012-12-18 21:00:15 -08:00
Sage Weil
b2eb8bd2ed osd: implement 'version' tell command
Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-18 20:08:42 -08:00
Gary Lowell
46344105e7 ceph.spec.in: Add packages for libcephfs-jni and libcephfs-java
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
2012-12-18 19:40:32 -08:00
Sage Weil
85763f09fe ceph: report error string to stderr, not stdout
If we return an error, send the message to stderr.  This makes things
more easily scriptable because error messages won't take the place of
expected output.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-18 19:21:24 -08:00