Commit Graph

26070 Commits

Author SHA1 Message Date
Gary Lowell
054e96cf79 v0.63 2013-05-28 13:58:22 -07:00
Samuel Just
5bca9c38ef HashIndex: sync top directory during start_split,merge,col_split
Otherwise, the links might be ordered after the in progress
operation tag write.  We need the in progress operation tag to
correctly recover from an interrupted merge, split, or col_split.

Fixes: #5180
Backport: cuttlefish, bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-05-28 12:47:51 -07:00
Samuel Just
8c1c2d98c6 Merge branch 'wip_scrub_tphandle' into next
Fixes: #5159
Reviewed-by: Sage Weil <sage@inktank.com>
2013-05-23 20:08:54 -07:00
Samuel Just
86822485e5 PG: ping tphandle during omap loop as well
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-05-23 19:42:32 -07:00
Samuel Just
d62716dd4c PG: reset timeout in _scan_list for each object, read chunk
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-05-23 19:42:32 -07:00
Samuel Just
b8a25e08a6 OSD,PG: pass tphandle down to _scan_list
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-05-23 19:42:32 -07:00
Yehuda Sadeh
8b3a04dec8 rgw: iterate usage entries from correct entry
Fixes: #5152
When iterating through usage entries, and when user id was
provided, we started at the user's first entry and not from
the entry indexed by the request start time.
This commit fixes the issue.

Backport: bobtail

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-05-23 13:11:01 -07:00
Sage Weil
87cef3d5c3 mon: drop unnecessary conditionals
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-23 10:23:43 -07:00
Sage Weil
6af640517f Merge pull request #311 from ceph/wip-5102
Reviewed-by: Sage Weil <sage@inktank.com>
2013-05-23 10:21:51 -07:00
Xiaoxi Chen
e09e94424b modified: src/init-ceph.in
fixed bug in init script, the "df" should be run on remote host by do_cmd,
	and use $host instead of "hostname -s"

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
(cherry picked from commit 1dd99f0fc9)

Conflicts:

	src/init-ceph.in
2013-05-23 08:48:24 -07:00
Sage Weil
c2e262fc94 osd: skip mark-me-down message if osd is not up
Fixes crash when the OSD has not successfully booted and gets a
SIGINT or SIGTERM.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-22 15:03:50 -07:00
Sage Weil
32dc463ad4 osd, mds: shut down async signal handler on exit
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-22 14:56:24 -07:00
Sage Weil
eb91f41042 messages/MOSDMarkMeDown: fix uninit field
Fixes valgrind warning:
==14803== Use of uninitialised value of size 8
==14803==    at 0x12E7614: sctp_crc32c_sb8_64_bit (sctp_crc32.c:567)
==14803==    by 0x12E76F8: update_crc32 (sctp_crc32.c:609)
==14803==    by 0x12E7720: ceph_crc32c_le (sctp_crc32.c:733)
==14803==    by 0x105085F: ceph::buffer::list::crc32c(unsigned int) (buffer.h:427)
==14803==    by 0x115D7B2: Message::calc_front_crc() (Message.h:441)
==14803==    by 0x1159BB0: Message::encode(unsigned long, bool) (Message.cc:170)
==14803==    by 0x1323934: Pipe::writer() (Pipe.cc:1524)
==14803==    by 0x13293D9: Pipe::Writer::entry() (Pipe.h:59)
==14803==    by 0x120A398: Thread::_entry_func(void*) (Thread.cc:41)
==14803==    by 0x503BE99: start_thread (pthread_create.c:308)
==14803==    by 0x6C6E4BC: clone (clone.S:112)

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-22 14:29:37 -07:00
Sage Weil
b0d64de484 Merge pull request #316 from ceph/wip-sysvinit
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-05-22 13:25:42 -07:00
Sage Weil
d81d0ea5c4 sysvinit: fix osd weight calculation on remote hosts
We need to do df on the remote host, not locally.

Simlarly, the ceph command uses the osd key, which exists remotely; run it there.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-22 12:39:11 -07:00
Sage Weil
caa15a34cb sysvinit: use known hostname $host instead of (incorrectly) recalculating
We would need to do hostname -s on the remote node, not the local one.
But we already have $host; use it!

Reported-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-22 12:39:10 -07:00
Samuel Just
0289c445be OSDMonitor: skip new pools in update_pools_status() and get_pools_health()
New pools won't be full.  mon->pgmon()->pg_map.pg_pool_sum[poolid] will
implicitly create an entry for poolid causing register_new_pgs() to assume that
the newly created pgs in the new pool are in fact a result of a split
preventing MOSDPGCreate messages from being sent out.

Fixes: #4813
Backport: cuttlefish
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-05-22 10:23:25 -07:00
Joao Eduardo Luis
e15d290945 mon: Paxos: get rid of the 'prepare_bootstrap()' mechanism
We don't need it after all.  If we are in the middle of some proposal,
then we guarantee that said proposal is likely to be retried.  If we
haven't yet proposed, then it's forever more likely that a client will
eventually retry the message that triggered this proposal.

Basically, this mechanism attempted at fixing a non-problem, and was in
fact triggering some unforeseen issues that would have required increasing
the code complexity for no good reason.

Fixes: #5102

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-05-22 17:12:38 +01:00
Joao Eduardo Luis
586e8c2075 mon: Paxos: finish queued proposals instead of clearing the list
By finishing these Contexts, we make sure the Contexts they enclose (to be
called once the proposal goes through) will behave as their were initially
planned:  for instance, a C_Command() may retry the command if a -EAGAIN
is passed to 'finish_contexts', while a C_Trimmed() will simply set
'going_to_trim' to false.

This aims at fixing at least a bug in which Paxos will stop trimming if an
election is triggered while a trim is queued but not yet finished.  Such
happens because it is the C_Trimmed() context that is responsible for
resetting 'going_to_trim' back to false.  By clearing all the contexts on
the proposal list instead of finishing them, we stay forever unable to
trim Paxos again as 'going_to_trim' will stay True till the end of time as
we know it.

Fixes: #4895

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-05-22 17:10:42 +01:00
Joao Eduardo Luis
2ff23fe784 mon: Paxos: finish_proposal() when we're finished recovering
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-05-22 13:33:34 +01:00
Sage Weil
e9d20ffe19 mon: implement --extract-monmap <filename>
This will make for a simpler process for
  http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#removing-monitors-from-an-unhealthy-cluster

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c0268e2749)
2013-05-21 15:14:47 -07:00
Yehuda Sadeh
d48f1edb07 rgw: protect ops log socket formatter
Fixes: #4905
Ops log (through the unix domain socket) uses a formatter, which wasn't
protected.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-05-21 13:05:22 -07:00
Sage Weil
1c7b9c3505 os/LevelDBStore: fix compression selection
We were always disabling compression.

Fixes: #5131
Reported-by: Sylvain Munaut <s.munaut@whatever-company.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-21 08:16:56 -07:00
Sage Weil
2f193fb931 debian: stop sysvinit on ceph.prerm
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-20 14:39:16 -07:00
Mike Kelly
d05a4e5574 ceph df: fix si units for 'global' stats
si_t expects bytes, but it was being given kilobytes.

Signed-off-by: Mike Kelly <pioto@pioto.org>
(cherry picked from commit 0c2b738d8d)
2013-05-20 09:06:09 -07:00
Sage Weil
d0a5d3a7f4 Merge pull request #295 from ceph/wip-5077
Reviewed-by: Joao Luis <joao.luis@inktank.com>
2013-05-17 09:26:25 -07:00
Sage Weil
c80c6a032c sysvinit: fix enumeration of local daemons when specifying type only
- prepend $local to the $allconf list at the top
- remove $local special case for all case
- fix the type prefix checks to explicitly check for prefixes

Fugly bash, but works!

Backport: cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-05-16 20:39:32 -07:00
Sage Weil
d8d7113c35 udev: install disk/by-partuuid rules
Wheezy's udev (175-7.2) has broken rules for the /dev/disk/by-partuuid/
symlinks that ceph-disk relies on.  Install parallel rules that work.  On
new udev, this is harmless; old older udev, this will make life better.

Fixes: #4865
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-16 18:40:29 -07:00
Sage Weil
65072f2e43 mon: clear pg delta after some period
If we have not pg_map updates, the delta doesn't update, and can get stuck
with the velocity right before activity stopped.  This is confusing, and
can cause incorrect health warnings about in-progress recovery.

To fix this, zero the delta if there is no activity for
'mon delta reset interval' seconds.

Fixes: #5077
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-16 17:58:48 -07:00
Samuel Just
9b9d322c20 test_filestore_idempotent_sequence: unmount prior to deleting store
FileStoreDiff umounts the stores in its destructor.

Also, DeterministicOpSequence handles deletes its passed
object store.

Fixes: #5076
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
2013-05-16 15:46:11 -07:00
Samuel Just
5a27e85cf1 Revert "test_filejournal.cc: cleanup memory in destructor"
The finish() method for Contexts calls delete this.

This reverts commit 36028916c4.

Fixes: #5075
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
2013-05-16 15:45:42 -07:00
Sage Weil
604c83ff18 debian: make radosgw require matching version of librados2
...indirectly via ceph-common.  We get bad behavior when they diverge, I
think because of libcommon.la being linked both statically and dynamically.

Fixes: #4997
Backport: cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
2013-05-16 13:17:45 -07:00
Samuel Just
eaf3abf3f9 FileJournal: adjust write_pos prior to unlocking write_lock
In committed_thru, we use write_pos to reset the header.start value in cases
where seq is past the end of our journalq.  It is therefore important that the
journalq be updated atomically with write_pos (that is, under the write_lock).

The call to align_bl() is moved into do_write in order to ensure that write_pos
is adjusted correctly prior to write_bl().

Also, we adjust pos at the end of write_bl() such that pos \in [get_top(),
header.max_size) after write_bl().

Fixes: #5020
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-05-16 11:14:37 -07:00
Sage Weil
64871e0931 mds: avoid assert after suicide()
Fixes: #5079
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-16 09:42:29 -07:00
Gary Lowell
2ba167be0e Merge branch 'next' 2013-05-14 15:38:24 -07:00
athanatos
5ff703d60a Merge pull request #283 from dachary/wip-5058
internal documentation proofreading

Reviewed-by: Sam Just <sam.just@inktank.com>
2013-05-14 15:28:45 -07:00
Sage Weil
52b0438c66 doc/rados/configuration: fix [mon] osd min down report* config docs
Fix other osd -> mon section name, and note the old config value name prior
to v0.62.

Fixes: #5044.
Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-14 14:02:27 -07:00
Loic Dachary
2a4425af0e reflect recent changes in the pg deletion logic
No need to wait on DeletingStateRef for flush d3dd99b725
Fix typos

http://tracker.ceph.com/issues/5058 refs #5058

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-05-14 22:46:37 +02:00
Loic Dachary
1c53991e4c fix typos and add hyperlink to peering
s/;/:/
s/up_acting_affected/acting_up_affected/
Add relative link to ../../peering

http://tracker.ceph.com/issues/5058 refs #5058

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-05-14 22:46:37 +02:00
Loic Dachary
b7d4012c06 typo s/come/some/
http://tracker.ceph.com/issues/5058 refs #5058

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-05-14 22:46:36 +02:00
Loic Dachary
dbddffef06 update op added to a waiting queue or discarded
The decision to discard an op happens either in OSD or in PG.
The operation queue goes to a single OpWQ object if waiting_map does not impose a delay op_queue.
The decision to add an op to a waiting queue regardless of its type is updated.
The decision to add a CEPH_MSG_OSD_OP to a waiting queue is described in full.

http://tracker.ceph.com/issues/5058 refs #5058

Signed-off-by: Loic Dachary <loic@dachary.org>
2013-05-14 22:44:58 +02:00
Sage Weil
afeb8f2d52 md/Sever.cc: fix straydn assert
From fb222a0a1c, we only know straydn is
non-null if oldin is non-null.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-05-14 10:31:27 -07:00
Sage Weil
29d8ec4ecd Merge pull request #285 from dalgaaf/wip-da-CID-fixes-2-v3
Reviewed-by: Sage Weil <sage@inktank.com>
2013-05-14 10:30:20 -07:00
Danny Al-Gaaf
e69257eaee rgw/rgw_user.cc: fix possible NULL pointer dereference
CID 1019559 (#1 of 1): Dereference after null check (FORWARD_NULL)
  var_deref_model: Passing null pointer "usr" to function
  "RGWUser::get_store()", which dereferences it.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-05-14 19:20:29 +02:00
Danny Al-Gaaf
d69290219d mds/Server.cc: fix possible NULL pointer dereference
Assert if straydn is NULL.

CID 1019554 (#2 of 2): Dereference after null check (FORWARD_NULL)
  var_deref_model: Passing null pointer "straydn" to function
  "MDSCacheObject::is_auth() const", which dereferences it.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-05-14 19:15:23 +02:00
Danny Al-Gaaf
fb222a0a1c mds/Server.cc: fix possible NULL pointer dereference
Assert of straydn is NULL here.

CID 1019558 (#1 of 1): Dereference after null check (FORWARD_NULL)
  var_deref_model: Passing null pointer "straydn" to function
  "CDentry::get_dir() const", which dereferences it.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-05-14 19:07:29 +02:00
Danny Al-Gaaf
c87788b69b mds/Server.cc: fix possible NULL pointer dereference
Assert if destdn == NULL.

CID 1019557 (#1 of 1): Dereference after null check (FORWARD_NULL)
  var_deref_model: Passing null pointer "destdn" to function
  "CDentry::get_dir() const", which dereferences it.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-05-14 19:02:20 +02:00
Danny Al-Gaaf
088455f85e librados/AioCompletionImpl.h: add missing Lock
Add missing Lock around code changing AioCompletionImpl::rval/ack and safe
in C_AioCompleteAndSafe::finish().

CID 1019565 (#1 of 1): Data race condition (MISSING_LOCK)
  missing_lock: Accessing "this->c->rval" ("_ZN8librados17AioCompletionImplE.rval")
  requires the "Mutex._m" lock.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-05-14 18:50:09 +02:00
Danny Al-Gaaf
8a52350dd8 src/dupstore.cc: check return value of list_collections()
CID 1019545 (#1 of 1): Unchecked return value (CHECKED_RETURN)
  check_return: Calling function "ObjectStore::list_collections
  (std::vector<coll_t, std::allocator<coll_t> > &)" without
  checking return value (as is done elsewhere 5 out of 6 times).

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-05-14 18:44:06 +02:00
Danny Al-Gaaf
70a4a971f4 mds/Server.cc: fix possible NULL pointer dereference
CID 1019555 (#1 of 1): Dereference after null check (FORWARD_NULL)
  var_deref_model: Passing null pointer "in" to function
  "Server::_need_force_journal(CInode *, bool)", which dereferences it.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-05-14 18:43:37 +02:00