Commit Graph

27784 Commits

Author SHA1 Message Date
Joao Eduardo Luis
97462a3213 mon: OSDMonitor: work around a full version bug introduced in 7fb3804fb
In 7fb3804fb8 we moved the full version
stashing logic to the encode_trim_extra() function.  However, we forgot
to update the osdmap's 'latest_full' key that should always point to
the latest osdmap full version.  This eventually degenerated in a missing
full version after a trim.  This patch works around this bug by looking
for the latest available full osdmap version in the store and updating
'latest_full' to its proper value.

Related-to: #5704
Backport: cuttlefish

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-23 09:48:51 -07:00
Joao Eduardo Luis
bc8d62fe31 mon: OSDMonitor: get rid of encode_full() as we don't use it.
We have delegated this to encode_trim_extra() since
7fb3804fb8 -- no need to keep this code
around.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-23 09:22:32 -07:00
Joao Eduardo Luis
a815547ed3 mon: OSDMonitor: update the osdmap's latest_full with the new full version
We used to do this on encode_full(), but since [1] we no longer rely on
PaxosService to manage the full maps for us.  And we forgot to write down
the latest_full version to the store, leaving it in a truly outdated state.

[1] - 7fb3804fb8

Fixes: #5704
Backport: cuttlefish
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-23 09:21:48 -07:00
Sage Weil
7b3b989311 qa/workunits/suites/fsync-tester.sh: lsof at end
Trying to track down occasional EBUSY on umount at end of test.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-23 08:43:01 -07:00
Gary Lowell
835dd97301 v0.67-rc1 2013-07-22 11:57:27 -07:00
Noah Watkins
58c78dbaf3 FileJournal: fix posix_fallocate error handling
From the man page for posix_fallocate:

    posix_fallocate() returns zero on success, or an error
    number on failure.  Note that errno is not set.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-22 11:43:01 -07:00
Samuel Just
0897d3a820 OSD::_make_pg: use createmap, not osdmap
The osd lock is not held at this point, we must use
the createmap passed in.

Fixes: #5656
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-22 11:11:08 -07:00
Yehuda Sadeh
20bc09c668 rgw: read attributes when reading bucket entry point
Fixes: #5691

We need to also read the attributes, as bucket might be a legacy
bucket and might have all bucket instance info in that object.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Tested-by: Faidon Liambotis <faidon@wikimedia.org>
2013-07-22 10:11:13 -07:00
Samuel Just
d28c18da9d OSD::RemoveWQ: do not apply_transaction while blocking _try_resurrect_pg
Some callbacks take the osd lock, so we need to avoid blocking an
osd lock holding thread while waiting on a filestore callback.
Instead, just queue the transaction, and allow _try_resurrect_pg
to cancel us while we are waiting for the transaction to go through
(CLEARING_WAITING).

Fixes: #5672
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-22 10:31:31 -07:00
Samuel Just
6c4cd22e60 FileStore: use complete() instead of finish() and delete
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-22 10:31:19 -07:00
Samuel Just
9f591a630d Finisher: use complete() not finish() and delete
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-22 10:31:12 -07:00
Samuel Just
8536ff9a43 common/Cond.h: add a simpler C_SaferCond Context
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-22 10:31:02 -07:00
Gary Lowell
eabf2f6ae1 ceph.spec.in: Obsolete ceph-libs
Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-22 09:52:19 -07:00
Sage Weil
e58b0e9320 Merge remote-tracking branch 'gh/wip-mon-caps' into next
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-07-22 09:27:35 -07:00
Sage Weil
da2cb0901d Merge pull request #453 from dalgaaf/wip-da-SCA-cppcheck-7
Fix SCA and CID issues

Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-21 21:42:07 -07:00
Sage Weil
c456390158 Merge pull request #451 from dalgaaf/wip-da-SCA-cppcheck-6-v2
Fix some issues from SCA - v2 - against ceph:next

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-21 21:40:22 -07:00
Danny Al-Gaaf
617b3f750e cls_replica_log_types.h: pass const std::list<> by reference
Pass  const std::list<> parameter by refrence to
cls_replica_log_progress_marker().

From cppcheck:
 [src/cls/replica_log/cls_replica_log_types.h:64]: (performance)
  Function parameter 'b' should be passed by reference.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-07-20 20:15:57 +02:00
Danny Al-Gaaf
6319823443 mon/PGMonitor.cc: reduce scope of local 'num_slow_osds' variable
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-07-20 20:02:36 +02:00
Danny Al-Gaaf
cf29d17666 rgw/rgw_bucket.cc: use static_cast<>() instead of C-Style cast
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-07-20 19:52:18 +02:00
Danny Al-Gaaf
d181aaaed4 test_cls_version.cc: don't free object twice, free the right one
Object 'librados::ObjectWriteOperation *op' is freed twice in the TEST
test_version_inc_read. Free instead 'librados::ObjectReadOperation *rop'

Related cppcheck warning:
 [src/test/cls_version/test_cls_version.cc:79]: (error) Memory
  pointed to by 'op' is freed twice.

This should also fix:

CID 1049247 (#1 of 1): Use after free (USE_AFTER_FREE)
  deref_arg: Calling "librados::ObjectWriteOperation::~ObjectWriteOperation()"
  dereferences freed pointer "op". (The dereference happens because this is
  a virtual function call.)
CID 1049218 (#4 of 4): Resource leak (RESOURCE_LEAK)
  leaked_storage: Variable "rop" going out of scope leaks the storage it
  points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-07-20 19:43:29 +02:00
Danny Al-Gaaf
11c51e8485 rgw/rgw_metadata.cc: use static_cast<>() instead of C-Style cast
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-07-20 19:30:04 +02:00
Danny Al-Gaaf
e910421719 rgw: change RGWOp::name() to return string instead of char*
Return 'const string' instead of 'const char *' from RGWOp::name() to
avoid the usage of std::string:c_str() to return 'const char *' in
some cases in rgw_rest_replica_log.h.

Returning result of c_str() from a function is dangerous since the
result gets (may) invalid after the related string object gets
destroyed or out of scope (which is the case with return). So you
may end up with garbage in this case.

Related warning from cppcheck:
 [src/rgw/rgw_rest_replica_log.h:39]: (error) Dangerous usage of
  c_str(). The value returned by c_str() is invalid after this call.
 [src/rgw/rgw_rest_replica_log.h:59]: (error) Dangerous usage of
  c_str(). The value returned by c_str() is invalid after this call.
 [src/rgw/rgw_rest_replica_log.h:79]: (error) Dangerous usage of
  c_str(). The value returned by c_str() is invalid after this call

This should also fix:

CID 1049250 (#1 of 1): Wrapper object use after free (WRAPPER_ESCAPE)
  escape: The internal representation of "s" escapes, but is destroyed
  when it exits scope.
CID 1049251 (#1 of 1): Wrapper object use after free (WRAPPER_ESCAPE)
  escape: The internal representation of "s" escapes, but is destroyed
  when it exits scope.
CID 1049252 (#1 of 1): Wrapper object use after free (WRAPPER_ESCAPE)
  escape: The internal representation of "s" escapes, but is destroyed
  when it exits scope.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-07-20 19:02:18 +02:00
Sage Weil
3dec530de6 qa/workunits/mon/caps.sh: clean up users; rename
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-19 22:32:23 -07:00
Sage Weil
675d783aed mon/MonCap: simplify rwx match logic
Make this a positive check instead of double negative.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-19 22:32:23 -07:00
Sage Weil
f79d965049 mon: fix command caps check
We must require something or else the caps check is going to pass in
a degenerate sense.  Use X for commands.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-19 22:32:23 -07:00
Joao Eduardo Luis
fb2150425b qa: workunits: mon: test mon caps permissions
set env var TEST_EXIT_ON_ERROR=0 to obtain all errors instead of exiting
with return 1 on first error found.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-07-19 21:18:07 -07:00
Sage Weil
73b4003f65 Merge remote-tracking branch 'gh/wip-swift' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-07-19 21:08:18 -07:00
Sage Weil
0356eebfa5 mon/PaxosService: update on_active() docs to clarify calling rules
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-19 16:59:15 -07:00
Sage Weil
6d326b8424 mon/OSDMonitor: discard failure waiters, info on shutdown
This would prevent a leak, if we didn't assert before that in the
failure_reporter_t dtor.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-19 16:57:04 -07:00
Sage Weil
8371680bab mon: OSDMonitor: only thrash and propose if we are the leader
'thrash_map' is only set if we are the leader, so we would thrash and
propose the pending value if we are the leader.  However, we should keep
the 'is_leader()' check not only for clarity's sake (an unfamiliar reader
may cry OMGBUG, prompting to a patch much like this), but also because
we may lose a subsequent election and become a peon instead, while still
holding a 'thrash_map' value > 0 -- and we really don't want to propose
while being a peon.

[This is a rebased version of 5eac38797d,
complete with the typo fix in d656aed599ee754646e16386ce5a4ab0117f2d6e.]

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-19 16:39:49 -07:00
Sage Weil
e4f2e3ecd0 mon/OSDMonitor: do not wait for readable in send_latest()
send_latest() checks for readable and, if untrue, will wait before sending
out the latest OSDMap.  This is completely unnecessary; I think it is a
hold-over from when we have independent paxos states.  An audit of all
callers confirms that everyone would be happy with whatever is committed,
even if we are in the process of committing an even newer version.

Effectively, everyone waits *above* this layer in the usual PaxosService
traps for whether we are readable or not.  This means that waiting_for_map
and send_to_waiting() go away entirely, which is nice.

This addresses, among other things: send_to_waiting() is called from
update_from_paxos(), which can be called when we are not readable due to
the paxos commit/finish timing changes in f1ce8d7c95 and
c711203c0d.  If no subsequent update happens, those waiters never get
their maps.

Instead, we send them immediately--we know they are committed and old
history is as good as future history.

Fixes: #5643
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-19 16:39:49 -07:00
Sage Weil
6edec516bf Revert "mon/OSDMonitor: send_to_waiting() in on_active()"
This reverts commit f06a124a7f.

On peons, on_active() is only called when we *first* become active after an
election.  Only on the leader is it called after each commit/update.  This
makes this change cause other problems (broken subscriptions on peons, in
particular).  We possibly should fix that, but there is also a simpler fix
for the original problem we were trying to solve.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-19 16:39:47 -07:00
Sage Weil
2795eb1232 Revert "mon: OSDMonitor: only thrash and propose if we are the leader"
This reverts commit 5eac38797d.
2013-07-19 16:23:04 -07:00
Sage Weil
0a9964934d Revert "mon/OSDMonitor: fix typo"
This reverts commit d656aed599.
2013-07-19 16:22:48 -07:00
Dan Mick
8c5e1db4fb ceph_rest_api.py: remove unused imports
Fixes: #5684
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-19 15:09:00 -07:00
Dan Mick
ce46961e32 ceph.in: better error message when daemon command returns nothing
Fixes: #5683
signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-19 15:01:18 -07:00
Sage Weil
06ae53e2b6 mon: improve osdmap subscription debug output
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-19 14:50:03 -07:00
Sage Weil
d3902e2e31 Merge remote-tracking branch 'gh/wip-stats' into next
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-07-19 14:49:25 -07:00
Greg Farnum
bc1aca77ea Merge branch 'wip-rgw-next-2' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-07-19 13:25:48 -07:00
Yehuda Sadeh
da8584f15f rgw: remove extra unused param from RGWRados::get_attr()
No user for the extra obj_version param.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:50 -07:00
Yehuda Sadeh
d44082e421 cls_rgw: quiet down verbose log message
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:50 -07:00
Yehuda Sadeh
4e05786a58 rgw: replace logic that compares regions
The logic was a bit broken. Basically, we want to make sure
that region names are the same. However, if region name is not
set then we need to check whether it's the master region. This
can happen in upgrade cases where originally we didn't have
a region name set.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:50 -07:00
Yehuda Sadeh
e4d2787b02 rgw-admin: link / unlink should report errors
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:50 -07:00
Yehuda Sadeh
0024e5aa22 rgw: fix time parsing in replica log
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:49 -07:00
Yehuda Sadeh
7cd0bd85d4 rgw: bucket entry point object ver fixes
Multiple fixes:
 - sync master, secondary entry point ver on creation
 - use correct entry point version when removing entry point
 - check correct version on bucket removal

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:49 -07:00
Yehuda Sadeh
89ecba209b rgw: remove s->objv_tracker
was never initialized correctly anyway. It was only supposed to
be used for buckets, but it was never initialized in that case.
Using s->bucket_info.objv_tracker instead.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:49 -07:00
Yehuda Sadeh
85f3f09b0a rgw: forward delete bucket request to master after removal
We can only forward the bucket removal to the master if it was
successfully removed locally.
The master region has no knowledge about whether the
bucket can be removed or not, e.g., there are still objects in the
bucket. If we send it to the master first, then it'll happily remove it
even though it might fail in the end.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:49 -07:00
Yehuda Sadeh
989a4d93d8 rgw: adjust error for bucket removal on secondary region
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:49 -07:00
Yehuda Sadeh
2e51823563 rgw: forward x_amz_meta headers when forwarding a request
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:49 -07:00
Yehuda Sadeh
4f4bdbd5cb rgw: fix bucket re-creation on secondary region
We had a problem with bucket recreation, where we identified
that bucket has already existed, but missed the fact that it's
the same bucket, so removal of the bucket index was wrong.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-19 13:21:49 -07:00