Commit Graph

28316 Commits

Author SHA1 Message Date
Yehuda Sadeh
1d1f7f18df rgw: change watch init ordering, don't distribute if can't
Backport: dumpling

Moving back the watch initialization after the zone init,
as the zone info holds the control pool name. Since zone
init might need to create a new system object (that needs
to distribute cache), don't try to distribute cache if
watch is not yet initialized.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-09-03 13:33:23 -07:00
Roald J. van Loon
e48d6cb402 mon: fix uninitialized Op field
- Uninitialized field in MonitorLevelDB::Op causes random build errors.

Signed-off-by: Roald J. van Loon <roaldvanloon@gmail.com>
2013-08-31 10:30:30 -07:00
Roald J. van Loon
a5d815d233 automake cleanup: uninitialized version_t
This sometimes gives a completely random uint64_t value, because it is
potentially used uninitialized.

Signed-off-by: Roald J. van Loon <roaldvanloon@gmail.com>
2013-08-31 10:28:19 -07:00
João Eduardo Luís
12c8850a7c Merge pull request #530 from ceph/wip-monc-leak
mon/MonClient: release pending outgoing messages on shutdown

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-08-30 10:36:07 -07:00
Sage Weil
e60d4e09e9 ceph-post-file: use mktemp instead of tempfile
tempfile is a debian thing, apparently; mktemp is present everywhere.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-30 09:41:29 -07:00
Sylvain Munaut
7a7361d7e7 rgw: Fix S3 auth when using response-* query string params
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Signed-off-by: Sylvain Munaut <s.munaut@whatever-company.com>
2013-08-29 10:56:23 -07:00
Gary Lowell
91616ce4ef ceph.spec.in: remove trailing paren in previous commit
Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-08-29 09:12:49 -07:00
Gary Lowell
b03f24173b ceph.spec.in: Don't invoke debug_package macro on centos.
If the redhat-rpm-config package is installed, the debuginfo rpms will
be built by default.   The build will fail when the package installed
and the specfile also invokes the macro.

Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
2013-08-29 09:12:26 -07:00
Sage Weil
ec297ec660 Merge pull request #548 from dmick/next
ceph.in: add to $PATH if needed regardless of LD_LIBRARY_PATH state

Reviewed-by: Sage Weil <sage@inktank.com>
2013-08-27 14:02:26 -07:00
Dan Mick
37850e1be6 ceph.in: add to $PATH if needed regardless of LD_LIBRARY_PATH state
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-08-27 13:40:23 -07:00
Sage Weil
c5b5ce120a osd: install admin socket commands after signals
This lets us tell by the presence of the admin socket commands whether
a signal will make us shut down cleanly.  See #5924.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
2013-08-26 13:20:51 -07:00
Sage Weil
4b529c8bce Merge pull request #531 from dmick/wip-6099
ceph_rest_api.py: create own default for log_file

Reviewed-by: Sage Weil <sage@inktank.com>
2013-08-23 15:30:41 -07:00
Dan Mick
2031f391c3 ceph_rest_api.py: create own default for log_file
common/config thinks the default log_file for non-daemons should be "".
Override that so that the default is
    /var/log/ceph/{cluster}-{name}.{pid}.log
since ceph-rest-api is more of a daemon than a client.

Fixes: #6099
Backport: dumpling
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-08-23 15:11:03 -07:00
Yehuda Sadeh
057588f41a Merge pull request #535 from ceph/wip-readdir-r-sucks
Fix readdir_r invocation

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2013-08-23 12:00:30 -07:00
Sage Weil
99a2ff7da9 os: make readdir_r buffers larger
PATH_MAX isn't quite big enough.

Backport: dumpling, cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-23 11:45:35 -07:00
Sage Weil
2df66d9fa2 os: fix readdir_r buffer size
The buffer needs to be big or else we're walk all over the stack.

Backport: dumpling, cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-23 11:45:08 -07:00
Sage Weil
fe5010380a mon/Paxos: fix another uncommitted value corner case
It is possible that we begin the paxos recovery with an uncommitted
value for, say, commit 100.  During last/collect we discover 100 has been
committed already.  But also, another node provides an uncommitted value
for 101 with the same pn.  Currently, we refuse to learn it, because the
pn is not strictly > than our current uncommitted pn... even though it is
the next last_committed+1 value that we need.

There are two possible fixes here:

 - make this a >= as we can accept newer values from the same pn.
 - discard our uncommitted value metadata when we commit the value.

Let's do both!

Fixes: #6090
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-23 10:38:53 -07:00
Yehuda Sadeh
0373d749ce rgw: bucket meta remove don't overwrite entry point first
Fixes: #6056

When removing a bucket metadata entry we first unlink the bucket
and then we remove the bucket entrypoint object. Originally
when unlinking the bucket we first overwrote the bucket entrypoint
entry marking it as 'unlinked'. However, this is not really needed
as we're just about to remove it. The original version triggered
a bug, as we needed to propagate the new header version first (which
we didn't do, so the subsequent bucket removal failed).

Reviewed-by: Greg Farnum <greg@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-08-23 10:10:57 -07:00
Alfredo Deza
f040020fb2 ceph-disk: specify the filetype when mounting
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-08-23 08:15:06 -07:00
Sage Weil
b003e5fddc Merge pull request #532 from dmick/next
PGMonitor: pg dump_stuck should respect --format (plain works fine)

Reviewed-by: Sage Weil <sage@inktank.com>
2013-08-22 21:34:57 -07:00
Sandon Van Ness
40f43a028e QA: Compile fsstress if missing on machine.
Some distro's have a lack of ltp-kernel packages and all we need is
fstress. This just modified the shell script to download/compile
fstress from source and copy it to the right location if it doesn't
currently exist where it is expected. It is a very small/quick
compile and currently only SLES and debian do not have it already.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-08-22 19:52:16 -07:00
Dan Mick
ab4e85da6a PGMonitor: pg dump_stuck should respect --format (plain works fine)
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-08-22 18:53:34 -07:00
Sage Weil
309569a6d0 mon/MonClient: release pending outgoing messages on shutdown
This fixes a small memory leak when we have messages queued for the mon
when we shut down.  It is harmless except for the valgrind leak check
noise that obscures real leaks.

Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-22 17:46:45 -07:00
Yehuda Sadeh
3d55534268 rgw: fix crash when creating new zone on init
Moving the watch/notify init before the zone init,
as we might need to send a notification.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2013-08-22 14:31:06 -07:00
Alexandre Oliva
617dc36d47 enable mds rejoin with active inodes' old parent xattrs
When the parent xattrs of active inodes that the mds attempts to open
during rejoin lack pool info (struct_v < 5), this field will be filled
in with -1, causing the mds to retry fetching a backtrace with a pool
number that matches the expected value, which fails and causes the
err==-ENOENT branch to be taken and retry pool 1, which succeeds, but
with pool -1, and so keeps on bouncing between the two retry cases
forever.

This patch arranges for the mds to go along with pool -1 instead of
insisting that it be refetched, enabling it to complete recovery
instead of eating cpu, network bandwidth and metadata osd's resources
like there's no tomorrow, in what AFAICT is an infinite and very busy
loop.

This is not a new problem: I've had it even before upgrading from
Cuttlefish to Dumpling, I'd just never managed to track it down, and
force-unmounting the filesystem and then restarting the mds was an
easier (if inconvenient) work-around, particularly because it always
hit when the filesystem was under active, heavy-ish use (or there
wouldn't be much reason for caps recovery ;-)

There are two issues not addressed in this patch, however.  One is
that nothing seems to proactively update the parent xattr when it is
found to be outdated, so it remains out of date forever.  Not even
renaming top-level directories causes the xattrs to be recursively
rewritten.  AFAICT that's a bug.

The other is that inodes that don't have a parent xattr (created by
even older versions of ceph) are reported as non-existing in the mds
rejoin message, because the absence of the parent xattr is signaled as
a missing inode (?failed to reconnect caps for missing inodes?).  I
suppose this may cause more serious recovery problems.

I suppose a global pass over the filesystem tree updating parent
xattrs that are out-of-date would be desirable, if we find any parent
xattrs still lacking current information; it might make sense to
activate it as a background thread from the backtrace decoding
function, when it finds a parent xattr that's too out-of-date, or as a
separate client (ceph-fsck?).

Backport: dumpling, cuttlefish
Signed-off-by: Alexandre Oliva <oliva@gnu.org>
Reviewed-by: Zheng, Yan <zheng.z.yan@intel.com>
2013-08-22 08:13:29 -07:00
Sage Weil
9242d01cc0 ceph-monstore-tool: shut up coverity
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-21 21:55:10 -07:00
Yan, Zheng
123f79bea8 store: fix issues reported by coverity
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-08-21 21:55:10 -07:00
Josh Durgin
8784564669 objecter: fix keys of dump_linger_ops
The registering flag no longer exists, and registered was using the
wrong property due to a copy-paste error.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>
2013-08-21 16:26:09 -07:00
Josh Durgin
38a0ca66a7 objecter: resend unfinished lingers when osdmap is no longer paused
Plain Ops that haven't finished yet need to be resent if the osdmap
transitions from full or paused to unpaused.  If these Ops are
triggered by LingerOps, they will be cancelled instead (since
should_resend = false), but the LingerOps that triggered them will not
be resent.

Fix this by checking the registered flag for all linger ops, and
resending any of them that aren't paused anymore.

Fixes: #6070
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>
2013-08-21 16:01:04 -07:00
Yehuda Sadeh
d26ba3ab03 rgw: change cache / watch-notify init sequence
Fixes: #6046
We were initializing the watch-notify (through the cache
init) before reading the zone info which was much too
early, as we didn't have the control pool name yet. Now
simplifying init/cleanup a bit, cache doesn't call watch/notify
init and cleanup directly, but rather states its need
through a virtual callback.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-08-21 11:13:09 -07:00
Sage Weil
cf8dbd248b Merge remote-tracking branch 'gh/wip-6004' into next
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
2013-08-20 16:57:46 -07:00
Sage Weil
edf2c3449e .gitignore: ignore test-driver
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-20 16:54:20 -07:00
Sage Weil
9833e9dabe fuse: fix warning when compiled against old fuse versions
client/fuse_ll.cc: In function 'void invalidate_cb(void*, vinodeno_t, int64_t, int64_t)':
warning: client/fuse_ll.cc:540: unused variable 'fino'

Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-20 16:54:10 -07:00
Sage Weil
6abae35a39 json_spirit: remove unused typedef
In file included from json_spirit/json_spirit_writer.cpp:7:0:
json_spirit/json_spirit_writer_template.h: In function 'String_type json_spirit::non_printable_to_string(unsigned int)':
json_spirit/json_spirit_writer_template.h:37:50: warning: typedef 'Char_type' locally defined but not used [-Wunused-local-typedefs]
         typedef typename String_type::value_type Char_type;

(Also, ha ha, this file uses \r\n.)

Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-20 16:54:05 -07:00
Sage Weil
c9cdd19d1c gtest: add build-aux/test-driver to .gitignore
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-20 16:54:02 -07:00
Dan Mick
0ccb9be3b6 Merge pull request #517 from dmick/wip-6049
mon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous)

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-08-20 12:18:43 -07:00
Sage Weil
981eda9f77 mon/Paxos: always refresh after any store_state
If we store any new state, we need to refresh the services, even if we
are still in the midst of Paxos recovery.  This is because the
subscription path will share any committed state even when paxos is
still recovering.  This prevents a race like:

 - we have maps 10..20
 - we drop out of quorum
 - we are elected leader, paxos recovery starts
 - we get one LAST with committed states that trim maps 10..15
 - we get a subscribe for map 10..20
   - we crash because 10 is no longer on disk because the PaxosService
     is out of sync with the on-disk state.

Fixes: #6045
Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-08-20 11:27:23 -07:00
Sage Weil
7e0848d8f8 mon/Paxos: return whether store_state stored anything
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-08-20 11:27:09 -07:00
Sage Weil
b9dee2285d mon/Paxos: cleanup: use do_refresh from handle_commit
This avoid duplicated code by using the helper created exactly for this
purpose.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
2013-08-20 11:26:57 -07:00
Sage Weil
6ef1970340 pybind: fix Rados.conf_parse_env test
This happens after we connect, which means we get ENOSYS always.
Instead, parse_env inside the normal setup method, which had the added
benefit of being able to debug these tests.

Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-20 11:23:46 -07:00
Dan Mick
eca53bbf58 mon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous)
Fixes: #6049
Signed-off-by: Dan Mick <dan.mick@inktank.com>
2013-08-20 11:14:43 -07:00
Samuel Just
1f851cb248 PG: remove old log when we upgrade log version
Otherwise the log_oid will be non-empty and the next
boot will cause us to try to upgrade again.

Fixes: #6057
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-08-19 22:51:05 -07:00
Samuel Just
00080d785f PGLog: add a config to disable PGLog::check()
This is a debug check which may be causing excessive
cpu usage.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-08-19 18:04:55 -07:00
Sage Weil
67a95b9880 ceph: parse CEPH_ARGS environment variable
Fixes: #6052
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-08-19 12:48:58 -07:00
Sage Weil
eef7cacdb1 rados pybind: add conf_parse_env()
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
2013-08-19 12:48:58 -07:00
Sage Weil
68c1c70e1f Merge remote-tracking branch 'gh/next' 2013-08-19 12:41:54 -07:00
Sage Weil
9dda1cc044 doc/release-notes: v0.61.8
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-19 12:41:44 -07:00
Sage Weil
233fed8c97 Merge pull request #513 from dalgaaf/fix/wip-da-documentation
Fix documentation issues
2013-08-19 12:32:30 -07:00
Danny Al-Gaaf
090e4c4a31 filestore-config-ref.rst: mark some filestore keys as deprecated
Marked the following keys as deprecated since v0.65:
- filestore flusher
- filestore flusher max fds
- filestore sync flush

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2013-08-19 20:56:48 +02:00
Sage Weil
a396e02713 Merge pull request #512 from ceph/wip-5988
Reviewed-by: Sage Weil <sage@inktank.com>
2013-08-19 11:16:57 -07:00