Commit Graph

32416 Commits

Author SHA1 Message Date
Samuel Just
2722a0a487 PrioritizedQueue: cap costs at max_tokens_per_subqueue
Otherwise, you can get a recovery op in the queue which has a cost
higher than the max token value.  It won't get serviced until all other
queues also do not have enough tokens and higher priority queues are
empty.

Fixes: #7706
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-13 14:04:25 -07:00
Yehuda Sadeh
a19ef011db rgw: manifest hold the actual bucket used for tail objects
Fixes: 7703
Object can be copied between different buckets, so we need to keep track
of which bucket is used for naming the tail parts. The new manifest
requires that because older manifest just held all the tail objects
(each containing the appropriate bucket internally).

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
2014-03-13 11:25:24 -07:00
Sage Weil
33b889f3f1 rbd-fuse: fix signed/unsigned warning
rbd_fuse/rbd-fuse.c: In function 'enumerate_images':
rbd_fuse/rbd-fuse.c:113:2: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-13 11:24:48 -07:00
Danny Al-Gaaf
c973e46c47 mds/Mutation.h: init export_dir with NULL in ctor
CID 1188167 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR)
 2. uninit_member: Non-static class member "export_dir" is not initialized in
 this constructor nor in any functions that it calls.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-13 18:48:00 +01:00
Danny Al-Gaaf
fd383a95f0 mds/Migrator.h: init some members of import_state_t in ctor
CID 1188166 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
 2. uninit_member: Non-static class member "state" is not initialized in this
   constructor nor in any functions that it calls.
 4. uninit_member: Non-static class member "peer" is not initialized in this
   constructor nor in any functions that it calls.
 6. uninit_member: Non-static class member "tid" is not initialized in this
   constructor nor in any functions that it calls.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-13 18:39:32 +01:00
Danny Al-Gaaf
5a53aa82f0 mds/Migrator.h: init some export_state_t members in ctor
CID 1188165 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
 2. uninit_member: Non-static class member "state" is not initialized in
  this constructor nor in any functions that it calls.
 4. uninit_member: Non-static class member "peer" is not initialized in this
  constructor nor in any functions that it calls.
 6. uninit_member: Non-static class member "tid" is not initialized in this
  constructor nor in any functions that it calls.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-13 18:30:54 +01:00
Danny Al-Gaaf
b10692fb3a CInode::encode_cap_message: add assert for cap
CID 716913 (#1 of 1): Dereference after null check (FORWARD_NULL)
5. var_deref_op: Dereferencing null pointer "cap".

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-13 17:59:41 +01:00
Danny Al-Gaaf
58e35a4bc2 test_filejournal.cc: use strncpy and terminate with '\0'
CID 966632 (#1 of 1): Copy into fixed size buffer (STRING_OVERFLOW)
 2. fixed_size_dest: You might overrun the 200 byte fixed-size string
 "path" by copying "args[0UL]" without checking the length.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-13 17:21:53 +01:00
Sage Weil
0073bc2592 Merge pull request #1433 from fractalcat/fix-crypto-init-race-condition
Work around race condition in libnss

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-13 08:44:16 -07:00
Sage Weil
bc7aa22144 Merge pull request #1443 from fghaas/doc-fix
doc: fix formatting on PG recommendation

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-13 08:40:41 -07:00
Sharif Olorin
a2784baae1 Add unit test for race condition in libnss
This isn't in test/crypto.cc because common_init_finish is called prior
to running any tests. Will not build the test function if Ceph hasn't
been configured with NSS.

Signed-off-by: Sharif Olorin <sio@tesser.org>
2014-03-14 01:22:07 +11:00
Sharif Olorin
44aaaaffd5 Work around race condition in libnss
This change prevents a segfault in ceph::crypto::init when using NSS and
calling rados_connect from multiple threads simultaneously on different
rados_t objects (and updates the documentation for rados_connect to
reflect the fix).

It's pretty simple, just one static mutex wrapping the
NSS definition of ceph::crypto::init. More details regarding the race
condition are in this[0] commit (and pull request #1424).

To reproduce the race condition in the existing codebase, the below[1]
C program will work (depending on number of cores and probably other
things, the number of threads needed to reliably reproduce varies, but
the more the better - in my environment five is sufficient, with four
cores.

[0]: 377c919088

[1]:

```c

void *init(void *p) {
	int err;
	rados_t cluster;
	err = rados_create(&cluster, NULL);
	if (err < 0) {
		fprintf(stderr, "cannot create cluster handle: %s\n", strerror(-err));
		return NULL;
	}
	err = rados_conf_read_file(cluster, "./ceph.conf");
	if (err < 0) {
		fprintf(stderr, "error reading config file: %s\n", strerror(-err));
		return NULL;
	}
	rados_connect(cluster);
	return NULL;
}

int main() {
	pthread_t ts[NTHREAD];
	int i;
	for (i = 0; i < NTHREAD; i++) {
		pthread_create(&ts[i], NULL, init, NULL);
	}
	for (i = 0; i < NTHREAD; i++) {
		int k;
		void *p = (void*)&k;
		pthread_join(ts[i], p);
	}

	return 0;
}
```

Signed-off-by: Sharif Olorin <sio@tesser.org>
2014-03-14 01:22:07 +11:00
Guang Yang
fe8a715c03 Make the configuration "filestore merge threshold" can be negative which prevent it from merging, this could help:
1. We are trying to create the PG folder up to several levels with a standalone tool to prevent it from runtime splitting, we need a configuration which prevent it from merging even there is no file within the folder.
  2. As runtime split / merge could bring latency issues, customer can use a negative merge threshold to prevent merging but only splitting.
This change is backward compatbile.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
2014-03-13 13:10:38 +00:00
Florian Haas
27f06346e2 doc: fix formatting on PG recommendation
Previous commit (047287afbe) broke
formatting on the formula, and also made mixed formula and text oddly,
which on second thought didn't look too good.

Add the note about the power of two to the following paragraph
instead, in prose.

Signed-off-by: Florian Haas <florian@hastexo.com>
2014-03-13 11:35:20 +01:00
Danny Al-Gaaf
7cf81ce214 libcephfs/test.cc: shutdown cmount at end of MountNonExist
CID 966624 (#5 of 5): Resource leak (RESOURCE_LEAK)
 17. leaked_storage: Variable "cmount" going out of scope leaks the
 storage it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-13 11:21:42 +01:00
Danny Al-Gaaf
269cf138b7 libcephfs/test.cc: shutdown cmount
CID 743410 (#17 of 17): Resource leak (RESOURCE_LEAK)
 65. leaked_storage: Variable "cmount" going out of scope leaks the storage
 it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-13 09:48:44 +01:00
Danny Al-Gaaf
94acb6b313 test_librbd.cc: add missing va_end() to test_ls_pp
CID 1054877 (#1 of 1): Missing varargs init or cleanup (VARARGS)
 17. missing_va_end: va_end was not called for "ap".

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-13 09:03:57 +01:00
Danny Al-Gaaf
fb4ca9406f mailmap: Danny Al-Gaaf name normalization
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-13 07:49:53 +01:00
Sage Weil
fb8ff44516 doc/release-notes: note that WATCH can get ENOENT now
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-12 21:32:21 -07:00
Samuel Just
fce63aa821 Merge pull request #1440 from ceph/wip-7649
Wip 7649

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-12 18:33:03 -07:00
Sage Weil
b5d2df4a92 Merge pull request #1441 from ceph/wip-7671
Wip 7671

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-12 17:09:31 -07:00
Samuel Just
2cbad1b17f test/librados/watch_notify: create foo before watching
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-12 16:26:48 -07:00
Samuel Just
9d549eb2f4 test/system/st_rados_watch: expect ENOENT for watch on non-existent object
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-12 16:26:00 -07:00
Sage Weil
a523595691 Merge pull request #1439 from ceph/wip-7682
ReplicatedPG::already_(complete|ack) should skip temp object ops

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-12 15:45:35 -07:00
Danny Al-Gaaf
b23a141d54 RGWListBucketMultiparts: init max_uploads/default_max with 0
CID 717377 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
 2. uninit_member: Non-static class member "max_uploads" is not initialized
    in this constructor nor in any functions that it calls.
 4. uninit_member: Non-static class member "default_max" is not initialized
    in this constructor nor in any functions that it calls.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-12 22:56:44 +01:00
Danny Al-Gaaf
4057a3062e AbstractWrite: initialize m_snap_seq with 0
CID 717223 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
 2. uninit_member: Non-static class member "m_snap_seq" is not initialized
 in this constructor nor in any functions that it calls.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-12 22:37:12 +01:00
Samuel Just
90a2654ff5 ReplicatedPG::already_(complete|ack) should skip temp object ops
We clearly won't get dup ops on these repops, and they don't
have meaningful versions since they don't carry log
entries.

Fixes: #7682
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-12 14:07:54 -07:00
Danny Al-Gaaf
72bc1ef8b0 AdminSocket: initialize m_getdescs_hook in the constructor
CID 717212 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR)
 2. uninit_member: Non-static class member "m_getdescs_hook" is not
 initialized in this constructor nor in any functions that it calls.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-12 21:03:25 +01:00
Danny Al-Gaaf
f7529cf428 RGWPutCORS_ObjStore_S3::get_params: check data before dereference
CID 1063697 (#1 of 1): Explicit null dereferenced (FORWARD_NULL)
 5. var_deref_model: Passing null pointer "data" to function
 "RGWXMLParser::parse(char const *, int, int)", which dereferences it.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-12 20:27:57 +01:00
Danny Al-Gaaf
5334d5c808 mds/Server.cc: check straydn before dereference
ID 1019554 (#1 of 1): Dereference after null check (FORWARD_NULL)
 13. var_deref_model: Passing null pointer "straydn" to function
 "MDSCacheObject::is_auth() const", which dereferences it.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-12 20:09:22 +01:00
Sage Weil
f6b3a0b741 Merge pull request #1423 from fractalcat/respect-python-env
Update Python hashbang to respect environment
2014-03-12 12:05:05 -07:00
Sage Weil
45f54539c6 Merge pull request #1434 from ceph/wip-7695
build-doc: fix checks for required commands for non-debian

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-12 11:57:46 -07:00
Sage Weil
bbc228fc86 Merge pull request #1438 from fghaas/doc-fix
doc: Add "nearest power of two" to PG rule-of-thumb

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-12 11:55:13 -07:00
Florian Haas
047287afbe doc: Add "nearest power of two" to PG rule-of-thumb
Following an IRC discussion, it emerged that it would be helpful
to explain the merit of choosing a number of PGs per pool that is
a power of two, to keep PGs at roughly equal sizes in case of
PG splits.

See http://irclogs.ceph.widodh.nl/index.php?date=2014-03-12 for the
original discussion.

Signed-off-by: Florian Haas <florian@hastexo.com>
2014-03-12 19:51:27 +01:00
Danny Al-Gaaf
7bb03598cf OSDMonitor::prepare_pool_op: add missing break in case
CID 1191886 (#1 of 1): Missing break in switch (MISSING_BREAK)
 unterminated_case: This case (value 34) is not terminated by a 'break'
 statement.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-12 19:09:38 +01:00
Sage Weil
bcd41c0fad Merge pull request #1436 from ceph/wip-7681
ECBackend: when removing the temp obj, use the right shard

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-12 10:46:54 -07:00
Sage Weil
f52af3063e Merge pull request #1437 from ceph/wip-7650
tools/rados/rados.cc: use write_full for sync_write for ec pools

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-12 10:44:50 -07:00
Samuel Just
a4a91ccdc5 PG: do not wait for flushed before activation
This should reduce the sting of the previous commit somewhat.  We wait
for the activation transactions to clear prior to accepting IO anyway,
so we can go ahead and get that process started without waiting for the
flush.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-12 10:38:21 -07:00
Samuel Just
a576eb3204 PG: do not serve requests until replicas have activated
There are two problems:
1) We choose the min last_update amoung peers with the max local-les
value as an upper bound on requests which could have been reported to
the client as committed.  We then, for ec pools, roll back to that point
to ensure that we don't inadvertently commit to an update which fewer
than K replicas actually saw.  If the primary sets local-les, accepts an
update from a client, and there is a new interval before any of the
replicas have been activated, we will end up being forced to use that
update which no other replica has seen as the new last_update.  This
will cause the object to become unfound.  We don't have this problem as
long as all active replicas agree on last_update before we accept IO.

2) Even for replicated pools, we would then immediately respond to the
request which created the primary-only update with a commit since it is
in the log and we have no outstanding repops.  If we then lose that
primary before any of the replicas in the new interval record the new
log, we will not only lose the object, but also the log entry recording
it, which will result in a lost write.

For these reasons, it seems like we need to wait for the replicas to
activate before we can process new requests essentially because whatever
update we select as last_update is essentially regarded as committed as
soon as we accept IO.

Fixes: #7649
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-12 10:38:17 -07:00
Samuel Just
980d2b59e4 ECBackend: when removing the temp obj, use the right shard
Introduced in d0b1094ff7
Fixes: #7681
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-12 10:34:17 -07:00
Samuel Just
dc00661d4b osd_types: print lb if incomplete even if empty
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-12 10:28:43 -07:00
Danny Al-Gaaf
8e76e4e4fa build-doc: fix checks for required commands for non-debian
Fixes: 7695

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-03-12 18:09:59 +01:00
Yehuda Sadeh
85e7f4d8d7 Merge pull request #1412 from ceph/wip-libxfs-flag
FileStore: support compiling without libxfs

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2014-03-12 09:50:58 -07:00
Sage Weil
c55da14a3d Merge pull request #1362 from dachary/wip-7548
doc: erasure coded pool developer and operations documentation

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-11 21:54:02 -07:00
Sage Weil
01a93a2a29 Merge pull request #1425 from ceph/wip-rbd-fuse-enumerate
rbd-fuse: fix enumerate_images() image names buffer size issue

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-11 21:41:53 -07:00
Sage Weil
9987e486ad Merge pull request #1424 from fractalcat/thread-safety-doc-update
rados_connect not thread-safe when using nss (documentation)

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-11 21:38:40 -07:00
Sage Weil
e3f8dd03e8 Merge pull request #1419 from ceph/wip-doc-prereq
doc: update build prerequisites
2014-03-11 21:30:07 -07:00
Sage Weil
004bf3b20a Merge pull request #1415 from ceph/wip-build-doc
doc: release-process documentation updates

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-11 21:29:53 -07:00
Sage Weil
bd2defb9c1 Merge pull request #1409 from enovance/wip-brag
ceph-brag enhancements

Reviewed-by: Sage Weil <sage@inktank.com>
2014-03-11 21:25:25 -07:00
Sage Weil
dc82cd78ae debian: make ceph depend on ceph-common >= 0.67
The older versions of ceph-common (ceph CLI, in particular) can't talk to
newer clusters.  The primary change happened with dumpling when the new
CLI and rest-api changes were made.  Although in reality ceph doesn't
care what version of ceph-common is installed, in practice this forces
ceph-common to get upgraded along with ceph and avoids some user pain.

Fixes: #7641
Signed-off-by: Sage Weil <sage@inktank.com>
2014-03-11 21:22:57 -07:00