RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-10 21:20:46 +00:00

Author	SHA1	Message	Date
Sage Weil	41364711a6	osd/ReplicatedPG: observed INCOMPLETE_CLONES when doing clone subsets During recovery, we can clone subsets if we know that all clones will be present. We skip this on caching pools because they may not be; do the same when INCOMPLETE_CLONES is set. Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-24 10:07:33 -07:00
Sage Weil	956f28721d	osd/ReplicatedPG: do not complain about missing clones when INCOMPLETE_CLONES is set When scrubbing, do not complain about missing cloens when we are in a caching mode or when the INCOMPLETE_CLONES flag is set. Both are indicators that we may be missing clones and that that is okay. Fixes: #8882 Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-24 10:07:33 -07:00
Sage Weil	54bf055c5d	osd/osd_types: add pg_pool_t FLAG_COMPLETE_CLONES Set a flag on the pg_pool_t when we change cache_mode NONE. This is because object promotion may promote heads without all of the clones, and when we switch the cache_mode back those objects may remain. Do this on any cache_mode change (to or from NONE) to capture legacy pools that were set up before this flag existed. Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-24 10:06:55 -07:00
Sage Weil	67d13d76f5	mon/OSDMonitor: improve no-op cache_mode set check If we have a pending pool value but the cache_mode hasn't changed, this is still a no-op (and we don't need to block). Backport: firefly Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-24 10:06:54 -07:00
Sage Weil	a05a0da3b1	common/random_cache: fix typo Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-23 10:11:59 -07:00
Sage Weil	54ea5c13ac	Merge pull request #2136 from yuyuyu101/fix-randomcache common/RandomCache: Fix inconsistence between contents and count Reviewed-by: Sage Weil <sage@redhat.com>	2014-07-23 09:57:59 -07:00
Haomai Wang	5efdc6236c	common/RandomCache: Fix inconsistence between contents and count The add/clear method may cause count inconsistent with the real size of contents. Signed-off-by: Haomai Wang <haomaiwang@gmail.com>	2014-07-23 11:31:46 +08:00
Josh Durgin	422218a3b3	Merge pull request #2120 from ceph/wip-8858 Wip 8858 Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2014-07-22 16:58:25 -07:00
Gregory Farnum	29401c7e77	Merge pull request #2133 from ceph/wip-8897 os: fix build warnings with name/attr len checks (fixes 8889) Reviewed-by: Greg Farnum <greg@inktank.com>	2014-07-22 15:36:40 -07:00
João Eduardo Luís	d37b2ac1e3	Merge pull request #2128 from ceph/wip-8851 mon: AuthMonitor: always encode full regardless of keyserver having keys Reviewed-by: Gregory Farnum <greg@inktank.com> Reviewed-by: Sage Weil <sage@redhat.com>	2014-07-22 22:10:17 +01:00
Sage Weil	253ca2b902	os: make name/attr max methods unsigned This fixes warnings when we use these in MIN/MAX macros against other unsigned values. Fixes: #8897 Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-22 13:38:32 -07:00
Sage Weil	daac7508d2	os/KeyValueStore: make get_max_object_name_length() sane This is getting the NAME_MAX from the OS, but in reality the backend KV store is the limiter. And for leveldb, there is no real limit. Return 4096 for now. Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-22 13:37:20 -07:00
Josh Durgin	74b386f03e	Merge pull request #2129 from ceph/wip-librbd-oc librbd: reduce cache flush overhead Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Haomai Wang <haomaiwang@gmail.com>	2014-07-22 13:33:24 -07:00
Sage Weil	36265d0db0	Merge pull request #2125 from ceph/wip-memstore memstore: a few fixes, and enable the tests! Reviewed-by: Haomai Wang <haomaiwang@gmail.com>	2014-07-22 10:52:40 -07:00
Sage Weil	f7112c5beb	Merge pull request #2105 from rootfs/wip-qa-hadoop-wordcount update hadoop-wordcount test to be able to run on hadoop 2.x. Reviewed-by: Sage Weil <sage@redhat.com>	2014-07-22 08:42:03 -07:00
rootfs	e311a085a8	uncomment cleanup command	2014-07-22 11:31:37 -04:00
Wido den Hollander	d87e5b9f60	powerdns: RADOS Gateway backend for bucket directioning This backend can be used to create one global namespace for multiple RGW regions. Using a CNAME DNS response the traffic is directed towards the RGW region without using HTTP redirects.	2014-07-22 16:51:05 +02:00
Joao Eduardo Luis	b551ae2bce	mon: AuthMonitor: always encode full regardless of keyserver having keys On clusters without cephx, assuming an admin never added a key to the cluster, the monitors have empty key servers. A previous patch had the AuthMonitor not encoding an empty keyserver as a full version. As such, whenever the monitor restarts we will have to read the whole state from disk in the form of incrementals. This poses a problem upon trimming, as we do every now and then: whenever we start the monitor, it will start with an empty keyserver, waiting to be populated from whatever we have on disk. This is performed in update_from_paxos(), and the AuthMonitor's will rely on the keyserver version to decide which incrementals we care about -- basically, all versions > keyserver version. Although we started with an empty keyserver (version 0) and are expecting to read state from disk, in this case it means we will attempt to read version 1 first. If the cluster has been running for a while now, and even if no keys have been added, it's fair to assume that version is greater than 0 (or even 1), as the AuthMonitor also deals and keeps track of auth global ids. As such, we expect to read version 1, then version 2, and so on. If we trim at some point however this will not be possible, as version 1 will not exist -- and we will assert because of that. This is fixed by ensuring the AuthMonitor keeps track of full versions of the key server, even if it's of an empty key server -- it will still keep track of the key server's version, which is incremented each time we update from paxos even if it is empty. Fixes: #8851 Backport: dumpling, firefly Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>	2014-07-22 00:25:37 +01:00
Josh Durgin	34b0efdec7	ObjectCacher: fix bh_{add,remove} dirty_or_tx_bh accounting tx buffers need to go on the bh_lru_rest as well, and removing erases (not inserts) them into dirty_or_tx_bh. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>	2014-07-21 14:13:21 -07:00
Josh Durgin	8a05f1ba0e	ObjectCacher: fix dirty_or_tx_bh logic in bh_set_state() The else-if chain here was wrong. Handling dirty or tx buffers and errors should be in independent conditions. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>	2014-07-21 14:13:11 -07:00
Haomai Wang	d3587419da	Wait tx state buffer in flush_set Signed-off-by: Haomai Wang <haomaiwang@gmail.com>	2014-07-21 13:31:32 -07:00
Haomai Wang	3c7229a2fe	Add rbdcache max dirty object option Librbd will calculate max dirty object according to rbd_cache_max_size, it doesn't suitable for every case. If user set image order 24, the calculating result is too small for reality. It will increase the overhead of trim call which is called each read/write op. Now we make it as option for tunning, by default this value is calculated. Signed-off-by: Haomai Wang <haomaiwang@gmail.com>	2014-07-21 13:31:32 -07:00
Haomai Wang	5cb4b000dd	Reduce ObjectCacher flush overhead Flush op in ObjectCacher will iterate the whole active object set, each dirty object also may own several BufferHead. If the object set is large, it will consume too much time. Use dirty_bh instead to reduce overhead. Now only dirty BufferHead will be checked. Signed-off-by: Haomai Wang <haomaiwang@gmail.com>	2014-07-21 13:31:32 -07:00
Ma, Jianpeng	9061988ec7	osd: init local_connection for fast_dispatch in _send_boot() We were not properly setting up Sessions on the local_connection for fast_dispatch'ed Messages if the cluster_addr was set explicitly: the OSD was not in the dispatch list at bind() time (in ceph_osd.cc), and nothing called it later on. This issue was missed in testing because Inktank only uses unified NICs. That led to errors like the following: When do ec-read, i met a bug which was occured 100%. The messages are: 2014-07-14 10:03:07.318681 7f7654f6e700 -1 osd/OSD.cc: In function 'virtual void OSD::ms_fast_dispatch(Message)' thread 7f7654f6e700 time 2014-07-14 10:03:07.316782 osd/OSD.cc: 5019: FAILED assert(session) ceph version 0.82-585-g79f3f67 (`79f3f67491`) 1: (OSD::ms_fast_dispatch(Message)+0x286) [0x6544b6] 2: (DispatchQueue::fast_dispatch(Message*)+0x56) [0xb059d6] 3: (DispatchQueue::run_local_delivery()+0x6b) [0xb08e0b] 4: (DispatchQueue::LocalDeliveryThread::entry()+0xd) [0xa4a5fd] 5: (()+0x8182) [0x7f7665670182] 6: (clone()+0x6d) [0x7f7663a1130d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. To resolve this, we have the OSD invoke ms_handle_fast_connect() explicitly in send_boot(). It's not really an appropriate location, but we're already doing a bunch of messenger twiddling there, so it's acceptable for now. Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2014-07-21 13:13:44 -07:00
Sage Weil	c1c5f4b5f5	Merge pull request #2121 from ceph/wip-dencoder limit leveldb linkage; move ceph-dencoder back into ceph-common Reviewed-by: Dan Mick <dan.mick@inktank.com> RGW patch Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>	2014-07-21 13:10:02 -07:00
Sage Weil	27f6dbb64a	Merge pull request #2067 from thorstenb/wip-janitorial-clang-3 [werror] Fix mismatched tags (struct vs. class) inconsistence Reviewed-by: Sage Weil <sage@redhat.com>	2014-07-21 09:08:31 -07:00
Thorsten Behrens	b6f3aff766	Fix mismatched tags (struct vs. class) inconsistency Signed-off-by: Thorsten Behrens <tbehrens@suse.com>	2014-07-21 17:09:17 +02:00
Sage Weil	ff15a43c71	Merge pull request #2111 from ceph/wip-8174 osd: add config for osd_max_object_name_len = 2048 (was hard-coded at 4096) Reviewed-by: Haomai Wang <haomaiwang@gmail.com> and the first patch was Reviewed-by: Samuel Just <sam.just@inktank.com>	2014-07-20 14:21:09 -07:00
Sage Weil	2aa3edcb13	os/FileStore: fix max object name limit Our max object name is not limited by file name size, but by the length of the name we can stuff in an xattr. That will vary from file system to file system, so just make this 4096. In practice, it should be limited via the global tunable, if it is adjusted at all. Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-20 07:48:47 -07:00
Sage Weil	f4bffece8f	ceph_test_objectstore: test memstore Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-19 13:56:07 -07:00
Sage Weil	6f312b0584	os/MemStore: copy attrs on clone Backport: firefly Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-19 13:56:07 -07:00
Sage Weil	8dd6b8f9d8	os/MemStore: fix wrlock ordering checks We can't compare the shared_ptrs themselves; we need to compare the addresses of the actual objects. Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-19 13:56:07 -07:00
Sage Weil	a2594a5472	osd/MemStore: handle collection_move_rename within the same collection Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-19 13:56:07 -07:00
Sage Weil	34671108ce	ceph-dencoder: don't link librgw.la (and rados, etc.) Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-18 22:44:51 -07:00
Sage Weil	b1a641f307	rgw: move a bunch of stuff into rgw_dencoder This will help out ceph-dencoder ... Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-18 22:39:46 -07:00
Sage Weil	1c170776cb	libosd_types, libos_types, libmon_types Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-18 22:33:42 -07:00
Sage Weil	58cc894b32	Revert "ceph.spec: move ceph-dencoder to ceph from ceph-common" This reverts commit `95f5a448b5`.	2014-07-18 20:55:39 -07:00
Sage Weil	f181f78b74	Revert "debian: move ceph-dencoder to ceph from ceph-common" This reverts commit `b37e3bde3b`.	2014-07-18 20:55:35 -07:00
Sage Weil	ad4a4e1346	unittest_osdmap: revert a few broken changes From commit `80ea6067f7`. Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-18 16:51:16 -07:00
Yehuda Sadeh	d7209c1125	rgw: dump prefix unconditionally As part of issue #8858, and to be more in line with S3, dump the Prefix field when listing bucket even if bucket is empty. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>	2014-07-18 14:56:24 -07:00
Yehuda Sadeh	dc417e477d	rgw: list extra objects to set truncation flag correctly Otherwise we end up returning wrong truncated value, and no data on the next iteration. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>	2014-07-18 14:56:24 -07:00
Yehuda Sadeh	82d2d612e7	rgw: account common prefixes for MaxKeys in bucket listing To be more in line with the S3 api. Beforehand we didn't account the common prefixes towards the MaxKeys (a single common prefix counts as a single key). Also need to adjust the marker now if it is pointing at a common prefix. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>	2014-07-18 14:56:23 -07:00
Yehuda Sadeh	924686f0b6	rgw: add NextMarker param for bucket listing Partially fixes #8858. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>	2014-07-18 14:55:58 -07:00
Wido den Hollander	09a5974fd3	crushtool: Send output to stdout instead of stderr A lot of output was send to stderr instead of stdout and vise versa. Error messages should go to stderr, but all other output to stdout	2014-07-18 20:18:18 +02:00
Gregory Farnum	b9463e3497	Merge pull request #2115 from ceph/wip-8811 Make standby-replay MDSes much more careful about journal formats; both changing them and generally being aware. Reviewed-by: Greg Farnum <greg@inktank.com>	2014-07-18 11:17:52 -07:00
Yehuda Sadeh	e6cf618c25	rgw: improve delmited listing of bucket If found a prefix, calculate a string greater than that so that next request we can skip to that. This is still not the most efficient way to do it. It'll be better to push it down to the objclass, but that'll require a much bigger change. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>	2014-07-18 10:45:57 -07:00
Yehuda Sadeh	49fc68cf8c	utf8: export encode_utf8() and decode_utf8() Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>	2014-07-18 10:45:57 -07:00
Sage Weil	bd3367eafb	osd: add config for osd_max_attr_name_len = 100 Set a limit on the length of an attr name. The fs can only take 128 bytes, but we were not imposing any limit. Add a test. Reported-by: Haomai Wang <haomaiwang@gmail.com> Signed-off-by: Sage Weil <sage@inktank.com>	2014-07-18 10:44:49 -07:00
Sage Weil	7c0b2a05b9	os: add ObjectStore::get_max_attr_name_length() Most importantly, capture that attrs on FileStore can't be more than about 100 chars. The Linux xattrs can only be 128 chars, but we also have some prefixing we do. Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-18 10:44:05 -07:00
Sage Weil	7e0aca18a0	osd: add config for osd_max_object_name_len = 2048 (was hard-coded at 4096) Previously we had a hard coded limit of 4096. Objects > 3k crash the OSD when running on ext4, although they probably work on xfs. But rgw only generates objects a bit over 1024 bytes (maybe 1200 tops?), so let set a more reasonable limit here. 2048 is a nice round number and should be safe. Add a test. Fixes: #8174 Signed-off-by: Sage Weil <sage@redhat.com>	2014-07-18 10:44:05 -07:00

1 2 3 4 5 ...

34512 Commits