Commit Graph

53258 Commits

Author SHA1 Message Date
Kefu Chai
f7331fdc3f Merge pull request #8786 from tchaikov/wip-btrfs-sudo
test: sudo to rm btrfs subvol

Reviewed-by: Erwan Velu <erwan@redhat.com>
2016-05-09 22:38:07 +08:00
Orit Wasserman
367bd0aec9 Merge pull request #8994 from theanalyst/fix/15776
rgw: log name instead of id for SystemMetaObj on failure
2016-05-09 16:36:43 +02:00
Li Wang
e8dc3f06fc librados: fix aio_operate()
ObjectOperationImpl has changed, no longer equal to ::ObjectOperation*,
this patch fixes the wrong pointer conversion.

Signed-off-by: Li Wang <li.wang@kylin-cloud.com>
2016-05-09 22:31:27 +08:00
Li Wang
aee6c52822 librados: fix operate()
ObjectOperationImpl has changed, no longer equal to ::ObjectOperation*,
this patch fixes the wrong pointer conversion.

Signed-off-by: Li Wang <li.wang@kylin-cloud.com>
2016-05-09 22:31:05 +08:00
Sage Weil
c27c684b16 Merge pull request #8919 from stiopaa1/log_moveToPrivateOsdService
osd/OSD.h: change some data members to private

Reviewed-by: Sage Weil <sage@redhat.com>
2016-05-09 09:51:19 -04:00
Sage Weil
162b4fb7d0 Merge pull request #8989 from flyd1005/wip-fix-python-warnings
cleanup: python: remove warnings of 'trailing whitespace' and 'new blank line at EOF'
2016-05-09 09:50:46 -04:00
Sage Weil
bd4a3b1b3e Merge pull request #8992 from runsisi/wip-fix-dup-keygen
cls_journal: remove duplicated key generation

Reviewed-by: Sage Weil <sage@redhat.com>
2016-05-09 09:49:54 -04:00
Sage Weil
453852642a Merge pull request #8991 from emenguy/doc_test-reweight-by-utilization
doc: adding test-reweight-by-utilization documentation
2016-05-09 09:48:52 -04:00
Sage Weil
1712dd5fd8 Merge pull request #8984 from stiopaa1/osd_removeUnorderedSet
osd/OSD.h: remove unneeded include file

Reviewed-by: Sage Weil <sage@redhat.com>
2016-05-09 09:38:12 -04:00
Sage Weil
09f572064c Merge pull request #8097 from aclamk/crushtool-pool-id
crushtool: add ability of precise testing of placement group calculation.

Reviewed-by: Sage Weil <sage@redhat.com>
2016-05-09 09:15:24 -04:00
Sage Weil
810a8ca2a3 Merge pull request #8832 from stiopaa1/log_graylogmove
common/Graylog.cc: use std move to avoid copy

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-05-09 09:14:55 -04:00
Sage Weil
c88a931212 Merge pull request #8920 from XinzeChi/wip-race-shutdown
osd: fix deadlock in OSD::_committed_osd_maps

Reviewed-by: Sage Weil <sage@redhat.com>
2016-05-09 09:14:10 -04:00
Sage Weil
4d10cb86bd Merge pull request #8922 from runsisi/wip-fix-lockdep-assert
lockdep: fix assertion expression if ran out of lock ids

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-05-09 09:13:26 -04:00
Sage Weil
c68357f0cf Merge pull request #8967 from liewegas/wip-15760
osdc/Objecter: upper bound watch_check result

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 09:12:49 -04:00
Sage Weil
a28b71e3c9 Merge pull request #8357 from liewegas/wip-osd-prestart
osd: update crush_location from ceph-osd on startup

Reviewed-by: Kefu Chai <kchai@redhat.com>
2016-05-09 08:59:05 -04:00
Sage Weil
0a6d0c61d7 Merge pull request #8194 from tanghaodong25/fix_4
os/bluestore/KernelDevice.cc: small fix in buffer flush

Reviewed-by: Haomai Wang <haomai@xsky.com>
2016-05-09 08:58:33 -04:00
Sage Weil
bfa5461265 Merge pull request #8349 from Aran85/prepare_entry_cleanup
os/filestore: prepare entry cleanup

Reviewed-by: Sage Weil <sage@redhat.com>
2016-05-09 08:57:51 -04:00
Sage Weil
d2cab3a898 osd: create osd id if it does not exist
Most tools create the osd id before trying to start ceph-osd.  Notably,
teuthology does not.  We could fix that, but we would be changing behavior,
as the osd boot will happily create the osd id on the fly for us.  Other
provisioning tools might rely on that behavior.  Instead, just allocate
the id sooner in the process (if necessary).

Signed-off-by: Sage Weil <sage@redhat.com>
2016-05-09 08:55:00 -04:00
Sage Weil
cf4ec5a8aa osd: change osd_crush_initial_weight = 0 to mean weight to 0
Negative now means auto-weight, 0 means weight to 0.  Change the
default accordingly.

Signed-off-by: Sage Weil <sage@redhat.com>
2016-05-09 08:55:00 -04:00
Sage Weil
573c349245 osd: update crush_location on startup from ceph-osd
Update the crush location from ceph-osd instead of relying on
kludgey bash in ceph-osd-prestart.sh.  Among other things, this
lets us get accurate statfs information from the ObjectStore
implementation instead of relying on 'df'.

Fixes: http://tracker.ceph.com/issues/15213
Signed-off-by: Sage Weil <sage@redhat.com>
2016-05-09 08:54:44 -04:00
Sage Weil
4587a379a3 osdc/Objecter: use cct's crush_location
Keep the observer so that we refresh our copy of the multimap.

Signed-off-by: Sage Weil <sage@redhat.com>
2016-05-09 08:54:44 -04:00
Sage Weil
db6e9bedf1 global: init crush_location on daemon start
Signed-off-by: Sage Weil <sage@redhat.com>
2016-05-09 08:54:44 -04:00
Sage Weil
3d71eda4b8 common/ceph_context: add CrushLocation to cct
Signed-off-by: Sage Weil <sage@redhat.com>
2016-05-09 08:54:44 -04:00
Sage Weil
6216461915 crush/CrushLocation: add class to manage crush_location
The crush_location can come from an explicitly set config,
a hook, or a simple fabricated default (root=default host=...).

Signed-off-by: Sage Weil <sage@redhat.com>
2016-05-09 08:54:44 -04:00
Sage Weil
7a17863c73 Merge pull request #8684 from vuhuong/wip-radosaio-copyout-data-xio
librados: copy out data to users' buffer for xio

Reviewed-by: Matt Benjamin <mbenjami@redhat.com>
2016-05-09 08:52:56 -04:00
Sage Weil
01a40155a7 Merge pull request #8759 from xiexingguo/xxg-wip-fixobjecterrace
osdc/Objecter: fix race condition for sortbitwise flag detection

Reviewed-by: Sage Weil <sage@redhat.com>
2016-05-09 08:51:30 -04:00
Sage Weil
99295cab83 Merge pull request #8826 from liewegas/wip-bluestore-bitmap-freelist
os/bluestore: bitmap-based freelist using merge operator

Reviewed-by: Allen Samuels <allen.samuels@sandisk.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-05-09 08:50:23 -04:00
xie xingguo
b29daf05fd osd: fix potential access violation during handling PING_REPLY
Need to make sure i != heartbeat_peers.end() before dereferencing it.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2016-05-09 20:34:06 +08:00
Kefu Chai
369db99308 osd: remove all stale osdmaps in handle_osd_map()
in a large cluster, there are better chances that the OSD fails to trim
the cached osdmap in a timely manner. and sometimes, it is just unable
to keep up with the incoming osdmap if skip_maps, so the osdmap cache
can keep building up to over 250GB in size. in this change

* publish_superblock() before trimming the osdmaps, so other osdmap
  consumers of OSDService.superblock won't access the osdmaps being
  removed.
* trim all stale osdmaps in batch of conf->osd_target_transaction_size
  if skip_maps is true. in my test, it happens when the osd only
  receives the osdmap from monitor occasionally because the osd happens
  to be chosen when monitor wants to share a new osdmap with a random
  osd.
* always use dedicated transaction(s) for trimming osdmaps. so even in
  the normal case where we are able to trim all stale osdmaps in a
  single batch, a separated transaction is used. we can piggy back
  the commits for removing maps, but we keep it this way for simplicity.
* use std::min() instead MIN() for type safety

Fixes: http://tracker.ceph.com/issues/13990
Signed-off-by: Kefu Chai <kchai@redhat.com>
2016-05-09 20:07:51 +08:00
Abhishek Lekshmanan
a3fdd26b02 rgw: drop unnecessary spacing in rgw zg init log
Dropping unneeded space when we're printing the failed reading zg info
message

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
2016-05-09 14:05:06 +02:00
Xinze Chi
cde3c54145 ReplicatedPG: fix fill_in_copy_get logic
performance degrade when promote or flush object

Signed-off-by: Xinze Chi <xinze@xsky.com>
2016-05-09 19:26:37 +08:00
Abhishek Lekshmanan
aee1d643be rgw: log name instead of id for SystemMetaObj on failure
Currently if we fail to read a SystemMetaObj we try to log the
MetaObject id, however this will not be set mostly as read_id has
failed, so we end up logging an empty id, changing this to log
the object name instead

Fixes: http://tracker.ceph.com/issues/15776
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
2016-05-09 11:41:19 +02:00
Etienne Menguy
6bab8cf362
adding test-reweight-by-utilization documentation
Signed-off-by: Etienne Menguy <etienne.menguy@corp.ovh.com>
2016-05-09 10:49:47 +02:00
runsisi
715e99c83e cls_journal: remove duplicated key generation
Signed-off-by: runsisi <runsisi@zte.com.cn>
2016-05-09 15:15:31 +08:00
Li Peng
9c2d785382 cleanup: python: remove warnings of 'new blank line at EOF'
When applying ceph patches, some warnings reported, e.g.
qa/workunits/mon/caps.py:367: new blank line at EOF.

Signed-off-by: Li Peng <lip@dtdream.com>
2016-05-09 11:28:42 +08:00
Li Peng
969c6d464c cleanup: python: remove warnings of 'trailing whitespace'
When applying ceph patches, some warnings reported, e.g.
doc/scripts/gen_state_diagram.py:99: trailing whitespace.

Signed-off-by: Li Peng <lip@dtdream.com>
2016-05-09 11:25:08 +08:00
Jianpeng Ma
3c1cf727d9 os/filestore/FileJournal: optimize align_bl.
Using is_aligned_size_and_memory replace is_aligned &&
is_n_aligned_size.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2016-05-08 07:17:30 +08:00
Jianpeng Ma
7ca73f4e6e common/buffer: For bufferlist add func is_aligned_size_and_memory.
For directIO requirement, we need check bufferpt whether size aligned
and the address aligned. To do this, we should call is_aligned &&
is_n_align_sized. Every func also list all ptr of bufferlist.
To reduce one list, we add is_aligned_size_and_memroy(align_size,
align_memory) which only need list once.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2016-05-08 07:17:26 +08:00
Jianpeng Ma
dc0c99eb4f aos/bluestore/KernelDevice: optimize rebuild for aio_write.
For the requirement of directio, the content maybe rebuild.
In fact, rebuild_aligned_size_and_memory first check is_n_align_sized &&
is_algined and if need it rebuild.
So using rebuild_aligned_size_and_memory can remove the check.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2016-05-08 07:16:28 +08:00
Jianpeng Ma
c82e08692e common/buffer: Change rebuild_aligned_* & rebuild_page_aligned API.
Make those func return value from void to bool. Using the return value
we can know whether really rebuild content in order to optimize .

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2016-05-08 06:41:24 +08:00
Jianpeng Ma
3161ab1c07 os/bluestore/BlueFS: remove the duplicate bufferlist.rebuild().
In the later BlockDevice::aio_write(), it will rebuild again if
bufferlist need rebuild.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2016-05-08 06:41:21 +08:00
Michal Jarzabek
3a5f4c5c89 osd/OSD.h: remove unneeded include file
Signed-off-by: Michal Jarzabek <stiopa@gmail.com>
2016-05-07 13:54:24 +01:00
Jason Dillaman
44827a3e8e librbd: assertion to ensure no concurrent processing of replay events
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2016-05-07 08:23:19 -04:00
Jason Dillaman
3b8d21ac13 journal: suppress notifications if client still in try_pop_front loop
One such example is popping the last entry from an object.  The next
object will be automatically prefetched.  When that object is received,
we do not want to alert the user that entries are available since
try_pop_front already indicated more records were available.

Fixes: http://tracker.ceph.com/issues/15755
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2016-05-07 08:23:19 -04:00
Jason Dillaman
5d205ef33c librbd: delay processing of next journal entry until flush in-progress
When replaying a journal flush event, do not start processing the next
journal entry until after the flush is in progress to ensure the barrier
is correctly guarding against future writes.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2016-05-07 08:23:18 -04:00
Jianpeng Ma
bd5f06702e common/config_opt: add bool option: bluestore_block_preallocate_file.
Using this option control whether preallocate space when bluesotre
block/db_path/wal_path use file instead block device.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2016-05-07 18:49:53 +08:00
Jianpeng Ma
75b0f7d1ba os/bluestore/BlueStore: preallocate space when use file instead of blockdevice.
Avoid failure because lack of space.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2016-05-07 14:56:07 +08:00
Yehuda Sadeh
5d3882d565 test/rgw: fix bucket checkpoint
Can refer to the incremental sync marker only if bucket is in the incremental
sync state.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2016-05-06 15:32:50 -07:00
Casey Bodley
e2b27c7266 test/rgw: add test_zonegroup_remove
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2016-05-06 16:51:42 -04:00
Casey Bodley
6327ade12b test/rgw: index zones by name instead of insertion order
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2016-05-06 16:51:42 -04:00