By calling reweight_by_utilization() method, we are aiming at an evener result
of utilization among all osds. To achieve this, we shall decrease weights of
osds which are currently overloaded, and try to increase weights of osds which
are currently underloaded when it is possible.
However, we can't do this all at a time in order to avoid a massive pg migrations
between osds. Thus we introduce a max_osds limit to smooth the progress.
The problem here is that we have sorted the utilization of all osds in a descending
manner and we always try to decrease the weights of the most overloaded osds
since they are most likely to encounter a nearfull/full transition soon, but
we won't increase the weights from the most underloaded(least utilized by contrast)
at the same time, which I think is not quite reasonable.
Actually, the best thing would probably be to iterate over teh low and high osds
in parallel, and do the ones that are furthest from the average first.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
The grace calculation during check_failure() is now very complicated
and time-consuming. Therefore we shall skip this when it is possible.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
systemd: make Ceph daemons dependent upon time-sync.target
Reviewed-by: Tim Serong <tserong@suse.com>
Reviewed-by: James Page <james.page@ubuntu.com>
Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
By use of single delete, RocksDB should be able to remove deleted wal
entries with only one compaction in theory, when wal entries land on level0.
This should reduce bluestore wal entries incurred WAF.
Signed-off-by: Jianjian Huo <jianjian.huo@ssi.samsung.com>
This is useful for log-insert-merge tree based key value store, such as
RocksDB, to avoid more LSM compactions for already deleted key value pairs.
Signed-off-by: Jianjian Huo <jianjian.huo@ssi.samsung.com>
If wal-reply, it release the released-extents. But it don't record those
extents. So if wal_transaction exist, it should record released-extents into
bluestore_wal_transaction_t.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
SUSE has settled on "ntp-daemon" as the generic package name. The "ntp" and
"chrony" etc. packages have "Provides: ntp-daemon" in their respective spec
files.
References: http://tracker.ceph.com/issues/15419
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Adding the short descriptions of the keystone admin tenant, user and
password options to the config reference as well. Also adding a note
that this applies to only v2 of Openstack Identity API
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
Explain the configuration of `rgw keystone admin user`, tenant and
password which avoids the need for setting the keystone admin token
shared secret in ceph configuration, since this token is recommended to
be disabled in production environments.
Fixes: #13066, #13519
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
When running install-deps on a minimalistic system, we reach that situation :
dpkg-checkbuilddeps --admindir=/tmp/install-deps.5526 debian/control
sh: 1: gcc: not found
dpkg-checkbuilddeps: warning: Couldn't determine gcc system type, falling back to default (native compilation)
dpkg-checkbuilddeps: error: cannot open /tmp/install-deps.5526/status: No such file or directory
This means that we shall install gcc before calling dpkg-checkbuilddeps.
Signed-off-by: Erwan Velu <erwan@redhat.com>
Can be tested via "--op writesame". Requests are currently dispatched
*without* a multiplication factor, i.e. data_len == write_len.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Write a buffer a number of times using writesame, read the full range
back, and check that it matches. Do this using rados_aio_writesame(),
ioctx.aio_writesame() and ioctx.aio_operate(op.writesame()).
Signed-off-by: David Disseldorp <ddiss@suse.de>
This adds a new ceph request writesame that writes a buffer of length
writesame.data_length bytes at writesame.offset over
writesame.length bytes.
This command maps to SCSI's WRITE SAME request, so users like LIO+rbd
can pass this to the OSD. Right now, it only saves having to transfer
writesame.length bytes over the network, but future versions will be
to fully offload it by passing it directly to the FS/devices if they
support it.
v2:
- Fix tab/spaces to matching coding style.
- Allow zero write length. Check for invalid data lengths.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>