Since we filter monitor addresses based on ms_mode, check that at
least one address was found.
Otherwise, we mismatch arguments when calling sysfs/add_single_major
which emits a misleading error message to dmesg:
libceph: resolve 'name=user1' (ret=-3): failed
libceph: parse_ips bad ip 'name=user1,key=client.user1'
Fixes: https://tracker.ceph.com/issues/54128
Signed-off-by: Burt Holzman <burt@fnal.gov>
When transitioning an object to cloud, there was an early return,
skipping the removal of the cloud target. Fix this to be in the right
place.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
After https://github.com/ceph/ceph/pull/44059 the monitoring/prometheus
and monitoring/grafana/dashboards directories are changed to
monitoring/ceph-mixins. That broke the shared_folders in the cephadm
bootstrap script.
Changed all the instances of monitoring/prometheus and
monitoring/grafana/dashboards to monitoring/ceph-mixins
Also, renaming all the instances of prometheus_alerts.yaml to
prometheus_alerts.yml.
Fixes: https://tracker.ceph.com/issues/54176
Signed-off-by: Nizamudeen A <nia@redhat.com>
Create a gc thread to cleanup the stale tail objects data
XXX: handle read + delete usecase, simple approach could be
to use locks or sqlite transactions in GC
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Create unique ID for each object upload which will be
atomically updated in the head object at the end. This will
prevent data corruption during concurrent writes.
Incase of Multipart Uploads, upload_id is used as ObjectID.
XXX: The stale or obsolete tail data needs to be deleted
Also addressed invalid usage of CephContext in dbstore tests.
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Add new test case to verify the network block temporarily,
that case would make outgoing_bl overflow so add the assert
checking mechanism to claim_append
Just use 2 connections because that we could not generate the
large data set to verify it
Simulate the EAGAIN situation looks like by skip calling
cs.send() because EAGAIN would return size 0 and keep the
outgoing_bl
Signed-off-by: Vicente Cheng <vicente_cheng@bigtera.com>
CopyObject api support condition headers, eg x-amz-copy-source-if-match, while radosgw miss out the 'source' keyword
Fixes: https://tracker.ceph.com/issues/53945
Signed-off-by: Wang Hao <wanghao72@baidu.com>
The output of mkfs wasn't also being included in the OSD's log before this which
can make it more difficult to debug issues with mkfs.
ceph-run restarting every 5 seconds can make it difficult to read the osd's stdout.
Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
For basic, rbd and rbd-nomount subsuites, replace legacy and crc
facets with "legacy or legacy+rxbounce" and "crc or crc+rxbounce"
facets (chosen at random).
For fsx, singleton and thrash subsuites, add legacy+rxbounce and
crc+rxbounce facets and drop prefer-crc facet. The expected behaviour
of the latter depends on cluster configuration and should be tested
separately.
The total number of jobs remains the same.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Issue:
If we provide a random string in the schedule remove
command the entire schedule at specified level gets
removed.
Fixes: https://tracker.ceph.com/issues/53250
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
Delaying the rgw service creation in the tests until the cluster is
healthy
also changing the node_ip_offset to 110 because in the jenkins I saw
Fixes: https://tracker.ceph.com/issues/54030
Signed-off-by: Nizamudeen A <nia@redhat.com>
Replaces the BitmapAllocator used by NCB Recovery code with a dedicated SimpleBitmap.
The SimpleBitmap allows for bits to be set multiple times without any adverse effect.
This is needed beacuse shared-blobs will report the same allocation multiple times.
Fixes: https://tracker.ceph.com/issues/53678
Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>
We weren't previously handling the deallocation of the store when
a realm was reloaded. Now passing a const reference to the pointer.
Fixes: https://tracker.ceph.com/issues/54130
Signed-off-by: Cory Snyder <csnyder@iland.com>
Currently, the MotrAtomicWriter::cleanup() is called from
MotrAtomicWriter::commit(), which may not be called at all
by rgw in case of md5 checksum failure.
Solution: call cleanup() from process() when data is zero.
rgw calls Writer::process(data, off) with zero data at the
end of the loop to allow writes to flush the data. From:
src/rgw/rgw_op.cc:RGWPutObj::execute():
op_ret = filter->process(std::move(data), ofs);
...
ofs += len;
} while (len > 0);
// flush any data in filters
op_ret = filter->process({}, ofs);
Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Reviewed-by: Sining Wu <sining.wu@seagate.com>
The progress module disabled the pg recovery event by default
since the event is expensive and has interrupted other serviceis
when there is OSDs being marked in/out from the the cluster.
To turn the event on manually:
ceph config set mgr mgr/progress/allow_pg_recovery_event true
Updated qa/tasks/mgr/test_progress.py to enable
the pg recovery event when testing the progress module.
Signed-off-by: Kamoltat <ksirivad@redhat.com>