RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-31 07:22:56 +00:00

Author	SHA1	Message	Date
Sage Weil	27c2b83a8e	mgr/orchestrator: reformat a few methods Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-01 14:45:11 -04:00
Sage Weil	fb80427ec1	pybind/ceph_argparse: stop parsing when we run out of positional args If we encouter an arg that is not a named flag/arg, and the next item in the command description is non-positional, then raise an 'unexpected argument' exception. Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-01 14:45:11 -04:00
Sage Weil	747eb7d142	pybind/ceph_argparse: remove dead code Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-01 14:45:11 -04:00
Sage Weil	d2e869353e	pybind/mgr/mgr_module: infer non-positional args Once we have an Optional[bool], we can always transition to non-positional, since we never have a non-optional bool. Same goes for the 'format' arg. Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-01 14:45:11 -04:00
Sage Weil	bcd821e1b0	pybind/mgr/mgr_module: add separator for non-positional args Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-01 14:45:11 -04:00
Sage Weil	c441d35bc1	command/cmdparse: use -- to separate positional from non-positional args In a command definition, separate the non-positional args with "--". Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-01 14:45:07 -04:00
Sage Weil	b9a2a71402	pybind/ceph_argparse: adjust help text for non-positional args If an arg is non-positional, always show it as [--arg-name <value>] (All non-positional args are optional.) Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-01 14:39:52 -04:00
Sage Weil	165987c4c9	pybind/ceph_argparse: track a 'positional' property on cli args Signed-off-by: Sage Weil <sage@newdream.net>	2021-06-01 14:39:52 -04:00
Kalpesh	2e0b8a2a1f	qa/tasks: Adding RabbitMQ task for bucket notification tests This commit majorly consists of the RabbitMQ task which is a required and supported endpoint in bucket notification tests. And some related changes in the AMQP tests. Major changes are: 1. Addition of RabbitMQ task 2. Documentation update for the steps to execute AMQP tests 3. Addition of attributes to the tests 4. Tox dependency removal from kafka.py Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>	2021-06-01 23:34:31 +05:30
Ernesto Puerta	34f8e6fd7e	Merge pull request #41421 from s0nea/wip-dashboard-rbd-partially-rm mgr/dashboard: show partially deleted RBDs Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Tatjana Dehler <tdehler@suse.com> Reviewed-by: Mykola Golub <mgolub@suse.com> Reviewed-by: Volker Theile <vtheile@suse.com>	2021-06-01 19:29:20 +02:00
Samuel Just	7c4a392cfd	Merge pull request #41606 from liu-chunmei/seastore-fix-tracker crimson/seastore: fix assert in read_extent Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-06-01 08:53:31 -07:00
Ilya Dryomov	d41a998ce1	Merge pull request #41616 from idryomov/wip-rbd-qemu-precise-repos qa/tasks/qemu: precise repos have been archived Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>	2021-06-01 17:48:12 +02:00
Kefu Chai	a061cb683e	Merge pull request #41605 from t-msn/update-podman-detection vstart: detect podman using `command -v` Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-06-01 23:43:59 +08:00
Kefu Chai	f1e822ee47	Merge pull request #41369 from ifed01/wip-ifed-fix-avl-enospc2 os/bluestore: fix unexpected ENOSPC in Avl/Hybrid allocators. Reviewed-by: Adam Kupczyk <akupczyk@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-06-01 23:00:57 +08:00
Ernesto Puerta	1b312db505	Merge pull request #41395 from rhcs-dashboard/fix-50855-master mgr/dashboard: API Version changes do not apply to pre-defined methods (list, create etc.) Reviewed-by: Aashish Sharma <aasharma@redhat.com> Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Nizamudeen A <nia@redhat.com>	2021-06-01 16:28:48 +02:00
Ernesto Puerta	4f5f95396f	Merge pull request #41598 from rhcs-dashboard/fix-51026-master mgr/dashboard: pass Grafana datasource in URL Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com> Reviewed-by: Nizamudeen A <nia@redhat.com>	2021-06-01 16:28:07 +02:00
Ernesto Puerta	a0cc81e196	Merge pull request #41184 from rhcs-dashboard/fix-base-href mgr/dashboard: fix base-href Reviewed-by: Aashish Sharma <aasharma@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com>	2021-06-01 16:26:30 +02:00
Sage Weil	091c50cde7	Merge PR #41601 into master * refs/pull/41601/head: doc/foundation: remove amihan Reviewed-by: Mike Perez <miperez@redhat.com>	2021-06-01 09:46:33 -04:00
Igor Fedotov	0eed13a496	os/bluestore: fix unexpected ENOSPC in Avl/Hybrid allocators. Avl allocator mode was returning unexpected ENOSPC in first-fit mode if all size- matching available extents were unaligned but applying the alignment made all of them shorter than required. Since no lookup retry with smaller size - ENOSPC is returned. Additionally we should proceed with a lookup in best-fit mode even when original size has been truncated to match the avail size. (force_range_size_alloc==true) Fixes: https://tracker.ceph.com/issues/50656 Signed-off-by: Igor Fedotov <ifedotov@suse.com>	2021-06-01 16:44:21 +03:00
Casey Bodley	023dcc1952	Merge pull request #41470 from a16bitsysop/rgw_string.h rgw/rgw_string.h: add missing includes for alpine and boost 1.75 Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com>	2021-06-01 08:28:04 -04:00
Kefu Chai	a687e68de9	Merge pull request #41591 from tchaikov/wip-mgr-selftest-repl pybind/mgr/selftest: add "mgr self-test eval" command Reviewed-by: Ernesto Puerta <epuertat@redhat.com>	2021-06-01 19:43:34 +08:00
Kefu Chai	ad780d1dfe	Merge pull request #41603 from rzarzynski/wip-crimson-fix-use-after-free-alienstore-get_attr crimson/os: fix use-after-free in AlienStore::get_attr(). Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-06-01 19:28:06 +08:00
Ilya Dryomov	dcd193c35e	qa/tasks/qemu: precise repos have been archived Fixes: https://tracker.ceph.com/issues/51033 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-06-01 12:54:16 +02:00
Cory Snyder	b4316d257e	mgr/DaemonServer.cc: prevent integer underflow that is triggered by large increases to pg_num/pgp_num This fixes a scenario where mgrs continually crash while attempting to apply large increases to pg_num/pgp_num. The max step size (estmax) for each incremental update to the pgp_num is calculated as a percentage of the pg_num, which permits the possibility for the max step size (estmax) to be greater than the current pgp_num when the increase is large; this causes an integer underflow when the max step size is subtracted from the pgp_num in order to calculate the next step size with std::clamp. The integer underflow causes hi < lo in args passed to std::clamp, which causes a failed assertion, SIGABRT, and ultimately crashing mgr. Fixes: https://tracker.ceph.com/issues/47738 Signed-off-by: Cory Snyder <csnyder@iland.com>	2021-06-01 05:34:47 -04:00
Sebastian Wagner	9d3f5ae9c2	Merge pull request #41595 from zdover23/wip-doc-cephadm-serv-man-daemon-status-2021-05-30 doc/cephadm: enriching "daemon status" Reviewed-by: Sebastian Wagner <sewagner@redhat.com>	2021-06-01 11:30:10 +02:00
Sebastian Wagner	67efe92394	Merge pull request #41608 from zdover23/wip-doc-cephadm-serv-man-service-spec-2021-05-30 doc/cephadm: enriching "Service Specification" Reviewed-by: Sebastian Wagner <sewagner@redhat.com>	2021-06-01 11:29:12 +02:00
Radoslaw Zarzynski	c63a78131f	crimson/os: fix formatting in AlienStore::get_attr(). Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-06-01 09:20:51 +00:00
Radoslaw Zarzynski	1b7539ee13	crimson/os: fix use-after-free in AlienStore::get_attr(). The `FuturizedStore` interface imposes the `get_attr()` takes the `name` parameter as `std::string_view`, and thus burdens implementations with extending the life- time of the data the instance refers to. Unfortunately, `AlienStore` is unaware that prolonging the life of a `std::string_view` instance doesn't prolong the data memory it points to. This problem has manifested in the following use-after-free detected at Sepia: ``` rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136929$ less ./remote/smithi194/log/ceph-osd.7.log.gz ... DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - do_osd_ops_execute: object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head - handling op call DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - handling op call on object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - calling method lock.lock, num_read=0, num_write=0 DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - handling op getxattr on object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - getxattr on obj=14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head for attr=_lock.TestLockPP1 DEBUG 2021-05-26 20:24:54,078 [shard 0] bluestore - get_attr ================================================================= ==34068==ERROR: AddressSanitizer: heap-use-after-free on address 0x6030001851d0 at pc 0x7f824d6a5b27 bp 0x7f822b4201c0 sp 0x7f822b41f968 READ of size 17 at 0x6030001851d0 thread T28 (alien-store-tp) ... #0 0x7f824d6a5b26 (/lib64/libasan.so.5+0x40b26) #1 0x55e2cbb2e00b (/usr/bin/ceph-osd+0x2b6dc00b) #2 0x55e2d31f086e (/usr/bin/ceph-osd+0x32d9e86e) #3 0x55e2d3467607 in crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) (/usr/bin/ceph-osd+0x33015607) #4 0x55e2d346b14a (/usr/bin/ceph-osd+0x3301914a) #5 0x7f8249d32ba2 (/lib64/libstdc++.so.6+0xc2ba2) #6 0x7f824a00d149 in start_thread (/lib64/libpthread.so.0+0x8149) #7 0x7f82486edf22 in clone (/lib64/libc.so.6+0xfcf22) 0x6030001851d0 is located 0 bytes inside of 31-byte region [0x6030001851d0,0x6030001851ef) freed by thread T0 here: #0 0x7f824d757688 in operator delete(void*) (/lib64/libasan.so.5+0xf2688) previously allocated by thread T0 here: #0 0x7f824d7567b0 in operator new(unsigned long) (/lib64/libasan.so.5+0xf17b0) Thread T28 (alien-store-tp) created by T0 here: #0 0x7f824d6b7ea3 in __interceptor_pthread_create (/lib64/libasan.so.5+0x52ea3) SUMMARY: AddressSanitizer: heap-use-after-free (/lib64/libasan.so.5+0x40b26) Shadow bytes around the buggy address: 0x0c06800289e0: fd fd fd fa fa fa fd fd fd fa fa fa 00 00 00 fa 0x0c06800289f0: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd 0x0c0680028a00: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa 0x0c0680028a10: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa 0x0c0680028a20: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd =>0x0c0680028a30: fd fd fa fa fd fd fd fd fa fa[fd]fd fd fd fa fa 0x0c0680028a40: fd fd fd fd fa fa fd fd fd fd fa fa 00 00 00 07 0x0c0680028a50: fa fa 00 00 00 fa fa fa 00 00 00 fa fa fa fd fd 0x0c0680028a60: fd fd fa fa fd fd fd fd fa fa fd fd fd fd fa fa 0x0c0680028a70: 00 00 00 00 fa fa fd fd fd fd fa fa fd fd fd fd 0x0c0680028a80: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==34068==ABORTING ``` Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-06-01 09:19:40 +00:00
Tatjana Dehler	d83c277ac1	mgr/dashboard: show partially deleted RBDs An RBD might be partially deleted if the deletion process has been started but was interrupted. In this case return the RBD as part of the RBD list and mark it as partially deleted. Fixes: https://tracker.ceph.com/issues/48603 Signed-off-by: Tatjana Dehler <tdehler@suse.com>	2021-06-01 10:29:50 +02:00
Kefu Chai	c566181e75	Merge pull request #41607 from liu-chunmei/seastore-cleanup-lba-get-mapping crimson/seastore: cleanup lba manager get_mappings Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-06-01 16:18:00 +08:00
Kefu Chai	b15baca97b	Merge pull request #41597 from rhcs-dashboard/remove-promtool-script test,cmake: remove run-promtool-unitests.sh script Reviewed-by: Willem Jan Withagen <wjw@digiware.nl> Reviewed-by: Kefu Chai <kchai@redhat.com>	2021-06-01 16:17:07 +08:00
chunmei-liu	e81193b648	crimson/seastore: cleanup lba manager get_mappings Signed-off-by: chunmei-liu <chunmei.liu@intel.com>	2021-05-31 23:44:57 -07:00
chunmei-liu	b127fa3cdd	crimson/seastore: fix assert in read_extent lba btree root leaf is empty after osd reboot, because SegmentStateTracker's states are wrong. and that is caused by tracker->do_write not finished then seastore closed. in transaction manager read_extent, can't read extent. ceph_assert(0 == "Should be impossible"); Signed-off-by: chunmei-liu <chunmei.liu@intel.com>	2021-05-31 22:59:31 -07:00
Misono Tomohiro	32964b3f64	osd/ECBackend: Fix null pointer dereference when enabling jaeger tracing As comment in header says client_op might be null, we need to check it first before accessing client_op->osd_parent_span. Fixes: #51030 Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>	2021-06-01 14:59:07 +09:00
Aashish Sharma	479d738be1	test,cmake:remove run-promtool-unitests.sh script This PR intends to remove the run-promtool-unittests.sh script as CMakeLists.txt handles the promtool execution (also adding the description to run these tests in Readme.md) Signed-off-by: Aashish Sharma <aasharma@redhat.com>	2021-06-01 11:15:27 +05:30
Aashish Sharma	dc4becfde8	mgr/dashboard: API Version changes do not apply to pre-defined methods (list, create etc.) Methods like list(), create(), get() etc doesn't get applied the version.Also for the endpoints that get the version changed, the docs and the request header has still the version v1.0+ in them. So with the version reduced it gives 415 error when trying to make the request. This PR fixes this issue. Fixes: https://tracker.ceph.com/issues/50855 Signed-off-by: Aashish Sharma <aasharma@redhat.com>	2021-06-01 10:39:24 +05:30
Kefu Chai	2f1dd0ce9f	pybind/mgr/selftest: add "mgr self-test eval" command and a simple REPL client allowing developer to peek and poke the selftest module. if this turns out to be useful, we can promote this method into a dedicated mix-in class, so other module can use it if developer wants to test it manually. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-06-01 11:03:10 +08:00
Mykola Golub	109f0b3c05	Merge pull request #41514 from ideepika/wip-49592-upgrade qa/upgrade: conditionally disable update_features tests Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Mykola Golub <mgolub@suse.com>	2021-05-31 19:34:53 +03:00
Sage Weil	f4585775ca	doc/foundation: remove amihan Signed-off-by: Sage Weil <sage@newdream.net>	2021-05-31 11:26:01 -05:00
Ernesto Puerta	957c9c304b	mgr/dashboard: pass Grafana datasource in URL PR https://github.com/ceph/ceph/pull/24314 added support for specifying the Grafana datasource via $datasource template variable, but this hadn't been used from the Dashboard side so far. As per https://grafana.com/docs/grafana/latest/variables/#templates, by adding `var-datasource=Dashboard1`, Dashboard can specify the datasource. Fixes: https://tracker.ceph.com/issues/51026 Signed-off-by: Ernesto Puerta <epuertat@redhat.com>	2021-05-31 16:19:44 +02:00
Kefu Chai	1df55c2378	Merge pull request #41589 from tchaikov/wip-crimson-start-up-error crimson: handle startup failures properly Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>	2021-05-31 20:07:33 +08:00
Kefu Chai	703545c595	crimson/os/alienstore: do not cleanup if not started there is chance stop() and umount() methods get called even if start() is not called in the error handling path. in that case, just make these methods no-op. to ensure that OSD behaves in that case. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-05-31 20:06:40 +08:00
Kefu Chai	cc59c82483	crimson/os/alienstore: create tp in AlienStore::start() thread pool is not needed until AlienStore::start(). with this change, we are able to tell if the AlienStore is actually started or not in AlienStore::stop(). as seastar::sharded<Service> start a service in two phases: 1. construct the shard instances 2. actually start them and it stops a service in a single shot, which both stops the services and destructs the service instance(s). so we have to implement a proper stop() method for services whose start() might not be called after its instance is created by seastar::sharded<Service>::start() in case of error handling or if we just don't want to call start(). to ensure we can skip the steps to clean up the stuff created by start(), we need to have a flag in the sharded service, because AlienStore is a member variable of OSD, and when we do mkfs, AlienStore is not start()'ed, and as explained above, we have to call OSD::stop() to ensure OSD instance is destructed properly. but OSD::stop() calls store->umount() and store->stop() unconditionally. these methods in AlienStore rely on a functional thread pool. fortunately, we don't need to call these methods if the store is never mounted or started. in a case of failed "mkfs", store is not mounted at all but the store and osd instances are created. so, in this change, thread pool is created in AlienStore::start(), and we will use it to tell if AlienStore is started or not in the following change which makes the related method no-op if AlienStore is not started yet. also, postpone the creation of `store` until in AlienStore::start(), so we don't need to destroy it in the dtor of AlienStore. otherwise, BlueStore::~BlueStore() would need to reference resources which are only available in alien threads, but when OSD::~OSD() is called, we are in seastar's reactor. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-05-31 20:06:40 +08:00
Kefu Chai	d4671c2ff9	crimson/osd/main: always stop osd as long as it started otherwise the sharded_service's dtor complains if we destruct it without stopping it first, like: FATAL: startup failed: std::system_error (error crimson::net:3, negotiation failure) crimson-osd: ../src/seastar/include/seastar/core/sharded.hh:523: seastar::sharded<T>::~sharded() [with Service = crimson::osd::OSD]: Assertion `_instances.empty()' failed. Aborting on shard 0. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-05-31 20:06:40 +08:00
Kefu Chai	37b83f4ed7	crimson/osd/main: do cleanup using defer() since we do the startup in a seastar thread, we have the luxury of doing cleanup using the RAII machinery. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-05-31 20:06:22 +08:00
Kefu Chai	a6314f1542	crimson/osd/main: catch exception thrown in the async() call * use seastar::app_template::run() instead of seastar::app_template::run_deprecated() for returning int, instead of returning `void`. so the application can return int explicitly in the continuation passed to run(). more readable this way. * wrap the all the block in run() in a giant try-catch block, so the exceptions thrown by the startup code can be captured and handled. * do not capture the exceptions individually, in the try-catch block anymore. the outer catch block takes care of them. this change improves the error handling when crimson-osd launches. Signed-off-by: Kefu Chai <kchai@redhat.com>	2021-05-31 20:05:52 +08:00
Misono Tomohiro	571a9e6d53	vstart: update podman detection Since it is possible there is no podman process running when launching vstart, use 'command -v' instead of 'pgrep -f'. Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>	2021-05-31 21:00:10 +09:00
Deepika	9c0b239d70	qa/upgrade: conditionally disable update_features tests with the recent support for async rbd operations from pacific+ when an older client(non async support) goes on upgrade, and simultaneously interacts with a newer client which expects the requests to be async, experiences hang; considering the return code for request completion to be acknowledgement for async request, which then keeps waiting for another acknowledgement of request completion. this if happens should be a rare only when lockowner is an old client and should be deferred if compatibility issues arises. see also: 541230475d3b25ab18c4eb9bc5011060462594a6(octopus) Signed-off-by: Deepika <dupadhya@redhat.com>	2021-05-31 16:46:31 +05:30
Ilya Dryomov	16d9a68a3e	librbd: don't stop at the first unremovable image when purging As there is no inherent ordering, there may be multiple removable images past the unremovable image. On top of that, removing a clone may make its parent removable so perform an additional pass if any image gets removed. Fixes: https://tracker.ceph.com/issues/51021 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-05-31 11:44:47 +02:00
Ilya Dryomov	0bcb910217	rbd: combined error message for expected Trash::purge() errors Output to stderr instead of the log where regular users wouldn't see it given the elevated log level. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2021-05-31 11:44:47 +02:00

... 5 6 7 8 9 ...

124015 Commits