Commit Graph

1300 Commits

Author SHA1 Message Date
Adam C. Emerson
8777980283 osd: Build ceph-osd without using namespace declarations in headers
This is part of a series of commits to clean up using namespace at top
level in headers.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2020-04-06 11:15:06 -04:00
Chunmei Liu
a54d0a90c0 crimson:common add TOPNSPC namespace for ceph and crimson
some code coexist in crimson seastar environment and posix environment,
so add namespace to avoid same function conflict, for example add namespace
for CephContext, since the new namespace for classic ceph-osd,
need modify all files declare use CephContext by including "common_fwd.h"
which defined the namespace for each environment.

Signed-off-by: Chunmei Liu <chunmei.liu@intel.com>
2020-02-27 19:56:29 -08:00
Jianpeng Ma
df3eaf1318 osd/OSD: remove unused func add_map_inc_bl
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2020-02-24 10:27:07 +08:00
Jianpeng Ma
1a9b5dc437 osd/OSD: remove unused func add_map_bl
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2020-02-24 10:27:07 +08:00
Jianpeng Ma
20ecc688b2 osd/OSD.cc : remove unused funcs.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2020-02-24 10:27:07 +08:00
Radoslaw Zarzynski
80da5f9a98 osd: fix racy accesses to OSD::osdmap.
Accordingly to cppreference.com [1]:

  "If multiple threads of execution access the same std::shared_ptr
  object without synchronization and any of those accesses uses
  a non-const member function of shared_ptr then a data race will
  occur (...)"

[1]: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic

One of the coredumps showed the `shared_ptr`-typed `OSD::osdmap`
with healthy looking content but damaged control block:

  ```
  [Current thread is 1 (Thread 0x7f7dcaf73700 (LWP 205295))]
  (gdb) bt
  #0  0x0000559cb81c3ea0 in ?? ()
  #1  0x0000559c97675b27 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
  #2  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
  #3  0x0000559c975ef8aa in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167
  #4  std::__shared_ptr<OSDMap const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167
  #5  std::shared_ptr<OSDMap const>::~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr.h:103
  #6  OSD::create_context (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9053
  #7  0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...)
      at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665
  #8  0x0000559c97886db6 in ceph::osd::scheduler::PGPeeringItem::run (this=<optimized out>, osd=<optimized out>, sdata=<optimized out>, pg=..., handle=...) at /usr/include/c++/8/ext/atomicity.h:96
  #9  0x0000559c9764862f in ceph::osd::scheduler::OpSchedulerItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7f7dcaf703f0) at /usr/include/c++/8/bits/unique_ptr.h:342
  #10 OSD::ShardedOpWQ::_process (this=<optimized out>, thread_index=<optimized out>, hb=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:10677
  #11 0x0000559c97c76094 in ShardedThreadPool::shardedthreadpool_worker (this=0x559ca22aca28, thread_index=14) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.cc:311
  #12 0x0000559c97c78cf4 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.h:706
  #13 0x00007f7df17852de in start_thread () from /lib64/libpthread.so.0
  #14 0x00007f7df052f133 in __libc_ifunc_impl_list () from /lib64/libc.so.6
  #15 0x0000000000000000 in ?? ()
  (gdb) frame 7
  #7  0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...)
      at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665
  9665      in /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc
  (gdb) print osdmap
  $24 = std::shared_ptr<const OSDMap> (expired, weak count 0) = {get() = 0x559cba028000}
  (gdb) print *osdmap
     # pretty sane OSDMap
  (gdb) print sizeof(osdmap)
  $26 = 16
  (gdb) x/2a &osdmap
  0x559ca22acef0:   0x559cba028000  0x559cba0ec900

  (gdb) frame 2
  #2  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
  148       /usr/include/c++/8/bits/shared_ptr_base.h: No such file or directory.
  (gdb) disassemble
  Dump of assembler code for function std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release():
  ...
     0x0000559c97675b1e <+62>:      mov    (%rdi),%rax
     0x0000559c97675b21 <+65>:      mov    %rdi,%rbx
     0x0000559c97675b24 <+68>:      callq  *0x10(%rax)
  => 0x0000559c97675b27 <+71>:      test   %rbp,%rbp
  ...
  End of assembler dump.
  (gdb) info registers rdi rbx rax
  rdi            0x559cba0ec900      94131624790272
  rbx            0x559cba0ec900      94131624790272
  rax            0x559cba0ec8a0      94131624790176
  (gdb) x/a 0x559cba0ec8a0 + 0x10
  0x559cba0ec8b0:   0x559cb81c3ea0
  (gdb) bt
  #0  0x0000559cb81c3ea0 in ?? ()
  ...
  (gdb) p $_siginfo._sifields._sigfault.si_addr
  $27 = (void *) 0x559cb81c3ea0
  ```

Helgrind seems to agree:
  ```
  ==00:00:02:54.519 510301== Possible data race during write of size 8 at 0xF123930 by thread #90
  ==00:00:02:54.519 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8
  ==00:00:02:54.519 510301==    at 0x7218DD: operator= (shared_ptr_base.h:1078)
  ==00:00:02:54.519 510301==    by 0x7218DD: operator= (shared_ptr.h:103)
  ==00:00:02:54.519 510301==    by 0x7218DD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116)
  ==00:00:02:54.519 510301==    by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
  ==00:00:02:54.519 510301==    by 0x72A06C: Context::complete(int) (Context.h:77)
  ==00:00:02:54.519 510301==    by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
  ==00:00:02:54.519 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:02:54.519 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:02:54.519 510301==    by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
  ==00:00:02:54.519 510301==
  ==00:00:02:54.519 510301== This conflicts with a previous read of size 8 by thread #117
  ==00:00:02:54.519 510301== Locks held: 1, at address 0x2123E9A0
  ==00:00:02:54.519 510301==    at 0x6B5842: __shared_ptr (shared_ptr_base.h:1165)
  ==00:00:02:54.519 510301==    by 0x6B5842: shared_ptr (shared_ptr.h:129)
  ==00:00:02:54.519 510301==    by 0x6B5842: get_osdmap (OSD.h:1700)
  ==00:00:02:54.519 510301==    by 0x6B5842: OSD::create_context() (OSD.cc:9053)
  ==00:00:02:54.519 510301==    by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665)
  ==00:00:02:54.519 510301==    by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701)
  ==00:00:02:54.519 510301==    by 0x70E62E: run (OpSchedulerItem.h:148)
  ==00:00:02:54.519 510301==    by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677)
  ==00:00:02:54.519 510301==    by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311)
  ==00:00:02:54.519 510301==    by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706)
  ==00:00:02:54.519 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:02:54.519 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:02:54.519 510301==  Address 0xf123930 is 3,824 bytes inside a block of size 10,296 alloc'd
  ==00:00:02:54.519 510301==    at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433)
  ==00:00:02:54.519 510301==    by 0x66F766: main (ceph_osd.cc:688)
  ==00:00:02:54.519 510301==  Block was alloc'd by thread #1
  ```

Actually there is plenty of similar issues reported like:
  ```
  ==00:00:05:04.903 510301== Possible data race during read of size 8 at 0x1E3E0588 by thread #119
  ==00:00:05:04.903 510301== Locks held: 1, at address 0x1EAD41D0
  ==00:00:05:04.903 510301==    at 0x753165: clear (hashtable.h:2051)
  ==00:00:05:04.903 510301==    by 0x753165: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t>
  >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__deta
  il::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369)
  ==00:00:05:04.903 510301==    by 0x75331C: ~unordered_map (unordered_map.h:102)
  ==00:00:05:04.903 510301==    by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350)
  ==00:00:05:04.903 510301==    by 0x753606: operator() (shared_cache.hpp:100)
  ==00:00:05:04.903 510301==    by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr
  _base.h:471)
  ==00:00:05:04.903 510301==    by 0x73BB26: _M_release (shared_ptr_base.h:155)
  ==00:00:05:04.903 510301==    by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148)
  ==00:00:05:04.903 510301==    by 0x6B58A9: ~__shared_count (shared_ptr_base.h:728)
  ==00:00:05:04.903 510301==    by 0x6B58A9: ~__shared_ptr (shared_ptr_base.h:1167)
  ==00:00:05:04.903 510301==    by 0x6B58A9: ~shared_ptr (shared_ptr.h:103)
  ==00:00:05:04.903 510301==    by 0x6B58A9: OSD::create_context() (OSD.cc:9053)
  ==00:00:05:04.903 510301==    by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665)
  ==00:00:05:04.903 510301==    by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701)
  ==00:00:05:04.903 510301==    by 0x70E62E: run (OpSchedulerItem.h:148)
  ==00:00:05:04.903 510301==    by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677)
  ==00:00:05:04.903 510301==    by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311)
  ==00:00:05:04.903 510301==    by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706)
  ==00:00:05:04.903 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:05:04.903 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:05:04.903 510301==    by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
  ==00:00:05:04.903 510301==
  ==00:00:05:04.903 510301== This conflicts with a previous write of size 8 by thread #90
  ==00:00:05:04.903 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8
  ==00:00:05:04.903 510301==    at 0x7531E1: clear (hashtable.h:2054)
  ==00:00:05:04.903 510301==    by 0x7531E1: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t> >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369)
  ==00:00:05:04.903 510301==    by 0x75331C: ~unordered_map (unordered_map.h:102)
  ==00:00:05:04.903 510301==    by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350)
  ==00:00:05:04.903 510301==    by 0x753606: operator() (shared_cache.hpp:100)
  ==00:00:05:04.903 510301==    by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:471)
  ==00:00:05:04.903 510301==    by 0x73BB26: _M_release (shared_ptr_base.h:155)
  ==00:00:05:04.903 510301==    by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148)
  ==00:00:05:04.903 510301==    by 0x72191E: operator= (shared_ptr_base.h:747)
  ==00:00:05:04.903 510301==    by 0x72191E: operator= (shared_ptr_base.h:1078)
  ==00:00:05:04.903 510301==    by 0x72191E: operator= (shared_ptr.h:103)
  ==00:00:05:04.903 510301==    by 0x72191E: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116)
  ==00:00:05:04.903 510301==    by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
  ==00:00:05:04.903 510301==    by 0x72A06C: Context::complete(int) (Context.h:77)
  ==00:00:05:04.903 510301==    by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
  ==00:00:05:04.903 510301==  Address 0x1e3e0588 is 872 bytes inside a block of size 1,208 alloc'd
  ==00:00:05:04.903 510301==    at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433)
  ==00:00:05:04.903 510301==    by 0x6C7C0C: OSDService::try_get_map(unsigned int) (OSD.cc:1606)
  ==00:00:05:04.903 510301==    by 0x7213BD: get_map (OSD.h:699)
  ==00:00:05:04.903 510301==    by 0x7213BD: get_map (OSD.h:1732)
  ==00:00:05:04.903 510301==    by 0x7213BD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8076)
  ==00:00:05:04.903 510301==    by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
  ==00:00:05:04.903 510301==    by 0x72A06C: Context::complete(int) (Context.h:77)
  ==00:00:05:04.903 510301==    by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
  ==00:00:05:04.903 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:05:04.903 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:05:04.903 510301==    by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
  ```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2020-02-14 17:04:34 +01:00
Sage Weil
0db140c15c osd: trim pg logs based on a per-osd budget
Set the default budget based on the current defaults: 3000 per osd, and a
rule of thumb target of 100 PGs per OSD.  Set the per-PG trim target
by dividing the overall value by the number of PGs on the OSD.

Increase the max pg log length alone, so if the OSD has <100 PGs,
those PGs will get more entries.  Reduce the minimum to be smaller than
the max.  Use the min/max config options to bracket what is allocated to
a single PG.

Signed-off-by: Sage Weil <sage@redhat.com>
2020-01-22 09:13:00 +08:00
Kefu Chai
e230c400d3
Merge pull request #32514 from majianpeng/osd-remove-osmap_lock_name
osd/OSD: remove unused parameter osdmap_lock_name.

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-01-10 23:09:33 +08:00
Jianpeng Ma
c56108b561 osd/OSD: remove unused parameter osdmap_lock_name.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2020-01-10 14:59:08 +08:00
Jianpeng Ma
234beadff4 osd/OSD: remove unused func enqueue_peering_evt_front
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2020-01-06 16:42:19 +08:00
Sage Weil
86512b71fc Merge PR #32132 into master
* refs/pull/32132/head:
	osd/OSDMap: rename old calc_pg_role -> calc_pg_role_broken
	osd/OSDMap: remove dead osd_is_valid_op_target
	osd/OSDMap: fix get_pg_acting_role()
	osd: use spg_t for pending_creates_from_osd
	osd/OSDMap: drop unused get_pg_acting_role()
	osd/OSDMap: fix+simplify is_up_acting_osd_shard
	osd: use new and improved calc_pg_role()
	osd/OSDMap: new calc_pg_role() that takes a pg_shard_t
	osd/OSDMap: calc_pg_rank -> calc_pg_role
	osd/PeeringState: debug lines for upacting_features, proc_lease
	osd/PeeringState: use pg_vector_string for operator<<

Reviewed-by: Samuel Just <sjust@redhat.com>
2019-12-10 08:49:52 -06:00
Kefu Chai
6c7f395233
Merge pull request #32007 from tchaikov/wip-osd-cleanup
osd: use unique_ptr for managing life cycles

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
2019-12-10 17:18:03 +08:00
Sage Weil
cfdb569ab2 osd: use spg_t for pending_creates_from_osd
Signed-off-by: Sage Weil <sage@redhat.com>
2019-12-09 10:44:55 -06:00
Venky Shankar
efcebe1eb4 mgr: templatize/generalize metrics collection interface
templatize metrics collection so as to reuse quering routines.
`MetricCollector` can be subclassed and along with implementing
` process_reports()` to process incoming metrics data.

also, generalize metrics data in `MMgrReport` and metric query
configuration in `MMgrConfigure`.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
2019-12-05 22:51:45 -05:00
Kefu Chai
6651faddba osd: use unique_ptr for managing life cycles
instead of `new` and `delete` manually, use `unique_ptr<>` for managing
life cycles of member variables.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-12-04 23:51:04 +08:00
Kefu Chai
95d0962c98 osd: use accessors to access OSDService
instead of accessing OSDService's member variable directly, it would be
easier if we can use accessors to do this.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-12-04 23:49:01 +08:00
Samuel Just
83eba36c2c osd/: factor OSD::init_op_flags into seperate class
We'll want to reuse this in crimson, extract the logic
for setting flags from MOSDOp into osd_types.h.

Signed-off-by: Samuel Just <sjust@redhat.com>
2019-12-02 21:35:36 -08:00
Sage Weil
d6f5918850 Merge PR #31778 into master
* refs/pull/31778/head:
	os/bluestore: pin onodes as they are added to the cache
	Revert "Revert "Merge pull request #30964 from markhpc/wip-bs-cache-trim-pinned""

Reviewed-by: Mark Nelson <mnelson@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2019-11-23 20:30:28 -06:00
Sage Weil
54ddf2c897 Revert "Merge pull request #16715 from adamemerson/wip-I-Object!"
This reverts commit 669453138d, reversing
changes made to 36f5fcbb97.

Signed-off-by: Sage Weil <sage@redhat.com>

- conflicts due to code rearrangement in 14b0db908f
2019-11-22 09:24:25 -06:00
Josh Durgin
0808184b0f Revert "Revert "Merge pull request #30964 from markhpc/wip-bs-cache-trim-pinned""
This reverts commit f03395e5a6.
Re-instate the cache trimming from https://github.com/ceph/ceph/pull/30964

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2019-11-21 00:00:08 -05:00
Casey Bodley
669453138d
Merge pull request #16715 from adamemerson/wip-I-Object!
osdc/Objecter: Boost.Asio (I object!)

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
2019-11-19 12:48:26 -05:00
Jianpeng Ma
bb201f3974 osd: reduce unnessary notify.
Only has pre_publish_waiter, it call notify.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2019-11-07 18:16:04 +08:00
Jianpeng Ma
079dbff3c7 osd: add new api send_message_osd_cluster(std::vector<std::pair<int,Message*>>& messages, epoch_t from_epoch).
Batch send message to osd cluster.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2019-11-07 18:15:31 +08:00
Adam C. Emerson
044e2d64e0 osdc: Use boost::container::small_vector for ops vector
This way an making an ObjectOperation won't require allocations just
for a few ops.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2019-11-04 15:17:18 -05:00
Adam C. Emerson
73f5249f3f monc: Asifoact MonClient
Of course now everyone has to feed an io_context into the MonClient.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2019-11-04 13:28:53 -05:00
Sage Weil
f03395e5a6 Revert "Merge pull request #30964 from markhpc/wip-bs-cache-trim-pinned"
This reverts commit 304c37f521, reversing
changes made to 7b4f9a083f.

This causes some bluestore test failures due to an ENOENT right after a
new object is created.  The simplest reproducer is the
ObjectStore/StoreTest.FiemapEmpty/2 test.

Fixes: https://tracker.ceph.com/issues/42495
Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-27 09:59:10 -05:00
Mark Nelson
304c37f521
Merge pull request #30964 from markhpc/wip-bs-cache-trim-pinned
os/bluestore: Keep separate onode cache pinned list.
2019-10-24 14:50:11 -05:00
Samuel Just
17a05cb15b src/osd: replace OpQueue abstraction in osd with Scheduler
OpQueue is overkill for mclock based schedulers.  The interface doesn't
need to externalize the _strict modifiers, the scheduler can figure that
out from the item itself.  Introduce simpler Scheduler interface and add
an adapter for the existing OpQueue based implementations.

Signed-off-by: Samuel Just <sjust@redhat.com>
2019-10-22 15:11:51 -07:00
Mark Nelson
bfcf80e89c osd/OSD: Set the number of cache shards independently. Default to 32
Signed-off-by: Mark Nelson <mnelson@redhat.com>
2019-10-22 15:34:29 -04:00
Sage Weil
bbc7bb5a22 Merge PR #30217 into master
* refs/pull/30217/head:
	crimson: common/admin_socket kludge so that it builds
	mon/MonClient: fix sending mon command to a specific rank
	src/.gitignore: ignore .tox
	mon/MonClient: interpret numeric mon target name as rank
	mgr,mgr/MgrClient: use fsid to signal mon-mgr vs cli MCommands
	qa/workunits/cephtool: fix errpr checks for 'ceph daemon' commands
	common/ceph_context: make 'config unset' idempotent
	qa/tasks/dump_stuck: mon.a, not mon.0
	qa/suites/rados/singleton/all/admin-socket: fix test
	common/config: EPERM setting config option after startup
	qa/workunits/cephtool/test.sh: fix tell output error check
	common/admin_socket: pass Formatter from generic infrastructure
	common/admin_socket: pass ostream to call() for error output
	os/bluestore: fix asok hook return value
	rgw: fix asok return value
	common/ceph_context: return error code from asok commands
	test/pybind/test_rados: fix accidental mon tell test
	mon: print entity_name along with caps to debug log
	PendingReleaseNotes: notes about asok changes
	mgr/MgrClient: empty target string for 'tell' means active mgr
	common/admin_socket: report error code as part of output string
	osd: change trigger_[deep_]scrub tommands to a pg tell command
	osd: remove old command workqueue, threadpool
	osd: drop MMonCommand handling
	osdc/Objecter: resend OSD tell commands on EAGAIN
	osd: route tell commands to asok; migrate commands
	osd: use unique_ptr<Formatter> for asok_command
	common/ceph_context: add generic asok 'injectargs'
	common/admin_socket: allow dup prefixes
	common/admin_socket: refactor with sync and async execute_command variants
	common/admin_socket: pass input bufferlist
	osd: transition to call_async() for asok
	common/admin_socket: support alternative call_async()
	mon/MonClient: send tell commands out of band via MCommand
	mon: accept tell commands via MCommand and send them to asok handler
	common/admin_socket: return int from hook call()
	mgr/DaemonServer: route MCommand (for octopus+) to asok commands
	do not use 'ceph tell mgr'
	pybind/ceph_argparse: disambiguate mgr tell and CLI commands
	ceph: make 'ceph tell mgr.*' send to the active mgr
	ceph: send 'ceph tell mgr.X' to the right mgr
	librados: add rados_mgr_command_target
	mgr/MgrClient: add start_command variant that takes a target
	common/admin_socket: drop unregister_command(); use per-hook variant
	common/admin_socket: drop explicit prefix arg to register_command
	common/admin_socket: simplify command routing
	common/admin_socket: add ability to process MCommand via asok queue
	common/admin_socket: pass cmdvec to execute_command
	common/admin_socket: use pipe for general wakeup
	include/compat: add flags arg to pipe_cloexec
	common/admin_socket: drop unused args

Reviewed-by: Neha Ojha <nojha@redhat.com>
2019-10-06 09:08:28 -05:00
Sage Weil
adf1486e46 common/admin_socket: pass Formatter from generic infrastructure
The implementation can choose to either use the provided Formatter, or
put something directly into outbl.  The implementation may choose to
flush the formatter to the output buffer|stream, or let the caller do it
for them (usually the latter).

Lots of fiddling/cleanup in the implementations to make this build,
including dropping the (seeminlyg unused?) ostream& output mode for
the librbd asok implementations.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-04 09:07:03 -05:00
Sage Weil
817cca779d osd: remove old command workqueue, threadpool
These are now sent to the asok handler.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-04 09:07:02 -05:00
Sage Weil
45adbe011d osd: drop MMonCommand handling
Nothing sends these.. or has in a very very long time, AFAICS.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-04 09:07:02 -05:00
Sage Weil
804458bf51 Merge PR #30640 into master
* refs/pull/30640/head:
	osd/PrimaryLogPG: remove unused reply creation path
	osd/PrimaryLogPG: include op_returns in dup replies
	osd/PrimaryLog: drop unused reply_ctx() variant
	osd/PrimaryLogPG: remove dead already_ack()

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-10-02 09:50:53 -05:00
Sage Weil
a460a63869 common/admin_socket: pass input bufferlist
Pass this to the async handler only for now, since the sync implementations
don't currently use it.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-01 16:30:53 -05:00
Sage Weil
f4786deeab osd: transition to call_async() for asok
And some variable renames, error path fixup.

No other significant functional change.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-01 16:30:53 -05:00
Sage Weil
2042f661b6 osd/PrimaryLogPG: include op_returns in dup replies
We are storing the return metadata; actually use it when sending dup
replies!

Signed-off-by: Sage Weil <sage@redhat.com>
2019-09-30 10:04:48 -05:00
Xie Xingguo
97065f5c9a
Merge pull request #30644 from majianpeng/osd-remove-unused-func
osd: remove unused function

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-09-30 17:01:44 +08:00
Jianpeng Ma
8ba01c40ca osd: remove unused function
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2019-09-30 13:38:27 +08:00
Sage Weil
7aec060e0a osd: add CheckReadable pg event, queue_check_readable()
Signed-off-by: Sage Weil <sage@redhat.com>
2019-09-26 12:22:22 -05:00
Sage Weil
d883db7028 osd: schedule regular lease renewals
Signed-off-by: Sage Weil <sage@redhat.com>
2019-09-26 12:21:53 -05:00
Sage Weil
19d770832b osd: send and process lease[_ack] messages
Signed-off-by: Sage Weil <sage@redhat.com>
2019-09-26 12:21:53 -05:00
Sage Weil
5b5aa8b0c8 osd: add mono_timer
Signed-off-by: Sage Weil <sage@redhat.com>
2019-09-26 12:21:53 -05:00
Kefu Chai
545a40e9d7
Merge pull request #30399 from tchaikov/wip-crimson-peering-query
crimson: handle MOSDPGQuery2 properly

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2019-09-17 01:54:03 +08:00
Patrick Donnelly
517bdca529 common/RefCountedObj: cleanup con/des
Also, don't allow children to set nref (to 0). This is the more significant
change as it required fixing various code to not do this:

    <reftype> ptr = new RefCountedObjectFoo(..., 0);

as a way to create a starting reference with nref==1. This is a pretty
bad code smell so I've converted all the code doing this to use the new
factory method which produces the reference safely:

    auto ptr = ceph::make_ref<RefCountedObjectFoo>(...);

libradosstriper was particularly egregious in its abuse of setting the starting
nref. :(

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2019-09-16 19:53:58 +08:00
Kefu Chai
b1e655b11e osd/OSD.h: remove unused OSD::discard_context()
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-09-16 19:03:42 +08:00
Sage Weil
991520b869 Merge PR #29820 into master
* refs/pull/29820/head:
	osd/PeeringState: log_weirdness during peering
	osd/PeeringState: send new message types
	osd/PeeringState: give require_osd_release to BufferedRecoveryMessages
	osd: add new notify, query, and info messages
	osd/PeeringState: use send_info() for replica activation ack
	osd/PeeringState: remove old info_map member
	osd/PeeringState: send infos via message_map (not info_map)
	osd/PeeringState: remove old query_map member
	osd/PeeringState: send queries via message_map (not query_map)
	osd/PeeringState: use return value from discover_all_missing
	osd/PeeringState: give PeeringCtxWrapper a BufferedRecoveryMessages ref
	osd/PeeringState: send notifies via message_map (not notify_list)
	osd/PeeringState: add message_map to PeeringCtx/BufferedRecoveryMessages
	osd: dispatch_context inside PG lock

Reviewed-by: Samuel Just <sjust@redhat.com>
2019-09-12 09:57:12 -05:00
David Zafman
b3e1c58b0e osd: Replace active/pending scrub tracking for local/remote
This is similar to how recovery reservations are split between
local and remote.

It was the case that scrubs_pending was used for reservations at
the replicas as well as at the primary while requesting reservations
from the replicas.  There was no need for scrubs_pending to turn
into scrubs_active at the primary as nothing treated that value
as special.  scrubber.active = true when scrubbing is
actually going.

Now scurbber.local_reserved indicates scrubs_local incremented
Now scrubber.remote_reserved indicates scrubs_remote incremented

Fixes: https://tracker.ceph.com/issues/41669

Signed-off-by: David Zafman <dzafman@redhat.com>
2019-09-10 13:33:27 -07:00
David Zafman
8dffd68365 osd: Add dump_scrub_reservations to show scrub reserve tracking
Signed-off-by: David Zafman <dzafman@redhat.com>
2019-09-10 13:32:29 -07:00
Sage Weil
ce05c17299 osd: add new notify, query, and info messages
These are streamlined to only include the information we need for the
peering events, as reflected by the old handle_fast_pg_{notify,query,info}
messages.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-09-09 11:22:11 -05:00