if we run upgrade test, where, for example, "jewel" is not in
ceph-ci.git repo, we should check ceph.git to clone the workunits.
Signed-off-by: Kefu Chai <kchai@redhat.com>
ff0e521 resolved the issue for the other daemons but not for rgw since
it calls setuid (via civetweb) after the new code sets PR_SET_DUMPABLE.
Add another prctl call before wait_shutdown.
Fixes: http://tracker.ceph.com/issues/19089
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
Each BmapEntry instance stores a pointer to the same CephContext.
As we expect to have thousands of instances the overhead might
be too high. For instance, serving 1 TiB SSD disk on x86-64,
while using the default settings, results in 32 MiB of extra
memory consumption:
# assuming sizeof(unsigned long) * CHAR_BIT == 64
>>> 1024 * 1024 * 1024 * 1024 / 4096 / 64
4194304
>>> 4194304 * 8 / 1024
32768
Although memory is cheap, CPU's caches are not.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
Before the patch BitMapZone::is_exhausted() required from its
callers to acquire appropriate lock. However, fulfilling this
condition is not really necessary to use the method correctly
while it can significantly hurt performance.
The change allows BitMapAreaLeaf::child_check_n_lock() to not
acquire the lock while examining zones for being exhausted.
Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
This is difficult to break into pieces, so one big fat commit it is.
A few trivial bits
- include epoch in PGQueueable.
- PGQueuable operator<<
- remove op_wq ref from OSDService; use simple set of queue methods instead
The big stuff:
- Fast dispatch now passes messages directly to the queue based on an
spg_t. The exception is MOSDOp's from legacy clients. We add a
waiting_for_map mechanism on the front-side that is similar to but simpler
than the previous one so that we can map those legacy requests to an
accurate spg_t.
- The dequeue path now has a waiting_for_pg mechanism. It also uses a
much simpler set of data structures that should make it much faster than
the previous incarnation.
- Shutdown works a bit differently; we drain the queue instead of trying
to remove work for individual PGs. This lets us remove the dequeue_pg
machinery.
Signed-off-by: Sage Weil <sage@redhat.com>
The objecter actually always needs to get a response in order to
be able to not continually resend ops (even if the caller didn't
provide a callback). Thus, it makes no sense for an MOSDOp to
ever not have FLAG_ONDISK set. Therefore, we'll just remove the
helper and assume it's always there (it's safe to send a response
the client didn't ask for, the error paths already do that). On
the Objecter side, we'll just unconditionally fill in ONDISK for
the benefit of pre-luminous OSDs.
Fixes: http://tracker.ceph.com/issues/18961
Signed-off-by: Samuel Just <sjust@redhat.com>
I think that whole thing was a misguided attempt to avoid deleting head
if it exists in the base tier (in reality it doesn't matter since head
would have to be logically dirty and anything we actually care about
would be preserved by sending a new enough seq to cause a clone).
Introduced in 4843fd510b, but the real
logical error happened in f3df50188b.
I suggest never backporting this patch. If you want to try, keep in
mind that the last version didn't turn up as busted for 2 years.
Fixes: f3df50188b
Signed-off-by: Samuel Just <sjust@redhat.com>
qa/tasks/workunit: use the suite repo for cloning workunit
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>