We now wrap StackTraceScope thingy in tcmalloc-specific parts, instead
of automagically inside every stacktrace.cc function. TCMalloc bits
all need to grab stacktraces via newly introduced
tcmalloc::GrabBacktrace (which handles emergency malloc wrapping).
New approach eliminates the need for doing fake stacktrace scope. CPU
profiler, being distinct .so library couldn't take advantage of
emergency malloc anyways.
This simplifies the build further and eliminates another potential
point of runtime divergence when stacktrace is linked to both
libprofiler and libtcmalloc.
Logic was removed from thread_cache.{h,cc} into
thread_cache_ptr.{h,cc}.
Separation will help possible future evolution, and we already changed
the logic quite a bit:
* early access (when TLS isn't initialized yet) now uses global
ThreadCache instance. We therefore have ThreadCachePtr instances
managing required locking. This eliminates unnecessary complication of
PTHREADS_CRASHES_IF_RUN_TOO_EARLY logic, and any other danger of
touching TLS too early. BTW previous implementation actually leaked
initial early-initialized ThreadCache instance(!)
* old configure-time HAVE_TLS logic is amputated. Config-time part of
it made little sense as C++ 17 guarantees availability of
thread_local, but we have manually curated deny-list of "bad" OSes,
that we tested (via compile checks!) at configure time. Now this
is all compile time. There is now compile-time kHaveGoodTLS variable
and we're using it mostly via if constexpr.
* kHaveGoodTLS case of creating thread cache is simplified and made
more straightforward (no need to have in_setspecific logic).
* !kHaveGoodTLS case if fixed and improved too. We avoid
std:🧵:get_id, as it deadlocks on mingw. We use errno address as
a portable and (usually) async-signal safe 'my thread' identifier. We
also eliminate linear searching of thread's cache and replace it with
straightforward hash table lookup.
This allows us, later, to avoid building this stuff in configurations
that don't use it. I have also reduced API and ABI surface to enable
further refactorings.
- Fix CMake builds for MinGW and MSVC
- Ensure the Autotools, CMake and VSProj builds do not reference each others' config.h
- Use std:🧵:id instead of our own thread ID wrappers
- Moved explicit TLS wrapper functions into the tcmalloc:: namespace and change their visibility to hidden
Resolves#1486
Previous implementation wasn't entirely safe w.r.t. 32-bit off_t
systems. Specifically around mmap replacement hook. Also, API was a
lot more general and broad than we actually need.
Sadly, old mmap hooks API was shipped with our public headers. But
thankfully it appears to be unused externally (checked via github
search). So we keep this old API and ABI for the sake of formal API
and ABI compatibility. But this old API is now empty and always
fails (some OS/hardware combinations didn't have functional
implementations of those hooks anyways).
New API is 64-bit clean and only provides us with what we need. Namely
being able to react to virtual address space mapping changes for
logging, heap profiling and heap leak checker. I.e. no pre hooks or
mmap-replacement hooks. We also explicitly not ship this API
externally to give us freedom to change it.
New code is also hopefully tidier and slightly more portable. At least
there are fewer arch-specific ifdef-s.
Another somewhat notable change is, since mmap hook isn't needed in
"minimal" configuration, we now don't override system's
mmap/munmap/etc functions in this configuration. No big deal, but it
reduces risk of damage if we somehow mess those up. I.e. musl's mmap
does few things that our mmap replacement doesn't, such as very fancy
vm_lock thingy. Which doesn't look critical, but is good thing for us
not to interfere with when not necessary.
Fixes issue #1406 and issue #1407. Lets also mention issue #1010 which
is somewhat relevant.
After change to release page heap lock around returning memory back to
kernel, page heap test got dependency on page heap lock. Which was not
available on windows since relevant symbols are not exported.
Proposed fix is to simply duplicate all needed .cc files in
page_heap_test project instead of linking to dll. This is not perfect
but gets job done, until we figure out better solution (GNU/Linux will
eventually get hidden visibility and will need it).
This fixes github issue 1189.
1.Remove superfluous per file settings for include directory and runtime library.
2.Remove unnecessary project tcmalloc_minimal_unittest-static. We can simply build libtcmalloc_minimal as a static library and then link against the single .lib file.
3.Add separate configurations of patching and overriding facility for release mode.