This is off by default for now. And autotools-only. But after
significant preparatory work, we're now able to do it cleanly. Stuff
that was previously exported just for tests (like page heap stuff or
flags) is now unexported.
Just like on windows all symbols explicitly exported by
PERFTOOLS_DLL_DECL are visible and exported and rest are hidden. Those
include all the malloc/new APIs of course, and all the other symbols
we advertise in our headers (e.g. MallocExtension, MallocHook).
Updates issue #600
Biggest part of it is removal of SetTestResourceLimit. The reasoning
behind it is, we don't do it on windows. Tests pass just fine. So,
there is no reason to bother.
Logic was removed from thread_cache.{h,cc} into
thread_cache_ptr.{h,cc}.
Separation will help possible future evolution, and we already changed
the logic quite a bit:
* early access (when TLS isn't initialized yet) now uses global
ThreadCache instance. We therefore have ThreadCachePtr instances
managing required locking. This eliminates unnecessary complication of
PTHREADS_CRASHES_IF_RUN_TOO_EARLY logic, and any other danger of
touching TLS too early. BTW previous implementation actually leaked
initial early-initialized ThreadCache instance(!)
* old configure-time HAVE_TLS logic is amputated. Config-time part of
it made little sense as C++ 17 guarantees availability of
thread_local, but we have manually curated deny-list of "bad" OSes,
that we tested (via compile checks!) at configure time. Now this
is all compile time. There is now compile-time kHaveGoodTLS variable
and we're using it mostly via if constexpr.
* kHaveGoodTLS case of creating thread cache is simplified and made
more straightforward (no need to have in_setspecific logic).
* !kHaveGoodTLS case if fixed and improved too. We avoid
std:🧵:get_id, as it deadlocks on mingw. We use errno address as
a portable and (usually) async-signal safe 'my thread' identifier. We
also eliminate linear searching of thread's cache and replace it with
straightforward hash table lookup.
* Remove build dependency on HAVE_PTHREAD
* Remove build dependency on HAVE_STD_ALIGNED_VAL_T and ENABLE_ALIGNED_NEW_DELETE
* Remove redundant tcmalloc.h files & ensure there are no cross-build-tool references
* Adopt automake commit 26927d1 in the CMake build
- Fix CMake builds for MinGW and MSVC
- Ensure the Autotools, CMake and VSProj builds do not reference each others' config.h
- Use std:🧵:id instead of our own thread ID wrappers
- Moved explicit TLS wrapper functions into the tcmalloc:: namespace and change their visibility to hidden
Resolves#1486
As part of cpu profiler we're extracting current PC (program counter)
of out signal's ucontext. Different OS and hardware combinations have
different ways for that. We had a list of variants that we tested at
compile time and populated PC_FROM_UCONTEXT macro into config.h. It
caused duplication and occasional mismatches between our autoconf and
cmake bits.
So this commit changes testing to be compile-time. We remove
complexity from build system and add some to C++ source.
We use SFINAE to find which of those variants compile (and we silently
assume that 'compiles' implies 'works'; this is what config-time
testing did too). Occasionally we'll face situations where several
variants compile. And we couldn't handle this case in pure C++. So we
have a small Ruby program that generates chain of inheritance among
SFINAE-specialized class templates. This handles prioritization among
variants.
List of ucontext->pc extraction variants is mostly same. We dropped
super-obsolete (circa Linux kernel 2.0) arm variant. And NetBSD case
is now improved. We now use their nice architecture-independent macro
instead of x86-specific access.
This facility allowed us to build tcmalloc without linking in actual
-lpthread. Via weak symbols we checked at runtime if pthread functions
are available and if not, special single-threaded stubs were used
instead. Not always brining in pthread dependency helped performance
of some programs or libraries which depended at runtime on whether
threads are linked or not. Most notable of those are libstdc++ which
uses non-atomic refcounting on single threaded programs.
But such optional dependency on pthreads caused complications for
nearly no benefit. One trouble was reported in github issue #1110.
This days glibc/libstdc++ combo actually depends on
sys/single_threaded.h facility. So bringing pthread at runtime is
fine. Also modern glibc ships pthread symbols inside libc anyways and
libpthread is empty. I also found that for whatever reason on BSDs and
osx we already pulled in proper pthreads too.
So we loose nothing and we get issue #1110 fixed. And we simplify
everything.
- Some small automake changes. Add libc++ for AIX instead of libstdc++
- Add the interface changes for AIX:User-defined malloc replacement
- Add code to avoid use of pthreads library prior to its initialization
- Some small changes to the unittest case
- Update INSTALL for AIX
[alkondratenko@gmail.com]: lower-case/de-trailing-dot for commit subject line
[alkondratenko@gmail.com]: rebase
[alkondratenko@gmail.com]: amputate unused AM_CONDITIONAL for AIX
[alkondratenko@gmail.com]: explicitly mention libc_override_aix.h in Makefile.am
The idea is -momit-leaf-frame-pointer gives us performance pretty much
same as fully omitting frame pointers. And having frame pointers
elsewhere allows us to support cases when user's code is built with
frame pointers. We also pass all tests with
TCMALLOC_STACKTRACE_METHOD=generic_fp (not just libunwind or libgcc).
It is kinda minor feature, and apparently we never had it working. But
is a nice to have. Allows our users to override malloc/free/etc while
still being able to link to us (for tc_malloc for example). With
broken weakening we had this use-case broken for static library
case. And it should now work.
In most practical terms, this expands "official" heap leak checker
support to Linux/arm64 and Linux/riscv (mips-en and legacy arm are
likely to work & pass tests too now).
The code is now explicitly Linux-only, without trying to pretend
otherwise. Main goal of this change is to finally amputate
linux_syscall_support.h, which we historically had trouble maintaining
well. Biggest challenge was around thread listing facility which uses
clone (ptrace explicitly fails between threads) and that causes
difficulties around parent and child tasks sharing
errno. linux_syscall_support stuff had special feature to "redirect"
errno accesses. But it caused us for more trouble. We switched to
regular syscalls, and errno stamping avoidance is now simply via
careful programming.
A number of other cleanups is made (such us thread finding codes in
procfs which clearly was built for some ages old and odd kernels).
sem_post/sem_wait synchronization was previously potentially prone to
deadlock (if parent died at bad time). We now use pipe pair for this
synchronization and it is fully robust.
Previously we blindly tried to use libunwind whenever header is
detected. Even if actually working libunwind library is missing. This
is now fixed, so we attempt to use libunwind when it actually works.
Somehow recent freebsd ships libunwind.h (which seems to belong to
llvm's implementation), but apparently without matching .so. So then building
and linking failed.
As part of that we also upgrade required autoconf version to 2.69
which is what I see in rhel/centos 7 and ubuntu 14.04. Both are old
enough. And, of course, .tar.gz releases still ship "packaged" configure,
so will work on older distros.
This fixes issue #1335.
It is actually needed for libgcc backtracer from time to time. And
we've seen libunwind to need it too. Plus we've not heard of any
problems with it. So lets just always enable it.
This should fix github issue #1248.
Previously it only was respected on x86_64, but this days lots of modern
ABIs are without frame pointers by default (e.g. arm64 and riscv, and
even older mips).