In 57d6ecc4ae I removed obsolete
src/windows/TODO but failed to remove it from Makefile.am
I also previously failed to arrange distribution of vendored
googletest. And we also failed to distribute headers in src/tests/
This is all fixed now.
Turns out at least on FreeBSD tmpfile will fail if TMPDIR points to
non-existant directory. This also has nice property of leaving it down
to users to set up TMPDIR the way they want.
There were far too many intermediate libraries, references to obsolete
libtool bugs and whatnot.
We now have libcommon.la as a convenience archive that contains
spinlock, logging and misc stuff. A number of unit tests that need
those facilities are being linked to this.
Then we have libstacktrace.la convenience archive, since it is used by
libtcmalloc.la and libprofiler.la.
And after that we're simply directly creating our main library
products: libtcmalloc_minimal{,_debug}.la, libtcmalloc_{,debug}.la,
libprofiler.la etc.
We now wrap StackTraceScope thingy in tcmalloc-specific parts, instead
of automagically inside every stacktrace.cc function. TCMalloc bits
all need to grab stacktraces via newly introduced
tcmalloc::GrabBacktrace (which handles emergency malloc wrapping).
New approach eliminates the need for doing fake stacktrace scope. CPU
profiler, being distinct .so library couldn't take advantage of
emergency malloc anyways.
This simplifies the build further and eliminates another potential
point of runtime divergence when stacktrace is linked to both
libprofiler and libtcmalloc.
For compiling things automake never needs to be given a full set of
headers. Usually headers are specified so that make dist includes them
into archive, but we can achieve this goal easier.
This reduces size and complexity of our Makefile.am stuff.
Logic was removed from thread_cache.{h,cc} into
thread_cache_ptr.{h,cc}.
Separation will help possible future evolution, and we already changed
the logic quite a bit:
* early access (when TLS isn't initialized yet) now uses global
ThreadCache instance. We therefore have ThreadCachePtr instances
managing required locking. This eliminates unnecessary complication of
PTHREADS_CRASHES_IF_RUN_TOO_EARLY logic, and any other danger of
touching TLS too early. BTW previous implementation actually leaked
initial early-initialized ThreadCache instance(!)
* old configure-time HAVE_TLS logic is amputated. Config-time part of
it made little sense as C++ 17 guarantees availability of
thread_local, but we have manually curated deny-list of "bad" OSes,
that we tested (via compile checks!) at configure time. Now this
is all compile time. There is now compile-time kHaveGoodTLS variable
and we're using it mostly via if constexpr.
* kHaveGoodTLS case of creating thread cache is simplified and made
more straightforward (no need to have in_setspecific logic).
* !kHaveGoodTLS case if fixed and improved too. We avoid
std:🧵:get_id, as it deadlocks on mingw. We use errno address as
a portable and (usually) async-signal safe 'my thread' identifier. We
also eliminate linear searching of thread's cache and replace it with
straightforward hash table lookup.
This allows us, later, to avoid building this stuff in configurations
that don't use it. I have also reduced API and ABI surface to enable
further refactorings.
* Remove build dependency on HAVE_PTHREAD
* Remove build dependency on HAVE_STD_ALIGNED_VAL_T and ENABLE_ALIGNED_NEW_DELETE
* Remove redundant tcmalloc.h files & ensure there are no cross-build-tool references
* Adopt automake commit 26927d1 in the CMake build
- Fix CMake builds for MinGW and MSVC
- Ensure the Autotools, CMake and VSProj builds do not reference each others' config.h
- Use std:🧵:id instead of our own thread ID wrappers
- Moved explicit TLS wrapper functions into the tcmalloc:: namespace and change their visibility to hidden
Resolves#1486
See github issue #1474 for immediate reason.
Note, this entire idea of number of convenience libraries is likely
simply artifact of Google's codebase past. We don't really need this
complexity. But I am holding big reorganization of this for after API
and ABI work. For now, simply moving dynamic_annotations.cc into
libsysinfo fixes things. Most of the code links both anyways. So lets
just do it.
We do shell wrapper for actual test run, so we can inspect output of
pprof. But when we set up sampling_debug_test.sh we simply copied
regular sampling_test.sh, which ran same non-debug test binary. Now we
sed-replace contents of shell program when copying, so we test right
binary.
Another thing we fix here is our (still hardcoded) test output path is
now different between sampling{,_debug}_test.sh. So this fixes main
cause of flakiness of our unit tests.
We used msync to verify that address is readable. But msync gives
false positives for PROT_NONE mappings. And we recently got bug report
from user hitting this exact condition.
For correct access check, we steal idea from Abseil and do sigprocmask
with address used as new signal mask and with invalid HOW
argument. This works in today's Linux kernels and is among fastest
methods available. But is brittle w.r.t. possible kernel changes. So
we supply fallback method that does 2 syscalls.
For non-Linux systems we implement usual "write to pipe" trick. Which
also has decent performance, but requires occasional pipe draining and
uses fds which could occasionally be damaged by some forking codes.
We also finally cover all new code with unit test.
Fixes github issue #1426