We do shell wrapper for actual test run, so we can inspect output of
pprof. But when we set up sampling_debug_test.sh we simply copied
regular sampling_test.sh, which ran same non-debug test binary. Now we
sed-replace contents of shell program when copying, so we test right
binary.
Another thing we fix here is our (still hardcoded) test output path is
now different between sampling{,_debug}_test.sh. So this fixes main
cause of flakiness of our unit tests.
We used msync to verify that address is readable. But msync gives
false positives for PROT_NONE mappings. And we recently got bug report
from user hitting this exact condition.
For correct access check, we steal idea from Abseil and do sigprocmask
with address used as new signal mask and with invalid HOW
argument. This works in today's Linux kernels and is among fastest
methods available. But is brittle w.r.t. possible kernel changes. So
we supply fallback method that does 2 syscalls.
For non-Linux systems we implement usual "write to pipe" trick. Which
also has decent performance, but requires occasional pipe draining and
uses fds which could occasionally be damaged by some forking codes.
We also finally cover all new code with unit test.
Fixes github issue #1426
This unbreaks building on older Linux distros. We missed this at
46d3315ad7 when dropped maybe_thread
stuff, since libprofiler indeed uses pthread, and because on newer
libc-s pthread stuff is now part of regular libc.so.
I am also dropping bogus LIBPROFILER stuff referring to some rpath
badness. Unsure what it was, maybe way back we did libstacktrace as a
proper libtool library, so maybe something was needed. But it is just
a convenience archive this days, so we don't really need to add it
everywhere libprofiler.la is linked.
There was this piece of makefile with indention to add stack tracing
functionality (for stuff like growthz, GetCallerStackTrace and
probably heap sampling) to work even in minimal configuration on
mingw.
What is odd is we fail to actually define libstacktrace.la target on
mingw, since libstacktrace.la requires WITH_STACK_TRACE automake
conditional which we don't enable on this platform. And yet somehow it
doesn't fail. It produces empty libstacktrace.la, so build kinda
works. Except at least on my machine it produces racy makefiles. So
lets not pretend and stop breaking our parallel builds.
Previous implementation wasn't entirely safe w.r.t. 32-bit off_t
systems. Specifically around mmap replacement hook. Also, API was a
lot more general and broad than we actually need.
Sadly, old mmap hooks API was shipped with our public headers. But
thankfully it appears to be unused externally (checked via github
search). So we keep this old API and ABI for the sake of formal API
and ABI compatibility. But this old API is now empty and always
fails (some OS/hardware combinations didn't have functional
implementations of those hooks anyways).
New API is 64-bit clean and only provides us with what we need. Namely
being able to react to virtual address space mapping changes for
logging, heap profiling and heap leak checker. I.e. no pre hooks or
mmap-replacement hooks. We also explicitly not ship this API
externally to give us freedom to change it.
New code is also hopefully tidier and slightly more portable. At least
there are fewer arch-specific ifdef-s.
Another somewhat notable change is, since mmap hook isn't needed in
"minimal" configuration, we now don't override system's
mmap/munmap/etc functions in this configuration. No big deal, but it
reduces risk of damage if we somehow mess those up. I.e. musl's mmap
does few things that our mmap replacement doesn't, such as very fancy
vm_lock thingy. Which doesn't look critical, but is good thing for us
not to interfere with when not necessary.
Fixes issue #1406 and issue #1407. Lets also mention issue #1010 which
is somewhat relevant.
This facility allowed us to build tcmalloc without linking in actual
-lpthread. Via weak symbols we checked at runtime if pthread functions
are available and if not, special single-threaded stubs were used
instead. Not always brining in pthread dependency helped performance
of some programs or libraries which depended at runtime on whether
threads are linked or not. Most notable of those are libstdc++ which
uses non-atomic refcounting on single threaded programs.
But such optional dependency on pthreads caused complications for
nearly no benefit. One trouble was reported in github issue #1110.
This days glibc/libstdc++ combo actually depends on
sys/single_threaded.h facility. So bringing pthread at runtime is
fine. Also modern glibc ships pthread symbols inside libc anyways and
libpthread is empty. I also found that for whatever reason on BSDs and
osx we already pulled in proper pthreads too.
So we loose nothing and we get issue #1110 fixed. And we simplify
everything.
- Some small automake changes. Add libc++ for AIX instead of libstdc++
- Add the interface changes for AIX:User-defined malloc replacement
- Add code to avoid use of pthreads library prior to its initialization
- Some small changes to the unittest case
- Update INSTALL for AIX
[alkondratenko@gmail.com]: lower-case/de-trailing-dot for commit subject line
[alkondratenko@gmail.com]: rebase
[alkondratenko@gmail.com]: amputate unused AM_CONDITIONAL for AIX
[alkondratenko@gmail.com]: explicitly mention libc_override_aix.h in Makefile.am
We used to explicitly link to libstdc++, libm and even libpthread, but
this should be handled by libtool since those are dependencies of
libtcmalloc_minimal. What also helps is we now build everything with
C++ compiler, not C. So libstdc++ or (libc++) dependency doesn't need
to be added at all, even if libtool for some reason fails to handle
it.
This fixes issue #1371
From time to time things file inside tcmalloc guts where calling to
malloc is not safe. Regular strerror does locale bits, so will
occasionally open files/malloc/etc. We avoid this by using our own
"safe" variant that hardcodes names of all POSIX errno constants.
We had plenty of old and mostly no more correct i386 cruft. Now that
generic_fp backtracer covers i386 just fine, we can drop explicit x86
backtracer.
With that we refactored and simplified stacktrace.cc mostly around
picking default implementation, but also adding few more minor
cleanups.
The idea is -momit-leaf-frame-pointer gives us performance pretty much
same as fully omitting frame pointers. And having frame pointers
elsewhere allows us to support cases when user's code is built with
frame pointers. We also pass all tests with
TCMALLOC_STACKTRACE_METHOD=generic_fp (not just libunwind or libgcc).
Previously we allowed test programs to be linked at the same time as
weakening is performed, rewriting the .a archives. So lets be more
explicit. We weaken after all-am (which "runs" everything including
libraries and programs), but before all target.
It is kinda minor feature, and apparently we never had it working. But
is a nice to have. Allows our users to override malloc/free/etc while
still being able to link to us (for tc_malloc for example). With
broken weakening we had this use-case broken for static library
case. And it should now work.
In most practical terms, this expands "official" heap leak checker
support to Linux/arm64 and Linux/riscv (mips-en and legacy arm are
likely to work & pass tests too now).
The code is now explicitly Linux-only, without trying to pretend
otherwise. Main goal of this change is to finally amputate
linux_syscall_support.h, which we historically had trouble maintaining
well. Biggest challenge was around thread listing facility which uses
clone (ptrace explicitly fails between threads) and that causes
difficulties around parent and child tasks sharing
errno. linux_syscall_support stuff had special feature to "redirect"
errno accesses. But it caused us for more trouble. We switched to
regular syscalls, and errno stamping avoidance is now simply via
careful programming.
A number of other cleanups is made (such us thread finding codes in
procfs which clearly was built for some ages old and odd kernels).
sem_post/sem_wait synchronization was previously potentially prone to
deadlock (if parent died at bad time). We now use pipe pair for this
synchronization and it is fully robust.
We previously tested wrong assumption that larger than page size size
classes have addresses aligned on page size. New code is making proper
check of size class.
Also added is unit test coverage for this previously failing
condition. And we now also run "assert-ful" unittests for big tcmalloc
too, not only tcmalloc_minimal configuration.
This fixes github issue #1254
This supports frame pointer backtracing on x86-64, aarch64 and
riscv-s (should work for both 32 and 64 bits).
Also added is detection of borked libunwind on aarch64-s. In this case
frame pointer unwinder is preferred.
Previously it only was respected on x86_64, but this days lots of modern
ABIs are without frame pointers by default (e.g. arm64 and riscv, and
even older mips).
Clang mostly ignores those anyways, so our tests needed better way to
disable optimizations (clang is quite aggressive replacing new/delete
pair with stack allocation).
Everyone should be using golang pprof from github.com/google/pprof, but
distros still ship our perl version and not everyone is aware of
better pprof yet.
This is another step in completely dropping perl pprof. We still
default to installing it, but hopefully we'll be able to convince
distros to disable this soon.
We still install pprof under pprof-symbolize name because
stack traces symbolization depends on it, and because golang pprof
won't support this feature.
This is related to issue #1038.
1.Remove superfluous per file settings for include directory and runtime library.
2.Remove unnecessary project tcmalloc_minimal_unittest-static. We can simply build libtcmalloc_minimal as a static library and then link against the single .lib file.
3.Add separate configurations of patching and overriding facility for release mode.
Only add -static to malloc_bench_LDFLAGS and binary_trees_LDFLAGS if
ENABLE_STATC is set otherwise build with some compilers will fail if
user has decided to build only the shared version of gperftools
libraries
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
- Add auto-detection of std::align_val_t presence to configure scripts. This
indicates that the compiler supports C++17 operator new/delete overloads
for overaligned types.
- Add auto-detection of -faligned-new compiler option that appeared in gcc 7.
The option allows the compiler to generate calls to the new operators. It is
needed for tests.
- Added overrides for the new operators. The overrides are enabled if the
support for std::align_val_t has been detected. The implementation is mostly
based on the infrastructure used by memalign, which had to be extended to
support being used by C++ operators in addition to C functions. In particular,
the debug version of the library has to distinguish memory allocated by
memalign from that by operator new. The current implementation of sized
overaligned delete operators do not make use of the supplied size argument
except for the debug allocator because it is difficult to calculate the exact
allocation size that was used to allocate memory with alignment. This can be
done in the future.
- Removed forward declaration of std::nothrow_t. This was not portable as
the standard library is not required to provide nothrow_t directly in
namespace std (it could use e.g. an inline namespace within std). The <new>
header needs to be included for std::align_val_t anyway.
- Fixed operator delete[] implementation in libc_override_redefine.h.
- Moved TC_ALIAS definition to the beginning of the file in tcmalloc.cc so that
the macro is defined before its first use in nallocx.
- Added tests to verify the added operators.
[alkondratenko@gmail.com: fixed couple minor warnings, and some
whitespace change]
[alkondratenko@gmail.com: removed addition of TC_ALIAS in debug allocator]
Signed-off-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>