Commit Graph

810 Commits

Author SHA1 Message Date
tigeran
4ac4d88a99 Remove unused using declaration
std::sort in `heap-profiler.cc` is not in use, so remove it.
2023-07-06 11:41:24 -04:00
Aliaksey Kandratsenka
4e24f4adaa fix overflow in c-f-l populate when span address is close to top
This fixes github issue #1323.

When populating span with linked list of objects ptr + size <= limit
condition could overflow before comparing to limit. So fixed code
carefully tests limit. We also produce version with
__builtin_add_overflow since this is semi-hot loop and we want it to
be fast.
2023-07-04 01:45:33 -04:00
Aliaksey Kandratsenka
f15425dc99 implement SafeStrError and use it inside strerror
This fixes issue #1371

From time to time things file inside tcmalloc guts where calling to
malloc is not safe. Regular strerror does locale bits, so will
occasionally open files/malloc/etc. We avoid this by using our own
"safe" variant that hardcodes names of all POSIX errno constants.
2023-07-03 18:14:05 -04:00
Aliaksey Kandratsenka
88d0fd5a3b remove dead remains of arm_instruction_set_select header 2023-07-03 17:29:13 -04:00
Aliaksey Kandratsenka
d9cecd6e42 print errno whenever dumping heap profile fails for some reason
Refers to issue #1116
2023-07-03 16:45:00 -04:00
Aliaksey Kandratsenka
fe5ba4b524 use sys/ptrace.h instead of linux/ptrace.h
As was pointed out at
https://github.com/gperftools/gperftools/pull/1329 there is no need
for us to depend on linux-headers thingy. Libc headers should be
enough.
2023-07-03 16:20:26 -04:00
Aliaksey Kandratsenka
55e798623f unbreak compilation with --disable-cpu-profiler 2023-07-03 15:31:21 -04:00
Aliaksey Kandratsenka
dd89dc7d01 install compat headers and .pc files only with matching libs
Thix closes issue #1356
2023-07-03 15:29:56 -04:00
Ali Saidi
a63ea13b20 Add an arm64 implementation for SpinlockPause()
This patch adds an arm64 implementation of the SpinlockPause() function
allowing the core to adaptively spin to acquire the lock and improving
performance in multi-threaded environments where the locks can be contended.

From experience with other projects, we've found a single isb is the closest
corollary to the rep; nop timing or x86.

Overall, across thirty runs, the binary_trees_shared benchmark improves 6% in
average, median, and P90 runtime with this change.
2023-07-03 14:05:13 -04:00
Sergey Fedorov
3c0738ac1c libc_override_osx.h: a small fix for ppc 2023-07-03 13:38:35 -04:00
Aliaksey Kandratsenka
cc4e289a83 drop weakening from cmake build
Weakening is optional and in github issue #1392 we apparently tried to
weaken on windows and failed. So lets not even try.
2023-07-03 13:02:59 -04:00
Aliaksey Kandratsenka
44eb0ee83c bump heap profiler stats fields to 64 bit
People actually seen overflows there. Fixes github issue #1303
2023-07-03 12:47:35 -04:00
Aliaksey Kandratsenka
972c12f77d refactor stacktrace.cc and drop x86 backtracer
We had plenty of old and mostly no more correct i386 cruft. Now that
generic_fp backtracer covers i386 just fine, we can drop explicit x86
backtracer.

With that we refactored and simplified stacktrace.cc mostly around
picking default implementation, but also adding few more minor
cleanups.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
d9b178695f support more OSes in generic-fp
We're still x86+arm+riscv only, but netbsd and freebsd work too. OSX
as well.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
4b78ffd03c try building with -fno-omit-frame-pointer -momit-leaf-frame-pointer
The idea is -momit-leaf-frame-pointer gives us performance pretty much
same as fully omitting frame pointers. And having frame pointers
elsewhere allows us to support cases when user's code is built with
frame pointers. We also pass all tests with
TCMALLOC_STACKTRACE_METHOD=generic_fp (not just libunwind or libgcc).
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
e5ac219780 restore unwind-bench
We previously deleted it, since it wasn't portable enough. But
unportable bits are now ifdef-ed out, so we can return it.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
cf046c8421 extend generic frame pointer backtracer to support x86-32 (aka x32) 2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
f56c27910a extend generic frame pointer backtracer to support i386 2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
58d6842576 more coverage for stacktrace_unittest 2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
25698cd1b8 improve diagnostics for stacktrace_unittest 2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
a25e7fa8b0 cleanup cmake's config.h stuff
Some header defines were not cmakedefine01.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
88d7e65cc2 drop unused libtool target in our Makefile.am
Not sure what it was for, but it is not useful today.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
ae4aafa468 freebsd+x86-64 pc-from-ucontext is not untested 2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
cddd759bd1 disable libgcc stack trace capturing on freebsd 2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
4c72b14d35 support proc maps iterator on NetBSD
Turns out it's procfs is available by default, and is fairly similar
to Linux's. So we can just reuse our Linux codes. All tests now pass
on NetBSD!
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
7ffd35a54b correctly detect and link to backtrace_symbols
BSDs need -lexecinfo
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
5b86aa0a51 reimplement GetProgramInvocationName via reading /proc/self/exe
This enables us to support NetBSD, as it lacks
program_invocation_name, but has fairly full-featured procfs.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
50b5219635 don't NO_INTR close
This is wrong. See man 2 close.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
7dd1b82378 simplify project by making it C++-only
I.e. no need for any AC_LANG_PUSH stuff in configure. Most usefully,
only CXXFLAGS needs to be set now when you need to tweak compile
flags.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
cf28e03567 correctly order weakening step to avoid race
Previously we allowed test programs to be linked at the same time as
weakening is performed, rewriting the .a archives. So lets be more
explicit. We weaken after all-am (which "runs" everything including
libraries and programs), but before all target.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
a39073886a unbreak symbol weakening
It is kinda minor feature, and apparently we never had it working. But
is a nice to have. Allows our users to override malloc/free/etc while
still being able to link to us (for tc_malloc for example). With
broken weakening we had this use-case broken for static library
case. And it should now work.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
630dac81ea implement simpler ChangeLog generation for source tarballs
We used ax_generate_changelog which works great. But it made our
makefile require GNU make, which was causing annoyance on bsd systems.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
b30b984281 [osx] make profiler unittest pass
somehow VerifyIdentical tests fail for some formatting details. But
since it tests --raw feature of our deprecated perl pprof
implementation, which we don't intend supporting anyways, we drop this
test.

There is some details about how wc -l returns stuff and how zsh uses
it to compare to 3. So we now explicitly strip whitespace and it
works.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
7437e0f612 [osx] make heap profiler unittest pass
Apparently awk's comparison $2 > 90 doesn't work when $2 is 100.0. I
frobbed it some and it works now.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
6683e4f6ee [osx] implement native c++ allocation operators on osx
OSX has really screwed performance for it's memory allocator due to
their malloc zones stuff. We and our competitors hook into that
facility instead directly overriding allocation functions. Which has
massive performance overhead. This is, apparently, so that malloc is
same as allocating from default zone.

As a result, osx'es C++ operator new/delete performance is even worse.
Because we call operator new, it calls malloc, which calls
malloc_zone_malloc, which calls more stuff and eventually tc_malloc.

Thankfully, for C++ API there is no need for such "integration"
between 'stock' memory allocator and malloc zones stuff. So we can
hook those APIs directly. This speeds up malloc_bench numbers on OSX
about 3x.

This change also unbreaks couple tests (e.g. heap profiler unittest)
which depend on MallocHook_GetCallerStackTrace to return precisely
stack trace up to memory allocator call.

Performance-wise, OSX still lacks good TLS and there is still one jump
indirection even for operator new/delete API due to lack support of
aliases. So Linux or BSD, or even windows would still be significantly
faster on malloc performance on same hardware.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
11d4253ead [osx] don't crash debugallocator on 0-sized allocations
OSX's malloc zones facility is calling 'usable size' before each (!)
free and uses size to choose between our allocator and something
else. And returning 0 breaks this choice. Which happens if we got
0-sized malloc request, which our tests exercise couple times. So we
don't let debug allocator on OSX to have 0 sized chunks. This unbreaks
debug allocation tests on OSX.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
b0c2ab298a [osx] place malloc-zone-related functions to google_malloc section
On OSX we hook into "system's" malloc through this zones facility, so
we need to place those those interface functions into google_malloc
section. This largely unbreaks MallocHook_GetCallerStackTrace on OSX.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
531ca4fdca [osx] unbreak LowLevelAlloc
For some reason clang shipped with osx miscompiles LowLevelAlloc when
it tries to place Alloc methods into malloc_hooks section. But since
we don't really need that placement on OSX, we can simply drop that
section placement.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
e78238d94d reworked heap leak checker for more portability
In most practical terms, this expands "official" heap leak checker
support to Linux/arm64 and Linux/riscv (mips-en and legacy arm are
likely to work & pass tests too now).

The code is now explicitly Linux-only, without trying to pretend
otherwise. Main goal of this change is to finally amputate
linux_syscall_support.h, which we historically had trouble maintaining
well. Biggest challenge was around thread listing facility which uses
clone (ptrace explicitly fails between threads) and that causes
difficulties around parent and child tasks sharing
errno. linux_syscall_support stuff had special feature to "redirect"
errno accesses. But it caused us for more trouble. We switched to
regular syscalls, and errno stamping avoidance is now simply via
careful programming.

A number of other cleanups is made (such us thread finding codes in
procfs which clearly was built for some ages old and odd kernels).

sem_post/sem_wait synchronization was previously potentially prone to
deadlock (if parent died at bad time). We now use pipe pair for this
synchronization and it is fully robust.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
2186967987 fix heap checker unittest
We had shell wrapper for heap checker unittest, but it failed to deal
with heap-checker-debug variant. So we now posix_spawn from .cc test
instead.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
1c3f0dda1b ensure that heap checker initialization actually calls malloc
I.e. all recent compilers (both gcc and clang) optimize out delete(new
int) sequence to nothing. We call to tc_new and tc_delete directly to
suppress this optimization.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
b96dc99dbf one trivial config cleanup 2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
b7e47a77c0 simplify heap checker building default to be Linux-only
This also fixes cmake and freebsd where previously check for freebsd
wasn't working.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
b58cbd2e23 make preamble patcher build and run (win64 only)
First, ml64 (amd64 version of masm) doesn't support /coff. Second, it
also doesn't support (nor use) .model directive.

Sadly, asm is inherently amd64-only, so this entire test won't build
in i386 configuration.
2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
54605b8a58 amputate old atomic ops implementation 2023-07-02 22:30:00 -04:00
Aliaksey Kandratsenka
ea0988b020 port spinlocks atomic ops usage to std::atomic 2023-07-02 21:28:30 -04:00
Aliaksey Kandratsenka
3494eb29d6 switch malloc hooks to std::atomic
This makes our code one step closer to killing unportable custom
atomics implementation. I did manually inspect generated code that
fast-path is identical and slow path's changes of generated code are
cosmetic.
2023-07-02 21:28:30 -04:00
Aliaksey Kandratsenka
0e21f36843 unbreak frame skipping in generic-fp backtrace method 2023-07-02 21:28:30 -04:00
Aliaksey Kandratsenka
6f3cacf698 stacktrace_unittest: test backtracing from signals more reliably
Somehow I originally choose to segfault to test this, and it caused us
to deal with 'skipping over' null pointer access. Which ended up not
so easy to do semi-portably. But easier way is to do same
itimer/sigprof thing that we do in actual profiler.

We're still somewhat at the mercy of compiler placing code "normally",
but this implementation seems more robust anyways.
2023-07-02 21:28:30 -04:00
Aliaksey Kandratsenka
68b442714a stop working around obsolete compilation bug
This was targeting (some obsure and perhaps even Google-internal
version) of now-ancient compiler. No need to keep this anymore.
2023-07-02 21:28:30 -04:00