Commit Graph

1132 Commits

Author SHA1 Message Date
Aliaksei Kandratsenka
5273567470 [trivialre] fix past-end-of-string access in MatchSubstring
Thanks goes to MSVC's string_view asserts that check this.
2024-09-16 19:33:14 -04:00
Aliaksei Kandratsenka
38abfd9038 make dist windows/CMakeLists.txt 2024-09-16 19:33:14 -04:00
Aliaksei Kandratsenka
d5c94e0bd3 unbreak make dist for benchmark/trivialre.h 2024-09-16 18:17:02 -04:00
Aliaksei Kandratsenka
e0844fa799 [low_level_alloc_unittest] cleanup unused variable usage 2024-09-16 16:30:10 -04:00
Aliaksei Kandratsenka
a4e2f95038 [osx] unbreak EmergencyMallocNoHook test
OSX's malloc zones stuff is essentially incompatible with emergency
malloc. In order to handle it, our unit tests use tc_free directly in
order to free memory allocated by emergency malloc. So lets keep doing
it for our newer code that exercises calloc/free pair as well.
2024-09-16 13:13:42 -04:00
Aliaksei Kandratsenka
92fd07cc2f bump googletest version to latest 2024-09-14 22:54:29 -04:00
Aliaksey Kandratsenka
ea35d14585 stop checking unused malloc_hook "subsection"
Originally at Google they had to do 2 subsections for hookable
functions because of some linking detais (in bazel infrastructure they
still do different libraries as .so-s for tests). So "generic" hooks
(such as mmap/sbrk) were in malloc_hook section and tcmalloc's
were/are in google_malloc. Since those are different bazel/blaze
libraries. And we kept this distinction simply because no-one bothered
to undo it, despite us never needing it.

We recently refactored mmap/sbrk hooking. And we don't use section
stuff anymore for those hooks. And so there are no malloc_hook
anymore. And so we were getting bogus and useless warnings about empty
section. So lets avoid this.
2024-09-14 17:24:20 -04:00
Aliaksey Kandratsenka
fd09d89fcd [qnx] provide -lregex dependency for googletest
For more context see: https://github.com/gperftools/gperftools/pull/1544
2024-09-14 16:02:46 -04:00
Aliaksey Kandratsenka
a5d86777ce [malloc_bench] add rnd_dependent_8cores benchmark
This benchmark exercizes multi-threaded central free list operations,
and this is where we're losing to a bunch of competing malloc (i.e. which
shard heap).
2024-09-13 18:02:37 -04:00
Aliaksey Kandratsenka
ea81e46ff1 improve benchmarks facility
We now support a set of command line flags similar to "abseil"
benchmark thingy. I.e. to let people specify a subset of benchmarks or
run them longer/shorter as needed.

This commit also includes small, portable and very simple regexp
facility. It isn't good enough for some production use, but it is
plenty good for some testing uses or benchmark selection.
2024-09-13 18:01:25 -04:00
Aliaksey Kandratsenka
7fa0c2da53 modernize malloc_bench
Instead of relying on gperftools-specific tc_XYZ functions for sized
deallocation and memalign we use standard C++ facilities. There are
also other minor improvements like mallocing larger buffers rather
than statically allocating them.
2024-09-05 22:56:23 -04:00
Olivier Langlois
f46c141b4e make ThreadCache constructor/destructor private
1. This documents the intent that the way to create/destroy a ThreadCache
    object is through the static methods NewHeap()/DeleteCache()
2. It makes using the class less error prone. The compiler will complain
   if some new code is accidentally creating objects directly
3. This may allow some compilers to optimize code knowing that those
   functions are private

Signed-off-by: Olivier Langlois <olivier@trillion01.com>
2024-09-01 15:45:16 -04:00
Aliaksey Kandratsenka
ae15d7a490 fix spurious rare failures in profile handler unittest
It's method of verifying that cpu profiling ticks happens is
inherently brittle. It sleeps certain time and then checks if ticks
happened during that time. And sleeping is by wall clock time. But due
to cpu scheduling being unpredictable, it is not impossible to see no
ticks even waiting 200 ms. We improve the issue by having the code
loop a bit until it seeks ticks happen.
2024-08-18 15:56:17 -04:00
Xiang.Lin
2e0b81852e [qnx] prefer fp unwinder 2024-08-13 14:26:05 -04:00
Xiang.Lin
68ae9e624e [qnx] fix empty proc maps info
Qnx pmap file starts with a header line, skip it to make ForEachLine
continue feeding next valid proc maps line.

vaddr,size,flags,prot,maxprot,dev,ino,offset,rsv,guardsize,refcnt,mapcnt,path
0x0000001c0ae92000,0x0000000000002000,0x00010071,0x05,0x0d,0x00000802,0x00000000c00000ab, \
  0x0000000000000000,0x0000000000000000,0x00000000,0x0000017e,0x0000017c,/bin/cat
2024-08-13 14:26:05 -04:00
Aliaksey Kandratsenka
285908e8c7 actualize TCMallocGuard comment
It was incorrectly refering to long gone behavior of updating
environment variables to tell (ancient versions of) libstdc++ to use
new/delete.
2024-06-21 15:52:57 -04:00
Aliaksey Kandratsenka
469da8dac6 override_functions: add _{msize,recalloc}_base
Apparently modern ucrt does have _msize_base and _recalloc_base
defined in same .obj files as their corresponsing "non-base"
functions. And they're seemingly used somewhere in their C or C++
runtime. So lets provide them too in our override_functions.cc, so
that overriding malloc routines in MSVC's static C runtime option
works.

Otherwise, whatever msize_base usage pulls msize.obj file that wants
to define regular msize, and with msize already provided by
override_function, we'd rightfully get linking error.

Update github ticket
https://github.com/gperftools/gperftools/issues/1430
2024-06-21 13:32:35 -04:00
Aliaksey Kandratsenka
825b6638cf don't try to unit-test generic_fp on known-broken platforms
I.e. 32-bit legacy arm has broken frame pointers backtracing.

This fixes https://github.com/gperftools/gperftools/issues/1512
2024-06-12 19:52:23 -04:00
Aliaksey Kandratsenka
7c736310f9 avoid testing known-to-fail cases of backtracing from ucontext
See https://github.com/gperftools/gperftools/issues/1524
2024-06-12 16:42:31 -04:00
Aliaksey Kandratsenka
5284f386ee allow disabling actual versus given sized delete checking
See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279560#c10 and
https://github.com/llvm/llvm-project/pull/90292
2024-06-12 16:42:29 -04:00
Aliaksey Kandratsenka
778df41889 unbreak calloc memsetting extra memory in emergency malloc mode 2024-06-11 19:09:09 -04:00
Aliaksey Kandratsenka
cb2a58fb1b correctly find getpc.h header for GetPC configure test
When building out-of-tree include of src/getpc.h fails, so we
decide incorrectly that GetPC isn't functional.

Fixes github issue #1525
2024-06-10 17:22:08 -04:00
Aliaksey Kandratsenka
46c63ae3f6 .gitignore of recently added min_per_thread_cache_size_test
It was added by it's wrong name, so lets fix it.
2024-06-10 17:22:03 -04:00
Ishant Goyal
addf751420 Added support to configure lower bound on per-thread cache size
[alkondratenko@gmail.com: removed spurious new line at thread_cache.h]
Signed-off-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
2024-06-04 12:11:01 -04:00
Aliaksey Kandratsenka
9fb05f3467 Reapply "[osx] implement native c++ allocation operators on osx"
This reverts commit a0880d78f3.

We had to revert it, because some most recent OSX versions had broken
exceptions for unwinding through functions with ATTRIBUTE_SECTION. But
now that ATTRIBUTE_SECTION is dropped on OSX, we can recover it.

Same as before, operator new/delete integration is much faster on
OSX. Unlike malloc/free and similar APIs which are ~fundamentally
broken performance-wise on OSX due to expensive arenas integration, we
don't need to support this overhead for C++ primitives. So lets get
this win at last.
2024-05-29 19:22:09 -04:00
Aliaksey Kandratsenka
5472781f4a drop broken and deprecated ATTRIBUTE_SECTION support for OSX
This support is only strictly necessary for
MallocHook_GetCallerStackTrace which is only needed for heap
checker. And since heap checker is Linux-only, lets just amputate this
code.
2024-05-29 19:20:21 -04:00
Aliaksey Kandratsenka
2b0b119b6d always use full 64 bits of address on Solaris
Solaris has somewhat nice behavior of mmaps where high bits of address
are set. It still uses only 48 bits of address on x86 (and,
presumably, 64-bit arm), just with somewhat non-standard way. But this
behavior causes some inconveniences to us. In particular, we had to
disable mmap sys allocator and had failing emergency malloc tests (due
to assertion in TryGetSizeClass in CheckCachedSizeClass). We could
consider more comprehensive fix, but lets just do "honest" 64-bit
addresses at least for now.
2024-05-29 18:30:09 -04:00
Aliaksey Kandratsenka
8be31af0fc unbreak debugallocation_test on Solaris
Allow debugallocation death test to accept RUN_ALL_TESTS in backtrace
instead of main. For some reason solaris ends up either optimizing
tail-calling of RUN_ALL_TESTS or something else happens in gtest
platform integration, but failure backtraces don't have main
there. But the intention of the test is simply to ensure that we got
failure with backtrace. So lets accept RUN_ALL_TESTS as well.
2024-05-29 18:30:09 -04:00
Aliaksey Kandratsenka
c61f35f04c simplify tcmalloc/sbrk/sbrk-hooks integration
Instead of relying on __sbrk (on subset of Linux systems) or invoking
sbrk syscall directly (on subset of FreeBSD systems), we have malloc
invoke special tcmalloc_hooked_sbrk function. Which handles hooking
and then invokes regular system's sbrk. Yes, we loose theoretical
ability to hook into non-tcmalloc uses of sbrk, but we gain portable
simplicity.
2024-05-29 18:30:09 -04:00
Aliaksey Kandratsenka
29f394339b refactor and simplify capturing backtraces from mmap hooks
The case of mmap and sbrk hooks is simple enough that we can do
simpler "skip right number of frames" approach. Instead of relying on
less portable and brittle attribute section trick.
2024-05-29 18:30:09 -04:00
Aliaksey Kandratsenka
29b6eff4c7 replace heap-checker "bcad" stuff
The comments in this file stated that it has to be linked in specific
order to get initialized early enough. But our initialization is
de-facto via initial malloc hook. And to deal with latest-possible
destruction, we use more convenient destructor function
attribute, and make things simpler.
2024-05-29 18:30:09 -04:00
Aliaksey Kandratsenka
33cda2c9b3 unbreak grab-backtrace frame skipping logic
When building with -O0 -fno-inlines, +[] () {} (lambda) syntax for
function pointers actually creates "wrapper function" so we see extra
frame (due to disabled inlinings). Fix is to create explicit function
and pass it, instead of lambda thingy.
2024-05-29 18:30:09 -04:00
Yikai Zhao
38b19664d3 generic_fp stacktrace: aarch64 frame pointer may be 8 byte aligned 2024-05-25 12:32:44 -04:00
Aliaksey Kandratsenka
7b9dc8e1fc unbreak skip_count adjustment in tcmalloc::GrabBacktrace
I broke it with "simple" off by one error in big emergency malloc
refactoring change.

It is somewhat shocking that no test caught this, but we'll soon be
adding such test.
2024-05-23 15:08:02 -04:00
Aliaksey Kandratsenka
d9263eea08 unbreak pagemap unittest compilation on msvc 2024-05-23 13:31:42 -04:00
Aliaksey Kandratsenka
d9a99c290a expand emergency malloc integration to !kHaveGoodTLS systems
References github issue #1503.

This significantly reworks both early thread cache access and related
emergency malloc mode checking integration. As a result, we're able to
to rely on emergency malloc even on systems without "good"
TLS (e.g. QNX which does emutls).

One big change here is we're undoing early change to have single
"global" thread cache early during process lifetime. It was nice and
somewhat simpler approach. But because of inherent locking during
early process lifetime, we couldn't be sure of certain lock ordering
aspects w.r.t. backtracing/exception-stack-unwinding. So I choose to
keep it safe. So the new idea is we use SlowTLS facility to find
threads' caches when normal tls isn't ready yet. It avoids holding
locks around potentially recursion-ful things (like
ThreadCache::ModuleInit or growing heap). But we then have to be
careful to "re-attach" those early thread cache instances to regular
TLS. There will nearly always be just one of such thread caches. For
initial thread. But we cannot entirely rule out more general case
where someone creates threads before process initializers ran and
main() is reached. Another notable thing is free-ing memory in this
early mode will always using slow-path deletion directly into central
free list.

SlowTLS facility is "simply" a generalization of previous
CreateBadTLSCache code. I.e. we have a small fixed-size cache that
holds "exceptional" instances of thread-identitity to
thread-cache+emergency-mode-flag mappings.

We also take advantage of tcmalloc::kInvalidTLSKey we introduced
earlier and remove potentially raceful memory ordering between reading
tls_ready_ and tls_key_.

For emergency malloc detection we previously used thread_local
flag. Which we cannot use on !kHaveGoodTLS systems. So we instead
_remove_ thread's cache from it's normal tls storage and place it
"into" SlowTLS instead for the duration of WithStacktraceScope
call (which is how emergency mode is enabled now).
2024-05-23 13:31:42 -04:00
Aliaksey Kandratsenka
46d9a6293a introduce tcmalloc::kInvalidTLSKey
The intention is to initialize tls-key variable with this invalid
value. This will help us avoid separate "tls ready" flag and possible
memory ordering issues around distinct tls key and tls-ready
variables.

On windows we use TLS_OUT_OF_INDEXES values which is properly
"impossible" tls index value. On POSIX systems we add theoretically
unportable, but practically portable assumption that tls keys are
integers. And we make value of -1 be that invalid key.
2024-05-18 13:53:34 -04:00
Aliaksey Kandratsenka
65ce9e899e better abort in internal_logging.cc:Log
We use __builtin_trap (which compiles to explicitly undefined
instruction or "int 3" on x64-en), when available, to make those
crashing Log invokations a little nicer to debug.
2024-05-18 13:53:34 -04:00
Aliaksey Kandratsenka
a6864ae233 clean up FLAGS_tcmalloc_heap_limit_mb value in page heap tests
This will not only make things right w.r.t. possible order of test
runs, but it also unbreaks test failures on windows (where gtest ends
up doing some malloc after test completion, hits the limit and dies).
2024-05-18 13:51:54 -04:00
Aliaksey Kandratsenka
de12f89d2a expand emergency malloc test coverage
We add coverage of calloc and we also cover no-hooks case.
2024-05-01 17:43:49 -04:00
Aliaksey Kandratsenka
77ba2cf133 don't include malloc_extension.h in page_heap.h 2024-05-01 17:43:49 -04:00
Aliaksey Kandratsenka
d63b2fad17 introduce thread::SelfThreadId
Sadly, certain/many implementations of std::this_thread::id invoke
malloc. So we need something more robust. On Unix systems we use
address of errno as thread identifier. Sadly, this doesn't cover
windows where MS's C runtime facility will occasionally malloc when
errno location is grabbed (with some special trickery for when malloc
itself needs to set errno to ENOMEM!). So on windows, we do
GetCurrentThreadId which appears to be roughly as efficient as
"normal" system's __errno_location implementation.
2024-05-01 17:43:49 -04:00
Aliaksey Kandratsenka
13aecbe197 make debugallocation calloc hook invocation order consistent
I.e. we normally call new hook just before returning. In calloc's case
this means after zeroing allocated memory.
2024-05-01 16:54:32 -04:00
Aliaksey Kandratsenka
bc2aac871a re-introduce missing initial-exec attribute for per-thread data 2024-05-01 16:54:10 -04:00
Aliaksey Kandratsenka
786ecdfbc8 handle re-entrancy in check-address facility
We use mmap when we initialize it, which could via heap checker
recurse back into backtracing and check-address. So before we do mmap
and rest of initialization, we now set check-address implementation
to conservative two-syscalls version.
2024-05-01 16:51:29 -04:00
Aliaksey Kandratsenka
a038c8c23f unbreak cmake build around HAVE_SBRK check 2024-04-30 23:25:17 -04:00
leap
1bdabc0e37 Unbreak proc_maps_iterator.cc compilation on QNX
We had duplicate definition of flags_tmp variable.

Signed-off-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
[alkondratenko@gmail.com] updated commit message
2024-04-10 13:49:48 -04:00
Aliaksey Kandratsenka
7c0cbb3a0c avoid depending on {s,}brk on FreeBSD systems without it
Apparently some recent FreeBSDs occasionally lack brk. So our code
which previously hard-coded that this OS has brk (which we use to
implement hooked sbrk) fails to compile.

Our configure scripts already detects sbrk, so we simply need to pay
attention. Fixes github issue #1499
2024-04-03 15:46:06 -04:00
Aliaksey Kandratsenka
8dde01b4de unbreak emergency malloc tests on systems without "good" TLS
Our implementation of emergency malloc slow-path actually depends on
good TLS implementation. So on systems without good TLS (i.e. OSX), we
lied to ourselves that emergency malloc is available, but then failed
tests.
2024-03-27 18:36:29 -04:00
Aliaksey Kandratsenka
4cbd8ad245 refactor and simplify LowLevelAlloc
We don't expose DefaultArena anymore. Simply passing nullptr implies
default arena.

We also streamline default arena initialization (we
previously relied on combination of lazy initialazation in ArenaInit
and constexpr construction).

There is also no need to have elaborate ArenaLock thingy. We use plain
SpinLockHolder instead.
2024-03-27 18:18:20 -04:00