AARCH64 >= armv8.3-a supports pointer authentication. If this feature is
enabled it modifies the previously unused upper address bits in apointer.
The affected bits need to be cleared in order for stacktrace to work.
Signed-off-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
[alkondratenko@gmail.com: added succinct subject line]
As we recently found out, initializing static struct fields or
variables with lambdas, sets up runtime initialization instead of
static initialization as we assumed. So lets avoid this too for null
stacktrace implementation.
Otherwise mmap calling to do_mmap_with_hooks might tail-call (instead
of inlining), which will then break GetCallerStackTrace
facility (since only mmap is placed into special malloc_hook section).
This unbreaks heap checker on gcc 5, but is in general right thing to
do.
Testing every 7th size is a bit slow on slower machines. No need to be
as thorough. We now bump by about 1/128th each step which is still
more steps than size classes we have.
We do shell wrapper for actual test run, so we can inspect output of
pprof. But when we set up sampling_debug_test.sh we simply copied
regular sampling_test.sh, which ran same non-debug test binary. Now we
sed-replace contents of shell program when copying, so we test right
binary.
Another thing we fix here is our (still hardcoded) test output path is
now different between sampling{,_debug}_test.sh. So this fixes main
cause of flakiness of our unit tests.
We used msync to verify that address is readable. But msync gives
false positives for PROT_NONE mappings. And we recently got bug report
from user hitting this exact condition.
For correct access check, we steal idea from Abseil and do sigprocmask
with address used as new signal mask and with invalid HOW
argument. This works in today's Linux kernels and is among fastest
methods available. But is brittle w.r.t. possible kernel changes. So
we supply fallback method that does 2 syscalls.
For non-Linux systems we implement usual "write to pipe" trick. Which
also has decent performance, but requires occasional pipe draining and
uses fds which could occasionally be damaged by some forking codes.
We also finally cover all new code with unit test.
Fixes github issue #1426
As we see in github issue #1428, msvc arranges full "init on first
use" initialization for local static usage of TrivialOnce even if that
initialization is completely empty. Fair game, even if stupid.
POD with no initialization should be safely zero-initialized with no
games or tricks from the compilers.
We could have and perhaps at some point should do constexpr for
TrivialOnce and SpinLock (abseil has been liberated from
LinkerInitialized for perphaps well over decade now, including their
fork of SpinLock, of course). But C++ legalese rules are complex
enough and bugs happened in past, so I don't want to be in the tough
business of interpreting standard. So at least for now we keep
things simple.
Default MPICH builds use the Hydra process manager (mpiexec) which sets
PMI_RANK in the application environment. Update GetUniquePathFromEnv()
test accordingly.
Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>
This unbreaks building on older Linux distros. We missed this at
46d3315ad7 when dropped maybe_thread
stuff, since libprofiler indeed uses pthread, and because on newer
libc-s pthread stuff is now part of regular libc.so.
I am also dropping bogus LIBPROFILER stuff referring to some rpath
badness. Unsure what it was, maybe way back we did libstacktrace as a
proper libtool library, so maybe something was needed. But it is just
a convenience archive this days, so we don't really need to add it
everywhere libprofiler.la is linked.
Without this fix we're failing unit tests on ubuntu 18.04 and centos 7
and 6. It looks like clone() in old glibc-s doesn't align stack, so
lets handle it ourselves. How we didn't hit this much earlier (before
massive thread listing refactoring), I am not sure. Most likely pure
luck(?)
* Add support for known HPC environments (TODO: needs to be extended
with more nevironments)
* Added the "CPUPROFILE_USE_PID" environment variable to force appending
PID for the non-covered environments
* Preserve the old way of handling the Child-Parent case
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
It actually found real (but arguably minor) issue with memory region
map locking.
As part of that we're replacing PageHeap::DeleteAndUnlock that had
somewhat ambitious 'move' of SpinLockHolder, with more straightforward
PageHeap::PrepareAndDelete. Doesn't look like we can support move
thingy with thread annotations.
Some years back we fixed memalign vs realloc bug, but we preserved
'wrong' malloc_size/GetAllocatedSize implementation for debug
allocator.
This commit refactors old code making sure we always use right
data_size and it fixes GetAllocatedSize. We update our unittest
accordingly.
Closes#738
As noted on github issue #880 'temporarily' thing saves us not just on
freeing thread cache, but also returning thread's share of thread
cache (max_size_) into common pool. And the later has caused trouble
to mongo folk who originally proposed 'temporarily' thing. They claim
they don't use it anymore.
And thus with no users and no clear benefit, it makes no sense for us
to keep this API. For API and ABI compat sake we keep it, but it is
now identical to regular MarkThreadIdle.
Fixes issue #880