This allows us, later, to avoid building this stuff in configurations
that don't use it. I have also reduced API and ABI surface to enable
further refactorings.
We used msync to verify that address is readable. But msync gives
false positives for PROT_NONE mappings. And we recently got bug report
from user hitting this exact condition.
For correct access check, we steal idea from Abseil and do sigprocmask
with address used as new signal mask and with invalid HOW
argument. This works in today's Linux kernels and is among fastest
methods available. But is brittle w.r.t. possible kernel changes. So
we supply fallback method that does 2 syscalls.
For non-Linux systems we implement usual "write to pipe" trick. Which
also has decent performance, but requires occasional pipe draining and
uses fds which could occasionally be damaged by some forking codes.
We also finally cover all new code with unit test.
Fixes github issue #1426
Previous implementation wasn't entirely safe w.r.t. 32-bit off_t
systems. Specifically around mmap replacement hook. Also, API was a
lot more general and broad than we actually need.
Sadly, old mmap hooks API was shipped with our public headers. But
thankfully it appears to be unused externally (checked via github
search). So we keep this old API and ABI for the sake of formal API
and ABI compatibility. But this old API is now empty and always
fails (some OS/hardware combinations didn't have functional
implementations of those hooks anyways).
New API is 64-bit clean and only provides us with what we need. Namely
being able to react to virtual address space mapping changes for
logging, heap profiling and heap leak checker. I.e. no pre hooks or
mmap-replacement hooks. We also explicitly not ship this API
externally to give us freedom to change it.
New code is also hopefully tidier and slightly more portable. At least
there are fewer arch-specific ifdef-s.
Another somewhat notable change is, since mmap hook isn't needed in
"minimal" configuration, we now don't override system's
mmap/munmap/etc functions in this configuration. No big deal, but it
reduces risk of damage if we somehow mess those up. I.e. musl's mmap
does few things that our mmap replacement doesn't, such as very fancy
vm_lock thingy. Which doesn't look critical, but is good thing for us
not to interfere with when not necessary.
Fixes issue #1406 and issue #1407. Lets also mention issue #1010 which
is somewhat relevant.
This facility allowed us to build tcmalloc without linking in actual
-lpthread. Via weak symbols we checked at runtime if pthread functions
are available and if not, special single-threaded stubs were used
instead. Not always brining in pthread dependency helped performance
of some programs or libraries which depended at runtime on whether
threads are linked or not. Most notable of those are libstdc++ which
uses non-atomic refcounting on single threaded programs.
But such optional dependency on pthreads caused complications for
nearly no benefit. One trouble was reported in github issue #1110.
This days glibc/libstdc++ combo actually depends on
sys/single_threaded.h facility. So bringing pthread at runtime is
fine. Also modern glibc ships pthread symbols inside libc anyways and
libpthread is empty. I also found that for whatever reason on BSDs and
osx we already pulled in proper pthreads too.
So we loose nothing and we get issue #1110 fixed. And we simplify
everything.
This fixes issue #1371
From time to time things file inside tcmalloc guts where calling to
malloc is not safe. Regular strerror does locale bits, so will
occasionally open files/malloc/etc. We avoid this by using our own
"safe" variant that hardcodes names of all POSIX errno constants.
We previously tested wrong assumption that larger than page size size
classes have addresses aligned on page size. New code is making proper
check of size class.
Also added is unit test coverage for this previously failing
condition. And we now also run "assert-ful" unittests for big tcmalloc
too, not only tcmalloc_minimal configuration.
This fixes github issue #1254
Everyone should be using golang pprof from github.com/google/pprof, but
distros still ship our perl version and not everyone is aware of
better pprof yet.
This is another step in completely dropping perl pprof. We still
default to installing it, but hopefully we'll be able to convince
distros to disable this soon.
We still install pprof under pprof-symbolize name because
stack traces symbolization depends on it, and because golang pprof
won't support this feature.
This is related to issue #1038.
While this is not good representation of real-world production malloc
behavior, it is representative of length (instruction-wise and well as
cycle-wise) of fast-path. So this is better than nothing.