mirror of
https://github.com/gperftools/gperftools
synced 2025-01-18 04:41:34 +00:00
6691226953
Logic was removed from thread_cache.{h,cc} into
thread_cache_ptr.{h,cc}.
Separation will help possible future evolution, and we already changed
the logic quite a bit:
* early access (when TLS isn't initialized yet) now uses global
ThreadCache instance. We therefore have ThreadCachePtr instances
managing required locking. This eliminates unnecessary complication of
PTHREADS_CRASHES_IF_RUN_TOO_EARLY logic, and any other danger of
touching TLS too early. BTW previous implementation actually leaked
initial early-initialized ThreadCache instance(!)
* old configure-time HAVE_TLS logic is amputated. Config-time part of
it made little sense as C++ 17 guarantees availability of
thread_local, but we have manually curated deny-list of "bad" OSes,
that we tested (via compile checks!) at configure time. Now this
is all compile time. There is now compile-time kHaveGoodTLS variable
and we're using it mostly via if constexpr.
* kHaveGoodTLS case of creating thread cache is simplified and made
more straightforward (no need to have in_setspecific logic).
* !kHaveGoodTLS case if fixed and improved too. We avoid
std:🧵:get_id, as it deadlocks on mingw. We use errno address as
a portable and (usually) async-signal safe 'my thread' identifier. We
also eliminate linear searching of thread's cache and replace it with
straightforward hash table lookup.
|
||
---|---|---|
benchmark | ||
cmake | ||
docs | ||
m4 | ||
src | ||
vsprojects | ||
.gitignore | ||
.travis.yml | ||
AUTHORS | ||
autogen.sh | ||
ChangeLog.old | ||
CMakeLists.txt | ||
configure.ac | ||
COPYING | ||
gperftools.sln | ||
INSTALL | ||
Makefile.am | ||
NEWS | ||
README | ||
README_windows.txt |
gperftools
----------
(originally Google Performance Tools)
The fastest malloc we’ve seen; works particularly well with threads
and STL. Also: thread-friendly heap-checker, heap-profiler, and
cpu-profiler.
OVERVIEW
---------
gperftools is a collection of a high-performance multi-threaded
malloc() implementation, plus some pretty nifty performance analysis
tools.
gperftools is distributed under the terms of the BSD License. Join our
mailing list at gperftools@googlegroups.com for updates:
https://groups.google.com/forum/#!forum/gperftools
gperftools was original home for pprof program. But do note that
original pprof (which is still included with gperftools) is now
deprecated in favor of Go version at https://github.com/google/pprof
TCMALLOC
--------
Just link in -ltcmalloc or -ltcmalloc_minimal to get the advantages of
tcmalloc -- a replacement for malloc and new. See below for some
environment variables you can use with tcmalloc, as well.
tcmalloc functionality is available on all systems we've tested; see
INSTALL for more details. See README_windows.txt for instructions on
using tcmalloc on Windows.
when compiling. gcc makes some optimizations assuming it is using its
own, built-in malloc; that assumption obviously isn't true with
tcmalloc. In practice, we haven't seen any problems with this, but
the expected risk is highest for users who register their own malloc
hooks with tcmalloc (using gperftools/malloc_hook.h). The risk is
lowest for folks who use tcmalloc_minimal (or, of course, who pass in
the above flags :-) ).
HEAP PROFILER
-------------
See docs/heapprofile.html for information about how to use tcmalloc's
heap profiler and analyze its output.
As a quick-start, do the following after installing this package:
1) Link your executable with -ltcmalloc
2) Run your executable with the HEAPPROFILE environment var set:
$ HEAPPROFILE=/tmp/heapprof <path/to/binary> [binary args]
3) Run pprof to analyze the heap usage
$ pprof <path/to/binary> /tmp/heapprof.0045.heap # run 'ls' to see options
$ pprof --gv <path/to/binary> /tmp/heapprof.0045.heap
You can also use LD_PRELOAD to heap-profile an executable that you
didn't compile.
There are other environment variables, besides HEAPPROFILE, you can
set to adjust the heap-profiler behavior; c.f. "ENVIRONMENT VARIABLES"
below.
The heap profiler is available on all unix-based systems we've tested;
see INSTALL for more details. It is not currently available on Windows.
HEAP CHECKER
------------
Please note that as of gperftools-2.11 this is deprecated. You should
consider asan and other sanitizers instead.
See docs/heap_checker.html for information about how to use tcmalloc's
heap checker.
In order to catch all heap leaks, tcmalloc must be linked *last* into
your executable. The heap checker may mischaracterize some memory
accesses in libraries listed after it on the link line. For instance,
it may report these libraries as leaking memory when they're not.
(See the source code for more details.)
Here's a quick-start for how to use:
As a quick-start, do the following after installing this package:
1) Link your executable with -ltcmalloc
2) Run your executable with the HEAPCHECK environment var set:
$ HEAPCHECK=1 <path/to/binary> [binary args]
Other values for HEAPCHECK: normal (equivalent to "1"), strict, draconian
You can also use LD_PRELOAD to heap-check an executable that you
didn't compile.
The heap checker is only available on Linux at this time; see INSTALL
for more details.
CPU PROFILER
------------
See docs/cpuprofile.html for information about how to use the CPU
profiler and analyze its output.
As a quick-start, do the following after installing this package:
1) Link your executable with -lprofiler
2) Run your executable with the CPUPROFILE environment var set:
$ CPUPROFILE=/tmp/prof.out <path/to/binary> [binary args]
3) Run pprof to analyze the CPU usage
$ pprof <path/to/binary> /tmp/prof.out # -pg-like text output
$ pprof --gv <path/to/binary> /tmp/prof.out # really cool graphical output
There are other environment variables, besides CPUPROFILE, you can set
to adjust the cpu-profiler behavior; cf "ENVIRONMENT VARIABLES" below.
The CPU profiler is available on all unix-based systems we've tested;
see INSTALL for more details. It is not currently available on Windows.
NOTE: CPU profiling doesn't work after fork (unless you immediately
do an exec()-like call afterwards). Furthermore, if you do
fork, and the child calls exit(), it may corrupt the profile
data. You can use _exit() to work around this. We hope to have
a fix for both problems in the next release of perftools
(hopefully perftools 1.2).
EVERYTHING IN ONE
-----------------
If you want the CPU profiler, heap profiler, and heap leak-checker to
all be available for your application, you can do:
gcc -o myapp ... -lprofiler -ltcmalloc
However, if you have a reason to use the static versions of the
library, this two-library linking won't work:
gcc -o myapp ... /usr/lib/libprofiler.a /usr/lib/libtcmalloc.a # errors!
Instead, use the special libtcmalloc_and_profiler library, which we
make for just this purpose:
gcc -o myapp ... /usr/lib/libtcmalloc_and_profiler.a
CONFIGURATION OPTIONS
---------------------
For advanced users, there are several flags you can pass to
'./configure' that tweak tcmalloc performance. (These are in addition
to the environment variables you can set at runtime to affect
tcmalloc, described below.) See the INSTALL file for details.
ENVIRONMENT VARIABLES
---------------------
The cpu profiler, heap checker, and heap profiler will lie dormant,
using no memory or CPU, until you turn them on. (Thus, there's no
harm in linking -lprofiler into every application, and also -ltcmalloc
assuming you're ok using the non-libc malloc library.)
The easiest way to turn them on is by setting the appropriate
environment variables. We have several variables that let you
enable/disable features as well as tweak parameters.
Here are some of the most important variables:
HEAPPROFILE=<pre> -- turns on heap profiling and dumps data using this prefix
HEAPCHECK=<type> -- turns on heap checking with strictness 'type'
CPUPROFILE=<file> -- turns on cpu profiling and dumps data to this file.
PROFILESELECTED=1 -- if set, cpu-profiler will only profile regions of code
surrounded with ProfilerEnable()/ProfilerDisable().
CPUPROFILE_FREQUENCY=x-- how many interrupts/second the cpu-profiler samples.
PERFTOOLS_VERBOSE=<level> -- the higher level, the more messages malloc emits
MALLOCSTATS=<level> -- prints memory-use stats at program-exit
For a full list of variables, see the documentation pages:
docs/cpuprofile.html
docs/heapprofile.html
docs/heap_checker.html
See also TCMALLOC_STACKTRACE_METHOD_VERBOSE and
TCMALLOC_STACKTRACE_METHOD environment variables briefly documented in
our INSTALL file and on our wiki page at:
https://github.com/gperftools/gperftools/wiki/gperftools'-stacktrace-capturing-methods-and-their-issues
COMPILING ON NON-LINUX SYSTEMS
------------------------------
Perftools was developed and tested on x86, aarch64 and riscv Linux
systems, and it works in its full generality only on those systems.
However, we've successfully ported much of the tcmalloc library to
FreeBSD, Solaris x86 (not tested recently though), and Mac OS X
(aarch64; x86 and ppc have not been tested recently); and we've ported
the basic functionality in tcmalloc_minimal to Windows. See INSTALL
for details. See README_windows.txt for details on the Windows port.
---
Originally written: 17 May 2011
Last refreshed: 10 Aug 2023