gperftools/README

203 lines
7.5 KiB
Plaintext
Raw Normal View History

gperftools
----------
(originally Google Performance Tools)
The fastest malloc weve seen; works particularly well with threads
and STL. Also: thread-friendly heap-checker, heap-profiler, and
cpu-profiler.
OVERVIEW
---------
gperftools is a collection of a high-performance multi-threaded
malloc() implementation, plus some pretty nifty performance analysis
tools.
gperftools is distributed under the terms of the BSD License. Join our
mailing list at gperftools@googlegroups.com for updates:
https://groups.google.com/forum/#!forum/gperftools
gperftools was original home for pprof program. But do note that
2016-03-31 18:27:21 +00:00
original pprof (which is still included with gperftools) is now
deprecated in favor of Go version at https://github.com/google/pprof
TCMALLOC
--------
Just link in -ltcmalloc or -ltcmalloc_minimal to get the advantages of
tcmalloc -- a replacement for malloc and new. See below for some
environment variables you can use with tcmalloc, as well.
tcmalloc functionality is available on all systems we've tested; see
Thu Aug 5 12:48:03 PDT 2010 * google-perftools: version 1.6 release * Add tc_malloc_usable_size for compatibility with glibc (csilvers) * Override malloc_usable_size with tc_malloc_usable_size (csilvers) * Default to no automatic heap sampling in tcmalloc (csilvers) * Add -DTCMALLOC_LARGE_PAGES, a possibly faster tcmalloc (rus) * Make some functions extern "C" to avoid false ODR warnings (jyasskin) * pprof: Add SVG-based output (rsc) * pprof: Extend pprof --tools to allow per-tool configs (csilvers) * pprof: Improve support of 64-bit and big-endian profiles (csilvers) * pprof: Add interactive callgrind suport (weidenri...) * pprof: Improve address->function mapping a bit (dpeng) * Better detection of when we're running under valgrind (csilvers) * Better CPU-speed detection under valgrind (saito) * Use, and recommend, -fno-builtin-malloc when compiling (csilvers) * Avoid false-sharing of memory between caches (bmaurer) * BUGFIX: Fix heap sampling to use correct alloc size (bmauer) * BUGFIX: Avoid gcc 4.0.x bug by making hook-clearing atomic (csilvers) * BUGFIX: Avoid gcc 4.5.x optimization bug (csilvers) * BUGFIX: Work around deps-determining bug in libtool 1.5.26 (csilvers) * BUGFIX: Fixed test to use HAVE_PTHREAD, not HAVE_PTHREADS (csilvers) * BUGFIX: Fix tls callback behavior on windows when using wpo (wtc) * BUGFIX: properly align allocation sizes on Windows (antonm) * BUGFIX: Fix prototypes for tcmalloc/debugalloc wrt throw() (csilvers) * DOC: Updated heap-checker doc to match reality better (fischman) * DOC: Document ProfilerFlush, ProfilerStartWithOptions (csilvers) * DOC: Update docs for heap-profiler functions (csilvers) * DOC: Clean up documentation around tcmalloc.slack_bytes (fikes) * DOC: Renamed README.windows to README_windows.txt (csilvers) * DOC: Update the NEWS file to be non-empty (csilvers) * PORTING: Fix windows addr2line and nm with proper rc code (csilvers) * PORTING: Add CycleClock and atomicops support for arm 5 (sanek) * PORTING: Improve PC finding on cygwin and redhat 7 (csilvers) * PORTING: speed up function-patching under windows (csilvers) git-svn-id: http://gperftools.googlecode.com/svn/trunk@97 6b5cf1ce-ec42-a296-1ba9-69fdba395a50
2010-08-05 20:36:47 +00:00
INSTALL for more details. See README_windows.txt for instructions on
using tcmalloc on Windows.
when compiling. gcc makes some optimizations assuming it is using its
own, built-in malloc; that assumption obviously isn't true with
tcmalloc. In practice, we haven't seen any problems with this, but
the expected risk is highest for users who register their own malloc
hooks with tcmalloc (using gperftools/malloc_hook.h). The risk is
lowest for folks who use tcmalloc_minimal (or, of course, who pass in
the above flags :-) ).
HEAP PROFILER
-------------
2017-03-21 13:07:16 +00:00
See docs/heapprofile.html for information about how to use tcmalloc's
heap profiler and analyze its output.
As a quick-start, do the following after installing this package:
1) Link your executable with -ltcmalloc
2) Run your executable with the HEAPPROFILE environment var set:
* Suppress all large allocs when report threshold==0 * Clarified meaning of various malloc stats * Change from ATTRIBUTED_DEPRECATED to comments * Make array-size a var to compile under clang * Reduce page map key size under x86_64 by 4.4MB * Added full qualification to MemoryBarrier * Support systems that capitalize /proc weirdly * Avoid gcc warning: exporting type in unnamed ns * Add some dynamic annotations for gcc attributes * Add support for census profiler in pprof * Speed up pprof's ExtractSymbols * Speed up GoogleOnce * Add pkg-config (.pc) files * Detect when __environ exists but is NULL * Improve spinlock contention performance * Add GetFreeListSizes * Improve sampling_test, eg by adding no-inline * Relax malloc_extension test-check for big pages * Add proper library version number information * Update from autoconf 2.64 to 2.65 * Better document how to write a server that works with pprof * Change FillProcSelfMaps to better handle out-of-space * No longer hook _aligned_malloc/free in windows * Handle function-forwarding in DLLs when patching (in windows) * Update .vcproj files that had wrong .cc files in them (!) * get rid of unnecessary 'size < 0' * fix comments a bit in sysinfo.cc * another go at improving malloc-stats output * fix comment typo in profiler.cc * Add a few more thread annotations * Try to read TSC frequency from 'tsc_freq_khz' * Fix annotalysis/TSAN incompatibility * Add pprof --evince to go along with --gv * Document need for sampling to use GetHeapSample * Fix flakiness in malloc_extension_test * Separate out synchronization profiling routines git-svn-id: http://gperftools.googlecode.com/svn/trunk@99 6b5cf1ce-ec42-a296-1ba9-69fdba395a50
2010-11-18 01:07:25 +00:00
$ HEAPPROFILE=/tmp/heapprof <path/to/binary> [binary args]
3) Run pprof to analyze the heap usage
$ pprof <path/to/binary> /tmp/heapprof.0045.heap # run 'ls' to see options
$ pprof --gv <path/to/binary> /tmp/heapprof.0045.heap
You can also use LD_PRELOAD to heap-profile an executable that you
didn't compile.
There are other environment variables, besides HEAPPROFILE, you can
set to adjust the heap-profiler behavior; c.f. "ENVIRONMENT VARIABLES"
below.
The heap profiler is available on all unix-based systems we've tested;
see INSTALL for more details. It is not currently available on Windows.
HEAP CHECKER
------------
2023-07-31 19:16:14 +00:00
Please note that as of gperftools-2.11 this is deprecated. You should
consider asan and other sanitizers instead.
2017-03-21 13:07:16 +00:00
See docs/heap_checker.html for information about how to use tcmalloc's
heap checker.
In order to catch all heap leaks, tcmalloc must be linked *last* into
your executable. The heap checker may mischaracterize some memory
accesses in libraries listed after it on the link line. For instance,
it may report these libraries as leaking memory when they're not.
(See the source code for more details.)
Here's a quick-start for how to use:
As a quick-start, do the following after installing this package:
1) Link your executable with -ltcmalloc
2) Run your executable with the HEAPCHECK environment var set:
$ HEAPCHECK=1 <path/to/binary> [binary args]
Other values for HEAPCHECK: normal (equivalent to "1"), strict, draconian
You can also use LD_PRELOAD to heap-check an executable that you
didn't compile.
The heap checker is only available on Linux at this time; see INSTALL
for more details.
CPU PROFILER
------------
2017-03-21 13:07:16 +00:00
See docs/cpuprofile.html for information about how to use the CPU
profiler and analyze its output.
As a quick-start, do the following after installing this package:
1) Link your executable with -lprofiler
2) Run your executable with the CPUPROFILE environment var set:
$ CPUPROFILE=/tmp/prof.out <path/to/binary> [binary args]
3) Run pprof to analyze the CPU usage
$ pprof <path/to/binary> /tmp/prof.out # -pg-like text output
$ pprof --gv <path/to/binary> /tmp/prof.out # really cool graphical output
There are other environment variables, besides CPUPROFILE, you can set
to adjust the cpu-profiler behavior; cf "ENVIRONMENT VARIABLES" below.
The CPU profiler is available on all unix-based systems we've tested;
see INSTALL for more details. It is not currently available on Windows.
NOTE: CPU profiling doesn't work after fork (unless you immediately
do an exec()-like call afterwards). Furthermore, if you do
fork, and the child calls exit(), it may corrupt the profile
data. You can use _exit() to work around this. We hope to have
a fix for both problems in the next release of perftools
(hopefully perftools 1.2).
EVERYTHING IN ONE
-----------------
If you want the CPU profiler, heap profiler, and heap leak-checker to
all be available for your application, you can do:
gcc -o myapp ... -lprofiler -ltcmalloc
However, if you have a reason to use the static versions of the
library, this two-library linking won't work:
gcc -o myapp ... /usr/lib/libprofiler.a /usr/lib/libtcmalloc.a # errors!
Instead, use the special libtcmalloc_and_profiler library, which we
make for just this purpose:
gcc -o myapp ... /usr/lib/libtcmalloc_and_profiler.a
CONFIGURATION OPTIONS
---------------------
For advanced users, there are several flags you can pass to
2019-04-03 07:50:40 +00:00
'./configure' that tweak tcmalloc performance. (These are in addition
to the environment variables you can set at runtime to affect
tcmalloc, described below.) See the INSTALL file for details.
ENVIRONMENT VARIABLES
---------------------
The cpu profiler, heap checker, and heap profiler will lie dormant,
using no memory or CPU, until you turn them on. (Thus, there's no
harm in linking -lprofiler into every application, and also -ltcmalloc
assuming you're ok using the non-libc malloc library.)
The easiest way to turn them on is by setting the appropriate
environment variables. We have several variables that let you
enable/disable features as well as tweak parameters.
Here are some of the most important variables:
HEAPPROFILE=<pre> -- turns on heap profiling and dumps data using this prefix
HEAPCHECK=<type> -- turns on heap checking with strictness 'type'
CPUPROFILE=<file> -- turns on cpu profiling and dumps data to this file.
PROFILESELECTED=1 -- if set, cpu-profiler will only profile regions of code
surrounded with ProfilerEnable()/ProfilerDisable().
CPUPROFILE_FREQUENCY=x-- how many interrupts/second the cpu-profiler samples.
PERFTOOLS_VERBOSE=<level> -- the higher level, the more messages malloc emits
MALLOCSTATS=<level> -- prints memory-use stats at program-exit
For a full list of variables, see the documentation pages:
2016-11-15 08:58:11 +00:00
docs/cpuprofile.html
docs/heapprofile.html
docs/heap_checker.html
2023-08-14 14:12:11 +00:00
See also TCMALLOC_STACKTRACE_METHOD_VERBOSE and
TCMALLOC_STACKTRACE_METHOD environment variables briefly documented in
our INSTALL file and on our wiki page at:
https://github.com/gperftools/gperftools/wiki/gperftools'-stacktrace-capturing-methods-and-their-issues
COMPILING ON NON-LINUX SYSTEMS
------------------------------
2023-07-31 19:16:14 +00:00
Perftools was developed and tested on x86, aarch64 and riscv Linux
systems, and it works in its full generality only on those systems.
However, we've successfully ported much of the tcmalloc library to
FreeBSD, Solaris x86 (not tested recently though), and Mac OS X
(aarch64; x86 and ppc have not been tested recently); and we've ported
the basic functionality in tcmalloc_minimal to Windows. See INSTALL
for details. See README_windows.txt for details on the Windows port.
---
2023-08-14 14:12:11 +00:00
Originally written: 17 May 2011
Last refreshed: 10 Aug 2023