mirror of
https://github.com/gperftools/gperftools
synced 2025-02-27 17:40:24 +00:00
* google-perftools: version 0.90 release * (As the version-number jump hints, this is a major new release: almost every piece of functionality was rewritten. I can't do justice to all the changes, but will concentrate on highlights.) *** USER-VISIBLE CHANGES: * Ability to "release" unused memory added to tcmalloc * Exposed more tweaking knobs via environment variables (see docs) * pprof tries harder to map addresses to functions * tcmalloc_minimal compiles and runs on FreeBSD 6.0 and Solaris 10 *** INTERNAL CHANGES: * Much better 64-bit support * Better multiple-processor support (e.g. multicore contention tweaks) * Support for recent kernel ABI changes (e.g. new arg to mremap) * Addition of spinlocks to tcmalloc to reduce contention cost * Speed up tcmalloc by using __thread on systems that support TLS * Total redesign of heap-checker to improve liveness checking * More portable stack-frame analysis -- no more hard-coded constants! * Disentangled heap-profiler code and heap-checker code * Several new unittests to test, e.g., thread-contention costs * Lots of small (but important!) bug fixes: e.g., fixing GetPC on amd64 *** KNOWN PROBLEMS: * CPU-profiling may crash on x86_64 (64-bit) systems. See the README * Profiling/heap-checking may deadlock on x86_64 systems. See README git-svn-id: http://gperftools.googlecode.com/svn/trunk@28 6b5cf1ce-ec42-a296-1ba9-69fdba395a50 |
||
---|---|---|
doc | ||
packages | ||
src | ||
aclocal.m4 | ||
AUTHORS | ||
ChangeLog | ||
compile | ||
config.guess | ||
config.sub | ||
configure | ||
configure.ac | ||
COPYING | ||
depcomp | ||
INSTALL | ||
install-sh | ||
ltmain.sh | ||
Makefile.am | ||
Makefile.in | ||
missing | ||
mkinstalldirs | ||
NEWS | ||
README | ||
TODO |
IMPORTANT NOTE FOR 64-BIT USERS ------------------------------- There are known issues with some perftools functionality on x86_64 systems. See 64-BIT ISSUES, below. CPU PROFILER ------------ See doc/cpu-profiler.html for information about how to use the CPU profiler and analyze its output. As a quick-start, do the following after installing this package: 1) Link your executable with -lprofiler 2) Run your executable with the CPUPROFILE environment var set: $ CPUPROFILE=/tmp/prof.out <path/to/binary> [binary args] 3) Run pprof to analyze the CPU usage $ pprof <path/to/binary> /tmp/prof.out # -pg-like text output $ pprof --gv <path/to/binary> /tmp/prof.out # really cool graphical output There are other environment variables, besides CPUPROFILE, you can set to adjust the cpu-profiler behavior; cf "ENVIRONMENT VARIABLES" below. TCMALLOC -------- Just link in -ltcmalloc to get the advantages of tcmalloc. See below for some environment variables you can use with tcmalloc, as well. HEAP PROFILER ------------- See doc/heap-profiler.html for information about how to use tcmalloc's heap profiler and analyze its output. As a quick-start, do the following after installing this package: 1) Link your executable with -ltcmalloc 2) Run your executable with the HEAPPROFILE environment var set: $ HEAPROFILE=/tmp/heapprof <path/to/binary> [binary args] 3) Run pprof to analyze the heap usage $ pprof <path/to/binary> /tmp/heapprof.0045.heap # run 'ls' to see options $ pprof --gv <path/to/binary> /tmp/heapprof.0045.heap You can also use LD_PRELOAD to heap-profile an executable that you didn't compile. There are other environment variables, besides HEAPPROFILE, you can set to adjust the heap-profiler behavior; cf "ENVIRONMENT VARIABLES" below. HEAP CHECKER ------------ See doc/heap-checker.html for information about how to use tcmalloc's heap checker. In order to catch all heap leaks, tcmalloc must be linked *last* into your executable. The heap checker may mischaracterize some memory accesses in libraries listed after it on the link line. For instance, it may report these libraries as leaking memory when they're not. (See the source code for more details.) Here's a quick-start for how to use: As a quick-start, do the following after installing this package: 1) Link your executable with -ltcmalloc 2) Run your executable with the HEAPCHECK environment var set: $ HEAPCHECK=1 <path/to/binary> [binary args] Other values for HEAPCHECK: normal (equivalent to "1"), strict, draconian You can also use LD_PRELOAD to heap-check an executable that you didn't compile. ENVIRONMENT VARIABLES --------------------- The cpu profiler, heap checker, and heap profiler will lie dormant, using no memory or CPU, until you turn them on. (Thus, there's no harm in linking -lprofiler into every application, and also -ltcmalloc assuming you're ok using the non-libc malloc library.) The easiest way to turn them on is by setting the appropriate environment variables. We have several variables that let you enable/disable features as well as tweak parameters. Here are some of the most important variables: CPUPROFILE=<file> -- turns on cpu profiling and dumps data to this file. PROFILESELECTED=1 -- if set, cpu-profiler will only profile regions of code surrounded with ProfilerEnable()/ProfilerDisable(). PROFILEFREQUENCY=x-- how many interrupts/second the cpu-profiler samples. HEAPPROFILE=<pre> -- turns on heap profiling and dumps data using this prefix HEAPCHECK=<type> -- turns on heap checking with strictness 'type' TCMALLOC_DEBUG=<level> -- the higher level, the more messages malloc emits MALLOCSTATS=<level> -- prints memory-use stats at program-exit For a full list of variables, see the documentation pages: doc/cpuprofile.html doc/heapprofile.html doc/heap_checker.html COMPILING ON NON-LINUX SYSTEMS ------------------------------ Perftools was developed and tested on x86 Linux systems, and it works in its full generality only on those systems. However, we've successfully built and tested the core tcmalloc library (tcmalloc_minimal) on both FreeBSD and Solaris x86. See INSTALL for details. 64-BIT ISSUES ------------- There are two issues that can cause program hangs or crashes on x86_64 64-bit systems, which use the libunwind library to get stack-traces. Neither issue should affect the core tcmalloc library; they both affect the perftools tools such as cpu-profiler, heap-checker, and heap-profiler. 1) Some libc's -- at least glibc 2.4 on x86_64 -- have a bug where the libc function dl_iterate_phdr() acquires its locks in the wrong order. This bug should not affect tcmalloc, but may cause occasional deadlock with the cpu-profiler, heap-profiler, and heap-checker. Its likeliness increases the more dlopen() commands an executable has. Most executables don't have any, though several library routines like getgrgid() call dlopen() behind the scnees. 2) On x86-64 64-bit systems, while tcmalloc itself works fine, the cpu-profiler tool is unreliable: it will sometimes work, but sometimes cause a segfault. I'll explain the problem first, and then some workarounds. Note that this only affects the cpu-profiler, which is a google-perftools featuure you must turn on manually by setting the CPUPROFILE environment variable. If you do not turn on cpu-profiling, you shouldn't see any crashes due to perftools. The gory details: The underlying problem is in the backtrace() function, which is a built-in function in libc. (However, we *strongly* recommend for x86-64, that you use the libunwind functionality for backtraces instead; see the top of INSTALL.) Backtracing is fairly straightforward in the normal case, but can run into problems when having to backtrace across a signal frame. Unfortunately, the cpu-profiler uses signals in order to register a profiling event, so every backtrace that the profiler does crosses a signal frame. In our experience, the only time there is trouble is when the signal fires in the middle of pthread_mutex_lock. pthread_mutex_lock is called quite a bit from system libraries, particular at program startup and when creating a new thread. The solution: The dwarf debugging format has support for 'cfi annotations', which make it easy to recognize a signal frame. Some OS distributions, such as Fedora and gentoo 2007.0, already have added cfi annotations to their libc. A future version of libunwind should recognize these annotations; these systems should not see any crashses. Workarounds: If you see problems with crashes when running the cpu-profiler, consider inserting ProfilerStart()/ProfilerStop() into your code, rather than setting CPUPROFILE. This will profile only those sections of the codebase. Though we haven't done much testing, in theory this should reduce the chance of crashes by limiting the signal generation to only a small part of the codebase. Ideally, you would not use ProfilerStart()/ProfilerStop() around code that spawns new threads, or is otherwise likely to cause a call to pthread_mutex_lock! --- 12 April 2007