gperftools/doc/heap_checker.html

632 lines
25 KiB
HTML
Raw Normal View History

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
<HEAD>
<link rel="stylesheet" href="designstyle.css">
<title>Google Heap Leak Checker</title>
</HEAD>
<BODY>
<p align=right>
<i>Last modified
<script type=text/javascript>
var lm = new Date(document.lastModified);
document.write(lm.toDateString());
</script></i>
</p>
<p>This is the heap checker we use at Google to detect memory leaks in
C++ programs. There are three parts to using it: linking the library
into an application, running the code, and analyzing the output.</p>
<H1>Linking in the Library</H1>
<p>The heap-checker is part of tcmalloc, so to install the heap
checker into your executable, add <code>-ltcmalloc</code> to the
link-time step for your executable. Also, while we don't necessarily
recommend this form of usage, it's possible to add in the profiler at
run-time using <code>LD_PRELOAD</code>:</p>
<pre>% env LD_PRELOAD="/usr/lib/libtcmalloc.so" <binary></pre>
<p>This does <i>not</i> turn on heap checking; it just inserts the
code. For that reason, it's practical to just always link
<code>-ltcmalloc</code> into a binary while developing; that's what we
do at Google. (However, since any user can turn on the profiler by
setting an environment variable, it's not necessarily recommended to
install heapchecker-linked binaries into a production, running
system.) Note that if you wish to use the heap checker, you must
also use the tcmalloc memory-allocation library. There is no way
currently to use the heap checker separate from tcmalloc.</p>
<h1>Running the Code</h1>
<p>Note: For security reasons, heap profiling will not write to a file
-- and is thus not usable -- for setuid programs.</p>
<h2><a name="whole_program">Whole-program Heap Leak Checking</a></h2>
<p>The recommended way to use the heap checker is in "whole program"
mode. In this case, the heap-checker starts tracking memory
allocations before the start of <code>main()</code>, and checks again
at program-exit. If it finds any memory leaks -- that is, any memory
not pointed to by objects that are still "live" at program-exit -- it
aborts the program (via <code>exit(1)</code>) and prints a message
describing how to track down the memory leak (using <A
HREF="heapprofile.html#pprof">pprof</A>).</p>
<p>The heap-checker records the stack trace for each allocation while
it is active. This causes a significant increase in memory usage, in
addition to slowing your program down.</p>
<p>Here's how to run a program with whole-program heap checking:</p>
<ol>
<li> <p>Define the environment variable HEAPCHECK to the <A
HREF="#types">type of heap-checking</A> to do. For instance,
to heap-check
<code>/usr/local/bin/my_binary_compiled_with_tcmalloc</code>:</p>
<pre>% env HEAPCHECK=normal /usr/local/bin/my_binary_compiled_with_tcmalloc</pre>
</ol>
<p>No other action is required.</p>
<p>Note that since the heap-checker uses the heap-profiling framework
internally, it is not possible to run both the heap-checker and <A
HREF="heapprofile.html">heap profiler</A> at the same time.</p>
<h3><a name="types">Flavors of Heap Checking</a></h3>
<p>These are the legal values when running a whole-program heap
check:</p>
<ol>
<li> <code>minimal</code>
<li> <code>normal</code>
<li> <code>strict</code>
<li> <code>draconian</code>
</ol>
<p>"Minimal" heap-checking starts as late as possible ina
initialization, meaning you can leak some memory in your
initialization routines (that run before <code>main()</code>, say),
and not trigger a leak message. If you frequently (and purposefully)
leak data in one-time global initializers, "minimal" mode is useful
for you. Otherwise, you should avoid it for stricter modes.</p>
<p>"Normal" heap-checking tracks <A HREF="#live">live objects</A> and
reports a leak for any data that is not reachable via a live object
when the program exits.</p>
<p>"Strict" heap-checking is much like "normal" but has a few extra
checks that memory isn't lost in global destructors. In particular,
if you have a global variable that allocates memory during program
execution, and then "forgets" about the memory in the global
destructor (say, by setting the pointer to it to NULL) without freeing
it, that will prompt a leak message in "strict" mode, though not in
"normal" mode.</p>
<p>"Draconian" heap-checking is appropriate for those who like to be
very precise about their memory management, and want the heap-checker
to help them enforce it. In "draconian" mode, the heap-checker does
not do "live object" checking at all, so it reports a leak unless
<i>all</i> allocated memory is freed before program exit. (However,
you can use <A HREF="#disable">IgnoreObject()</A> to re-enable
liveness-checking on an object-by-object basis.)</p>
<p>"Normal" mode, as the name implies, is the one used most often at
Google. It's appropriate for everyday heap-checking use.</p>
<p>In addition, there are two other possible modes:</p>
<ul>
<li> <code>as-is</code>
<li> <code>local</code>
</ul>
<p><code>as-is</code> is the most flexible mode; it allows you to
specify the various <A HREF="#options">knobs</A> of the heap checker
explicitly. <code>local</code> activates the <A
HREF="#explicit">explicit heap-check instrumentation</A>, but does not
turn on any whole-program leak checking.</p>
<h3><A NAME="tweaking">Tweaking whole-program checking</A></h3>
<p>In some cases you want to check the whole program for memory leaks,
but waiting for after <code>main()</code> exits to do the first
whole-program leak check is waiting too long: e.g. in a long-running
server one might wish to simply periodically check for leaks while the
server is running. In this case, you can call the static method
<code>NoGlobalLeaks()</code>, to verify no global leaks have happened
as of that point in the program.</p>
<p>Alternately, doing the check after <code>main()</code> exits might
be too late. Perhaps you have some objects that are known not to
clean up properly at exit. You'd like to do the "at exit" check
before those objects are destroyed (since while they're live, any
memory they point to will not be considered a leak). In that case,
you can call <code>NoGlobalLeaks()</code> manually, near the end of
<code>main()</code>, and then call <code>CancelGlobalCheck()</code> to
turn off the automatic post-<code>main()</code> check.</p>
<p>Finally, there's a helper macro for "strict" and "draconian" modes,
which require all global memory to be freed before program exit. This
freeing can be time-consuming and is often unnecessary, since libc
cleans up all memory at program-exit for you. If you want the
benefits of "strict"/"draconian" modes without the cost of all that
freeing, look at <code>REGISTER_HEAPCHECK_CLEANUP</code> (in
<code>heap-checker.h</code>). This macro allows you to mark specific
cleanup code as active only when the heap-checker is turned on.</p>
<h2><a name="explicit">Explicit (Partial-program) Heap Leak Checking</h2>
<p>Instead of whole-program checking, you can check certain parts of
your code to verify they do not have memory leaks. There are two
types of checks you can do. The "no leak" check verifies that between
two parts of a program, no memory is allocated without being freed; it
checks that memory does not grow. The stricter "same heap" check
verifies that two parts of a program share the same heap profile; that
is, that the memory does not grow <i>or shrink</i>, or change in any
way.</p>
<p>To use this kind of checking code, bracket the code you want
checked by creating a <code>HeapLeakChecker</code> object at the
beginning of the code segment, and calling <code>*SameHeap()</code> or
<code>*NoLeaks()</code> at the end. These functions, and all others
referred to in this file, are declared in
<code>&lt;google/heap-checker.h&gt;</code>.
</p>
<p>Here's an example:</p>
<pre>
HeapLeakChecker heap_checker("test_foo");
{
code that exercises some foo functionality;
this code should preserve memory allocation state;
}
if (!heap_checker.SameHeap()) assert(NULL == "heap memory leak");
</pre>
<p>The various flavors of these functions -- <code>SameHeap()</code>,
<code>QuickSameHeap()</code>, <code>BriefSameHeap()</code> -- trade
off running time for accuracy: the faster routines might miss some
legitimate leaks. For instance, the briefest tests might be confused
by code like this:</p>
<pre>
void LeakTwentyBytes() {
char* a = malloc(20);
HeapLeakChecker heap_checker("test_malloc");
char* b = malloc(20);
free(a);
// This will pass: it totes up 20 bytes allocated and 20 bytes freed
assert(heap_checker.BriefNoLeaks()); // doesn't detect that b is leaked
}
</pre>
<p>(This is because <code>BriefSameHeap()</code> does not use <A
HREF="#pprof">pprof</A>, which is slower but is better able to track
allocations in tricky situations like the above.)</p>
<p>Note that adding in the <code>HeapLeakChecker</code> object merely
instruments the code for leak-checking. To actually turn on this
leak-checking on a particular run of the executable, you must still
run with the heap-checker turned on:</p>
<pre>% env HEAPCHECK=local /usr/local/bin/my_binary_compiled_with_tcmalloc</pre>
<p>If you want to do whole-program leak checking in addition to this
manual leak checking, you can run in <code>normal</code> or some other
mode instead: they'll run the "local" checks in addition to the
whole-program check.</p>
<h2><a name="disable">Disabling Heap-checking of Known Leaks</a></h2>
<p>Sometimes your code has leaks that you know about and are willing
to accept. You would like the heap checker to ignore them when
checking your program. You can do this by bracketing the code in
question with an appropriate heap-checking construct:</p>
<pre>
...
{
HeapLeakChecker::Disabler disabler;
&lt;leaky code&gt;
}
...
</pre>
Any objects allocated by <code>leaky code</code> (including inside any
routines called by <code>leaky code</code>) and any objects reachable
from such objects are not reported as leaks.
<p>Alternately, you can use <code>IgnoreObject()</code>, which takes a
pointer to an object to ignore. That memory, and everything reachable
from it (by following pointers), is ignored for the purposes of leak
checking. You can call <code>UnIgnoreObject()</code> to undo the
effects of <code>IgnoreObject()</code>.</p>
<h2><a name="options">Tuning the Heap Checker</h2>
<p>The heap leak checker has many options, some that trade off running
time and accuracy, and others that increase the sensitivity at the
risk of returning false positives. For most uses, the range covered
by the <A HREF="#types">heap-check flavors</A> is enough, but in
specialized cases more control can be helpful.</p>
<p>
These options are specified via environment varaiables.
</p>
<p>This first set of options controls sensitivity and accuracy. These
options are ignored unless you run the heap checker in <A
HREF="#types">as-is</A> mode.
<table frame=box rules=sides cellpadding=5 width=100%>
<tr valign=top>
<td><code>HEAP_CHECK_AFTER_DESTRUCTORS</code></td>
<td>Default: false</td>
<td>
When true, do the final leak check after all other global
destructors have run. When false, do it after all
<code>REGISTER_HEAPCHECK_CLEANUP</code>, typically much earlier in
the global-destructor process.
</td>
</tr>
<tr valign=top>
<td><code>HEAP_CHECK_IGNORE_THREAD_LIVE</code></td>
<td>Default: true</td>
<td>
If true, ignore objects reachable from thread stacks and registers
(that is, do not report them as leaks).
</td>
</tr>
<tr valign=top>
<td><code>HEAP_CHECK_IGNORE_GLOBAL_LIVE</code></td>
<td>Default: true</td>
<td>
If true, ignore objects reachable from global variables and data
(that is, do not report them as leaks).
</td>
</tr>
</table>
<p>These options modify the behavior of whole-program leak
checking.</p>
<table frame=box rules=sides cellpadding=5 width=100%>
<tr valign=top>
<td><code>HEAP_CHECK_REPORT</code></td>
<td>Default: true</td>
<td>
If true, use <code>pprof</code> to report more info about found leaks.
</td>
</tr>
<tr valign=top>
<td><code>HEAP_CHECK_STRICT_CHECK</code></td>
<td>Default: true</td>
<td>
If true, do the program-end check via <code>SameHeap()</code>;
if false, use <code>NoLeaks()</code>.
</td>
</tr>
</table>
<p>These options apply to all types of leak checking.</p>
<table frame=box rules=sides cellpadding=5 width=100%>
<tr valign=top>
<td><code>HEAP_CHECK_IDENTIFY_LEAKS</code></td>
<td>Default: false</td>
<td>
If true, generate the addresses of the leaked objects in the
generated memory leak profile files.
</td>
</tr>
<tr valign=top>
<td><code>HEAP_CHECK_TEST_POINTER_ALIGNMENT</code></td>
<td>Default: false</td>
<td>
If true, check all leaks to see if they might be due to the use
of unaligned pointers.
</td>
</tr>
<tr valign=top>
<td><code>PPROF_PATH</code></td>
<td>Default: pprof</td>
<td>
The location of the <code>pprof</code> executable.
</td>
</tr>
<tr valign=top>
<td><code>HEAP_CHECK_DUMP_DIRECTORY</code></td>
<td>Default: /tmp</td>
<td>
Where the heap-profile files are kept while the program is running.
</td>
</tr>
</table>
<h2>Tips for Handling Detected Leaks</h2>
<p>What do you do when the heap leak checker detects a memory leak?
First, you should run the reported <code>pprof</code> command;
hopefully, that is enough to track down the location where the leak
occurs.</p>
<p>If the leak is a real leak, you should fix it!</p>
<p>If you are sure that the reported leaks are not dangerous and there
is no good way to fix them, then you can use
<code>HeapLeakChecker::Disabler</code> and/or
<code>HeapLeakChecker::IgnoreObject()</code> to disable heap-checking
for certain parts of the codebase.</p>
<p>In "strict" or "draconian" mode, leaks may be due to incomplete
cleanup in the destructors of global variables. If you don't wish to
augment the cleanup routines, but still want to run in "strict" or
"draconian" mode, consider using <A
HREF="#tweaking"><code>REGISTER_HEAPCHECK_CLEANUP</code></A>.</p>
<h2>Hints for Debugging Detected Leaks</h2>
<p>Sometimes it can be useful to not only know the exact code that
allocates the leaked objects, but also the addresses of the leaked objects.
Combining this e.g. with additional logging in the program
one can then track which subset of the allocations
made at a certain spot in the code are leaked.
<br/>
To get the addresses of all leaked objects
define the environment variable <code>HEAP_CHECK_IDENTIFY_LEAKS</code>
to be <code>1</code>.
The object addresses will be reported in the form of addresses
of fake immediate callers of the memory allocation routines.
Note that the performance of doing leak-checking in this mode
can be noticeably worse than the default mode.
</p>
<p>One relatively common class of leaks that don't look real
is the case of multiple initialization.
In such cases the reported leaks are typically things that are
linked from some global objects,
which are initialized and say never modified again.
The non-obvious cause of the leak is frequently the fact that
the initialization code for these objects executes more than once.
<br/>
E.g. if the code of some <code>.cc</code> file is made to be included twice
into the binary, then the constructors for global objects defined in that file
will execute twice thus leaking the things allocated on the first run.
<br/>
Similar problems can occur if object initialization is done more explicitly
e.g. on demand by a slightly buggy code
that does not always ensure only-once initialization.
</p>
<p>
A more rare but even more puzzling problem can be use of not properly
aligned pointers (maybe inside of not properly aligned objects).
Normally such pointers are not followed by the leak checker,
hence the objects reachable only via such pointers are reported as leaks.
If you suspect this case
define the environment variable <code>HEAP_CHECK_TEST_POINTER_ALIGNMENT</code>
to be <code>1</code>
and then look closely at the generated leak report messages.
</p>
<h1>How It Works</h1>
<p>When a <code>HeapLeakChecker</code> object is constructed, it dumps
a memory-usage profile named
<code>&lt;prefix&gt;.&lt;name&gt;-beg.heap</code> to a temporary
directory. When <code>*NoLeaks()</code> or <code>*SameHeap()</code>
is called (for whole-program checking, this happens automatically at
program-exit), it dumps another profile, named
<code>&lt;prefix&gt;.&lt;name&gt;-end.heap</code>.
(<code>&lt;prefix&gt;</code> is typically determined automatically,
and <code>&lt;name&gt;</code> is typically <code>argv[0]</code>.) It
then compares the two profiles. If the second profile shows more
memory use than the first (or, for <code>*SameHeap()</code> calls,
any different pattern of memory use than the first), the
<code>*NoLeaks()</code> or <code>*SameHeap()</code> function will
return false. For "whole program" profiling, this will cause the
executable to abort (via <code>exit(1)</code>). In all cases, it will
print a message on how to process the dumped profiles to locate
leaks.</p>
<h3><A name=live>Detecting Live Objects</A></h3>
<p>At any point during a program's execution, all memory that is
accessible at that time is considered "live." This includes global
variables, and also any memory that is reachable by following pointers
from a global variable. It also includes all memory reachable from
the current stack frame and from current CPU registers (this captures
local variables). Finally, it includes the thread equivalents of
these: thread-local storage and thread heaps, memory reachable from
thread-local storage and thread heaps, and memory reachable from
thread CPU registers.</p>
<p>In all modes except "draconian," live memory is not
considered to be a leak. We detect this by doing a liveness flood,
traversing pointers to heap objects starting from some initial memory
regions we know to potentially contain live pointer data. Note that
this flood might potentially not find some (global) live data region
to start the flood from. If you find such, please file a bug.</p>
<p>The liveness flood attempts to treat any properly aligned byte
sequences as pointers to heap objects and thinks that it found a good
pointer whenever the current heap memory map contains an object with
the address whose byte representation we found. Some pointers into
not-at-start of object will also work here.</p>
<p>As a result of this simple approach, it's possible (though
unlikely) for the flood to be inexact and occasionally result in
leaked objects being erroneously determined to be live. For instance,
random bit patterns can happen to look like pointers to leaked heap
objects. More likely, stale pointer data not corresponding to any
live program variables can be still present in memory regions,
especially in thread stacks. For instance, depending on how the local
<code>malloc</code> is implemented, it may reuse a heap object
address:</p>
<pre>
char* p = new char[1]; // new might return 0x80000000, say.
delete p;
new char[1]; // new might return 0x80000000 again
// This last new is a leak, but doesn't seem it: p looks like it points to it
</pre>
<p>In other words, imprecisions in the liveness flood mean that for
any heap leak check we might miss some memory leaks. This means that
for local leak checks, we might report a memory leak in the local
area, even though the leak actually happened before the
<code>HeapLeakChecker</code> object was constructed. Note that for
whole-program checks, a leak report <i>does</i> always correspond to a
real leak (since there's no "before" to have created a false-live
object).</p>
<p>While this liveness flood approach is not very portable and not
100% accurate, it works in most cases and saves us from writing a lot
of explicit clean up code and other hassles when dealing with thread
data.</p>
<h3><A NAME="pprof">More Exact Checking via pprof</A></h3>
<p>The perftools profiling tool, <code>pprof</code>, is primarily
intended for users to use interactively in order to explore heap and
CPU usage. However, the heap-checker can -- and, by default, does -
call <code>pprof</code> internally, in order to improve its leak
checking.</p>
<p>In particular, the heap-checker calls <code>pprof</code> to utilize
the full call-path for all allocations. <code>pprof</code> uses this
data to disambiguate allocations. When the time comes to do a
<code>SameHeap</code> or <code>NoLeaks</code> check, the heap-checker
asks <code>pprof</code> to do this check on an
allocation-by-allocation basis, rather than just by comparing global
counts.</p>
<p>Here's an example. Consider the following function:</p>
<pre>
void LeakTwentyBytes() {
char* a = malloc(20);
HeapLeakChecker heap_checker("test_malloc");
char* b = malloc(20);
free(a);
heap_checker.NoLeaks();
}
</pre>
<p>Without using pprof, the only thing we will do is count up the
number of allocations and frees inside the leak-checked interval.
Twenty bytes allocated, twenty bytes freed, and the code looks ok.</p>
<p>With pprof, however, we can track the call-path for each
allocation, and account for them separately. In the example function
above, there are two call-paths that end in an allocation, one that
ends in "LeakTwentyBytes:line1" and one that ends in
"LeakTwentyBytes:line3".</p>
<p>Here's how the heap-checker works when it can use pprof in this
way:</p>
<ol>
<li> <b>Line 1:</b> Allocate 20 bytes, mark <code>a</code> as having
call-path "LeakTwentyBytes:line1", and update the count-map
<pre>count["LeakTwentyByte:line1"] += 20;</pre>
<li> <b>Line 2:</b> Dump the current <code>count</code> map to a file.
<li> <b>Line 3:</b> Allocate 20 bytes, mark <code>b</code> as having
call-path "LeakTwentyBytes:line3", and update the count-map:
<pre>count["LeakTwentyByte:line3"] += 20;</pre>
<li> <b>Line 4:</b> Look up <code>a</code> to find its call-path
(stored in line 1), and use that to update the count-map:
<pre>count["LeakTwentyByte:line1"] -= 20;</pre>
<li> <b>Line 5:</b> Look at each bucket in the current count-map,
minus what was dumped in line 2. Here's the diffs we'll have
in each bucket:
<pre>
count["LeakTwentyByte:line1"] == -20;
count["LeakTwentyByte:line3"] == 20;
</pre>
Since <i>at least one</i> bucket has a positive number, we
complain of a leak. (Note if line 5 had been
<code>SameHeap</code> instead of <code>NoLeaks</code>, we would
have complained if any bucket had had a <i>non-zero</i>
number.)
</ol>
<p>Note that one way to visualize the non-<code>pprof</code> mode is
that we do the same thing as above, but always use "unknown" as the
call-path. That is, our count-map always only has one entry in it:
<code>count["unknown"]</code>. Looking at the example above shows how
having only one entry in the map can lead to incorrect results.</p>
<p>Here is when <code>pprof</code> is used by the heap-checker:</p>
<ul>
<li> <code>NoLeaks()</code> and <code>SameHeap()</code> both use
<code>pprof</code>.
<li> <code>BriefNoLeaks()</code> and <code>BriefSameHeap()</code> do
not use <code>pprof</code>.
<li> <code>QuickNoLeaks</code> and <code>QuickSameHeap()</code> are
a kind of compromise: they do <i>not</i> use pprof for their
leak check, but if that check happens to find a leak anyway,
then they re-do the leak calculation using <code>pprof</code>.
This means they do not always find leaks, but when they do,
they will be as accurate as possible in their leak report.
</ul>
<h3>Leak-checking and Threads</h3>
<p>At the time of HeapLeakChecker's construction and during
<code>*NoLeaks()</code>/<code>*SameHeap()</code> calls, we grab a lock
and then pause all other threads so other threads do not interfere
with recording or analyzing the state of the heap.</p>
<p>In general, leak checking works correctly in the presence of
threads. However, thread stack data liveness determination (via
<code>base/thread_lister.h</code>) does not work when the program is
running under GDB, because the ptrace functionality needed for finding
threads is already hooked to by GDB. Conversely, leak checker's
ptrace attempts might also interfere with GDB. As a result, GDB can
result in potentially false leak reports. For this reason, the
heap-checker turns itself off when running under GDB.</p>
<p>Also, <code>thread_lister</code> only works for Linux pthreads;
leak checking is unlikely to handle other thread implementations
correctly.</p>
<p>As mentioned in the discussion of liveness flooding, thread-stack
liveness determination might mis-classify as reachable objects that
very recently became unreachable (leaked). This can happen when the
pointers to now-logically-unreachable objects are present in the
active thread stack frame. In other words, trivial code like the
following might not produce the expected leak checking outcome
depending on how the compiled code works with the stack:</p>
<pre>
int* foo = new int [20];
HeapLeakChecker check("a_check");
foo = NULL;
CHECK(check.NoLeaks()); // this might succeed
</pre>
<hr>
<address>Maxim Lifantsev<br>
<!-- Created: Tue Dec 19 10:43:14 PST 2000 -->
<!-- hhmts start -->
Last modified: Fri Jul 13 13:14:33 PDT 2007
<!-- hhmts end -->
</address>
</body>
</html>