mirror of
https://github.com/gperftools/gperftools
synced 2025-01-06 14:39:40 +00:00
8e188310f7
* google-perftools: version 0.8 release * Experimental support for remote profiling added to pprof (many) * Fixed race condition in ProfileData::FlushTable (etune) * Better support for weird /proc maps (maxim, mec) * Fix heap-checker interaction with gdb (markus) * Better 64-bit support in pprof (aruns) * Reduce scavenging cost in tcmalloc by capping NumMoveSize (sanjay) * Cast syscall(SYS_mmap); works on more 64-bit systems now (menage) * Document the text output of pprof! (csilvers) * Better compiler support for no-THREADS and for old compilers (csilvers) * Make libunwind the default stack unwinder for x86-64 (aruns) * Somehow the COPYING file got erased. Regenerate it (csilvers) git-svn-id: http://gperftools.googlecode.com/svn/trunk@23 6b5cf1ce-ec42-a296-1ba9-69fdba395a50
191 lines
7.4 KiB
HTML
191 lines
7.4 KiB
HTML
<HTML>
|
|
|
|
<HEAD>
|
|
<title>pprof and Remote Servers</title>
|
|
</HEAD>
|
|
|
|
<BODY>
|
|
|
|
<h1><code>pprof</code> and Remote Servers</h2>
|
|
|
|
<p>In mid-2006, we added an experimental facility to <A
|
|
HREF="cpu_profiler.html">pprof</A>, the tool that analyzes CPU and
|
|
heap profiles. This facility allows you to collect profile
|
|
information from running applications. It makes it easy to collect
|
|
profile information without having to stop the program first, and
|
|
without having to log into the machine where the application is
|
|
running. This is meant to be used on webservers, but will work on any
|
|
application that can be modified to accept TCP connections on a port
|
|
of its choosing, and to respond to HTTP requests on that port.</p>
|
|
|
|
<p>We do not currently have infrastructure, such as apache modules,
|
|
that you can pop into a webserver or other application to get the
|
|
necessary functionality "for free." However, it's easy to generate
|
|
the necessary data, which should allow the interested developer to add
|
|
the necessary support into his or her applications.</p>
|
|
|
|
<p>To use <code>pprof</code> in this experimental "server" mode, you
|
|
give the script a host and port it should query, replacing the normal
|
|
commandline arguments of application + profile file:</p>
|
|
<pre>
|
|
% pprof internalweb.mycompany.com:80
|
|
</pre>
|
|
|
|
<p>The host must be listening on that port, and be able to accept HTTP/1.0
|
|
requests -- sent via <code>wget</code> and <code>curl</code> -- for
|
|
several urls. The following sections list the urls that
|
|
<code>pprof</code> can send, and the responses it expects in
|
|
return.</p>
|
|
|
|
|
|
<ul><li> <code><b>/pprof/heap</b></code>
|
|
|
|
<p><code>pprof</code> asks for the url <code>/pprof/heap</code> to
|
|
get heap information. The actual url is controlled via the variable
|
|
<code>HEAP_PAGE</code> in the <code>pprof</code> script, so you
|
|
can change it if you'd like.</p>
|
|
|
|
<p>The server should respond by calling</p>
|
|
<pre>
|
|
MallocExtension::instance()->GetHeapSample(&output);
|
|
</pre>
|
|
<p>and sending <code>output</code> back as an HTTP response to
|
|
<code>pprof</code>. <code>MallocExtension</code> is defined in the
|
|
header file <code>google/malloc_extension.h</code>.</p>
|
|
|
|
<p>Here's an example, from an actual Google webserver, of what the
|
|
output should look like:</p>
|
|
<pre>
|
|
heap profile: 9369: 126987529 [ 9369: 126987529] @ heap
|
|
2: 1024 [ 2: 1024] @ 0x87da913 0x8923ad4 0x891d4c2 0x892de12 0x8930519 0x83a16c2 0x836cb38 0x834cd1c 0x8349ba5 0x10a3177 0x8349961
|
|
1: 36 [ 1: 36] @ 0x87da913 0x83a0929 0x836cb38 0x834cd1c 0x8349ba5 0x10a3177 0x8349961
|
|
308: 10092544 [ 308: 10092544] @ 0x87da913 0x8970d66 0x8970e64 0x896e8e2 0x88e69d2 0x88e6add 0x88e6dec 0x88e7384 0x88e73fa 0x8838793 0x8838b36 0x88395f8 0x88f5a4b 0x890d03a 0x890d65a 0x8917666 0x890d1f3 0x890e6e4 0x8349c1b 0x10a3177 0x8349961
|
|
[...]
|
|
</pre>
|
|
|
|
|
|
</li><li> <code><b>/pprof/growth</b></code>
|
|
|
|
<p><code>pprof</code> asks for the url <code>/pprof/growth</code> to
|
|
get heap-profiling delta (growth) information. The actual url is
|
|
controlled via the variable <code>GROWTH_PAGE</code> in the
|
|
<code>pprof</code> script, so you can change it if you'd like.</p>
|
|
|
|
<p>The server should respond by calling</p>
|
|
<pre>
|
|
MallocExtension::instance()->GetHeapGrowthStacks(&output);
|
|
</pre>
|
|
<p>and sending <code>output</code> back as an HTTP response to
|
|
<code>pprof</code>. <code>MallocExtension</code> is defined in the
|
|
header file <code>google/malloc_extension.h</code>.</p>
|
|
|
|
<p>Here's an example, from an actual Google webserver, of what the
|
|
output should look like:</p>
|
|
<pre>
|
|
heap profile: 741: 812122112 [ 741: 812122112] @ growth
|
|
1: 1572864 [ 1: 1572864] @ 0x87da564 0x87db8a3 0x84787a4 0x846e851 0x836d12f 0x834cd1c 0x8349ba5 0x10a3177 0x8349961
|
|
1: 1048576 [ 1: 1048576] @ 0x87d92e8 0x87d9213 0x87d9178 0x87d94d3 0x87da9da 0x8a364ff 0x8a437e7 0x8ab7d23 0x8ab7da9 0x8ac7454 0x8348465 0x10a3161 0x8349961
|
|
[...]
|
|
</pre>
|
|
|
|
|
|
</li><li> <code><b>/pprof/profile</b></code>
|
|
|
|
<p><code>pprof</code> asks for the url
|
|
<code>/pprof/profile?seconds=XX</code> to get cpu-profiling
|
|
information. The actual url is controlled via the variable
|
|
<code>PROFILE_PAGE</code> in the <code>pprof</code> script, so you can
|
|
change it if you'd like.</p>
|
|
|
|
<p>The server should respond by calling
|
|
<code>ProfilerStart(filename)</code>, continuing to do its work, and
|
|
then, XX seconds later, calling <code>ProfilerStop()</code>. (These
|
|
functions are declared in <code>google/profiler.h</code>.) The
|
|
application is responsible for picking a unique filename for
|
|
<code>ProfilerStart()</code>. After calling
|
|
<code>ProfilerStop()</code>, the server should read the contents of
|
|
<code>filename</code> and send them back as an HTTP response to
|
|
<code>pprof</code>.</p>
|
|
|
|
<p>Obviously, to get useful profile information the application must
|
|
continue to run in the XX seconds that the profiler is running. Thus,
|
|
the profile start-stop calls should be done in a separate thread, or
|
|
be otherwise non-blocking.</p>
|
|
|
|
<p>The profiler output file is binary, but near the end of it, it
|
|
should have lines of text somewhat like this:</p>
|
|
<pre>
|
|
01016000-01017000 rw-p 00015000 03:01 59314 /lib/ld-2.2.2.so
|
|
</pre>
|
|
|
|
|
|
</li><li> <code><b>/pprof/contention</b></code>
|
|
|
|
<p>This is intended to be able to profile (thread) lock contention in
|
|
addition to CPU and memory use. It's not yet usable.</p>
|
|
|
|
|
|
</li><li> <code><b>/pprof/cmdline</b></code>
|
|
|
|
<p><code>pprof</code> asks for the url <code>/pprof/cmdline</code> to
|
|
figure out what application it's profiling. The actual url is
|
|
controlled via the variable <code>PROGRAM_NAME_PAGE</code> in the
|
|
<code>pprof</code> script, so you can change it if you'd like.</p>
|
|
|
|
<p>The server should respond by reading the contents of
|
|
<code>/proc/self/cmdline</code>, converting all internal NUL (\0)
|
|
characters to newlines, and sending the result back as an HTTP
|
|
response to <code>pprof</code>.</p>
|
|
|
|
<p>Here's an example return value:<p>
|
|
<pre>
|
|
/root/server/custom_webserver
|
|
80
|
|
--configfile=/root/server/ws.config
|
|
</pre>
|
|
|
|
|
|
</li><li> <code><b>/pprof/symbol</b></code>
|
|
|
|
<p><code>pprof</code> asks for the url <code>/pprof/symbol</code> to
|
|
map from hex addresses to variable names. The actual url is
|
|
controlled via the variable <code>SYMBOL_PAGE</code> in the
|
|
<code>pprof</code> script, so you can change it if you'd like.</p>
|
|
|
|
<p>This is perhaps the hardest request to write code for, because
|
|
it must accept POST requests. This means that after the HTTP headers,
|
|
pprof will pass in a list of hex addresses connected by
|
|
<code>+</code>, like so:</p>
|
|
<pre>
|
|
curl -d '0x0824d061+0x0824d1cf' http://remote_host:80/pprof/symbol
|
|
</pre>
|
|
|
|
<p>The server should read the POST data, which will be in one line,
|
|
and for each hex value, should write one line of output to the output
|
|
stream, like so:</p>
|
|
<pre>
|
|
<hex address><tab><function name>
|
|
</pre>
|
|
<p>For instance:</p>
|
|
<pre>
|
|
0x08b2dabd _Update
|
|
</pre>
|
|
|
|
<p>The other reason this is the most difficult request to implement,
|
|
is that the application will have to figure out for itself how to map
|
|
from address to function name. One possibility is to run <code>nm -C
|
|
-n <program name></code> to get the mappings, either statically
|
|
(say at program-compile time), or dynamically, by having the
|
|
application call out to <code>nm</code> for every
|
|
<code>pprof/symbol</code> call (presumably with some caching!).</p>
|
|
|
|
<p><code>pprof</code> itself does just this for local profiles (not
|
|
ones that talk to remote servers); look at the subroutine
|
|
<code>GetProcedureBoundaries</code>.</p>
|
|
|
|
|
|
<hr>
|
|
Last modified: Mon Jun 12 21:30:14 PDT 2006
|
|
</body>
|
|
</html>
|