gperftools/doc/pprof_remote_servers.html
csilvers 8e188310f7 Wed Jun 14 15:11:14 2006 Google Inc. <opensource@google.com>
* google-perftools: version 0.8 release
	* Experimental support for remote profiling added to pprof (many)
	* Fixed race condition in ProfileData::FlushTable (etune)
	* Better support for weird /proc maps (maxim, mec)
	* Fix heap-checker interaction with gdb (markus)
	* Better 64-bit support in pprof (aruns)
	* Reduce scavenging cost in tcmalloc by capping NumMoveSize (sanjay)
	* Cast syscall(SYS_mmap); works on more 64-bit systems now (menage)
	* Document the text output of pprof! (csilvers)
	* Better compiler support for no-THREADS and for old compilers (csilvers)
	* Make libunwind the default stack unwinder for x86-64 (aruns)
	* Somehow the COPYING file got erased.  Regenerate it (csilvers)


git-svn-id: http://gperftools.googlecode.com/svn/trunk@23 6b5cf1ce-ec42-a296-1ba9-69fdba395a50
2007-03-22 04:55:49 +00:00

191 lines
7.4 KiB
HTML

<HTML>
<HEAD>
<title>pprof and Remote Servers</title>
</HEAD>
<BODY>
<h1><code>pprof</code> and Remote Servers</h2>
<p>In mid-2006, we added an experimental facility to <A
HREF="cpu_profiler.html">pprof</A>, the tool that analyzes CPU and
heap profiles. This facility allows you to collect profile
information from running applications. It makes it easy to collect
profile information without having to stop the program first, and
without having to log into the machine where the application is
running. This is meant to be used on webservers, but will work on any
application that can be modified to accept TCP connections on a port
of its choosing, and to respond to HTTP requests on that port.</p>
<p>We do not currently have infrastructure, such as apache modules,
that you can pop into a webserver or other application to get the
necessary functionality "for free." However, it's easy to generate
the necessary data, which should allow the interested developer to add
the necessary support into his or her applications.</p>
<p>To use <code>pprof</code> in this experimental "server" mode, you
give the script a host and port it should query, replacing the normal
commandline arguments of application + profile file:</p>
<pre>
% pprof internalweb.mycompany.com:80
</pre>
<p>The host must be listening on that port, and be able to accept HTTP/1.0
requests -- sent via <code>wget</code> and <code>curl</code> -- for
several urls. The following sections list the urls that
<code>pprof</code> can send, and the responses it expects in
return.</p>
<ul><li> <code><b>/pprof/heap</b></code>
<p><code>pprof</code> asks for the url <code>/pprof/heap</code> to
get heap information. The actual url is controlled via the variable
<code>HEAP_PAGE</code> in the <code>pprof</code> script, so you
can change it if you'd like.</p>
<p>The server should respond by calling</p>
<pre>
MallocExtension::instance()->GetHeapSample(&output);
</pre>
<p>and sending <code>output</code> back as an HTTP response to
<code>pprof</code>. <code>MallocExtension</code> is defined in the
header file <code>google/malloc_extension.h</code>.</p>
<p>Here's an example, from an actual Google webserver, of what the
output should look like:</p>
<pre>
heap profile: 9369: 126987529 [ 9369: 126987529] @ heap
2: 1024 [ 2: 1024] @ 0x87da913 0x8923ad4 0x891d4c2 0x892de12 0x8930519 0x83a16c2 0x836cb38 0x834cd1c 0x8349ba5 0x10a3177 0x8349961
1: 36 [ 1: 36] @ 0x87da913 0x83a0929 0x836cb38 0x834cd1c 0x8349ba5 0x10a3177 0x8349961
308: 10092544 [ 308: 10092544] @ 0x87da913 0x8970d66 0x8970e64 0x896e8e2 0x88e69d2 0x88e6add 0x88e6dec 0x88e7384 0x88e73fa 0x8838793 0x8838b36 0x88395f8 0x88f5a4b 0x890d03a 0x890d65a 0x8917666 0x890d1f3 0x890e6e4 0x8349c1b 0x10a3177 0x8349961
[...]
</pre>
</li><li> <code><b>/pprof/growth</b></code>
<p><code>pprof</code> asks for the url <code>/pprof/growth</code> to
get heap-profiling delta (growth) information. The actual url is
controlled via the variable <code>GROWTH_PAGE</code> in the
<code>pprof</code> script, so you can change it if you'd like.</p>
<p>The server should respond by calling</p>
<pre>
MallocExtension::instance()->GetHeapGrowthStacks(&output);
</pre>
<p>and sending <code>output</code> back as an HTTP response to
<code>pprof</code>. <code>MallocExtension</code> is defined in the
header file <code>google/malloc_extension.h</code>.</p>
<p>Here's an example, from an actual Google webserver, of what the
output should look like:</p>
<pre>
heap profile: 741: 812122112 [ 741: 812122112] @ growth
1: 1572864 [ 1: 1572864] @ 0x87da564 0x87db8a3 0x84787a4 0x846e851 0x836d12f 0x834cd1c 0x8349ba5 0x10a3177 0x8349961
1: 1048576 [ 1: 1048576] @ 0x87d92e8 0x87d9213 0x87d9178 0x87d94d3 0x87da9da 0x8a364ff 0x8a437e7 0x8ab7d23 0x8ab7da9 0x8ac7454 0x8348465 0x10a3161 0x8349961
[...]
</pre>
</li><li> <code><b>/pprof/profile</b></code>
<p><code>pprof</code> asks for the url
<code>/pprof/profile?seconds=XX</code> to get cpu-profiling
information. The actual url is controlled via the variable
<code>PROFILE_PAGE</code> in the <code>pprof</code> script, so you can
change it if you'd like.</p>
<p>The server should respond by calling
<code>ProfilerStart(filename)</code>, continuing to do its work, and
then, XX seconds later, calling <code>ProfilerStop()</code>. (These
functions are declared in <code>google/profiler.h</code>.) The
application is responsible for picking a unique filename for
<code>ProfilerStart()</code>. After calling
<code>ProfilerStop()</code>, the server should read the contents of
<code>filename</code> and send them back as an HTTP response to
<code>pprof</code>.</p>
<p>Obviously, to get useful profile information the application must
continue to run in the XX seconds that the profiler is running. Thus,
the profile start-stop calls should be done in a separate thread, or
be otherwise non-blocking.</p>
<p>The profiler output file is binary, but near the end of it, it
should have lines of text somewhat like this:</p>
<pre>
01016000-01017000 rw-p 00015000 03:01 59314 /lib/ld-2.2.2.so
</pre>
</li><li> <code><b>/pprof/contention</b></code>
<p>This is intended to be able to profile (thread) lock contention in
addition to CPU and memory use. It's not yet usable.</p>
</li><li> <code><b>/pprof/cmdline</b></code>
<p><code>pprof</code> asks for the url <code>/pprof/cmdline</code> to
figure out what application it's profiling. The actual url is
controlled via the variable <code>PROGRAM_NAME_PAGE</code> in the
<code>pprof</code> script, so you can change it if you'd like.</p>
<p>The server should respond by reading the contents of
<code>/proc/self/cmdline</code>, converting all internal NUL (\0)
characters to newlines, and sending the result back as an HTTP
response to <code>pprof</code>.</p>
<p>Here's an example return value:<p>
<pre>
/root/server/custom_webserver
80
--configfile=/root/server/ws.config
</pre>
</li><li> <code><b>/pprof/symbol</b></code>
<p><code>pprof</code> asks for the url <code>/pprof/symbol</code> to
map from hex addresses to variable names. The actual url is
controlled via the variable <code>SYMBOL_PAGE</code> in the
<code>pprof</code> script, so you can change it if you'd like.</p>
<p>This is perhaps the hardest request to write code for, because
it must accept POST requests. This means that after the HTTP headers,
pprof will pass in a list of hex addresses connected by
<code>+</code>, like so:</p>
<pre>
curl -d '0x0824d061+0x0824d1cf' http://remote_host:80/pprof/symbol
</pre>
<p>The server should read the POST data, which will be in one line,
and for each hex value, should write one line of output to the output
stream, like so:</p>
<pre>
&lt;hex address&gt;&lt;tab&gt;&lt;function name&gt;
</pre>
<p>For instance:</p>
<pre>
0x08b2dabd _Update
</pre>
<p>The other reason this is the most difficult request to implement,
is that the application will have to figure out for itself how to map
from address to function name. One possibility is to run <code>nm -C
-n &lt;program name&gt;</code> to get the mappings, either statically
(say at program-compile time), or dynamically, by having the
application call out to <code>nm</code> for every
<code>pprof/symbol</code> call (presumably with some caching!).</p>
<p><code>pprof</code> itself does just this for local profiles (not
ones that talk to remote servers); look at the subroutine
<code>GetProcedureBoundaries</code>.</p>
<hr>
Last modified: Mon Jun 12 21:30:14 PDT 2006
</body>
</html>