gperftools/INSTALL

488 lines
20 KiB
Plaintext

Copyright 1994, 1995, 1996, 1999, 2000, 2001, 2002 Free Software
Foundation, Inc.
This file is free documentation; the Free Software Foundation gives
unlimited permission to copy, distribute and modify it.
Perftools-Specific Install Notes
================================
See generic autotool-provided installation notes at the
end. Immediately below you can see gperftools-specific details.
*** Building from source repository
As of 2.1 gperftools does not have configure and other autotools
products checked into it's source repository. This is common practice
for projects using autotools.
NOTE: Source releases (.tar.gz that you download from
https://github.com/gperftools/gperftools/releases) still have all
required files just as before. Nothing has changed w.r.t. building
from .tar.gz releases.
But, in order to build gperftools checked out from subversion
repository you need to have autoconf, automake and libtool
installed. And before running ./configure you have to generate it (and
a bunch of other files) by running ./autogen.sh script. That script
will take care of calling correct autotools programs in correct order.
If you're maintainer then it's business as usual too. Just run make
dist (or, preferably, make distcheck) and it'll produce .tar.gz or
.tar.bz2 with all autotools magic already included. So that users can
build our software without having autotools.
*** Stacktrace capturing details
A number of gperftools facilities capture stack traces. And
occasionally this happens in 'tricky' locations, like in SIGPROF
handler. So some platforms and library versions occasionally cause
troubles (crashes or hangs, or truncated stack traces).
So we do provide several implementations that our users are able to
select at runtime. Pass TCMALLOC_STACKTRACE_METHOD_VERBOSE=t as
environment variable to ./stacktrace_unittest to see options.
* frame-pointer-based stacktracing is fully supported on x86 (all 3
kinds: i386, x32 and x86-64 are suppored), aarch64 and riscv. But
all modern architectures and ABIs by default build code without
frame pointers (even on i386). So in order to get anything useful
out of this option, you need to build your code with frame
pointers. It adds some performance overhead (usually people quote
order of 2%-3%, but it can really vary based on workloads). Also it
is worth mentioning, that it is fairly common for various asm
routines not to have frame pointers, so you'll have somewhat
imperfect profiles out of typical asm bits like memcpy. This stack
trace capuring method is also fastest (like 2-3 orders of magnitude
faster), which will matter when stacktrace capturing is done a lot
(e.g. heap profiler).
* libgcc-based stacktracing works particularly great on modern
GNU/Linux systems with glibc 2.34 or later and libgcc from gcc 12 or
later. Thanks to usage of dl_find_object API introduced in recent
glibc-s this implementation seems to be truly async-signal safe and
it is reasonably fast too. On Linux and other ELF platforms it uses
eh_frame facility (which is very similar to dwarf unwind info). It
was originally introduced for exception handling. On most modern
platforms this unwind info is automatically added by compilers. On
others you might need to add -fexceptions and/or
-fasynchrnous-unwind-tables to your compiler flags. To make this
option default, pass --enable-libgcc-unwinder-by-default to
configure. When used without dl_find_object it will occasionally
deadlock especially when used in cpuprofiler.
* libunwind is another supported mechanism and is default when
available. It also depends on eh_frame stuff (or dwarf or some
arm-specific thingy when available). When using it, be sure to use
latest available libunwind version. As with libgcc some people
occasionally had trouble with it on codes with broken or missing
unwind info. If you encounter something like that, first make sure
to file tickets against your compiler vender. Second, libunwind has
configure option to check accesses more thoroughly, so consider
that.
* many systems provide backtrace() function either as part of their
libc or in -lexecinfo. On most systems, including GNU/Linux, it is
not built by default, so pass --enable-stacktrace-via-backtrace to
configure to enable it. Occasionally this implementation will call
malloc when capturing backtrace, but we should automagically handle
it via our "emergency malloc" facility which is now built by default
on most systems (but it currently doesn't handle being used by
cpuprofiler).
*** TCMALLOC LARGE PAGES: TRADING TIME FOR SPACE
You can set a compiler directive that makes tcmalloc faster, at the
cost of using more space (due to internal fragmentation).
Internally, tcmalloc divides its memory into "pages." The default
page size is chosen to minimize memory use by reducing fragmentation.
The cost is that keeping track of these pages can cost tcmalloc time.
We've added a new flag to tcmalloc that enables a larger page size.
In general, this will increase the memory needs of applications using
tcmalloc. However, in many cases it will speed up the applications
as well, particularly if they allocate and free a lot of memory. We've
seen average speedups of 3-5% on Google applications.
To build libtcmalloc with large pages you need to use the
--with-tcmalloc-pagesize=ARG configure flag, e.g.:
./configure <other flags> --with-tcmalloc-pagesize=32
The ARG argument can be 4, 8, 16, 32, 64, 128 or 256 which sets the
internal page size to 4K, 8K, 16K, 32K, 64K, 128K and 256K respectively.
The default is 8K.
*** SMALL TCMALLOC CACHES: TRADING SPACE FOR TIME
You can set a compiler directive that makes tcmalloc use less memory
for overhead, at the cost of some time.
Internally, tcmalloc keeps information about some of its internal data
structures in a cache. This speeds memory operations that need to
access this internal data. We've added a new, experimental flag to
tcmalloc that reduces the size of this cache, decresaing the memory
needs of applications using tcmalloc.
This feature is still very experimental; it's not even a configure
flag yet. To build libtcmalloc with smaller internal caches, run
./configure <normal flags> CXXFLAGS=-DTCMALLOC_SMALL_BUT_SLOW
(or add -DTCMALLOC_SMALL_BUT_SLOW to your existing CXXFLAGS argument).
*** TCMALLOC AND DLOPEN
To improve performance, we use the "initial exec" model of Thread
Local Storage in tcmalloc. The price for this is the library will not
work correctly if it is loaded via dlopen(). This should not be a
problem, since loading a malloc-replacement library via dlopen is
asking for trouble in any case: some data will be allocated with one
malloc, some with another.
*** COMPILING ON NON-LINUX SYSTEMS
We regularly build and test on typical modern GNU/Linux systems. You
should expect all tests to pass on modern Linux distros and x86,
aarch64 and riscv machines. Other machine types may fail some tests,
but you should expect at least malloc to be fully functional.
Perftools has been tested on the following non-Linux systems:
Various recent versions of FreeBSD (x86-64 mostly)
Recent version of NetBSD (x86-64)
Recent versions of OSX (aarch64, x86 and ppc hasn't been tested for some time)
Solaris 10 (x86_64), but not recently
Windows using both MSVC (starting from MSVC 2015 and later) and mingw toolchains
Windows XP and other obsolete versions have not been tested recently
Windows XP, Cygwin 5.1 (x86), but not recently
Portions of gperftools work on those other systems. The basic
memory-allocation library, tcmalloc_minimal, works on all systems.
The cpu-profiler also works fairly widely. However, the heap-profiler
and heap-checker are not yet as widely supported. Heap checker is now
deprecated. In general, the 'configure' script will detect what OS you
are building for, and only build the components that work on that OS.
Note that tcmalloc_minimal is perfectly usable as a malloc/new
replacement, so it is possible to use tcmalloc on all the systems
above, by linking in libtcmalloc_minimal.
** Solaris 10 x86: (note, this is fairly old)
I've only tested using the GNU C++ compiler, not the Sun C++
compiler. Using g++ requires setting the PATH appropriately when
configuring.
% PATH=${PATH}:/usr/sfw/bin/:/usr/ccs/bin ./configure
% PATH=${PATH}:/usr/sfw/bin/:/usr/ccs/bin make [...]
Again, the binaries and libraries that successfully build are
exactly the same as for FreeBSD. (However, while libprofiler.so can
be used to generate profiles, pprof is not very successful at
reading them -- necessary helper programs like nm don't seem
to be installed by default on Solaris, or perhaps are only
installed as part of the Sun C++ compiler package.) See that
section for a list of binaries, and instructions on building them.
** Windows (MSVC, Cygwin, and MinGW):
Work on Windows is rather preliminary: only tcmalloc_minimal is
supported.
This Windows functionality is also available using MinGW and Msys,
In this case, you can use the regular './configure && make'
process. 'make install' should also work. The Makefile will limit
itself to those libraries and binaries that work on windows.
** AIX (as of 2021)
I've tested using the IBM XL and IBM Open XL Compilers. The
minimum requirement for IBM XL is V16 which includes C++11
support. IBM XL and gcc are not ABI compatible. If you would
like to use the library with a gcc built executable then the
library must also be built with gcc. To use the library with
and IBM XL built binary then it follows that the library must
also be built with IBM XL.
Both 32-bit and 64-bit builds have been tested.
To do a 32-bit IBM XL build:
% ./configure CC="xlclang" CXX="xlclang++" AR="ar"
RANLIB="ranlib" NM="nm"
To do a 64-bit IBM XL build:
% ./configure CC="xlclang -q64" CXX="xlclang++ -q64"
AR="ar -X64" RANLIB="ranlib -X64" NM="nm -X64"
Add your favorite optimization levels via CFLAGS and CXXFLAGS.
If you link to the shared library but it may not work as you
expect. Allocations and deallocations that occur from within
the Standard C and C++ libraries will not be redirected the
tcmalloc library.
The recommended method is to use the AIX User-defined malloc
replacement as documented by IBM. This replaces the default
AIX memory subsystem with a user defined memory subsystem.
The AIX user defined memory subsystem specifies that the 32-
and 64- bit objects must be placed in an archive with the
32-bit shared object named mem32.o and the 64-bit shared
object named mem64.o.
It is recommended to make combined 32_64 bit archive by
doing a 64-bit build, then copy the shared library to mem64.o
add mem64.o the archive, then do a 32-bit build
copy the shared library to mem32.o and add it to the same
combined archive.
For eg) perform a 64-bit build then:
% cp libtcmalloc_minimal.so.4 mem64.o
% ar -X32_64 -r libtmalloc_minimal.a mem64.o
Followed by a 32-bit build:
% cp libtcmalloc_minimal.so.4 mem32.o
% ar -X32_64 -r libtmalloc_minimal.a mem32.o
The final archive should contain both mem32.o and mem64.o
To use the library you are expected have the library location
in your LIBPATH or LD_LIBRARY_PATH followed by exporting the
environment variable MALLOCTYPE=user:libtcmalloc_minimal.a to
enable the new user defined memory subsystem.
I recommend using:
% MALLOCTYPE=user:libtcmalloc_minimal.a <user-exectuable>
to minimize the impact of replacing the memory subsystem. Once
the subsystem is replaced it is used for all commands issued from
the terminal.
Basic Installation
==================
These are generic installation instructions.
The `configure' shell script attempts to guess correct values for
various system-dependent variables used during compilation. It uses
those values to create a `Makefile' in each directory of the package.
It may also create one or more `.h' files containing system-dependent
definitions. Finally, it creates a shell script `config.status' that
you can run in the future to recreate the current configuration, and a
file `config.log' containing compiler output (useful mainly for
debugging `configure').
It can also use an optional file (typically called `config.cache'
and enabled with `--cache-file=config.cache' or simply `-C') that saves
the results of its tests to speed up reconfiguring. (Caching is
disabled by default to prevent problems with accidental use of stale
cache files.)
If you need to do unusual things to compile the package, please try
to figure out how `configure' could check whether to do them, and mail
diffs or instructions to the address given in the `README' so they can
be considered for the next release. If you are using the cache, and at
some point `config.cache' contains results you don't want to keep, you
may remove or edit it.
The file `configure.ac' (or `configure.in') is used to create
`configure' by a program called `autoconf'. You only need
`configure.ac' if you want to change it or regenerate `configure' using
a newer version of `autoconf'.
The simplest way to compile this package is:
1. `cd' to the directory containing the package's source code and type
`./configure' to configure the package for your system. If you're
using `csh' on an old version of System V, you might need to type
`sh ./configure' instead to prevent `csh' from trying to execute
`configure' itself.
Running `configure' takes awhile. While running, it prints some
messages telling which features it is checking for.
2. Type `make' to compile the package.
3. Optionally, type `make check' to run any self-tests that come with
the package.
4. Type `make install' to install the programs and any data files and
documentation.
5. You can remove the program binaries and object files from the
source code directory by typing `make clean'. To also remove the
files that `configure' created (so you can compile the package for
a different kind of computer), type `make distclean'. There is
also a `make maintainer-clean' target, but that is intended mainly
for the package's developers. If you use it, you may have to get
all sorts of other programs in order to regenerate files that came
with the distribution.
Compilers and Options
=====================
Some systems require unusual options for compilation or linking that
the `configure' script does not know about. Run `./configure --help'
for details on some of the pertinent environment variables.
You can give `configure' initial values for configuration parameters
by setting variables in the command line or in the environment. Here
is an example:
./configure CC=c89 CFLAGS=-O2 LIBS=-lposix
*Note Defining Variables::, for more details.
Compiling For Multiple Architectures
====================================
You can compile the package for more than one kind of computer at the
same time, by placing the object files for each architecture in their
own directory. To do this, you must use a version of `make' that
supports the `VPATH' variable, such as GNU `make'. `cd' to the
directory where you want the object files and executables to go and run
the `configure' script. `configure' automatically checks for the
source code in the directory that `configure' is in and in `..'.
If you have to use a `make' that does not support the `VPATH'
variable, you have to compile the package for one architecture at a
time in the source code directory. After you have installed the
package for one architecture, use `make distclean' before reconfiguring
for another architecture.
Installation Names
==================
By default, `make install' will install the package's files in
`/usr/local/bin', `/usr/local/man', etc. You can specify an
installation prefix other than `/usr/local' by giving `configure' the
option `--prefix=PATH'.
You can specify separate installation prefixes for
architecture-specific files and architecture-independent files. If you
give `configure' the option `--exec-prefix=PATH', the package will use
PATH as the prefix for installing programs and libraries.
Documentation and other data files will still use the regular prefix.
In addition, if you use an unusual directory layout you can give
options like `--bindir=PATH' to specify different values for particular
kinds of files. Run `configure --help' for a list of the directories
you can set and what kinds of files go in them.
If the package supports it, you can cause programs to be installed
with an extra prefix or suffix on their names by giving `configure' the
option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'.
Optional Features
=================
Some packages pay attention to `--enable-FEATURE' options to
`configure', where FEATURE indicates an optional part of the package.
They may also pay attention to `--with-PACKAGE' options, where PACKAGE
is something like `gnu-as' or `x' (for the X Window System). The
`README' should mention any `--enable-' and `--with-' options that the
package recognizes.
For packages that use the X Window System, `configure' can usually
find the X include and library files automatically, but if it doesn't,
you can use the `configure' options `--x-includes=DIR' and
`--x-libraries=DIR' to specify their locations.
Specifying the System Type
==========================
There may be some features `configure' cannot figure out
automatically, but needs to determine by the type of machine the package
will run on. Usually, assuming the package is built to be run on the
_same_ architectures, `configure' can figure that out, but if it prints
a message saying it cannot guess the machine type, give it the
`--build=TYPE' option. TYPE can either be a short name for the system
type, such as `sun4', or a canonical name which has the form:
CPU-COMPANY-SYSTEM
where SYSTEM can have one of these forms:
OS KERNEL-OS
See the file `config.sub' for the possible values of each field. If
`config.sub' isn't included in this package, then this package doesn't
need to know the machine type.
If you are _building_ compiler tools for cross-compiling, you should
use the `--target=TYPE' option to select the type of system they will
produce code for.
If you want to _use_ a cross compiler, that generates code for a
platform different from the build platform, you should specify the
"host" platform (i.e., that on which the generated programs will
eventually be run) with `--host=TYPE'.
Sharing Defaults
================
If you want to set default values for `configure' scripts to share,
you can create a site shell script called `config.site' that gives
default values for variables like `CC', `cache_file', and `prefix'.
`configure' looks for `PREFIX/share/config.site' if it exists, then
`PREFIX/etc/config.site' if it exists. Or, you can set the
`CONFIG_SITE' environment variable to the location of the site script.
A warning: not all `configure' scripts look for a site script.
Defining Variables
==================
Variables not defined in a site shell script can be set in the
environment passed to `configure'. However, some packages may run
configure again during the build, and the customized values of these
variables may be lost. In order to avoid this problem, you should set
them in the `configure' command line, using `VAR=value'. For example:
./configure CC=/usr/local2/bin/gcc
will cause the specified gcc to be used as the C compiler (unless it is
overridden in the site shell script).
`configure' Invocation
======================
`configure' recognizes the following options to control how it
operates.
`--help'
`-h'
Print a summary of the options to `configure', and exit.
`--version'
`-V'
Print the version of Autoconf used to generate the `configure'
script, and exit.
`--cache-file=FILE'
Enable the cache: use and save the results of the tests in FILE,
traditionally `config.cache'. FILE defaults to `/dev/null' to
disable caching.
`--config-cache'
`-C'
Alias for `--cache-file=config.cache'.
`--quiet'
`--silent'
`-q'
Do not print messages saying which checks are being made. To
suppress all normal output, redirect it to `/dev/null' (any error
messages will still be shown).
`--srcdir=DIR'
Look for the package's source code in directory DIR. Usually
`configure' can determine that directory automatically.
`configure' also accepts some other, not widely useful, options. Run
`configure --help' for more details.