mirror of
https://github.com/gperftools/gperftools
synced 2024-12-30 11:12:03 +00:00
1374 lines
50 KiB
Plaintext
1374 lines
50 KiB
Plaintext
== 11 Sep 2023
|
|
gperftools 2.13 is out!
|
|
|
|
This release includes a few minor fixes:
|
|
|
|
* Ivan Dlugos has fixed some issues with cmake and config.h defines.
|
|
|
|
* 32-bit builds no longer require 64-bit atomics (which we wrongly
|
|
introduced in 2.11 and which broke builds on some 32-bit
|
|
architectures).
|
|
|
|
* generic_fp backtracing method now uses robust address probing
|
|
method. The previous approach had occasional false positives, which
|
|
caused occasional rare crashes.
|
|
|
|
* In some cases, MSVC generated TrivialOnce machine code that
|
|
deadlocked programs on startup. The issue is now fixed.
|
|
|
|
== 24 Aug 2023
|
|
gperftools 2.12 is out!
|
|
|
|
Brett T. Warden contributed one significant fix. After a change in the
|
|
previous release, we installed broken pkg-config files. Brett noticed
|
|
and fixed that. Huge thanks!
|
|
|
|
== 14 Aug 2023
|
|
gperftools 2.11 is out!
|
|
|
|
Few minor fixes since rc couple weeks ago. Plus couple notable
|
|
contributions:
|
|
|
|
* Artem Polyakov has contributed auto-detection of several MPI systems
|
|
w.r.t. filenames used by HEAPPROFILE and CPUPROFILE environment
|
|
variables. Also, we now support HEAPPROFILE_USE_PID and
|
|
CPUPROFILE_USE_PID environment variables that force profile
|
|
filenames to have pid appended. Which will be useful for some
|
|
programs that fork for parallelism. See
|
|
https://github.com/gperftools/gperftools/pull/1263 for details.
|
|
|
|
* Ken Raffenetti has extended MPI detection mentioned above with
|
|
detection of MPICH system.
|
|
|
|
Thanks a lot!
|
|
|
|
== 31 July 2023
|
|
gperftools 2.11rc is out!
|
|
|
|
Most notable change is that Linux/aarch64 and Linux/riscv are now
|
|
fully supported. That is, all unit tests pass on those architectures
|
|
(previously the heap leak checker was broken).
|
|
|
|
Also notable is that heap leak checker support is officially
|
|
deprecated as of this release. All bug fixes from now are on a best
|
|
effort basis. For clarity we also declare that it is only expected to
|
|
work (for some definition of work) on Linux/x86 (all kinds),
|
|
Linux/aarch64, Linux/arm, Linux/ppc (untested as of this writing) and
|
|
Linux/mips (untested as well). While some functionality worked in the
|
|
past on BSDs, it was never fully functional; and will never be. We
|
|
strongly recommend everyone to switch to asan and friends.
|
|
|
|
For major internal changes it is also worth mentioning that we now
|
|
fully switched to C++-11 std::atomic. All custom OS- and arch-specific
|
|
atomic bits have been removed at last.
|
|
|
|
Another notable change is that mmap and sbrk hooks facility is now
|
|
no-op. We keep API and ABI for formal compatibility, but the calls to
|
|
add mmap/sbrk hooks do nothing and return an error (whenever possible
|
|
as part of API). There seem to be no users of it anyways, and mmap
|
|
replacement API that is part of that facility really screwed up 64-bit
|
|
offsets on (some/most) 32-bit systems. Internally for heap profiler
|
|
and heap checker we have a new, but non-public API (see mmap_hook.h).
|
|
|
|
Most tests now pass on NetBSD x86-64 (I tested on version 9.2). And
|
|
only one that fails is new stacktrace test for stacktraces from signal
|
|
handler (so there could be some imperfections for cpu profiles).
|
|
|
|
We don't warn people away from the libgcc stacktrace capturing method
|
|
anymore. In fact users on most recent glibc-s are advised to use it
|
|
(pass --enable-libgcc-unwinder-by-default). This is thanks to the
|
|
dl_find_object API offered by glibc which allows this implementation
|
|
to be fully async-signal-safe. Modern Linux distros should from now on
|
|
build their gperftools package with this enabled (other than those
|
|
built on top of musl).
|
|
|
|
generic_fp and generic_fp_unsafe stacktrace capturing methods have
|
|
been expanded for more architectures and even some basic non-Linux
|
|
support. We have completely removed old x86-specific frame pointer
|
|
stacktrace implementation in favor of those 2. _unsafe one should be
|
|
roughly equivalent to the old x86 method. And 'safe' one is
|
|
recommended as a new default for those who want FP-based
|
|
stacktracing. Safe implementation robustly checks memory before
|
|
accessing it, preventing unlikely, but not impossible crashes when
|
|
frame pointers are bogus.
|
|
|
|
On platforms that support it, we now build gperftools with
|
|
"-fno-omit-frame-pointer -momit-leaf-frame-pointer". This makes
|
|
gperftools mostly frame-pointer-ful, but without performance hit in
|
|
places that matter (this is how Google builds their binaries
|
|
BTW). That should cover gcc (at least) on x86, aarch64 and
|
|
riscv. Intention for this change is to make distro-shipped
|
|
libtcmalloc.so compatible with frame-pointer stacktrace capturing (for
|
|
those who still do heap profiling, for example). Of course, passing
|
|
--enable-frame-pointers still gives you full frame pointers (i.e. even
|
|
for leaf functions).
|
|
|
|
There is now support for detecting actual page size at
|
|
runtime. tcmalloc will now allocate memory in units of this page
|
|
size. It particularly helps on arms with 64k pages to return memory
|
|
back to the kernel. But it is somewhat controversial, because it
|
|
effectively bumps tcmalloc logical page size on those machines
|
|
potentially increasing fragmentation. In any case, there is now a new
|
|
environment variable TCMALLOC_OVERRIDE_PAGESIZE allowing people to
|
|
override this check. I.e. to either reduce effective page size down to
|
|
tcmalloc's logical page size or to increase it.
|
|
|
|
MallocExtension::MarkThreadTemporarilyIdle has been changed to be
|
|
identical to MarkThreadIdle. MarkThreadTemporarilyIdle is believed to
|
|
be unused, anyways. See issue #880 for details.
|
|
|
|
There are a whole bunch of smaller fixes. Many of those smaller fixes
|
|
had no associated ticket, but some had. People are advised to see here
|
|
for list of notable tickets closed in this release:
|
|
https://github.com/gperftools/gperftools/issues?q=label%3Afixed-in-2.11+
|
|
|
|
Some of those tickets are quite notable (fixes for rare deadlocks in
|
|
cpu profiler ProfilerStop or while capturing heap growth stacktraces
|
|
(aka growthz)).
|
|
|
|
Here is list of notable contributions:
|
|
|
|
* Chris Cambly has contributed initial support for AIX
|
|
|
|
* Ali Saidi has contributed SpinlockPause implementation for aarch64
|
|
|
|
* Henrik Reinstädtler has contributed fix for cpuprofiler on aarch64
|
|
OSX
|
|
|
|
* Gabriel Marin has backported Chromium's commit for always sanity
|
|
checking large frees
|
|
|
|
* User zhangyiru has contributed a fix to report the number of leaked
|
|
bytes as size_t instead of (usually 32-bit) int.
|
|
|
|
* Sergey Fedorov has contributed some fix for building on older
|
|
ppc-based OSX-es
|
|
|
|
* User tigeran has removed unused using declaration
|
|
|
|
Huge thanks to all contributors.
|
|
|
|
== 30 May 2022 ==
|
|
gperftools 2.10 is out!
|
|
|
|
Here are notable changes:
|
|
|
|
* Matt T. Proud contributed documentation fix to call Go programming
|
|
language by it's true name instead of golang.
|
|
|
|
* Robert Scott contributed debugallocator feature to use readable
|
|
(PROT_READ) fence pages. This is activated by
|
|
TCMALLOC_PAGE_FENCE_READABLE environment veriable.
|
|
|
|
* User stdpain contributed fix for cmake detection of libunwind.
|
|
|
|
* Natale Patriciello contributed fix for OSX Monterey support.
|
|
|
|
* Volodymyr Nikolaichuk contributed support for returning memory back
|
|
to OS by using mmap with MAP_FIXED and PROT_NONE. It is off by
|
|
default and enabled by preprocessor define:
|
|
FREE_MMAP_PROT_NONE. This should help OSes that don't support
|
|
Linux-style madvise MADV_DONTNEED or BSD-style MADV_FREE.
|
|
|
|
* Jingyun Hua has contributed basic support for LoongArch.
|
|
|
|
* Github issue #1338 of failing to build on some recent musl versions
|
|
has been fixed.
|
|
|
|
* Github issue #1321 of failing to ship cmake bits with .tar.gz
|
|
archive has been fixed.
|
|
|
|
== 2 March 2021 ==
|
|
gperftools 2.9.1 is out!
|
|
|
|
Minor fixes landed since previous release:
|
|
|
|
* OSX builds new prefer backtrace() and have somewhat working heap
|
|
sampling.
|
|
|
|
* Incorrect assertion failure was fixed that crashed tcmalloc if
|
|
assertions were on and sized delete was used. More details in github
|
|
issue #1254.
|
|
|
|
== 21 February 2021 ==
|
|
gperftools 2.9 is out!
|
|
|
|
Few more changes landed compared to rc:
|
|
|
|
* Venkatesh Srinivas has contributed thread-safety annotations
|
|
support.
|
|
|
|
* couple more unit test bugs that caused tcmalloc_unittest to fail on
|
|
recent clang has been fixed.
|
|
|
|
* usage of unsupportable linux_syscall_support.h has been removed from
|
|
few places. Building with --disable-heap-checker now completely
|
|
avoids it. Expect complete death of this header in next major
|
|
release.
|
|
|
|
== 14 February 2021 ==
|
|
gperftools 2.9rc is out!
|
|
|
|
Here are notable changes:
|
|
|
|
* Jarno Rajahalme has contributed fix for crashing bug in syscalls
|
|
support for aarch64.
|
|
|
|
* User SSE4 has contributed basic support for Elbrus 2000 architecture
|
|
(!)
|
|
|
|
* Venkatesh Srinivas has contributed cleanup to atomic ops.
|
|
|
|
* Đoàn Trần Công Danh has fixed cpu profiler compilation on musl.
|
|
|
|
* there is now better backtracing support for aarch64 and
|
|
riscv. x86-64 with frame pointers now also defaults to this new
|
|
"generic" frame pointer backtracer.
|
|
|
|
* emergency malloc is now enabled by default. Fixes hang on musl when
|
|
libgcc backtracer is enabled.
|
|
|
|
* bunch of legacy config tests has been removed
|
|
|
|
== 20 December 2020 ==
|
|
gperftools 2.8.1 is out!
|
|
|
|
Here are notable changes:
|
|
|
|
* previous release contained change to release memory without page
|
|
heap lock, but this change had at least one bug that caused to
|
|
crashes and corruption when running under aggressive decommit mode
|
|
(this is not default). While we check for other bugs, this feature
|
|
was reverted. See github issue #1204 and issue #1227.
|
|
|
|
* stack traces depth captured by gperftools is now up to 254 levels
|
|
deep. Thanks to Kerrick Staley for this small but useful tweak.
|
|
|
|
* Levon Ter-Grigoryan has contributed small fix for compiler warning.
|
|
|
|
* Grant Henke has contributed updated detection of program counter
|
|
register for OS X on arm64.
|
|
|
|
* Tim Gates has contributed small typo fix.
|
|
|
|
* Steve Langasek has contributed basic build fixes for riscv64 (!).
|
|
|
|
* Isaac Hier and okhowang have contributed premiliminary port of build
|
|
infrastructure to cmake. This works, but it is very premiliminary.
|
|
Autotools-based build is the only officially supported build for
|
|
now.
|
|
|
|
== 6 July 2020 ==
|
|
gperftools 2.8 is out!
|
|
|
|
Here are notable changes:
|
|
|
|
* ProfilerGetStackTrace is now officially supported API for
|
|
libprofiler. Contributed by Kirill Müller.
|
|
|
|
* Build failures on mingw were fixed. This fixed issue #1108.
|
|
|
|
* Build failure of page_heap_test on MSVC was fixed.
|
|
|
|
* Ryan Macnak contributed fix for compiling linux syscall support on
|
|
i386 and recent GCCs. This fixed issue #1076.
|
|
|
|
* test failures caused by new gcc 10 optimizations were fixed. Same
|
|
change also fixed tests on clang.
|
|
|
|
== 8 Mar 2020 ==
|
|
gperftools 2.8rc is out!
|
|
|
|
Here are notable changes:
|
|
|
|
* building code now requires c++11 or later. Bundled MSVC project was
|
|
converted to Visual Studio 2015.
|
|
|
|
* User obones contributed fix for windows x64 TLS callbacks. This
|
|
fixed leak of thread caches on thread exists in 64-bit windows.
|
|
|
|
* releasing memory back to kernel is now made with page heap lock
|
|
dropped.
|
|
|
|
* HoluWu contributed fix for correct malloc patching on debug builds
|
|
on windows. This configuration previously crashed.
|
|
|
|
* Romain Geissler contributed fix for tls access during early tls
|
|
initialization on dlopen.
|
|
|
|
* large allocation reports are now silenced by default. Since not all
|
|
programs want their stderr polluted by those messages. Contributed
|
|
by Junhao Li.
|
|
|
|
* HolyWu contributed improvements to MSVC project files. Notably,
|
|
there is now project for "overriding" version of tcmalloc.
|
|
|
|
* MS-specific _recalloc is now correctly zeroing only malloced
|
|
part. This fix was contributed by HolyWu.
|
|
|
|
* Brian Silverman contributed correctness fix to sampler_test.
|
|
|
|
* Gabriel Marin ported few fixes from chromium's fork. As part of
|
|
those fixes, we reduced number of static initializers (forbidden in
|
|
chromium). Also we now syscalls via syscall function instead of
|
|
reimplementing direct way to make syscalls on each platform.
|
|
|
|
* Brian Silverman fixed flakiness in page heap test.
|
|
|
|
* There is now configure flag to skip installing perl pprof, since
|
|
external golang pprof is much superior. --disable-deprecated-pprof
|
|
is the flag.
|
|
|
|
* Fabric Fontaine contributed fixes to drop use of nonstandard
|
|
__off64_t type.
|
|
|
|
* Fabrice Fontaine contributed build fix to check for presence of
|
|
nonstandard __sbrk functions. It is only used by mmap hooks code and
|
|
(rightfully) not available on musl.
|
|
|
|
* Fabrice Fontaine contributed build fix around mmap64 macro and
|
|
function conflict in same cases.
|
|
|
|
* there is now configure time option to enable aggressive decommit by
|
|
default. Contributed by Laurent
|
|
Stacul. --enable-aggressive-decommit-by-default is the flag.
|
|
|
|
* Tulio Magno Quites Machado Filho contributed build fixes for ppc
|
|
around ucontext access.
|
|
|
|
* User pkubaj contributed couple build fixes for FreeBSD/ppc.
|
|
|
|
* configure now always assumes we have mmap. This fixes configure
|
|
failures on some linux guests inside virtualbox. This fixed issue
|
|
#1008.
|
|
|
|
* User shipujin contributed syscall support fixes for mips64 (big and
|
|
little endian).
|
|
|
|
* Henrik Edin contributed configurable support for wide range of
|
|
malloc page sizes. 4K, 8K, 16K, 32K, 64K, 128K and 256K are now
|
|
supported via existing --with-tcmalloc-pagesize flag to configure.
|
|
|
|
* Jon Kohler added overheads fields to per-size-class textual
|
|
stats. Stats that are available via
|
|
MallocExtension::instance()->GetStats().
|
|
|
|
* tcmalloc can now avoid fallback from memfs to default sys
|
|
allocator. TCMALLOC_MEMFS_DISABLE_FALLBACK switches this on. This
|
|
was contributed by Jon Kohler.
|
|
|
|
* Ilya Leoshkevich fixed mmap syscall support on s390.
|
|
|
|
* Todd Lipcon contributed small build warning fix.
|
|
|
|
* User prehistoricpenguin contributed misc source file mode fixes (we
|
|
still had few few c++ files marked executable).
|
|
|
|
* User invalid_ms_user contributed fix for typo.
|
|
|
|
* Jakub Wilk contributed typos fixes.
|
|
|
|
== 29 Apr 2018 ==
|
|
gperftools 2.7 is out!
|
|
|
|
Few people contributed minor, but important fixes since rc.
|
|
|
|
Changes:
|
|
|
|
* bug in span stats printing introduced by new scalable page heap
|
|
change was fixed.
|
|
|
|
* Christoph Müllner has contributed couple warnings fixes and initial
|
|
support for aarch64_ilp32 architecture.
|
|
|
|
* Ben Dang contributed documentation fix for heap checker.
|
|
|
|
* Fabrice Fontaine contributed fixed for linking benchmarks with
|
|
--disable-static.
|
|
|
|
* Holy Wu has added sized deallocation unit tests.
|
|
|
|
* Holy Wu has enabled support of sized deallocation (c++14) on recent
|
|
MSVC.
|
|
|
|
* Holy Wu has fixed MSVC build in WIN32_OVERRIDE_ALLOCATORS mode. This
|
|
closed issue #716.
|
|
|
|
* Holy Wu has contributed cleanup of config.h used on windows.
|
|
|
|
* Mao Huang has contributed couple simple tcmalloc changes from
|
|
chromium code base. Making our tcmalloc forks a tiny bit closer.
|
|
|
|
* issue #946 that caused compilation failures on some Linux clang
|
|
installations has been fixed. Much thanks to github user htuch for
|
|
helping to diagnose issue and proposing a fix.
|
|
|
|
* Tulio Magno Quites Machado Filho has contributed build-time fix for
|
|
PPC (for problem introduced in one of commits since RC).
|
|
|
|
== 18 Mar 2018 ==
|
|
gperftools 2.7rc is out!
|
|
|
|
Changes:
|
|
|
|
* Most notable change in this release is that very large allocations
|
|
(>1MiB) are now handled be O(log n) implementation. This is
|
|
contributed by Todd Lipcon based on earlier work by Aliaksei
|
|
Kandratsenka and James Golick. Special thanks to Alexey Serbin for
|
|
contributing OSX fix for that commit.
|
|
|
|
* detection of sized deallocation support is improved. Which should
|
|
fix another set of issues building on OSX. Much thanks to Alexey
|
|
Serbin for reporting the issue, suggesting a fix and verifying it.
|
|
|
|
* Todd Lipcon made a change to extend page heaps freelists to 1 MiB
|
|
(up from 1MiB - 8KiB). This may help a little for some workloads.
|
|
|
|
* Ishan Arora contributed typo fix to docs
|
|
|
|
== 9 Dec 2017 ==
|
|
gperftools 2.6.3 is out!
|
|
|
|
Just two fixes were made in this release:
|
|
|
|
* Stephan Zuercher has contributed a build fix for some recent XCode
|
|
versions. See issue #942 for more details.
|
|
|
|
* assertion failure on some windows builds introduced by 2.6.2 was
|
|
fixed. Thanks to github user nkeemik for reporting it and testing
|
|
fix. See issue #944 for more details.
|
|
|
|
== 30 Nov 2017 ==
|
|
gperftools 2.6.2 is out!
|
|
|
|
Most notable change is recently added support for C++17 over-aligned
|
|
allocation operators contributed by Andrey Semashev. I've extended his
|
|
implemention to have roughly same performance as malloc/new. This
|
|
release also has native support for C11 aligned_alloc.
|
|
|
|
Rest is mostly bug fixes:
|
|
|
|
* Jianbo Yang has contributed a fix for potentially severe data race
|
|
introduced by malloc fast-path work in gperftools 2.6. This race
|
|
could cause occasional violation of total thread cache size
|
|
constraint. See issue #929 for more details.
|
|
|
|
* Correct behavior in out-of-memory condition in fast-path cases was
|
|
restored. This was another bug introduced by fast-path optimization
|
|
in gperftools 2.6 which caused operator new to silently return NULL
|
|
instead of doing correct C++ OOM handling (calling new_handler and
|
|
throwing bad_alloc).
|
|
|
|
* Khem Raj has contributed couple build fixes for newer glibcs (ucontext_t vs
|
|
struct ucontext and loff_t definition)
|
|
|
|
* Piotr Sikora has contributed build fix for OSX (not building unwind
|
|
benchmark). This was issue #910 (thanks to Yuriy Solovyov for
|
|
reporting it).
|
|
|
|
* Dorin Lazăr has contributed fix for compiler warning
|
|
|
|
* issue #912 (occasional deadlocking calling getenv too early on
|
|
windows) was fixed. Thanks to github user shangcangriluo for
|
|
reporting it.
|
|
|
|
* Couple earlier lsan-related commits still causing occasional issues
|
|
linking on OSX has been reverted. See issue #901.
|
|
|
|
* Volodimir Krylov has contributed GetProgramInvocationName for FreeBSD
|
|
|
|
* changsu lee has contributed couple minor correctness fixes (missing
|
|
va_end() and missing free() call in rarely executed Symbolize path)
|
|
|
|
* Andrew C. Morrow has contributed some more page heap stats. See issue
|
|
#935.
|
|
|
|
* some cases of built-time warnings from various gcc/clang versions
|
|
about throw() declarations have been fixes.
|
|
|
|
== 9 July 2017 ==
|
|
|
|
gperftools 2.6.1 is out! This is mostly bug-fixes release.
|
|
|
|
* issue #901: build issue on OSX introduced in last-time commit in 2.6
|
|
was fixed (contributed by Francis Ricci)
|
|
|
|
* tcmalloc_minimal now works on 32-bit ABI of mips64. This is issue
|
|
#845. Much thanks to Adhemerval Zanella and github user mtone.
|
|
|
|
* Romain Geissler contributed build fix for -std=c++17. This is pull
|
|
request #897.
|
|
|
|
* As part of fixing issue #904, tcmalloc atfork handler is now
|
|
installed early. This should fix slight chance of hitting deadlocks
|
|
at fork in some cases.
|
|
|
|
== 4 July 2017 ==
|
|
|
|
gperftools 2.6 is out!
|
|
|
|
* Kim Gräsman contributed documentation update for HEAPPROFILESIGNAL
|
|
environment variable
|
|
|
|
* KernelMaker contributed fix for population of min_object_size field
|
|
returned by MallocExtension::GetFreeListSizes
|
|
|
|
* commit 8c3dc52fcfe0 "issue-654: [pprof] handle split text segments"
|
|
was reverted. Some OSX users reported issues with this commit. Given
|
|
our pprof implementation is strongly deprecated it is best to drop
|
|
recently introduced features rather than breaking it badly.
|
|
|
|
* Francis Ricci contributed improvement for interaction with leak
|
|
sanitizer.
|
|
|
|
== 22 May 2017 ==
|
|
|
|
gperftools 2.6rc4 is out!
|
|
|
|
Dynamic sized delete is disabled by default again. There is no hope of
|
|
it working with eager dynamic symbols resolution (-z now linker
|
|
flag). More details in
|
|
https://bugzilla.redhat.com/show_bug.cgi?id=1452813
|
|
|
|
== 21 May 2017 ==
|
|
|
|
gperftools 2.6rc3 is out!
|
|
|
|
gperftools compilation on older systems (e.g. rhel 5) was fixed. This
|
|
was originally reported in github issue #888.
|
|
|
|
== 14 May 2017 ==
|
|
|
|
gperftools 2.6rc2 is out!
|
|
|
|
Just 2 small fixes on top of 2.6rc. Particularly, Rajalakshmi
|
|
Srinivasaraghavan contributed build fix for ppc32.
|
|
|
|
== 14 May 2017 ==
|
|
|
|
gperftools 2.6rc is out!
|
|
|
|
Highlights of this release are performance work on malloc fast-path
|
|
and support for more modern visual studio runtimes, and deprecation of
|
|
bundled pprof. Another significant performance-affecting changes are
|
|
reverting central free list transfer batch size back to 32 and
|
|
disabling of aggressive decommit mode by default.
|
|
|
|
Note, while we still ship perl implementation of pprof, everyone is
|
|
strongly advised to use golang reimplementation of pprof from
|
|
https://github.com/google/pprof.
|
|
|
|
Here are notable changes in more details (and see ChangeLog for full
|
|
details):
|
|
|
|
* a bunch of performance tweaks to tcmalloc fast-path were
|
|
merged. This speeds up critical path of tcmalloc by few tens of
|
|
%. Well tuned and allocation-heavy programs should see substantial
|
|
performance boost (should apply to all modern elf platforms). This
|
|
is based on Google-internal tcmalloc changes for fast-path (with
|
|
obvious exception of lacking per-cpu mode, of course). Original
|
|
changes were made by Aliaksei Kandratsenka. And Andrew Hunter,
|
|
Dmitry Vyukov and Sanjay Ghemawat contributed with reviews and
|
|
discussions.
|
|
|
|
* Architectures with 48 bits address space (x86-64 and aarch64) now
|
|
use faster 2 level page map. This was ported from Google-internal
|
|
change by Sanjay Ghemawat.
|
|
|
|
* Default value of TCMALLOC_TRANSFER_NUM_OBJ was returned back to
|
|
32. Larger values have been found to hurt certain programs (but help
|
|
some other benchmarks). Value can still be tweaked at run time via
|
|
environment variable.
|
|
|
|
* tcmalloc aggressive decommit mode is now disabled by default
|
|
again. It was found to degrade performance of certain tensorflow
|
|
benchmarks. Users who prefer smaller heap over small performance win
|
|
can still set environment variable TCMALLOC_AGGRESSIVE_DECOMMIT=t.
|
|
|
|
* runtime switchable sized delete support has be fixed and re-enabled
|
|
(on GNU/Linux). Programs that use C++ 14 or later that use sized
|
|
delete can again be sped up by setting environment variable
|
|
TCMALLOC_ENABLE_SIZED_DELETE=t. Support for enabling sized
|
|
deallication support at compile-time is still present, of course.
|
|
|
|
* tcmalloc now explicitly avoids use of MADV_FREE on Linux, unless
|
|
TCMALLOC_USE_MADV_FREE is defined at compile time. This is because
|
|
performance impact of MADV_FREE is not well known. Original issue
|
|
#780 raised by Mathias Stearn.
|
|
|
|
* issue #786 with occasional deadlocks in stack trace capturing via
|
|
libunwind was fixed. It was originally reported as Ceph issue:
|
|
http://tracker.ceph.com/issues/13522
|
|
|
|
* ChangeLog is now automatically generated from git log. Old ChangeLog
|
|
is now ChangeLog.old.
|
|
|
|
* tcmalloc now provides implementation of nallocx. Function was
|
|
originally introduced by jemalloc and can be used to return real
|
|
allocation size given allocation request size. This is ported from
|
|
Google-internal tcmalloc change contributed by Dmitry Vyukov.
|
|
|
|
* issue #843 which made tcmalloc crash when used with erlang runtime
|
|
was fixed.
|
|
|
|
* issue #839 which caused tcmalloc's aggressive decommit mode to
|
|
degrade performance in some corner cases was fixed.
|
|
|
|
* Bryan Chan contributed support for 31-bit s390.
|
|
|
|
* Brian Silverman contributed compilation fix for 32-bit ARMs
|
|
|
|
* Issue #817 that was causing tcmalloc to fail on windows 10 and
|
|
later, as well as on recent msvc was fixed. We now patch _free_base
|
|
as well.
|
|
|
|
* a bunch of minor documentaion/typos fixes by: Mike Gaffney
|
|
<mike@uberu.com>, iivlev <iivlev@productengine.com>, savefromgoogle
|
|
<savefromgoogle@users.noreply.github.com>, John McDole
|
|
<jtmcdole@gmail.com>, zmertens <zmertens@asu.edu>, Kirill Müller
|
|
<krlmlr@mailbox.org>, Eugene <n.eugene536@gmail.com>, Ola Olsson
|
|
<ola1olsson@gmail.com>, Mostyn Bramley-Moore <mostynb@opera.com>
|
|
|
|
* Tulio Magno Quites Machado Filho has contributed removal of
|
|
deprecated glibc malloc hooks.
|
|
|
|
* Issue #827 that caused intercepting malloc on osx 10.12 to fail was
|
|
fixed, by copying fix made by Mike Hommey to jemalloc. Much thanks
|
|
to Koichi Shiraishi and David Ribeiro Alves for reporting it and
|
|
testing fix.
|
|
|
|
* Aman Gupta and Kenton Varda contributed minor fixes to pprof (but
|
|
note again that pprof is deprecated)
|
|
|
|
* Ryan Macnak contributed compilation fix for aarch64
|
|
|
|
* Francis Ricci has fixed unaligned memory access in debug allocator
|
|
|
|
* TCMALLOC_PAGE_FENCE_NEVER_RECLAIM now actually works thanks to
|
|
contribution by Andrew Morrow.
|
|
|
|
== 12 Mar 2016 ==
|
|
|
|
gperftools 2.5 is out!
|
|
|
|
Just single bugfix was merged after rc2. Which was fix for issue #777.
|
|
|
|
== 5 Mar 2016 ==
|
|
|
|
gperftools 2.5rc2 is out!
|
|
|
|
New release contains just few commits on top of first release
|
|
candidate. One of them is build fix for Visual Studio. Another
|
|
significant change is that dynamic sized delete is now disabled by
|
|
default. It turned out that IFUNC relocations are not supporting our
|
|
advanced use case on all platforms and in all cases.
|
|
|
|
== 21 Feb 2016 ==
|
|
|
|
gperftools 2.5rc is out!
|
|
|
|
Here are major changes since 2.4:
|
|
|
|
* we've moved to github!
|
|
|
|
* Bryan Chan has contributed s390x support
|
|
|
|
* stacktrace capturing via libgcc's _Unwind_Backtrace was implemented
|
|
(for architectures with missing or broken libunwind).
|
|
|
|
* "emergency malloc" was implemented. Which unbreaks recursive calls
|
|
to malloc/free from stacktrace capturing functions (such us glib'c
|
|
backtrace() or libunwind on arm). It is enabled by
|
|
--enable-emergency-malloc configure flag or by default on arm when
|
|
--enable-stacktrace-via-backtrace is given. It is another fix for a
|
|
number common issues people had on platforms with missing or broken
|
|
libunwind.
|
|
|
|
* C++14 sized-deallocation is now supported (on gcc 5 and recent
|
|
clangs). It is off by default and can be enabled at configure time
|
|
via --enable-sized-delete. On GNU/Linux it can also be enabled at
|
|
run-time by either TCMALLOC_ENABLE_SIZED_DELETE environment variable
|
|
or by defining tcmalloc_sized_delete_enabled function which should
|
|
return 1 to enable it.
|
|
|
|
* we've lowered default value of transfer batch size to 512. Previous
|
|
value (bumped up in 2.1) was too high and caused performance
|
|
regression for some users. 512 should still give us performance
|
|
boost for workloads that need higher transfer batch size while not
|
|
penalizing other workloads too much.
|
|
|
|
* Brian Silverman's patch finally stopped arming profiling timer
|
|
unless profiling is started.
|
|
|
|
* Andrew Morrow has contributed support for obtaining cache size of the
|
|
current thread and softer idling (for use in MongoDB).
|
|
|
|
* we've implemented few minor performance improvements, particularly
|
|
on malloc fast-path.
|
|
|
|
A number of smaller fixes were made. Many of them were contributed:
|
|
|
|
* issue that caused spurious profiler_unittest.sh failures was fixed.
|
|
|
|
* Jonathan Lambrechts contributed improved callgrind format support to
|
|
pprof.
|
|
|
|
* Matt Cross contributed better support for debug symbols in separate
|
|
files to pprof.
|
|
|
|
* Matt Cross contributed support for printing collapsed stack frame
|
|
from pprof aimed at producing flame graphs.
|
|
|
|
* Angus Gratton has contributed documentation fix mentioning that on
|
|
windows only tcmalloc_minimal is supported.
|
|
|
|
* Anton Samokhvalov has made tcmalloc use mi_force_{un,}lock on OSX
|
|
instead of pthread_atfork. Which apparently fixes forking
|
|
issues tcmalloc had on OSX.
|
|
|
|
* Milton Chiang has contributed support for building 32-bit gperftools
|
|
on arm8.
|
|
|
|
* Patrick LoPresti has contributed support for specifying alternative
|
|
profiling signal via CPUPROFILE_TIMER_SIGNAL environment variable.
|
|
|
|
* Paolo Bonzini has contributed support configuring filename for
|
|
sending malloc tracing output via TCMALLOC_TRACE_FILE environment
|
|
variable.
|
|
|
|
* user spotrh has enabled use of futex on arm.
|
|
|
|
* user mitchblank has contributed better declaration for arg-less
|
|
profiler functions.
|
|
|
|
* Tom Conerly contributed proper freeing of memory allocated in
|
|
HeapProfileTable::FillOrderedProfile on error paths.
|
|
|
|
* user fdeweerdt has contributed curl arguments handling fix in pprof
|
|
|
|
* Frederik Mellbin fixed tcmalloc's idea of mangled new and delete
|
|
symbols on windows x64
|
|
|
|
* Dair Grant has contributed cacheline alignment for ThreadCache
|
|
objects
|
|
|
|
* Fredrik Mellbin has contributed updated windows/config.h for Visual
|
|
Studio 2015 and other windows fixes.
|
|
|
|
* we're not linking libpthread to libtcmalloc_minimal anymore. Instead
|
|
libtcmalloc_minimal links to pthread symbols weakly. As a result
|
|
single-threaded programs remain single-threaded when linking to or
|
|
preloading libtcmalloc_minimal.so.
|
|
|
|
* Boris Sazonov has contributed mips compilation fix and printf misue
|
|
in pprof.
|
|
|
|
* Adhemerval Zanella has contributed alignment fixes for statically
|
|
allocated variables.
|
|
|
|
* Jens Rosenboom has contributed fixes for heap-profiler_unittest.sh
|
|
|
|
* gshirishfree has contributed better description for GetStats method.
|
|
|
|
* cyshi has contributed spinlock pause fix.
|
|
|
|
* Chris Mayo has contributed --docdir argument support for configure.
|
|
|
|
* Duncan Sands has contributed fix for function aliases.
|
|
|
|
* Simon Que contributed better include for malloc_hook_c.h
|
|
|
|
* user wmamrak contributed struct timespec fix for Visual Studio 2015.
|
|
|
|
* user ssubotin contributed typo in PrintAvailability code.
|
|
|
|
|
|
== 10 Jan 2015 ==
|
|
|
|
gperftools 2.4 is out! The code is exactly same as 2.4rc.
|
|
|
|
== 28 Dec 2014 ==
|
|
|
|
gperftools 2.4rc is out!
|
|
|
|
Here are changes since 2.3:
|
|
|
|
* enabled aggressive decommit option by default. It was found to
|
|
significantly improve memory fragmentation with negligible impact on
|
|
performance. (Thanks to investigation work performed by Adhemerval
|
|
Zanella)
|
|
|
|
* added ./configure flags for tcmalloc pagesize and tcmalloc
|
|
allocation alignment. Larger page sizes have been reported to
|
|
improve performance occasionally. (Patch by Raphael Moreira Zinsly)
|
|
|
|
* sped-up hot-path of malloc/free. By about 5% on static library and
|
|
about 10% on shared library. Mainly due to more efficient checking
|
|
of malloc hooks.
|
|
|
|
* improved stacktrace capturing in cpu profiler (due to issue found by
|
|
Arun Sharma). As part of that issue pprof's handling of cpu profiles
|
|
was also improved.
|
|
|
|
== 7 Dec 2014 ==
|
|
|
|
gperftools 2.3 is out!
|
|
|
|
Here are changes since 2.3rc:
|
|
|
|
* (issue 658) correctly close socketpair fds on failure (patch by glider)
|
|
|
|
* libunwind integration can be disabled at configure time (patch by
|
|
Raphael Moreira Zinsly)
|
|
|
|
* libunwind integration is disabled by default for ppc64 (patch by
|
|
Raphael Moreira Zinsly)
|
|
|
|
* libunwind integration is force-disabled for OSX. It was not used by
|
|
default anyways. Fixes compilation issue I saw.
|
|
|
|
== 2 Nov 2014 ==
|
|
|
|
gperftools 2.3rc is out!
|
|
|
|
Most small improvements in this release were made to pprof tool.
|
|
|
|
New experimental Linux-only (for now) cpu profiling mode is a notable
|
|
big improvement.
|
|
|
|
Here are notable changes since 2.2.1:
|
|
|
|
* (issue-631) fixed debugallocation miscompilation on mmap-less
|
|
platforms (courtesy of user iamxujian)
|
|
|
|
* (issue-630) reference to wrong PROFILE (vs. correct CPUPROFILE)
|
|
environment variable was fixed (courtesy of WenSheng He)
|
|
|
|
* pprof now has option to display stack traces in output for heap
|
|
checker (courtesy of Michael Pasieka)
|
|
|
|
* (issue-636) pprof web command now works on mingw
|
|
|
|
* (issue-635) pprof now handles library paths that contain spaces
|
|
(courtesy of user mich...@sebesbefut.com)
|
|
|
|
* (issue-637) pprof now has an option to not strip template arguments
|
|
(patch by jiakai)
|
|
|
|
* (issue-644) possible out-of-bounds access in GetenvBeforeMain was
|
|
fixed (thanks to user abyss.7)
|
|
|
|
* (issue-641) pprof now has an option --show_addresses (thanks to user
|
|
yurivict). New option prints instruction address in addition to
|
|
function name in stack traces
|
|
|
|
* (issue-646) pprof now works around some issues of addr2line
|
|
reportedly when DWARF v4 format is used (patch by Adam McNeeney)
|
|
|
|
* (issue-645) heap profiler exit message now includes remaining memory
|
|
allocated info (patch by user yurivict)
|
|
|
|
* pprof code that finds location of /proc/<pid>/maps in cpu profile
|
|
files is now fixed (patch by Ricardo M. Correia)
|
|
|
|
* (issue-654) pprof now handles "split text segments" feature of
|
|
Chromium for Android. (patch by simonb)
|
|
|
|
* (issue-655) potential deadlock on windows caused by early call to
|
|
getenv in malloc initialization code was fixed (bug reported and fix
|
|
proposed by user zndmitry)
|
|
|
|
* incorrect detection of arm 6zk instruction set support
|
|
(-mcpu=arm1176jzf-s) was fixed. (Reported by pedronavf on old
|
|
issue-493)
|
|
|
|
* new cpu profiling mode on Linux is now implemented. It sets up
|
|
separate profiling timers for separate threads. Which improves
|
|
accuracy of profiling on Linux a lot. It is off by default. And is
|
|
enabled if both librt.f is loaded and CPUPROFILE_PER_THREAD_TIMERS
|
|
environment variable is set. But note that all threads need to be
|
|
registered via ProfilerRegisterThread.
|
|
|
|
== 21 Jun 2014 ==
|
|
|
|
gperftools 2.2.1 is out!
|
|
|
|
Here's list of fixes:
|
|
|
|
* issue-626 was closed. Which fixes initialization statically linked
|
|
tcmalloc.
|
|
|
|
* issue 628 was closed. It adds missing header file into source
|
|
tarball. This fixes for compilation on PPC Linux.
|
|
|
|
== 3 May 2014 ==
|
|
|
|
gperftools 2.2 is out!
|
|
|
|
Here are notable changes since 2.2rc:
|
|
|
|
* issue 620 (crash on windows when c runtime dll is reloaded) was
|
|
fixed
|
|
|
|
== 19 Apr 2014 ==
|
|
|
|
gperftools 2.2rc is out!
|
|
|
|
Here are notable changes since 2.1:
|
|
|
|
* a number of fixes for a number compilers and platforms. Notably
|
|
Visual Studio 2013, recent mingw with c++ threads and some OSX
|
|
fixes.
|
|
|
|
* we now have mips and mips64 support! (courtesy of Jovan Zelincevic,
|
|
Jean Lee, user xiaoyur347 and others)
|
|
|
|
* we now have aarch64 (aka arm64) support! (contributed by Riku
|
|
Voipio)
|
|
|
|
* there's now support for ppc64-le (by Raphael Moreira Zinsly and
|
|
Adhemerval Zanella)
|
|
|
|
* there's now some support of uclibc (contributed by user xiaoyur347)
|
|
|
|
* google/ headers will now give you deprecation warning. They are
|
|
deprecated since 2.0
|
|
|
|
* there's now new api: tc_malloc_skip_new_handler (ported from chromium
|
|
fork)
|
|
|
|
* issue-557: added support for dumping heap profile via signal (by
|
|
Jean Lee)
|
|
|
|
* issue-567: Petr Hosek contributed SysAllocator support for windows
|
|
|
|
* Joonsoo Kim contributed several speedups for central freelist code
|
|
|
|
* TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES environment variable now works
|
|
|
|
* configure scripts are now using AM_MAINTAINER_MODE. It'll only
|
|
affect folks who modify source from .tar.gz and want automake to
|
|
automatically rebuild Makefile-s. See automake documentation for
|
|
that.
|
|
|
|
* issue-586: detect main executable even if PIE is active (based on
|
|
patch by user themastermind1). Notably, it fixes profiler use with
|
|
ruby.
|
|
|
|
* there is now support for switching backtrace capturing method at
|
|
runtime (via TCMALLOC_STACKTRACE_METHOD and
|
|
TCMALLOC_STACKTRACE_METHOD_VERBOSE environment variables)
|
|
|
|
* there is new backtrace capturing method using -finstrument-functions
|
|
prologues contributed by user xiaoyur347
|
|
|
|
* few cases of crashes/deadlocks in profiler were addressed. See
|
|
(famous) issue-66, issue-547 and issue-579.
|
|
|
|
* issue-464 (memory corruption in debugalloc's realloc after
|
|
memallign) is now fixed
|
|
|
|
* tcmalloc is now able to release memory back to OS on windows
|
|
(issue-489). The code was ported from chromium fork (by a number of
|
|
authors).
|
|
|
|
* Together with issue-489 we ported chromium's "aggressive decommit"
|
|
mode. In this mode (settable via malloc extension and via
|
|
environment variable TCMALLOC_AGGRESSIVE_DECOMMIT), free pages are
|
|
returned back to OS immediately.
|
|
|
|
* MallocExtension::instance() is now faster (based on patch by
|
|
Adhemerval Zanella)
|
|
|
|
* issue-610 (hangs on windows in multibyte locales) is now fixed
|
|
|
|
The following people helped with ideas or patches (based on git log,
|
|
some contributions purely in bugtracker might be missing): Andrew
|
|
C. Morrow, yurivict, Wang YanQing, Thomas Klausner,
|
|
davide.italiano@10gen.com, Dai MIKURUBE, Joon-Sung Um, Jovan
|
|
Zelincevic, Jean Lee, Petr Hosek, Ben Avison, drussel, Joonsoo Kim,
|
|
Hannes Weisbach, xiaoyur347, Riku Voipio, Adhemerval Zanella, Raphael
|
|
Moreira Zinsly
|
|
|
|
== 30 July 2013 ==
|
|
|
|
gperftools 2.1 is out!
|
|
|
|
Just few fixes where merged after rc. Most notably:
|
|
|
|
* Some fixes for debug allocation on POWER/Linux
|
|
|
|
== 20 July 2013 ==
|
|
|
|
gperftools 2.1rc is out!
|
|
|
|
As a result of more than a year of contributions we're ready for 2.1
|
|
release.
|
|
|
|
But before making that step I'd like to create RC and make sure people
|
|
have chance to test it.
|
|
|
|
Here are notable changes since 2.0:
|
|
|
|
* fixes for building on newer platforms. Notably, there's now initial
|
|
support for x32 ABI (--enable-minimal only at this time))
|
|
|
|
* new getNumericProperty stats for cache sizes
|
|
|
|
* added HEAP_PROFILER_TIME_INTERVAL variable (see documentation)
|
|
|
|
* added environment variable to control heap size (TCMALLOC_HEAP_LIMIT_MB)
|
|
|
|
* added environment variable to disable release of memory back to OS
|
|
(TCMALLOC_DISABLE_MEMORY_RELEASE)
|
|
|
|
* cpu profiler can now be switched on and off by sending it a signal
|
|
(specified in CPUPROFILESIGNAL)
|
|
|
|
* (issue 491) fixed race-ful spinlock wake-ups
|
|
|
|
* (issue 496) added some support for fork-ing of process that is using
|
|
tcmalloc
|
|
|
|
* (issue 368) improved memory fragmentation when large chunks of
|
|
memory are allocated/freed
|
|
|
|
== 03 February 2012 ==
|
|
|
|
I've just released gperftools 2.0
|
|
|
|
The `google-perftools` project has been renamed to `gperftools`. I
|
|
(csilvers) am stepping down as maintainer, to be replaced by
|
|
David Chappelle. Welcome to the team, David! David has been an
|
|
an active contributor to perftools in the past -- in fact, he's the
|
|
only person other than me that already has commit status. I am
|
|
pleased to have him take over as maintainer.
|
|
|
|
I have both renamed the project (the Google Code site renamed a few
|
|
weeks ago), and bumped the major version number up to 2, to reflect
|
|
the new community ownership of the project. Almost all the
|
|
[http://gperftools.googlecode.com/svn/tags/gperftools-2.0/ChangeLog changes]
|
|
are related to the renaming.
|
|
|
|
The main functional change from google-perftools 1.10 is that
|
|
I've renamed the `google/` include-directory to be `gperftools/`
|
|
instead. New code should `#include <gperftools/tcmalloc.h>`/etc.
|
|
(Most users of perftools don't need any perftools-specific includes at
|
|
all, so this is mostly directed to "power users.") I've kept the old
|
|
names around as forwarding headers to the new, so `#include
|
|
<google/tcmalloc.h>` will continue to work.
|
|
|
|
(The other functional change which I snuck in is getting rid of some
|
|
bash-isms in one of the unittest driver scripts, so it could run on
|
|
Solaris.)
|
|
|
|
Note that some internal names still contain the text `google`, such as
|
|
the `google_malloc` internal linker section. I think that's a
|
|
trickier transition, and can happen in a future release (if at all).
|
|
|
|
|
|
=== 31 January 2012 ===
|
|
|
|
I've just released perftools 1.10
|
|
|
|
There is an API-incompatible change: several of the methods in the
|
|
`MallocExtension` class have changed from taking a `void*` to taking a
|
|
`const void*`. You should not be affected by this API change
|
|
unless you've written your own custom malloc extension that derives
|
|
from `MallocExtension`, but since it is a user-visible change, I have
|
|
upped the `.so` version number for this release.
|
|
|
|
This release focuses on improvements to linux-syscall-support.h,
|
|
including ARM and PPC fixups and general cleanups. I hope this will
|
|
magically fix an array of bugs people have been seeing.
|
|
|
|
There is also exciting news on the porting front, with support for
|
|
patching win64 assembly contributed by IBM Canada! This is an
|
|
important step -- perhaps the most difficult -- to getting perftools
|
|
to work on 64-bit windows using the patching technique (it doesn't
|
|
affect the libc-modification technique). `premable_patcher_test` has
|
|
been added to help test these changes; it is meant to compile under
|
|
x86_64, and won't work under win32.
|
|
|
|
For the full list of changes, including improved `HEAP_PROFILE_MMAP`
|
|
support, see the
|
|
[http://gperftools.googlecode.com/svn/tags/google-perftools-1.10/ChangeLog ChangeLog].
|
|
|
|
|
|
=== 24 January 2011 ===
|
|
|
|
The `google-perftools` Google Code page has been renamed to
|
|
`gperftools`, in preparation for the project being renamed to
|
|
`gperftools`. In the coming weeks, I'll be stepping down as
|
|
maintainer for the perftools project, and as part of that Google is
|
|
relinquishing ownership of the project; it will now be entirely
|
|
community run. The name change reflects that shift. The 'g' in
|
|
'gperftools' stands for 'great'. :-)
|
|
|
|
=== 23 December 2011 ===
|
|
|
|
I've just released perftools 1.9.1
|
|
|
|
I missed including a file in the tarball, that is needed to compile on
|
|
ARM. If you are not compiling on ARM, or have successfully compiled
|
|
perftools 1.9, there is no need to upgrade.
|
|
|
|
|
|
=== 22 December 2011 ===
|
|
|
|
I've just released perftools 1.9
|
|
|
|
This change has a slew of improvements, from better ARM and freebsd
|
|
support, to improved performance by moving some code outside of locks,
|
|
to better pprof reporting of code with overloaded functions.
|
|
|
|
The full list of changes is in the
|
|
[http://google-perftools.googlecode.com/svn/tags/google-perftools-1.9/ChangeLog ChangeLog].
|
|
|
|
|
|
=== 26 August 2011 ===
|
|
|
|
I've just released perftools 1.8.3
|
|
|
|
The star-crossed 1.8 series continues; in 1.8.1, I had accidentally
|
|
removed some code that was needed for FreeBSD. (Without this code
|
|
many apps would crash at startup.) This release re-adds that code.
|
|
If you are not on FreeBSD, or are using FreeBSD with perftools 1.8 or
|
|
earlier, there is no need to upgrade.
|
|
|
|
=== 11 August 2011 ===
|
|
|
|
I've just released perftools 1.8.2
|
|
|
|
I was incorrectly calculating the patch-level in the configuration
|
|
step, meaning the TC_VERSION_PATCH #define in tcmalloc.h was wrong.
|
|
Since the testing framework checks for this, it was failing. Now it
|
|
should work again. This time, I was careful to re-run my tests after
|
|
upping the version number. :-)
|
|
|
|
If you don't care about the TC_VERSION_PATCH #define, there's no
|
|
reason to upgrae.
|
|
|
|
=== 26 July 2011 ===
|
|
|
|
I've just released perftools 1.8.1
|
|
|
|
I was missing an #include that caused the build to break under some
|
|
compilers, especially newer gcc's, that wanted it. This only affects
|
|
people who build from source, so only the .tar.gz file is updated from
|
|
perftools 1.8. If you didn't have any problems compiling perftools
|
|
1.8, there's no reason to upgrade.
|
|
|
|
=== 15 July 2011 ===
|
|
|
|
I've just released perftools 1.8
|
|
|
|
Of the many changes in this release, a good number pertain to porting.
|
|
I've revamped OS X support to use the malloc-zone framework; it should
|
|
now Just Work to link in tcmalloc, without needing
|
|
`DYLD_FORCE_FLAT_NAMESPACE` or the like. (This is a pretty major
|
|
change, so please feel free to report feedback at
|
|
google-perftools@googlegroups.com.) 64-bit Windows support is also
|
|
improved, as is ARM support, and the hooks are in place to improve
|
|
FreeBSD support as well.
|
|
|
|
On the other hand, I'm seeing hanging tests on Cygwin. I see the same
|
|
hanging even with (the old) perftools 1.7, so I'm guessing this is
|
|
either a problem specific to my Cygwin installation, or nobody is
|
|
trying to use perftools under Cygwin. If you can reproduce the
|
|
problem, and even better have a solution, you can report it at
|
|
google-perftools@googlegroups.com.
|
|
|
|
Internal changes include several performance and space-saving tweaks.
|
|
One is user-visible (but in "stealth mode", and otherwise
|
|
undocumented): you can compile with `-DTCMALLOC_SMALL_BUT_SLOW`. In
|
|
this mode, tcmalloc will use less memory overhead, at the cost of
|
|
running (likely not noticeably) slower.
|
|
|
|
There are many other changes as well, too numerous to recount here,
|
|
but present in the
|
|
[http://google-perftools.googlecode.com/svn/tags/google-perftools-1.8/ChangeLog ChangeLog].
|
|
|
|
|
|
=== 7 February 2011 ===
|
|
|
|
Thanks to endlessr..., who
|
|
[http://code.google.com/p/google-perftools/issues/detail?id=307 identified]
|
|
why some tests were failing under MSVC 10 in release mode. It does not look
|
|
like these failures point toward any problem with tcmalloc itself; rather, the
|
|
problem is with the test, which made some assumptions that broke under the
|
|
some aggressive optimizations used in MSVC 10. I'll fix the test, but in
|
|
the meantime, feel free to use perftools even when compiled under MSVC
|
|
10.
|
|
|
|
=== 4 February 2011 ===
|
|
|
|
I've just released perftools 1.7
|
|
|
|
I apologize for the delay since the last release; so many great new
|
|
patches and bugfixes kept coming in (and are still coming in; I also
|
|
apologize to those folks who have to slip until the next release). I
|
|
picked this arbitrary time to make a cut.
|
|
|
|
Among the many new features in this release is a multi-megabyte
|
|
reduction in the amount of tcmalloc overhead uder x86_64, improved
|
|
performance in the case of contention, and many many bugfixes,
|
|
especially architecture-specific bugfixes. See the
|
|
[http://google-perftools.googlecode.com/svn/tags/google-perftools-1.7/ChangeLog ChangeLog]
|
|
for full details.
|
|
|
|
One architecture-specific change of note is added comments in the
|
|
[http://google-perftools.googlecode.com/svn/tags/perftools-1.7/README README]
|
|
for using tcmalloc under OS X. I'm trying to get my head around the
|
|
exact behavior of the OS X linker, and hope to have more improvements
|
|
for the next release, but I hope these notes help folks who have been
|
|
having trouble with tcmalloc on OS X.
|
|
|
|
*Windows users*: I've heard reports that some unittests fail on
|
|
Windows when compiled with MSVC 10 in Release mode. All tests pass in
|
|
Debug mode. I've not heard of any problems with earlier versions of
|
|
MSVC. I don't know if this is a problem with the runtime patching (so
|
|
the static patching discussed in README_windows.txt will still work),
|
|
a problem with perftools more generally, or a bug in MSVC 10. Anyone
|
|
with windows expertise that can debug this, I'd be glad to hear from!
|
|
|
|
|
|
=== 5 August 2010 ===
|
|
|
|
I've just released perftools 1.6
|
|
|
|
This version also has a large number of minor changes, including
|
|
support for `malloc_usable_size()` as a glibc-compatible alias to
|
|
`malloc_size()`, the addition of SVG-based output to `pprof`, and
|
|
experimental support for tcmalloc large pages, which may speed up
|
|
tcmalloc at the cost of greater memory use. To use tcmalloc large
|
|
pages, see the
|
|
[http://google-perftools.googlecode.com/svn/tags/perftools-1.6/INSTALL
|
|
INSTALL file]; for all changes, see the
|
|
[http://google-perftools.googlecode.com/svn/tags/perftools-1.6/ChangeLog
|
|
ChangeLog].
|
|
|
|
OS X NOTE: improvements in the profiler unittest have turned up an OS
|
|
X issue: in multithreaded programs, it seems that OS X often delivers
|
|
the profiling signal (from sigitimer()) to the main thread, even when
|
|
it's sleeping, rather than spawned threads that are doing actual work.
|
|
If anyone knows details of how OS X handles SIGPROF events (from
|
|
setitimer) in threaded programs, and has insight into this problem,
|
|
please send mail to google-perftools@googlegroups.com.
|
|
|
|
To see if you're affected by this, look for profiling time that pprof
|
|
attributes to `___semwait_signal`. This is work being done in other
|
|
threads, that is being attributed to sleeping-time in the main thread.
|
|
|
|
|
|
=== 20 January 2010 ===
|
|
|
|
I've just released perftools 1.5
|
|
|
|
This version has a slew of changes, leading to somewhat faster
|
|
performance and improvements in portability. It adds features like
|
|
`ITIMER_REAL` support to the cpu profiler, and `tc_set_new_mode` to
|
|
mimic the windows function of the same name. Full details are in the
|
|
[http://google-perftools.googlecode.com/svn/tags/perftools-1.5/ChangeLog
|
|
ChangeLog].
|
|
|
|
|
|
=== 11 September 2009 ===
|
|
|
|
I've just released perftools 1.4
|
|
|
|
The major change this release is the addition of a debugging malloc
|
|
library! If you link with `libtcmalloc_debug.so` instead of
|
|
`libtcmalloc.so` (and likewise for the `minimal` variants) you'll get
|
|
a debugging malloc, which will catch double-frees, writes to freed
|
|
data, `free`/`delete` and `delete`/`delete[]` mismatches, and even
|
|
(optionally) writes past the end of an allocated block.
|
|
|
|
We plan to do more with this library in the future, including
|
|
supporting it on Windows, and adding the ability to use the debugging
|
|
library with your default malloc in addition to using it with
|
|
tcmalloc.
|
|
|
|
There are also the usual complement of bug fixes, documented in the
|
|
ChangeLog, and a few minor user-tunable knobs added to components like
|
|
the system allocator.
|
|
|
|
|
|
=== 9 June 2009 ===
|
|
|
|
I've just released perftools 1.3
|
|
|
|
Like 1.2, this has a variety of bug fixes, especially related to the
|
|
Windows build. One of my bugfixes is to undo the weird `ld -r` fix to
|
|
`.a` files that I introduced in perftools 1.2: it caused problems on
|
|
too many platforms. I've reverted back to normal `.a` files. To work
|
|
around the original problem that prompted the `ld -r` fix, I now
|
|
provide `libtcmalloc_and_profiler.a`, for folks who want to link in
|
|
both.
|
|
|
|
The most interesting API change is that I now not only override
|
|
`malloc`/`free`/etc, I also expose them via a unique set of symbols:
|
|
`tc_malloc`/`tc_free`/etc. This enables clients to write their own
|
|
memory wrappers that use tcmalloc:
|
|
{{{
|
|
void* malloc(size_t size) { void* r = tc_malloc(size); Log(r); return r; }
|
|
}}}
|
|
|
|
|
|
=== 17 April 2009 ===
|
|
|
|
I've just released perftools 1.2.
|
|
|
|
This is mostly a bugfix release. The major change is internal: I have
|
|
a new system for creating packages, which allows me to create 64-bit
|
|
packages. (I still don't do that for perftools, because there is
|
|
still no great 64-bit solution, with libunwind still giving problems
|
|
and --disable-frame-pointers not practical in every environment.)
|
|
|
|
Another interesting change involves Windows: a
|
|
[http://code.google.com/p/google-perftools/issues/detail?id=126 new
|
|
patch] allows users to choose to override malloc/free/etc on Windows
|
|
rather than patching, as is done now. This can be used to create
|
|
custom CRTs.
|
|
|
|
My fix for this
|
|
[http://groups.google.com/group/google-perftools/browse_thread/thread/1ff9b50043090d9d/a59210c4206f2060?lnk=gst&q=dynamic#a59210c4206f2060
|
|
bug involving static linking] ended up being to make libtcmalloc.a and
|
|
libperftools.a a big .o file, rather than a true `ar` archive. This
|
|
should not yield any problems in practice -- in fact, it should be
|
|
better, since the heap profiler, leak checker, and cpu profiler will
|
|
now all work even with the static libraries -- but if you find it
|
|
does, please file a bug report.
|
|
|
|
Finally, the profile_handler_unittest provided in the perftools
|
|
testsuite (new in this release) is failing on FreeBSD. The end-to-end
|
|
test that uses the profile-handler is passing, so I suspect the
|
|
problem may be with the test, not the perftools code itself. However,
|
|
I do not know enough about how itimers work on FreeBSD to be able to
|
|
debug it. If you can figure it out, please let me know!
|
|
|
|
=== 11 March 2009 ===
|
|
|
|
I've just released perftools 1.1!
|
|
|
|
It has many changes since perftools 1.0 including
|
|
|
|
* Faster performance due to dynamically sized thread caches
|
|
* Better heap-sampling for more realistic profiles
|
|
* Improved support on Windows (MSVC 7.1 and cygwin)
|
|
* Better stacktraces in linux (using VDSO)
|
|
* Many bug fixes and feature requests
|
|
|
|
Note: if you use the CPU-profiler with applications that fork without
|
|
doing an exec right afterwards, please see the README. Recent testing
|
|
has shown that profiles are unreliable in that case. The problem has
|
|
existed since the first release of perftools. We expect to have a fix
|
|
for perftools 1.2. For more details, see
|
|
[http://code.google.com/p/google-perftools/issues/detail?id=105 issue 105].
|
|
|
|
Everyone who uses perftools 1.0 is encouraged to upgrade to perftools
|
|
1.1. If you see any problems with the new release, please file a bug
|
|
report at http://code.google.com/p/google-perftools/issues/list.
|
|
|
|
Enjoy!
|