diff --git a/NEWS b/NEWS
index e69de29..064bd4b 100644
--- a/NEWS
+++ b/NEWS
@@ -0,0 +1,109 @@
+=== 20 January 2010 ===
+
+I've just released perftools 1.5
+
+This version has a slew of changes, leading to somewhat faster
+performance and improvements in portability. It adds features like
+`ITIMER_REAL` support to the cpu profiler, and `tc_set_new_mode` to
+mimic the windows function of the same name. Full details are in the
+[http://google-perftools.googlecode.com/svn/tags/perftools-1.5/ChangeLog
+ChangeLog].
+
+=== 11 September 2009 ===
+
+I've just released perftools 1.4
+
+The major change this release is the addition of a debugging malloc
+library! If you link with `libtcmalloc_debug.so` instead of
+`libtcmalloc.so` (and likewise for the `minimal` variants) you'll get
+a debugging malloc, which will catch double-frees, writes to freed
+data, `free`/`delete` and `delete`/`delete[]` mismatches, and even
+(optionally) writes past the end of an allocated block.
+
+We plan to do more with this library in the future, including
+supporting it on Windows, and adding the ability to use the debugging
+library with your default malloc in addition to using it with
+tcmalloc.
+
+There are also the usual complement of bug fixes, documented in the
+ChangeLog, and a few minor user-tunable knobs added to components like
+the system allocator.
+
+
+=== 9 June 2009 ===
+
+I've just released perftools 1.3
+
+Like 1.2, this has a variety of bug fixes, especially related to the
+Windows build. One of my bugfixes is to undo the weird `ld -r` fix to
+`.a` files that I introduced in perftools 1.2: it caused problems on
+too many platforms. I've reverted back to normal `.a` files. To work
+around the original problem that prompted the `ld -r` fix, I now
+provide `libtcmalloc_and_profiler.a`, for folks who want to link in
+both.
+
+The most interesting API change is that I now not only override
+`malloc`/`free`/etc, I also expose them via a unique set of symbols:
+`tc_malloc`/`tc_free`/etc. This enables clients to write their own
+memory wrappers that use tcmalloc:
+{{{
+ void* malloc(size_t size) { void* r = tc_malloc(size); Log(r); return r; }
+}}}
+
+
+=== 17 April 2009 ===
+
+I've just released perftools 1.2.
+
+This is mostly a bugfix release. The major change is internal: I have
+a new system for creating packages, which allows me to create 64-bit
+packages. (I still don't do that for perftools, because there is
+still no great 64-bit solution, with libunwind still giving problems
+and --disable-frame-pointers not practical in every environment.)
+
+Another interesting change involves Windows: a
+[http://code.google.com/p/google-perftools/issues/detail?id=126 new
+patch] allows users to choose to override malloc/free/etc on Windows
+rather than patching, as is done now. This can be used to create
+custom CRTs.
+
+My fix for this
+[http://groups.google.com/group/google-perftools/browse_thread/thread/1ff9b50043090d9d/a59210c4206f2060?lnk=gst&q=dynamic#a59210c4206f2060
+bug involving static linking] ended up being to make libtcmalloc.a and
+libperftools.a a big .o file, rather than a true `ar` archive. This
+should not yield any problems in practice -- in fact, it should be
+better, since the heap profiler, leak checker, and cpu profiler will
+now all work even with the static libraries -- but if you find it
+does, please file a bug report.
+
+Finally, the profile_handler_unittest provided in the perftools
+testsuite (new in this release) is failing on FreeBSD. The end-to-end
+test that uses the profile-handler is passing, so I suspect the
+problem may be with the test, not the perftools code itself. However,
+I do not know enough about how itimers work on FreeBSD to be able to
+debug it. If you can figure it out, please let me know!
+
+=== 11 March 2009 ===
+
+I've just released perftools 1.1!
+
+It has many changes since perftools 1.0 including
+
+ * Faster performance due to dynamically sized thread caches
+ * Better heap-sampling for more realistic profiles
+ * Improved support on Windows (MSVC 7.1 and cygwin)
+ * Better stacktraces in linux (using VDSO)
+ * Many bug fixes and feature requests
+
+Note: if you use the CPU-profiler with applications that fork without
+doing an exec right afterwards, please see the README. Recent testing
+has shown that profiles are unreliable in that case. The problem has
+existed since the first release of perftools. We expect to have a fix
+for perftools 1.2. For more details, see
+[http://code.google.com/p/google-perftools/issues/detail?id=105 issue 105].
+
+Everyone who uses perftools 1.0 is encouraged to upgrade to perftools
+1.1. If you see any problems with the new release, please file a bug
+report at http://code.google.com/p/google-perftools/issues/list.
+
+Enjoy!
diff --git a/doc/heapprofile.html b/doc/heapprofile.html
index c857df1..709559d 100644
--- a/doc/heapprofile.html
+++ b/doc/heapprofile.html
@@ -67,11 +67,12 @@ for a given run of an executable:
In your code, bracket the code you want profiled in calls to
HeapProfilerStart()
and HeapProfilerStop()
.
(These functions are declared in <google/heap-profiler.h>
.)
- HeapProfilerStart()
will take
- the profile-filename-prefix as an argument. You can then use
- HeapProfilerDump()
or
- GetHeapProfile()
to examine the profile.
- In case it's useful, IsHeapProfilerRunning() will tell you
+ HeapProfilerStart()
will take the
+ profile-filename-prefix as an argument. Then, as often as
+ you'd like before calling HeapProfilerStop()
, you
+ can use HeapProfilerDump()
or
+ GetHeapProfile()
to examine the profile. In case
+ it's useful, IsHeapProfilerRunning()
will tell you
whether you've already called HeapProfilerStart() or not.
diff --git a/packages/deb/control b/packages/deb/control
index 379a5b1..c6f0a83 100644
--- a/packages/deb/control
+++ b/packages/deb/control
@@ -1,6 +1,6 @@
Source: google-perftools
Priority: optional
-Maintainer: Google Inc.
+Maintainer: Google Inc.
Build-Depends: debhelper (>= 4.0.0), binutils
Standards-Version: 3.6.1
diff --git a/packages/rpm/rpm.spec b/packages/rpm/rpm.spec
index b18e89b..bbf448f 100644
--- a/packages/rpm/rpm.spec
+++ b/packages/rpm/rpm.spec
@@ -10,7 +10,7 @@ Group: Development/Libraries
URL: http://code.google.com/p/google-perftools/
License: BSD
Vendor: Google
-Packager: Google
+Packager: Google
Source: http://%{NAME}.googlecode.com/files/%{NAME}-%{VERSION}.tar.gz
Distribution: Redhat 7 and above.
Buildroot: %{_tmppath}/%{name}-root
diff --git a/src/base/dynamic_annotations.c b/src/base/dynamic_annotations.c
index 65c4158..cdefaa7 100644
--- a/src/base/dynamic_annotations.c
+++ b/src/base/dynamic_annotations.c
@@ -105,12 +105,7 @@ void AnnotateBenignRace(const char *file, int line,
void AnnotateBenignRaceSized(const char *file, int line,
const volatile void *mem,
long size,
- const char *description) {
- long i;
- for (i = 0; i < size; i++) {
- AnnotateBenignRace(file, line, (char*)(mem) + i, description);
- }
-}
+ const char *description) {}
void AnnotateMutexIsUsedAsCondVar(const char *file, int line,
const volatile void *mu){}
void AnnotateTraceMemory(const char *file, int line,
@@ -121,6 +116,7 @@ void AnnotateIgnoreReadsBegin(const char *file, int line){}
void AnnotateIgnoreReadsEnd(const char *file, int line){}
void AnnotateIgnoreWritesBegin(const char *file, int line){}
void AnnotateIgnoreWritesEnd(const char *file, int line){}
+void AnnotateEnableRaceDetection(const char *file, int line, int enable){}
void AnnotateNoOp(const char *file, int line,
const volatile void *arg){}
void AnnotateFlushState(const char *file, int line){}
diff --git a/src/base/dynamic_annotations.h b/src/base/dynamic_annotations.h
index 3980b24..dae1a14 100644
--- a/src/base/dynamic_annotations.h
+++ b/src/base/dynamic_annotations.h
@@ -246,6 +246,12 @@
ANNOTATE_IGNORE_READS_END();\
}while(0)\
+ /* Enable (enable!=0) or disable (enable==0) race detection for all threads.
+ This annotation could be useful if you want to skip expensive race analysis
+ during some period of program execution, e.g. during initialization. */
+ #define ANNOTATE_ENABLE_RACE_DETECTION(enable) \
+ AnnotateEnableRaceDetection(__FILE__, __LINE__, enable)
+
/* -------------------------------------------------------------
Annotations useful for debugging. */
@@ -358,6 +364,7 @@
#define ANNOTATE_IGNORE_WRITES_END() /* empty */
#define ANNOTATE_IGNORE_READS_AND_WRITES_BEGIN() /* empty */
#define ANNOTATE_IGNORE_READS_AND_WRITES_END() /* empty */
+ #define ANNOTATE_ENABLE_RACE_DETECTION(enable) /* empty */
#define ANNOTATE_NO_OP(arg) /* empty */
#define ANNOTATE_FLUSH_STATE() /* empty */
@@ -428,6 +435,7 @@ void AnnotateIgnoreReadsBegin(const char *file, int line);
void AnnotateIgnoreReadsEnd(const char *file, int line);
void AnnotateIgnoreWritesBegin(const char *file, int line);
void AnnotateIgnoreWritesEnd(const char *file, int line);
+void AnnotateEnableRaceDetection(const char *file, int line, int enable);
void AnnotateNoOp(const char *file, int line,
const volatile void *arg);
void AnnotateFlushState(const char *file, int line);
diff --git a/src/base/vdso_support.cc b/src/base/vdso_support.cc
index ddaca37..fce7c2c 100644
--- a/src/base/vdso_support.cc
+++ b/src/base/vdso_support.cc
@@ -42,8 +42,8 @@
#include
#include "base/atomicops.h" // for MemoryBarrier
-#include "base/logging.h"
#include "base/linux_syscall_support.h"
+#include "base/logging.h"
#include "base/dynamic_annotations.h"
#include "base/basictypes.h" // for COMPILE_ASSERT
diff --git a/src/central_freelist.cc b/src/central_freelist.cc
index 674ff9b..5b7dfbb 100644
--- a/src/central_freelist.cc
+++ b/src/central_freelist.cc
@@ -266,8 +266,7 @@ void CentralFreeList::Populate() {
Span* span;
{
SpinLockHolder h(Static::pageheap_lock());
- span = Static::pageheap()->New(npages);
- if (span) Static::pageheap()->RegisterSizeClass(span, size_class_);
+ span = Static::pageheap()->New(npages, size_class_, kPageSize);
}
if (span == NULL) {
MESSAGE("tcmalloc: allocation failed", npages << kPageShift);
@@ -275,12 +274,6 @@ void CentralFreeList::Populate() {
return;
}
ASSERT(span->length == npages);
- // Cache sizeclass info eagerly. Locking is not necessary.
- // (Instead of being eager, we could just replace any stale info
- // about this span, but that seems to be no better in practice.)
- for (int i = 0; i < npages; i++) {
- Static::pageheap()->CacheSizeClass(span->start + i, size_class_);
- }
// Split the block into pieces and add to the free-list
// TODO: coloring of objects to avoid cache conflicts?
diff --git a/src/common.h b/src/common.h
index 92c582f..b0278eb 100644
--- a/src/common.h
+++ b/src/common.h
@@ -62,6 +62,7 @@ static const size_t kPageSize = 1 << kPageShift;
static const size_t kMaxSize = 8u * kPageSize;
static const size_t kAlignment = 8;
static const size_t kNumClasses = 61;
+static const size_t kLargeSizeClass = 0;
// Maximum length we allow a per-thread free-list to have before we
// move objects from it into the corresponding central free-list. We
diff --git a/src/google/profiler.h b/src/google/profiler.h
index 74b936f..a6883f4 100644
--- a/src/google/profiler.h
+++ b/src/google/profiler.h
@@ -108,13 +108,15 @@ struct ProfilerOptions {
void *filter_in_thread_arg;
};
-/* Start profiling and write profile info into fname.
+/* Start profiling and write profile info into fname, discarding any
+ * existing profiling data in that file.
*
* This is equivalent to calling ProfilerStartWithOptions(fname, NULL).
*/
PERFTOOLS_DLL_DECL int ProfilerStart(const char* fname);
-/* Start profiling and write profile into fname.
+/* Start profiling and write profile into fname, discarding any
+ * existing profiling data in that file.
*
* The profiler is configured using the options given by 'options'.
* Options which are not specified are given default values.
diff --git a/src/heap-checker.cc b/src/heap-checker.cc
index 84e6cf3..2779c97 100644
--- a/src/heap-checker.cc
+++ b/src/heap-checker.cc
@@ -1377,9 +1377,9 @@ static SpinLock alignment_checker_lock(SpinLock::LINKER_INITIALIZED);
if (VLOG_IS_ON(15)) {
// log call stacks to help debug how come something is not a leak
HeapProfileTable::AllocInfo alloc;
- bool r = heap_profile->FindAllocDetails(ptr, &alloc);
- r = r; // suppress compiler warning in non-debug mode
- RAW_DCHECK(r, ""); // sanity
+ if (!heap_profile->FindAllocDetails(ptr, &alloc)) {
+ RAW_LOG(FATAL, "FindAllocDetails failed on ptr %p", ptr);
+ }
RAW_LOG(INFO, "New live %p object's alloc stack:", ptr);
for (int i = 0; i < alloc.stack_depth; ++i) {
RAW_LOG(INFO, " @ %p", alloc.call_stack[i]);
diff --git a/src/internal_logging.h b/src/internal_logging.h
index 0cb9ba2..731b2d9 100644
--- a/src/internal_logging.h
+++ b/src/internal_logging.h
@@ -119,7 +119,9 @@ do { \
#ifndef NDEBUG
#define ASSERT(cond) CHECK_CONDITION(cond)
#else
-#define ASSERT(cond) ((void) 0)
+#define ASSERT(cond) \
+ do { \
+ } while (0 && (cond))
#endif
// Print into buffer
diff --git a/src/page_heap.cc b/src/page_heap.cc
index 1e63cb9..7bfeea4 100644
--- a/src/page_heap.cc
+++ b/src/page_heap.cc
@@ -61,49 +61,64 @@ PageHeap::PageHeap()
}
}
-Span* PageHeap::New(Length n) {
+// Returns the minimum number of pages necessary to ensure that an
+// allocation of size n can be aligned to the given alignment.
+static Length AlignedAllocationSize(Length n, size_t alignment) {
+ ASSERT(alignment >= kPageSize);
+ return n + tcmalloc::pages(alignment - kPageSize);
+}
+
+Span* PageHeap::New(Length n, size_t sc, size_t align) {
ASSERT(Check());
ASSERT(n > 0);
+ if (align < kPageSize) {
+ align = kPageSize;
+ }
+
+ Length aligned_size = AlignedAllocationSize(n, align);
+
// Find first size >= n that has a non-empty list
- for (Length s = n; s < kMaxPages; s++) {
+ for (Length s = aligned_size; s < kMaxPages; s++) {
Span* ll = &free_[s].normal;
// If we're lucky, ll is non-empty, meaning it has a suitable span.
if (!DLL_IsEmpty(ll)) {
ASSERT(ll->next->location == Span::ON_NORMAL_FREELIST);
- return Carve(ll->next, n);
+ return Carve(ll->next, n, sc, align);
}
// Alternatively, maybe there's a usable returned span.
ll = &free_[s].returned;
if (!DLL_IsEmpty(ll)) {
ASSERT(ll->next->location == Span::ON_RETURNED_FREELIST);
- return Carve(ll->next, n);
+ return Carve(ll->next, n, sc, align);
}
// Still no luck, so keep looking in larger classes.
}
- Span* result = AllocLarge(n);
+ Span* result = AllocLarge(n, sc, align);
if (result != NULL) return result;
// Grow the heap and try again
- if (!GrowHeap(n)) {
+ if (!GrowHeap(aligned_size)) {
ASSERT(Check());
return NULL;
}
- return AllocLarge(n);
+ return AllocLarge(n, sc, align);
}
-Span* PageHeap::AllocLarge(Length n) {
- // find the best span (closest to n in size).
+Span* PageHeap::AllocLarge(Length n, size_t sc, size_t align) {
+ // Find the best span (closest to n in size).
// The following loops implements address-ordered best-fit.
Span *best = NULL;
+ Length aligned_size = AlignedAllocationSize(n, align);
+
// Search through normal list
for (Span* span = large_.normal.next;
span != &large_.normal;
span = span->next) {
- if (span->length >= n) {
+ if (span->length >= aligned_size) {
if ((best == NULL)
|| (span->length < best->length)
|| ((span->length == best->length) && (span->start < best->start))) {
@@ -117,7 +132,7 @@ Span* PageHeap::AllocLarge(Length n) {
for (Span* span = large_.returned.next;
span != &large_.returned;
span = span->next) {
- if (span->length >= n) {
+ if (span->length >= aligned_size) {
if ((best == NULL)
|| (span->length < best->length)
|| ((span->length == best->length) && (span->start < best->start))) {
@@ -127,19 +142,18 @@ Span* PageHeap::AllocLarge(Length n) {
}
}
- return best == NULL ? NULL : Carve(best, n);
+ return best == NULL ? NULL : Carve(best, n, sc, align);
}
Span* PageHeap::Split(Span* span, Length n) {
ASSERT(0 < n);
ASSERT(n < span->length);
- ASSERT(span->location == Span::IN_USE);
- ASSERT(span->sizeclass == 0);
+ ASSERT((span->location != Span::IN_USE) || span->sizeclass == 0);
Event(span, 'T', n);
const int extra = span->length - n;
Span* leftover = NewSpan(span->start + n, extra);
- ASSERT(leftover->location == Span::IN_USE);
+ leftover->location = span->location;
Event(leftover, 'U', extra);
RecordSpan(leftover);
pagemap_.set(span->start + n - 1, span); // Update map from pageid to span
@@ -148,25 +162,44 @@ Span* PageHeap::Split(Span* span, Length n) {
return leftover;
}
-Span* PageHeap::Carve(Span* span, Length n) {
+Span* PageHeap::Carve(Span* span, Length n, size_t sc, size_t align) {
ASSERT(n > 0);
ASSERT(span->location != Span::IN_USE);
- const int old_location = span->location;
+ ASSERT(align >= kPageSize);
+
+ Length align_pages = align >> kPageShift;
RemoveFromFreeList(span);
- span->location = Span::IN_USE;
- Event(span, 'A', n);
+
+ if (span->start & (align_pages - 1)) {
+ Length skip_for_alignment = align_pages - (span->start & (align_pages - 1));
+ Span* aligned = Split(span, skip_for_alignment);
+ PrependToFreeList(span); // Skip coalescing - no candidates possible
+ span = aligned;
+ }
const int extra = span->length - n;
ASSERT(extra >= 0);
if (extra > 0) {
- Span* leftover = NewSpan(span->start + n, extra);
- leftover->location = old_location;
- Event(leftover, 'S', extra);
- RecordSpan(leftover);
- PrependToFreeList(leftover); // Skip coalescing - no candidates possible
- span->length = n;
- pagemap_.set(span->start + n - 1, span);
+ Span* leftover = Split(span, n);
+ PrependToFreeList(leftover);
}
+
+ span->location = Span::IN_USE;
+ span->sizeclass = sc;
+ Event(span, 'A', n);
+
+ // Cache sizeclass info eagerly. Locking is not necessary.
+ // (Instead of being eager, we could just replace any stale info
+ // about this span, but that seems to be no better in practice.)
+ CacheSizeClass(span->start, sc);
+
+ if (sc != kLargeSizeClass) {
+ for (Length i = 1; i < n; i++) {
+ pagemap_.set(span->start + i, span);
+ CacheSizeClass(span->start + i, sc);
+ }
+ }
+
ASSERT(Check());
return span;
}
@@ -318,18 +351,6 @@ Length PageHeap::ReleaseAtLeastNPages(Length num_pages) {
return released_pages;
}
-void PageHeap::RegisterSizeClass(Span* span, size_t sc) {
- // Associate span object with all interior pages as well
- ASSERT(span->location == Span::IN_USE);
- ASSERT(GetDescriptor(span->start) == span);
- ASSERT(GetDescriptor(span->start+span->length-1) == span);
- Event(span, 'C', sc);
- span->sizeclass = sc;
- for (Length i = 1; i < span->length-1; i++) {
- pagemap_.set(span->start+i, span);
- }
-}
-
static double MB(uint64_t bytes) {
return bytes / 1048576.0;
}
diff --git a/src/page_heap.h b/src/page_heap.h
index 74030d2..de36266 100644
--- a/src/page_heap.h
+++ b/src/page_heap.h
@@ -93,21 +93,49 @@ class PERFTOOLS_DLL_DECL PageHeap {
public:
PageHeap();
- // Allocate a run of "n" pages. Returns zero if out of memory.
- // Caller should not pass "n == 0" -- instead, n should have
- // been rounded up already.
- Span* New(Length n);
+ // Allocate a run of "n" pages. Returns NULL if out of memory.
+ // Caller should not pass "n == 0" -- instead, n should have been
+ // rounded up already. The span will be used for allocating objects
+ // with the specifled sizeclass sc (sc must be zero for large
+ // objects). The first page of the span will be aligned to the value
+ // specified by align, which must be a power of two.
+ Span* New(Length n, size_t sc, size_t align);
// Delete the span "[p, p+n-1]".
// REQUIRES: span was returned by earlier call to New() and
// has not yet been deleted.
void Delete(Span* span);
- // Mark an allocated span as being used for small objects of the
- // specified size-class.
- // REQUIRES: span was returned by an earlier call to New()
- // and has not yet been deleted.
- void RegisterSizeClass(Span* span, size_t sc);
+ // Gets either the size class of addr, if it is a small object, or it's span.
+ // Return:
+ // if addr is invalid:
+ // leave *out_sc and *out_span unchanged and return false;
+ // if addr is valid and has a small size class:
+ // *out_sc = the size class
+ // *out_span =
+ // return true
+ // if addr is valid and has a large size class:
+ // *out_sc = kLargeSizeClass
+ // *out_span = the span pointer
+ // return true
+ bool GetSizeClassOrSpan(void* addr, size_t* out_sc, Span** out_span) {
+ const PageID p = reinterpret_cast(addr) >> kPageShift;
+ size_t cl = GetSizeClassIfCached(p);
+ Span* span = NULL;
+
+ if (cl != kLargeSizeClass) {
+ ASSERT(cl == GetDescriptor(p)->sizeclass);
+ } else {
+ span = GetDescriptor(p);
+ if (!span) {
+ return false;
+ }
+ cl = span->sizeclass;
+ }
+ *out_span = span;
+ *out_sc = cl;
+ return true;
+ }
// Split an allocated span into two spans: one of length "n" pages
// followed by another span of length "span->length - n" pages.
@@ -115,14 +143,29 @@ class PERFTOOLS_DLL_DECL PageHeap {
// Returns a pointer to the second span.
//
// REQUIRES: "0 < n < span->length"
- // REQUIRES: span->location == IN_USE
- // REQUIRES: span->sizeclass == 0
+ // REQUIRES: a) the span is free or b) sizeclass == 0
Span* Split(Span* span, Length n);
// Return the descriptor for the specified page. Returns NULL if
// this PageID was not allocated previously.
inline Span* GetDescriptor(PageID p) const {
- return reinterpret_cast(pagemap_.get(p));
+ Span* ret = reinterpret_cast(pagemap_.get(p));
+#ifndef NDEBUG
+ if (ret != NULL && ret->location == Span::IN_USE) {
+ size_t cl = GetSizeClassIfCached(p);
+ // Three cases:
+ // - The object is not cached
+ // - The object is cached correctly
+ // - It is a large object and we're not looking at the first
+ // page. This happens in coalescing.
+ ASSERT(cl == kLargeSizeClass || cl == ret->sizeclass ||
+ (ret->start != p && ret->sizeclass == kLargeSizeClass));
+ // If the object is sampled, it must have be kLargeSizeClass
+ ASSERT(ret->sizeclass == kLargeSizeClass || !ret->sample);
+ }
+#endif
+
+ return ret;
}
// Dump state to stderr
@@ -223,7 +266,7 @@ class PERFTOOLS_DLL_DECL PageHeap {
// length exactly "n" and mark it as non-free so it can be returned
// to the client. After all that, decrease free_pages_ by n and
// return span.
- Span* Carve(Span* span, Length n);
+ Span* Carve(Span* span, Length n, size_t sc, size_t align);
void RecordSpan(Span* span) {
pagemap_.set(span->start, span);
@@ -234,7 +277,7 @@ class PERFTOOLS_DLL_DECL PageHeap {
// Allocate a large span of length == n. If successful, returns a
// span of exactly the specified length. Else, returns NULL.
- Span* AllocLarge(Length n);
+ Span* AllocLarge(Length n, size_t sc, size_t align);
// Coalesce span with neighboring spans if possible, prepend to
// appropriate free list, and adjust stats.
diff --git a/src/pprof b/src/pprof
index d70ee30..8aff380 100755
--- a/src/pprof
+++ b/src/pprof
@@ -106,6 +106,12 @@ my $FILTEREDPROFILE_PAGE = "/pprof/filteredprofile(?:\\?.*)?";
my $SYMBOL_PAGE = "/pprof/symbol"; # must support symbol lookup via POST
my $PROGRAM_NAME_PAGE = "/pprof/cmdline";
+# These are the web pages that can be named on the command line.
+# All the alternatives must begin with /.
+my $PROFILES = "($HEAP_PAGE|$PROFILE_PAGE|$PMUPROFILE_PAGE|" .
+ "$GROWTH_PAGE|$CONTENTION_PAGE|$WALL_PAGE|" .
+ "$FILTEREDPROFILE_PAGE)";
+
# default binary name
my $UNKNOWN_BINARY = "(unknown)";
@@ -718,10 +724,8 @@ sub RunWeb {
"firefox",
);
foreach my $b (@alt) {
- if (-f $b) {
- if (system($b, $fname) == 0) {
- return;
- }
+ if (system($b, $fname) == 0) {
+ return;
}
}
@@ -2704,32 +2708,44 @@ sub CheckSymbolPage {
sub IsProfileURL {
my $profile_name = shift;
- my ($host, $port, $prefix, $path) = ParseProfileURL($profile_name);
- return defined($host) and defined($port) and defined($path);
+ if (-f $profile_name) {
+ printf STDERR "Using local file $profile_name.\n";
+ return 0;
+ }
+ return 1;
}
sub ParseProfileURL {
my $profile_name = shift;
- if (defined($profile_name) &&
- $profile_name =~ m,^(http://|)([^/:]+):(\d+)(|\@\d+)(|/|(.*?)($PROFILE_PAGE|$PMUPROFILE_PAGE|$HEAP_PAGE|$GROWTH_PAGE|$CONTENTION_PAGE|$WALL_PAGE|$FILTEREDPROFILE_PAGE))$,o) {
- # $7 is $PROFILE_PAGE/$HEAP_PAGE/etc. $5 is *everything* after
- # the hostname, as long as that everything is the empty string,
- # a slash, or something ending in $PROFILE_PAGE/$HEAP_PAGE/etc.
- # So "$7 || $5" is $PROFILE_PAGE/etc if there, or else it's "/" or "".
- return ($2, $3, $6, $7 || $5);
+
+ if (!defined($profile_name) || $profile_name eq "") {
+ return ();
}
- return ();
+
+ # Split profile URL - matches all non-empty strings, so no test.
+ $profile_name =~ m,^(https?://)?([^/]+)(.*?)(/|$PROFILES)?$,;
+
+ my $proto = $1 || "http://";
+ my $hostport = $2;
+ my $prefix = $3;
+ my $profile = $4 || "/";
+
+ my $host = $hostport;
+ $host =~ s/:.*//;
+
+ my $baseurl = "$proto$hostport$prefix";
+ return ($host, $baseurl, $profile);
}
# We fetch symbols from the first profile argument.
sub SymbolPageURL {
- my ($host, $port, $prefix, $path) = ParseProfileURL($main::pfile_args[0]);
- return "http://$host:$port$prefix$SYMBOL_PAGE";
+ my ($host, $baseURL, $path) = ParseProfileURL($main::pfile_args[0]);
+ return "$baseURL$SYMBOL_PAGE";
}
sub FetchProgramName() {
- my ($host, $port, $prefix, $path) = ParseProfileURL($main::pfile_args[0]);
- my $url = "http://$host:$port$prefix$PROGRAM_NAME_PAGE";
+ my ($host, $baseURL, $path) = ParseProfileURL($main::pfile_args[0]);
+ my $url = "$baseURL$PROGRAM_NAME_PAGE";
my $command_line = "$URL_FETCHER '$url'";
open(CMDLINE, "$command_line |") or error($command_line);
my $cmdline = ;
@@ -2880,10 +2896,10 @@ sub BaseName {
sub MakeProfileBaseName {
my ($binary_name, $profile_name) = @_;
- my ($host, $port, $prefix, $path) = ParseProfileURL($profile_name);
+ my ($host, $baseURL, $path) = ParseProfileURL($profile_name);
my $binary_shortname = BaseName($binary_name);
- return sprintf("%s.%s.%s-port%s",
- $binary_shortname, $main::op_time, $host, $port);
+ return sprintf("%s.%s.%s",
+ $binary_shortname, $main::op_time, $host);
}
sub FetchDynamicProfile {
@@ -2895,7 +2911,7 @@ sub FetchDynamicProfile {
if (!IsProfileURL($profile_name)) {
return $profile_name;
} else {
- my ($host, $port, $prefix, $path) = ParseProfileURL($profile_name);
+ my ($host, $baseURL, $path) = ParseProfileURL($profile_name);
if ($path eq "" || $path eq "/") {
# Missing type specifier defaults to cpu-profile
$path = $PROFILE_PAGE;
@@ -2903,33 +2919,26 @@ sub FetchDynamicProfile {
my $profile_file = MakeProfileBaseName($binary_name, $profile_name);
- my $url;
+ my $url = "$baseURL$path";
my $fetch_timeout = undef;
- if (($path =~ m/$PROFILE_PAGE/) || ($path =~ m/$PMUPROFILE_PAGE/)) {
- if ($path =~ m/$PROFILE_PAGE/) {
- $url = sprintf("http://$host:$port$prefix$path?seconds=%d",
- $main::opt_seconds);
+ if ($path =~ m/$PROFILE_PAGE|$PMUPROFILE_PAGE/) {
+ if ($path =~ m/[?]/) {
+ $url .= "&";
} else {
- if ($profile_name =~ m/[?]/) {
- $profile_name .= "&"
- } else {
- $profile_name .= "?"
- }
- $url = sprintf("http://$profile_name" . "seconds=%d",
- $main::opt_seconds);
+ $url .= "?";
}
+ $url .= sprintf("seconds=%d", $main::opt_seconds);
$fetch_timeout = $main::opt_seconds * 1.01 + 60;
} else {
# For non-CPU profiles, we add a type-extension to
# the target profile file name.
my $suffix = $path;
$suffix =~ s,/,.,g;
- $profile_file .= "$suffix";
- $url = "http://$host:$port$prefix$path";
+ $profile_file .= $suffix;
}
my $profile_dir = $ENV{"PPROF_TMPDIR"} || ($ENV{HOME} . "/pprof");
- if (!(-d $profile_dir)) {
+ if (! -d $profile_dir) {
mkdir($profile_dir)
|| die("Unable to create profile directory $profile_dir: $!\n");
}
@@ -2942,13 +2951,13 @@ sub FetchDynamicProfile {
my $fetcher = AddFetchTimeout($URL_FETCHER, $fetch_timeout);
my $cmd = "$fetcher '$url' > '$tmp_profile'";
- if (($path =~ m/$PROFILE_PAGE/) || ($path =~ m/$PMUPROFILE_PAGE/)){
+ if ($path =~ m/$PROFILE_PAGE|$PMUPROFILE_PAGE/){
print STDERR "Gathering CPU profile from $url for $main::opt_seconds seconds to\n ${real_profile}\n";
if ($encourage_patience) {
print STDERR "Be patient...\n";
}
} else {
- print STDERR "Fetching $path profile from $host:$port to\n ${real_profile}\n";
+ print STDERR "Fetching $path profile from $url to\n ${real_profile}\n";
}
(system($cmd) == 0) || error("Failed to get profile: $cmd: $!\n");
diff --git a/src/span.h b/src/span.h
index ab9a796..b3483ca 100644
--- a/src/span.h
+++ b/src/span.h
@@ -60,6 +60,10 @@ struct Span {
int value[64];
#endif
+ void* start_ptr() {
+ return reinterpret_cast(start << kPageShift);
+ }
+
// What freelist the span is on: IN_USE if on none, or normal or returned
enum { IN_USE, ON_NORMAL_FREELIST, ON_RETURNED_FREELIST };
};
diff --git a/src/stacktrace_win32-inl.h b/src/stacktrace_win32-inl.h
index 892cd7c..bbd4c43 100644
--- a/src/stacktrace_win32-inl.h
+++ b/src/stacktrace_win32-inl.h
@@ -49,6 +49,11 @@
// This code is inspired by a patch from David Vitek:
// http://code.google.com/p/google-perftools/issues/detail?id=83
+#ifndef BASE_STACKTRACE_WIN32_INL_H_
+#define BASE_STACKTRACE_WIN32_INL_H_
+// Note: this file is included into stacktrace.cc more than once.
+// Anything that should only be defined once should be here:
+
#include "config.h"
#include // for GetProcAddress and GetModuleHandle
#include
@@ -82,3 +87,5 @@ PERFTOOLS_DLL_DECL int GetStackFrames(void** /* pcs */,
assert(0 == "Not yet implemented");
return 0;
}
+
+#endif // BASE_STACKTRACE_WIN32_INL_H_
diff --git a/src/tcmalloc.cc b/src/tcmalloc.cc
index 122e18f..011fc91 100644
--- a/src/tcmalloc.cc
+++ b/src/tcmalloc.cc
@@ -798,22 +798,25 @@ static TCMallocGuard module_enter_exit_hook;
// Helpers for the exported routines below
//-------------------------------------------------------------------
-static inline bool CheckCachedSizeClass(void *ptr) {
- PageID p = reinterpret_cast(ptr) >> kPageShift;
- size_t cached_value = Static::pageheap()->GetSizeClassIfCached(p);
- return cached_value == 0 ||
- cached_value == Static::pageheap()->GetDescriptor(p)->sizeclass;
-}
-
static inline void* CheckedMallocResult(void *result) {
- ASSERT(result == NULL || CheckCachedSizeClass(result));
+ Span* fetched_span;
+ size_t cl;
+
+ if (result != NULL) {
+ ASSERT(Static::pageheap()->GetSizeClassOrSpan(result, &cl, &fetched_span));
+ }
+
return result;
}
static inline void* SpanToMallocResult(Span *span) {
- Static::pageheap()->CacheSizeClass(span->start, 0);
- return
- CheckedMallocResult(reinterpret_cast(span->start << kPageShift));
+ Span* fetched_span = NULL;
+ size_t cl = 0;
+ ASSERT(Static::pageheap()->GetSizeClassOrSpan(span->start_ptr(),
+ &cl, &fetched_span));
+ ASSERT(cl == kLargeSizeClass);
+ ASSERT(span == fetched_span);
+ return span->start_ptr();
}
static void* DoSampledAllocation(size_t size) {
@@ -824,7 +827,8 @@ static void* DoSampledAllocation(size_t size) {
SpinLockHolder h(Static::pageheap_lock());
// Allocate span
- Span *span = Static::pageheap()->New(tcmalloc::pages(size == 0 ? 1 : size));
+ Span *span = Static::pageheap()->New(tcmalloc::pages(size == 0 ? 1 : size),
+ kLargeSizeClass, kPageSize);
if (span == NULL) {
return NULL;
}
@@ -915,7 +919,7 @@ inline void* do_malloc_pages(ThreadCache* heap, size_t size) {
report_large = should_report_large(num_pages);
} else {
SpinLockHolder h(Static::pageheap_lock());
- Span* span = Static::pageheap()->New(num_pages);
+ Span* span = Static::pageheap()->New(num_pages, kLargeSizeClass, kPageSize);
result = (span == NULL ? NULL : SpanToMallocResult(span));
report_large = should_report_large(num_pages);
}
@@ -971,28 +975,22 @@ static inline ThreadCache* GetCacheIfPresent() {
inline void do_free_with_callback(void* ptr, void (*invalid_free_fn)(void*)) {
if (ptr == NULL) return;
ASSERT(Static::pageheap() != NULL); // Should not call free() before malloc()
- const PageID p = reinterpret_cast(ptr) >> kPageShift;
- Span* span = NULL;
- size_t cl = Static::pageheap()->GetSizeClassIfCached(p);
+ Span* span;
+ size_t cl;
- if (cl == 0) {
- span = Static::pageheap()->GetDescriptor(p);
- if (!span) {
- // span can be NULL because the pointer passed in is invalid
- // (not something returned by malloc or friends), or because the
- // pointer was allocated with some other allocator besides
- // tcmalloc. The latter can happen if tcmalloc is linked in via
- // a dynamic library, but is not listed last on the link line.
- // In that case, libraries after it on the link line will
- // allocate with libc malloc, but free with tcmalloc's free.
- (*invalid_free_fn)(ptr); // Decide how to handle the bad free request
- return;
- }
- cl = span->sizeclass;
- Static::pageheap()->CacheSizeClass(p, cl);
+ if (!Static::pageheap()->GetSizeClassOrSpan(ptr, &cl, &span)) {
+ // result can be false because the pointer passed in is invalid
+ // (not something returned by malloc or friends), or because the
+ // pointer was allocated with some other allocator besides
+ // tcmalloc. The latter can happen if tcmalloc is linked in via
+ // a dynamic library, but is not listed last on the link line.
+ // In that case, libraries after it on the link line will
+ // allocate with libc malloc, but free with tcmalloc's free.
+ (*invalid_free_fn)(ptr); // Decide how to handle the bad free request
+ return;
}
- if (cl != 0) {
- ASSERT(!Static::pageheap()->GetDescriptor(p)->sample);
+
+ if (cl != kLargeSizeClass) {
ThreadCache* heap = GetCacheIfPresent();
if (heap != NULL) {
heap->Deallocate(ptr, cl);
@@ -1003,8 +1001,7 @@ inline void do_free_with_callback(void* ptr, void (*invalid_free_fn)(void*)) {
}
} else {
SpinLockHolder h(Static::pageheap_lock());
- ASSERT(reinterpret_cast(ptr) % kPageSize == 0);
- ASSERT(span != NULL && span->start == p);
+ ASSERT(span != NULL && ptr == span->start_ptr());
if (span->sample) {
tcmalloc::DLL_Remove(span);
Static::stacktrace_allocator()->Delete(
@@ -1024,20 +1021,17 @@ inline size_t GetSizeWithCallback(void* ptr,
size_t (*invalid_getsize_fn)(void*)) {
if (ptr == NULL)
return 0;
- const PageID p = reinterpret_cast(ptr) >> kPageShift;
- size_t cl = Static::pageheap()->GetSizeClassIfCached(p);
- if (cl != 0) {
+
+ Span* span;
+ size_t cl;
+ if (!Static::pageheap()->GetSizeClassOrSpan(ptr, &cl, &span)) {
+ return (*invalid_getsize_fn)(ptr);
+ }
+
+ if (cl != kLargeSizeClass) {
return Static::sizemap()->ByteSizeForClass(cl);
} else {
- Span *span = Static::pageheap()->GetDescriptor(p);
- if (span == NULL) { // means we do not own this memory
- return (*invalid_getsize_fn)(ptr);
- } else if (span->sizeclass != 0) {
- Static::pageheap()->CacheSizeClass(p, span->sizeclass);
- return Static::sizemap()->ByteSizeForClass(span->sizeclass);
- } else {
- return span->length << kPageShift;
- }
+ return span->length << kPageShift;
}
}
@@ -1132,39 +1126,10 @@ void* do_memalign(size_t align, size_t size) {
// We will allocate directly from the page heap
SpinLockHolder h(Static::pageheap_lock());
- if (align <= kPageSize) {
- // Any page-level allocation will be fine
- // TODO: We could put the rest of this page in the appropriate
- // TODO: cache but it does not seem worth it.
- Span* span = Static::pageheap()->New(tcmalloc::pages(size));
- return span == NULL ? NULL : SpanToMallocResult(span);
- }
-
- // Allocate extra pages and carve off an aligned portion
- const Length alloc = tcmalloc::pages(size + align);
- Span* span = Static::pageheap()->New(alloc);
- if (span == NULL) return NULL;
-
- // Skip starting portion so that we end up aligned
- Length skip = 0;
- while ((((span->start+skip) << kPageShift) & (align - 1)) != 0) {
- skip++;
- }
- ASSERT(skip < alloc);
- if (skip > 0) {
- Span* rest = Static::pageheap()->Split(span, skip);
- Static::pageheap()->Delete(span);
- span = rest;
- }
-
- // Skip trailing portion that we do not need to return
- const Length needed = tcmalloc::pages(size);
- ASSERT(span->length >= needed);
- if (span->length > needed) {
- Span* trailer = Static::pageheap()->Split(span, needed);
- Static::pageheap()->Delete(trailer);
- }
- return SpanToMallocResult(span);
+ // Any page-level allocation will be fine
+ Span* span = Static::pageheap()->New(tcmalloc::pages(size),
+ kLargeSizeClass, align);
+ return span == NULL ? NULL : SpanToMallocResult(span);
}
// Helpers for use by exported routines below:
diff --git a/src/tests/page_heap_test.cc b/src/tests/page_heap_test.cc
index 9120b78..fd444da 100644
--- a/src/tests/page_heap_test.cc
+++ b/src/tests/page_heap_test.cc
@@ -26,7 +26,7 @@ static void TestPageHeap_Stats() {
CheckStats(ph, 0, 0, 0);
// Allocate a span 's1'
- tcmalloc::Span* s1 = ph->New(256);
+ tcmalloc::Span* s1 = ph->New(256, kLargeSizeClass, kPageSize);
CheckStats(ph, 256, 0, 0);
// Split span 's1' into 's1', 's2'. Delete 's2'
diff --git a/src/windows/addr2line-pdb.c b/src/windows/addr2line-pdb.c
index 97b614b..5c65a03 100644
--- a/src/windows/addr2line-pdb.c
+++ b/src/windows/addr2line-pdb.c
@@ -48,6 +48,12 @@
#define SEARCH_CAP (1024*1024)
#define WEBSYM "SRV*c:\\websymbols*http://msdl.microsoft.com/download/symbols"
+void usage() {
+ fprintf(stderr, "usage: "
+ "addr2line-pdb [-f|--functions] [-C|--demangle] [-e filename]\n");
+ fprintf(stderr, "(Then list the hex addresses on stdin, one per line)\n");
+}
+
int main(int argc, char *argv[]) {
DWORD error;
HANDLE process;
@@ -74,10 +80,11 @@ int main(int argc, char *argv[]) {
}
filename = argv[i+1];
i++; /* to skip over filename too */
+ } else if (strcmp(argv[i], "--help") == 0) {
+ usage();
+ exit(0);
} else {
- fprintf(stderr, "usage: "
- "addr2line-pdb [-f|--functions] [-C|--demangle] [-e filename]\n");
- fprintf(stderr, "(Then list the hex addresses on stdin, one per line)\n");
+ usage();
exit(1);
}
}
diff --git a/src/windows/nm-pdb.c b/src/windows/nm-pdb.c
index 726d345..9beb21d 100644
--- a/src/windows/nm-pdb.c
+++ b/src/windows/nm-pdb.c
@@ -180,6 +180,10 @@ static void ShowSymbolInfo(HANDLE process, ULONG64 module_base) {
#endif
}
+void usage() {
+ fprintf(stderr, "usage: nm-pdb [-C|--demangle] \n");
+}
+
int main(int argc, char *argv[]) {
DWORD error;
HANDLE process;
@@ -195,12 +199,15 @@ int main(int argc, char *argv[]) {
for (i = 1; i < argc; i++) {
if (strcmp(argv[i], "--demangle") == 0 || strcmp(argv[i], "-C") == 0) {
symopts |= SYMOPT_UNDNAME;
+ } else if (strcmp(argv[i], "--help") == 0) {
+ usage();
+ exit(0);
} else {
break;
}
}
if (i != argc - 1) {
- fprintf(stderr, "usage: nm-pdb [-C|--demangle] \n");
+ usage();
exit(1);
}
filename = argv[i];
diff --git a/src/windows/port.cc b/src/windows/port.cc
index bf3b106..9a9da80 100644
--- a/src/windows/port.cc
+++ b/src/windows/port.cc
@@ -100,10 +100,14 @@ bool CheckIfKernelSupportsTLS() {
// binary (it also doesn't run if the thread is terminated via
// TerminateThread, which if we're lucky this routine does).
-// This makes the linker create the TLS directory if it's not already
-// there (that is, even if __declspec(thead) is not used).
+// Force a reference to _tls_used to make the linker create the TLS directory
+// if it's not already there (that is, even if __declspec(thread) is not used).
+// Force a reference to p_thread_callback_tcmalloc and p_process_term_tcmalloc
+// to prevent whole program optimization from discarding the variables.
#ifdef _MSC_VER
#pragma comment(linker, "/INCLUDE:__tls_used")
+#pragma comment(linker, "/INCLUDE:_p_thread_callback_tcmalloc")
+#pragma comment(linker, "/INCLUDE:_p_process_term_tcmalloc")
#endif
// When destr_fn eventually runs, it's supposed to take as its
@@ -142,14 +146,18 @@ static void NTAPI on_tls_callback(HINSTANCE h, DWORD dwReason, PVOID pv) {
#ifdef _MSC_VER
+// extern "C" suppresses C++ name mangling so we know the symbol names
+// for the linker /INCLUDE:symbol pragmas above.
+extern "C" {
// This tells the linker to run these functions.
#pragma data_seg(push, old_seg)
#pragma data_seg(".CRT$XLB")
-static void (NTAPI *p_thread_callback)(HINSTANCE h, DWORD dwReason, PVOID pv)
- = on_tls_callback;
+void (NTAPI *p_thread_callback_tcmalloc)(
+ HINSTANCE h, DWORD dwReason, PVOID pv) = on_tls_callback;
#pragma data_seg(".CRT$XTU")
-static int (*p_process_term)(void) = on_process_term;
+int (*p_process_term_tcmalloc)(void) = on_process_term;
#pragma data_seg(pop, old_seg)
+} // extern "C"
#else // #ifdef _MSC_VER [probably msys/mingw]