Commit Graph

328 Commits

Author SHA1 Message Date
Aliaksey Kandratsenka
1035d5c18f start building malloc_extension_c_test even with static linking
Comment in Makefile.am stating that it doesn't work with static
linking is not accurate anymore.
2014-12-21 19:52:34 -08:00
Aliaksey Kandratsenka
d570a6391c unbreak malloc_extension_c_test on clang
Looks like even force_malloc trick was not enough to force clang to
actually call malloc. I'm now calling tc_malloc directly to prevent
that smartness.
2014-12-21 19:33:25 -08:00
Aliaksey Kandratsenka
4ace8dbbe2 added subdir-objects automake options
This is suggested by automake itself regarding future-compat.
2014-12-21 18:49:47 -08:00
Aliaksey Kandratsenka
f72e37c3f9 fixed C++ comment warning in malloc_extension_c.h from C compiler 2014-12-21 18:27:03 -08:00
Aliaksey Kandratsenka
f94ff0cc09 made AtomicOps_x86CPUFeatureStruct hidden
So that access to has_sse2 is faster under -fPIC.
2014-12-20 21:20:43 -08:00
Aliaksey Kandratsenka
987a724c23 dropped atopmicops workaround for irrelevant Opteron locking bug
It's not cheap at all when done in this way (i.e. without runtime
patching) and apparently useless.

It looks like Linux kernel never got this workaround at all. See
bugzilla ticket: https://bugzilla.kernel.org/show_bug.cgi?id=11305

And I see no traces of this workaround in glibc either.

On the other hand, opensolaris folks apparently still have it (or
something similar, based on comments on linux bugzilla) in their code:
32842aabdc/usr/src/uts/i86pc/os/mp_startup.c (L1136)

And affected CPUs (if any) are from year 2008 (that's 6 years now).

Plus even if somebody still uses those cpus (which is unlikely), they
won't have working kernel and glibc anyways.
2014-12-20 21:20:43 -08:00
Aliaksey Kandratsenka
7da5bd014d enabled aggressive decommit by default
TCMALLOC_AGGRESSIVE_DECOMMIT=f is one way to disable it and
SetNumericProperty is another.
2014-12-20 21:18:07 -08:00
Aliaksey Kandratsenka
51b0ad55b3 added basic unit test for singular malloc hooks 2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka
bce72dda07 inform compiler that tcmalloc allocation sampling is unlikely
Now compiler generates slightly better code which produces jump-less
code for common case of not sampling allocations.
2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka
4f051fddcd eliminated CheckIfKernelSupportsTLS
We don't care about pre-2.6.0 kernels anymore. So we can assume that
if compile time check worked, then at runtime it'll work.
2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka
81291ac399 set elf visibility to hidden for malloc hooks
To speed up access to them under -fPIC.
2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka
105c004d0c introduced ATTRIBUTE_VISIBILITY_HIDDEN
So that we can disable elf symbol interposition for certain
perf-sensitive symbols.
2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka
6a6c49e1f5 replaced separate singular malloc hooks with faster HookList
Specifically, we can now check in one place if hooks are set at all,
instead of two places. Which makes fast path shorter.
2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka
ba0441785b removed extra barriers in malloc hooks mutation methods
Because those are already done under spinlock and read-only and
lockless Traverse is already tolerant to slight inconsistencies.
2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka
890f34c77e introduced support for deprecated singular hooks into HookList
So that we can later drop separate singular hooks.
2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka
81ed7dff11 returned date of 2.3rc in NEWS back 2014-12-07 13:33:40 -08:00
Aliaksey Kandratsenka
463a619408 bumped version to 2.3 2014-12-07 12:53:35 -08:00
Aliaksey Kandratsenka
76e8138e12 updated NEWS for gperftools 2.3 2014-12-07 12:46:49 -08:00
Raphael Moreira Zinsly
8eb4ed785a Added option to disable libunwind linking
This patch adds a configure option to enable or disable libunwind linking.
The patch also disables libunwind on ppc by default.
2014-11-27 12:51:33 -08:00
Aliaksey Kandratsenka
3b94031d21 compile libunwind unwinder only of __thread is supported
This fixed build on certain OSX that I have access to.
2014-11-27 12:30:36 -08:00
Aliaksey Kandratsenka
3ace468202 issue-658: correctly close socketpair fds when socketpair fails
This applies patch by glider.
2014-11-27 10:45:53 -08:00
Aliaksey Kandratsenka
e7d5e512b0 bumped version to 2.3rc 2014-11-02 20:13:33 -08:00
Aliaksey Kandratsenka
1d44d37851 updated NEWS for gperftools 2.3rc 2014-11-02 19:59:05 -08:00
Aliaksey Kandratsenka
1108d83cf4 implemented cpu-profiling mode that profiles threads separately
Default mode of operation of cpu profiler uses itimer and
SIGPROF. This timer is by definition per-process and no spec defines
which thread is going to receive SIGPROF. And it provides correct
profiles only if we assume that probability of picking threads will be
proportional to cpu time spent by threads.

It is easy to see, that recent Linux (at least on common SMP hardware)
doesn't satisfy that assumption. Quite big skews of SIGPROF ticks
between threads is visible. I.e. I could see as big as 70%/20%
division instead of 50%/50% for pair of cpu-hog threads. (And I do see
it become 50/50 with new mode)

Fortunately POSIX provides mechanism to track per-thread cpu time via
posix timers facility. And even more fortunately, Linux also provides
mechanism to deliver timer ticks to specific threads.

Interestingly, it looks like FreeBSD also has very similar facility
and seems to suffer from same skew.  But due to difference in a way
how threads are identified, I haven't bothered to try to support this
mode on FreeBSD.

This commit implements new profiling mode where every thread creates
posix timer which tracks thread's cpu time. Threads also also set up
signal delivery to itself on overflows of that timer.

This new mode requires every thread to be registered in cpu
profiler. Existing ProfilerRegisterThread function is used for that.

Because registering threads requires application support (or suitable
LD_PRELOAD-able wrapper for thread creation API), new mode is off by
default. And it has to be manually activated by setting environment
variable CPUPROFILE_PER_THREAD_TIMERS.

New mode also requires librt symbols to be available. Which we do not
link to due to librt's dependency on libpthread.  Which we avoid due
to perf impact of bringing in libpthread to otherwise single-threaded
programs. So it has to be either already loaded by profiling program
or LD_PRELOAD-ed.
2014-11-02 18:29:55 -08:00
Aliaksey Kandratsenka
714bd93e42 drop workaround for too old redhat 7
Note that this is _not_ RHEL7 but original redhat 7 from early 2000s.
2014-11-02 18:29:55 -08:00
Aliaksey Kandratsenka
8de46e66fc don't add leaf function twice to profile under libunwind 2014-11-02 18:29:55 -08:00
Aliaksey Kandratsenka
2e5ee04889 pprof: indicate if using remote profile
Missing profile file is common source of confusion. So a bit more
clarify is useful.
2014-11-02 18:29:55 -08:00
Aliaksey Kandratsenka
6efe96b41c issue-493: correctly detect __ARM_ARCH_6ZK__ for MemoryBarrier
Which should fix issue reported by user pedronavf
2014-11-02 18:29:38 -08:00
Aliaksey Kandratsenka
8e97626378 issue-655: use safe getenv for aggressive decommit mode flag
Because otherwise we risk deadlock due to too early use of getenv on
windows.
2014-11-02 11:28:30 -08:00
Aliaksey Kandratsenka
8c3dc52fcf issue-654: [pprof] handle split text segments
This applies patch by user simonb.

Quoting:

Relocation packing splits a single executable load segment into two.  Before:

  LOAD           0x000000 0x00000000 0x00000000 0x2034d28 0x2034d28 R E 0x1000
  LOAD           0x2035888 0x02036888 0x02036888 0x182d38 0x1a67d0 RW  0x1000

After:
  LOAD           0x000000 0x00000000 0x00000000 0x14648 0x14648 R E 0x1000
  LOAD           0x014648 0x0020c648 0x0020c648 0x1e286e0 0x1e286e0 R E 0x1000
  ...
  LOAD           0x1e3d888 0x02036888 0x02036888 0x182d38 0x1a67d0 RW  0x1000

The .text section is in the second LOAD, and this is not at
offset/address zero.  The result is that this library shows up in
/proc/self/maps as multiple executable entries, for example (note:
this trace is not from the library dissected above, but rather from an
earlier version of it):

  73b0c000-73b21000 r-xp 00000000 b3:19 786460 /data/.../libchrome.2160.0.so
  73b21000-73d12000 ---p 00000000 00:00 0
  73d12000-75a90000 r-xp 00014000 b3:19 786460 /data/.../libchrome.2160.0.so
  75a90000-75c0d000 rw-p 01d91000 b3:19 786460 /data/.../libchrome.2160.0.so

When parsing this, pprof needs to merge the two r-xp entries above
into a single entry, otherwise the addresses it prints are incorrect.

The following fix against 2.2.1 was sufficient to make pprof --text
print the correct output.  Untested with other pprof options.
2014-10-18 16:35:57 -07:00
Ricardo M. Correia
44c61ce6c4 Fix parsing /proc/pid/maps dump in CPU profile data file
When trying to use pprof on my machine, the symbols of my program were
not being recognized.

It turned out that pprof, when calculating the offset of the text list
of mapped objects (the last section of the CPU profile data file), was
assuming that the slot size was always 4 bytes, even on 64-bit machines.

This led to ParseLibraries() reading a lot of garbage data at the
beginning of the map, and consequently the regex was failing to match on
the first line of the real (non-garbage) map.
2014-10-11 16:37:29 -07:00
Aliaksey Kandratsenka
2a28ef24dd Added remaining memory allocated info to 'Exiting' dump message
This applies patch by user yurivict.
2014-09-06 16:49:24 -07:00
Adam McNeeney
bbf346a856 Cope with new addr2line outputs for DWARF4
Copes with ? for line number (converts to 0).
Copes with (discriminator <num>) suffixes to file/linenum (removes).

Change-Id: I96207165e4852c71d3512157864f12d101cdf44a
2014-08-23 14:59:30 -07:00
Aliaksey Kandratsenka
b08d760958 issue-641: Added --show_addresses option
This applies patch by user yurivict.
2014-08-23 14:47:04 -07:00
Aliaksey Kandratsenka
3c326d9f20 issue-644: fix possible out-of-bounds access in GetenvBeforeMain
As suggested by user Ivan L.
2014-08-19 08:30:07 -07:00
jiakai
f1ae3c446f Add an option to allow disabling stripping template argument in pprof 2014-08-01 22:14:16 -07:00
Aliaksey Kandratsenka
a12890df25 issue-635: allow whitespace in libraries paths
This applies change suggested by user mich...@sebesbefut.com
2014-07-26 14:12:42 -07:00
Aliaksey Kandratsenka
d5e36788d8 issue-636: fix prof/web command on Windows/MinGW
This applies patch sent by user chaishushan.
2014-07-26 14:04:26 -07:00
Michael Pasieka
4b788656bb added option to display stack traces in output for heap checker
Quoting from email:

I had the same question as William posted to stack overflow back on
Dec 9,2013: How to display symbols in stack trace of google-perftools
heap profiler (*).  I dug into the source and realized the
functionality was not there but could be added. I am hoping that
someone else will find this useful/helpful.

The patch I created will not attach so I am adding below.

Enjoy!

-- Michael

* http://stackoverflow.com/questions/20476918/how-to-display-symbols-in-stack-trace-of-google-perftools-heap-profiler
2014-07-13 18:15:20 -07:00
WenSheng He
3abb5cb819 issue-630: The env var should be "CPUPROFILE"
To enable cpu profile, the env var should be "CPUPROFILE", not "PROFILE"
actually.

Signed-off-by: Aliaksey Kandratsenka <alk@tut.by>
2014-07-06 18:51:27 -07:00
Aliaksey Kandratsenka
fd81ec2578 issue-631: fixed miscompilation of debugallocation without mmap
This applies patch sent by user iamxujian.

Clearly, when I updated debugallocation to fix issue-464 I've broken
no-mmap path by forgetting closing brace.
2014-06-28 13:05:12 -07:00
Aliaksey Kandratsenka
2e90b6fd72 bumped version to 2.2.1 2014-06-21 15:52:34 -07:00
Aliaksey Kandratsenka
577b940cc0 updated NEWS for 2.2.1 2014-06-21 15:52:31 -07:00
Aliaksey Kandratsenka
2fe4b329ad applied chromium patch fixing some build issue on android
This applies patch from: https://codereview.chromium.org/284843002/ by
jungjik.lee@samsung.com
2014-06-21 12:12:04 -07:00
Aliaksey Kandratsenka
c009398e32 issue-628:package missing stacktrace_powerpc-{linux,darwin}-inl.h
This headers were missing in .tar.gz because they were not mentioned
anywhere in Makefile.am.
2014-06-15 12:58:29 -07:00
Adhemerval Zanella
81d99f21ed issue-626: Fix SetupAggressiveDecommit initialization
This patch fixes the SetupAggressiveDecommit initialization to run after
pageheap_ creation.  Current code it not enforcing it, since
InitStaticVars is being called outside the static_vars module.
2014-06-03 07:50:56 -05:00
Aliaksey Kandratsenka
846b775dfa bumped version to 2.2 2014-05-03 18:01:12 -07:00
Aliaksey Kandratsenka
cdf8e1e932 updated NEWS for 2.2 2014-05-03 17:59:42 -07:00
Aliaksey Kandratsenka
0807476f56 issue-620: windows dll patching: fixed delete of old stub code
After code for issue 359 was applied PreamblePatcher started using
it's own code to manage memory of stub code fragments. It's not using
new[] anymore. And it automatically frees stub code memory on
Unpatch.

Clearly, author of that code forgot to remote that no more needed
delete call. With that delete call we end up trying to free memory
that was never allocated with any of known allocators and crash.
2014-05-03 17:38:14 -07:00
Aliaksey Kandratsenka
facd7e83b3 bumped version to 2.1.90 2014-04-19 13:16:20 -07:00