gperftools

mirror of https://github.com/gperftools/gperftools synced 2025-02-01 04:01:34 +00:00

Author	SHA1	Message	Date
Milton Chiang	81d8d2a9e7	Add "ARMv8-A" to the supporting list of ARM architecture.	2015-05-23 12:01:48 -07:00
Aliaksey Kandratsenka	64d1a86cb8	include time.h for struct timespec on Visual Studio 2015 This patch was submitted by user wmamrak.	2015-05-09 15:38:12 -07:00
Aliaksey Kandratsenka	7013b21997	hook mi_force_{un,}lock on OSX instead of pthread_atfork This is patch by Anton Samokhvalov. Apparently it helps with locking around forking on OSX.	2015-05-09 14:56:58 -07:00
Angus Gratton	f25f8e0bf2	Clarify that only tcmalloc_minimal is supported on Windows.	2015-05-09 12:03:17 -07:00
Aliaksey Kandratsenka	772a686c45	issue-683: fix compile error in clang with -m32 and 64-bit off_t	2015-05-03 13:15:16 -07:00
Aliaksey Kandratsenka	0a3bafd645	fix typo in PrintAvailability code This is patch contributed by user ssubotin.	2015-04-11 10:35:53 -07:00
Matt Cross	6ce10a2a05	Add support for printing collapsed stacks for generating flame graphs.	2015-03-26 16:24:11 -04:00
Matt Cross	2c1a165fa5	Add support for reading debug symbols automatically on systems where shared libraries with debug symbols are installed at "/usr/lib/debug/<originalpath>.debug", such as RHEL and CentOS.	2015-03-26 12:14:56 -04:00
Jonathan Lambrechts	2e65495628	callgrind : handle inlined functions	2015-02-13 17:54:21 -08:00
Jonathan Lambrechts	90d7408d38	pprof : callgrind : fix unknown files	2015-02-13 17:54:14 -08:00
Aliaksey Kandratsenka	aa963a24ae	issue-672: fixed date of news entry of gperftools 2.4 release It is 2015 and not 2014. Spotted and reported by Armin Rigo.	2015-02-09 08:35:03 -08:00
Aliaksey Kandratsenka	c66aeabdba	fixed default value of HEAP_PROFILER_TIME_INTERVAL in .html doc	2015-01-10 14:35:54 -08:00
Aliaksey Kandratsenka	689e4a5bb4	bumped version to 2.4	2015-01-10 12:26:51 -08:00
Aliaksey Kandratsenka	3f5f1bba0c	bumped version to 2.4rc	2014-12-28 18:28:18 -08:00
Aliaksey Kandratsenka	c4dfdebc79	updated NEWS for gperftools 2.4rc	2014-12-28 18:28:15 -08:00
Aliaksey Kandratsenka	0096be5f6f	pprof: allow disabling auto-removal of "constant 2nd frame" "constand 2nd frame" feature is supposed to detect and workaround incorrect cpu profile stack captures where parts of or whole cpu profiling signal handler frames are not skipped. I've seen programs where this feature incorrectly removes non-signal frames. Plus it actually hides bugs in stacktrace capturing which we want be able to spot. There is now --no-auto-signal-frm option for disabling it.	2014-12-28 15:35:54 -08:00
Aliaksey Kandratsenka	4859d80205	cpuprofiler: drop correct number of signal handler frames We actually have 3 and not 2 of them.	2014-12-28 15:35:54 -08:00
Aliaksey Kandratsenka	812ab1ee7e	pprof: eliminate duplicate top frames if dropping signal frames In cpu profiles that had parts of signal handler we could have situation like that: * PC * signal handler frame * PC Specifically when capturing stacktraces via libunwind. For such stacktraces pprof used to draw self-cycle in functions confusing everybody. Given that me might have a number of such profiles in the wild it makes sense to treat that duplicate PC issue.	2014-12-28 15:35:54 -08:00
Aliaksey Kandratsenka	e6e78315e4	cpuprofiler: better explain deduplication of top stacktrace entry	2014-12-28 15:35:54 -08:00
Aliaksey Kandratsenka	24b8ec2846	cpuprofiler: disable capturing stacktrace from signal's ucontext This was reported to cause problems due to libunwind occasionally returning top level pc that is 1 smaller than real pc which causes problems.	2014-12-28 15:35:54 -08:00
Aliaksey Kandratsenka	83588de720	pprof: added support for dumping stacks in --text mode Which is very useful for diagnosing stack capturing and processing bugs.	2014-12-28 15:35:54 -08:00
Aliaksey Kandratsenka	2f29c9b062	pprof: made --show-addresses work	2014-12-28 15:35:54 -08:00
Raphael Moreira Zinsly	b8b027d09a	Make PPC64 use 64K of internal page size for tcmalloc by default This patch set the default tcmalloc internal page size to 64K when built on PPC.	2014-12-23 10:51:54 -08:00
Raphael Moreira Zinsly	3f55d874be	New configure flags to set the alignment and page size of tcmalloc Added two new configure flags, --with-tcmalloc-pagesize and --with-tcmalloc-alignment, in order to set the tcmalloc internal page size and tcmalloc allocation alignment without the need of a compiler directive and to make the choice of the page size independent of the allocation alignment.	2014-12-23 10:51:51 -08:00
Aliaksey Kandratsenka	1035d5c18f	start building malloc_extension_c_test even with static linking Comment in Makefile.am stating that it doesn't work with static linking is not accurate anymore.	2014-12-21 19:52:34 -08:00
Aliaksey Kandratsenka	d570a6391c	unbreak malloc_extension_c_test on clang Looks like even force_malloc trick was not enough to force clang to actually call malloc. I'm now calling tc_malloc directly to prevent that smartness.	2014-12-21 19:33:25 -08:00
Aliaksey Kandratsenka	4ace8dbbe2	added subdir-objects automake options This is suggested by automake itself regarding future-compat.	2014-12-21 18:49:47 -08:00
Aliaksey Kandratsenka	f72e37c3f9	fixed C++ comment warning in malloc_extension_c.h from C compiler	2014-12-21 18:27:03 -08:00
Aliaksey Kandratsenka	f94ff0cc09	made AtomicOps_x86CPUFeatureStruct hidden So that access to has_sse2 is faster under -fPIC.	2014-12-20 21:20:43 -08:00
Aliaksey Kandratsenka	987a724c23	dropped atopmicops workaround for irrelevant Opteron locking bug It's not cheap at all when done in this way (i.e. without runtime patching) and apparently useless. It looks like Linux kernel never got this workaround at all. See bugzilla ticket: https://bugzilla.kernel.org/show_bug.cgi?id=11305 And I see no traces of this workaround in glibc either. On the other hand, opensolaris folks apparently still have it (or something similar, based on comments on linux bugzilla) in their code: `32842aabdc/usr/src/uts/i86pc/os/mp_startup.c (L1136)` And affected CPUs (if any) are from year 2008 (that's 6 years now). Plus even if somebody still uses those cpus (which is unlikely), they won't have working kernel and glibc anyways.	2014-12-20 21:20:43 -08:00
Aliaksey Kandratsenka	7da5bd014d	enabled aggressive decommit by default TCMALLOC_AGGRESSIVE_DECOMMIT=f is one way to disable it and SetNumericProperty is another.	2014-12-20 21:18:07 -08:00
Aliaksey Kandratsenka	51b0ad55b3	added basic unit test for singular malloc hooks	2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka	bce72dda07	inform compiler that tcmalloc allocation sampling is unlikely Now compiler generates slightly better code which produces jump-less code for common case of not sampling allocations.	2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka	4f051fddcd	eliminated CheckIfKernelSupportsTLS We don't care about pre-2.6.0 kernels anymore. So we can assume that if compile time check worked, then at runtime it'll work.	2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka	81291ac399	set elf visibility to hidden for malloc hooks To speed up access to them under -fPIC.	2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka	105c004d0c	introduced ATTRIBUTE_VISIBILITY_HIDDEN So that we can disable elf symbol interposition for certain perf-sensitive symbols.	2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka	6a6c49e1f5	replaced separate singular malloc hooks with faster HookList Specifically, we can now check in one place if hooks are set at all, instead of two places. Which makes fast path shorter.	2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka	ba0441785b	removed extra barriers in malloc hooks mutation methods Because those are already done under spinlock and read-only and lockless Traverse is already tolerant to slight inconsistencies.	2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka	890f34c77e	introduced support for deprecated singular hooks into HookList So that we can later drop separate singular hooks.	2014-12-07 17:46:04 -08:00
Aliaksey Kandratsenka	81ed7dff11	returned date of 2.3rc in NEWS back	2014-12-07 13:33:40 -08:00
Aliaksey Kandratsenka	463a619408	bumped version to 2.3	2014-12-07 12:53:35 -08:00
Aliaksey Kandratsenka	76e8138e12	updated NEWS for gperftools 2.3	2014-12-07 12:46:49 -08:00
Raphael Moreira Zinsly	8eb4ed785a	Added option to disable libunwind linking This patch adds a configure option to enable or disable libunwind linking. The patch also disables libunwind on ppc by default.	2014-11-27 12:51:33 -08:00
Aliaksey Kandratsenka	3b94031d21	compile libunwind unwinder only of __thread is supported This fixed build on certain OSX that I have access to.	2014-11-27 12:30:36 -08:00
Aliaksey Kandratsenka	3ace468202	issue-658: correctly close socketpair fds when socketpair fails This applies patch by glider.	2014-11-27 10:45:53 -08:00
Aliaksey Kandratsenka	e7d5e512b0	bumped version to 2.3rc	2014-11-02 20:13:33 -08:00
Aliaksey Kandratsenka	1d44d37851	updated NEWS for gperftools 2.3rc	2014-11-02 19:59:05 -08:00
Aliaksey Kandratsenka	1108d83cf4	implemented cpu-profiling mode that profiles threads separately Default mode of operation of cpu profiler uses itimer and SIGPROF. This timer is by definition per-process and no spec defines which thread is going to receive SIGPROF. And it provides correct profiles only if we assume that probability of picking threads will be proportional to cpu time spent by threads. It is easy to see, that recent Linux (at least on common SMP hardware) doesn't satisfy that assumption. Quite big skews of SIGPROF ticks between threads is visible. I.e. I could see as big as 70%/20% division instead of 50%/50% for pair of cpu-hog threads. (And I do see it become 50/50 with new mode) Fortunately POSIX provides mechanism to track per-thread cpu time via posix timers facility. And even more fortunately, Linux also provides mechanism to deliver timer ticks to specific threads. Interestingly, it looks like FreeBSD also has very similar facility and seems to suffer from same skew. But due to difference in a way how threads are identified, I haven't bothered to try to support this mode on FreeBSD. This commit implements new profiling mode where every thread creates posix timer which tracks thread's cpu time. Threads also also set up signal delivery to itself on overflows of that timer. This new mode requires every thread to be registered in cpu profiler. Existing ProfilerRegisterThread function is used for that. Because registering threads requires application support (or suitable LD_PRELOAD-able wrapper for thread creation API), new mode is off by default. And it has to be manually activated by setting environment variable CPUPROFILE_PER_THREAD_TIMERS. New mode also requires librt symbols to be available. Which we do not link to due to librt's dependency on libpthread. Which we avoid due to perf impact of bringing in libpthread to otherwise single-threaded programs. So it has to be either already loaded by profiling program or LD_PRELOAD-ed.	2014-11-02 18:29:55 -08:00
Aliaksey Kandratsenka	714bd93e42	drop workaround for too old redhat 7 Note that this is _not_ RHEL7 but original redhat 7 from early 2000s.	2014-11-02 18:29:55 -08:00
Aliaksey Kandratsenka	8de46e66fc	don't add leaf function twice to profile under libunwind	2014-11-02 18:29:55 -08:00

... 2 3 4 5 6 ...

502 Commits