gperftools

mirror of https://github.com/gperftools/gperftools synced 2025-03-03 03:17:38 +00:00

Author	SHA1	Message	Date
Aliaksey Kandratsenka	eb2d69014c	issue-489: made tests pass on enabled chromium-style decommitting	2014-02-22 13:10:08 -08:00
Aliaksey Kandratsenka	a92fc76f72	issue-489: enable chromium-style decommitting on env variable TCMALLOC_AGGRESSIVE_DECOMMIT=t now enables aggressive decommitting by default.	2014-02-22 13:09:33 -08:00
Aliaksey Kandratsenka	c7ce50cd04	issue-489: implemented API to set chromium-style de-committing Chrome has code to decommit (release back to OS) every span that's released. I don't want to make it default, but indeed some applications may want to enable this mode. The code itself is taken from 2-way-merging of code from Chromium fork.	2014-02-22 12:37:54 -08:00
Aliaksey Kandratsenka	1d707cd4a3	issue-489: fixed warning Computing certain values just for ASSERT raises just warning from compiler because if NDEBUG is set those are dead code.	2014-02-22 12:25:25 -08:00
Aliaksey Kandratsenka	91bffcbad6	issue-489: ported chromium windows decommitting code I tried to do it cleanly with merges but chromium code has so many relevant commits (with frequent reverts) that makes it near impossible. Simpler 2-way emerge-files worked in the end. I've removed chromium's aggressive 'always decommit' behavior which I want to make optional later. Majority of this work is the following commits (but there are more, particularly against port.cc): commit 9c92338c5f8770c440799d24387c3733fd6d826b Author: jamesr@chromium.org <jamesr@chromium.org@0039d316-1c4b-4281-b951-d872f2087c98> Date: Tue Oct 6 18:33:31 2009 +0000 Tracks the amount of committed vs uncommitted memory in tcmalloc's page heap's freelists Keeps track of the number of reserved but not committed pages in the freelist and uses that to calculate a waste metric, which is the ratio of committed pages vs pages used by the application. This is exposed in the GetStats() call (which is used for about:tcmalloc) and through GetNumericalProperty() in Malloc BUG=none TEST=open about:tcmalloc and monitor 'WASTE' columns while using the browser Review URL: http://codereview.chromium.org/251065 git-svn-id: svn://svn.chromium.org/chrome/trunk/src@28133 0039d316-1c4b-4281-b951-d872f2087c98 commit aef4f1be3eec2059a7c6e2c106050a5f3d6ccf12 Author: jar@chromium.org <jar@chromium.org@0039d316-1c4b-4281-b951-d872f2087c98> Date: Mon Oct 5 17:58:51 2009 +0000 Revert further back to MBelshe's baseline forking TCMalloc This changes to decommitting in all paths through the page_heap delete method (which adds spans to the free lists). r=mbelshe,jamesr Review URL: http://codereview.chromium.org/255067 git-svn-id: svn://svn.chromium.org/chrome/trunk/src@28006 0039d316-1c4b-4281-b951-d872f2087c98 commit e94afbb913b95f512cb8745a2729c73f82b15ae7 Author: jar@chromium.org <jar@chromium.org@0039d316-1c4b-4281-b951-d872f2087c98> Date: Thu Oct 1 00:25:41 2009 +0000 Rollback Scavenge implemetation and rely on existing functionality to free This is a landing of a patch provided by antonm. See: http://codereview.chromium.org/235022 Also included change to browser_about_handler.cc to fix build, and I set TCMALLOC_RELEASE_RATE to 1.0 on line 40 of page_heap.cc (I think this was an inadvertent rollback element). r=antonm Review URL: http://codereview.chromium.org/257009 git-svn-id: svn://svn.chromium.org/chrome/trunk/src@27692 0039d316-1c4b-4281-b951-d872f2087c98 commit c585892d2c42a47c95d06a684a6685156c545403 Author: mbelshe@google.com <mbelshe@google.com@0039d316-1c4b-4281-b951-d872f2087c98> Date: Wed Sep 2 17:33:23 2009 +0000 Landing for Anton Muhin's tcmalloc patch: http://codereview.chromium.org/180021/show Restore decommitting in IncrementalScavenge and draft Scavenge method to be invoked periodically to reduce amount of committed pages. BUG=none TEST=none Review URL: http://codereview.chromium.org/187008 git-svn-id: svn://svn.chromium.org/chrome/trunk/src@25188 0039d316-1c4b-4281-b951-d872f2087c98 commit 14239acc00731e94736ac62e80fc6b17c31ea131 Author: mbelshe@google.com <mbelshe@google.com@0039d316-1c4b-4281-b951-d872f2087c98> Date: Wed Aug 12 02:17:14 2009 +0000 Major changes to the Chrome allocator. Changes include: * Fix tcmalloc to release memory. Implements the TCMalloc_SystemCommit() mechanism so that tcmalloc can implement SystemRelease() and later reuse that memory. * Enable dynamic switching of allocators based on an environment variable. Users can now switch between tcmalloc, jemalloc, the default windows heap, and the windows low-fragmentation heap. * Implements set_new_mode() across all allocators so that we can be sure that out-of-memory conditions are handled safely. BUG=18345 TEST=none; plan to get all unit tests running through these allocators. Review URL: http://codereview.chromium.org/165275 git-svn-id: svn://svn.chromium.org/chrome/trunk/src@23140 0039d316-1c4b-4281-b951-d872f2087c98	2014-02-22 12:25:25 -08:00
Aliaksey Kandratsenka	7e24b6ca2a	added debugallocation check for offset_ corruption It was previously possible (although unlikely) for damaged offset_ field to lead FromRawPointer implementation into different MallocBlock. As is usual with any damage, it's best to catch errors at earliest possible time.	2014-02-22 12:10:08 -08:00
Aliaksey Kandratsenka	6dcd73f1eb	avoid crash in DebugMallocImplementation::GetOwnership It was possible that if GetOwnership is passed pointer to memory not owned by tcmalloc, it would crash. Or incorrectly return owned. I.e. due to indirection in FromRawPointer. New implementation prevents that, but introduces different bug instead. New implementation incorrectly returns "not owned" for memalign chunks with big alignment. But in can be argued that passing pointer returned from different memalign implementation did not work previously too.	2014-02-22 11:45:59 -08:00
Aliaksey Kandratsenka	33280ffb71	removed unused "using" in malloc_extension_test.cc	2014-02-22 11:45:59 -08:00
Aliaksey Kandratsenka	066e524d6e	eliminated useless BASE_XXX defines in debugallocation.cc And closed TODO entry for that.	2014-02-22 11:45:59 -08:00
Aliaksey Kandratsenka	a2375a1f36	issue-464: correctly handle realloc after memalign in debugalloc debug memalign is creating special header block to allow us to find real allocated block. And previous implementation of data copying wasn't taking that into account and was copying that "alignment header" into newly allocated block.	2014-02-22 11:45:59 -08:00
Riku Voipio	d31f522f0e	Add aarch64 defines With atomic operations and system call support in place, enable with __aarch64__ defines Aarch64 support in other files around the google-perftools header files. After these, google-perftools testsuite (make check) results: 8 of 46 tests failed. FAIL: sampling_test.sh FAIL: heap-profiler_unittest.sh FAIL: heap-checker_unittest.sh FAIL: heap-checker-death_unittest.sh FAIL: sampling_debug_test.sh FAIL: heap-profiler_debug_unittest.sh FAIL: heap-checker_debug_unittest.sh FAIL: profiler_unittest.sh While it indicates that there is still work to do, This is still better than the result I get on ARMv7: 12 of 46 tests failed.	2014-02-16 20:15:00 -08:00
Riku Voipio	15b5e7a35c	linux_syscall_support.h: add aarch64 support Aarch64 support for linux_syscall_support.h. Since Aarch64 is a brand new architecture, none of the legacy system calls are neccesarily available. Thus some changes were neccesary affect other architectures as well: 1) use getdents64 where available and else getdents (for ppc64) 2) other legacy system calls, pipe, waitpid and open replaced by pipe2, wait4 and openat where available. 3) use fstatat if stat is not available. The aarch64 system call interface follows the Aarch64 calling convention (regs x0-x5 for arguments and x8 system call number - return in x0). Clone implementation is adapted from glibc. v2: step back in getdents removal due to ppc64	2014-02-16 20:14:36 -08:00
Aliaksey Kandratsenka	90ba15d1f2	issue-604: implement runtime-selectable stacktrace capturing We're now building all supported stacktrace capturing methods. And there's now a way to select at runtime which method is used.	2014-02-16 19:22:06 -08:00
Aliaksey Kandratsenka	33f6781d64	issue-605: avoid compilation errors if pthread_key_t is pointer Which seems to be the case on later cygwin	2014-02-16 19:22:02 -08:00
Wang YanQing	a0ed9ace53	debugallocation: fix bus error on mipsel-linux platform when enable use_malloc_page_fence Fix below "BUS ERROR" issue: a0 hold start address of memory block allocated by DebugAllocate in debugallocation.cc gdb) info registers zero at v0 v1 a0 a1 a2 a3 R0 00000000 10008700 772f62a0 00084d40 766dcfef 7fb5f420 00000000 004b4dd8 t0 t1 t2 t3 t4 t5 t6 t7 R8 7713c1a0 7712dbc0 ffffffff 777bc000 f0000000 00000001 00000000 00403d10 s0 s1 s2 s3 s4 s5 s6 s7 R16 7fb5ff1c 00401b9c 77050020 7fb5fb18 00000000 004cb008 004ca748 ffffffff t8 t9 k0 k1 gp sp s8 ra R24 0000002f 771adcd4 00000000 00000000 771f4140 7fb5f408 7fb5f430 771add6c sr lo hi bad cause pc 00008713 0000e9fe 00000334 766dcff7 00800010 771adcfc fsr fir 00000004 00000000 (gdb) disassemble Dump of assembler code for function _ZNSs4_Rep10_M_disposeERKSaIcE: 0x771adcd4 <+0>: lui gp,0x4 0x771adcd8 <+4>: addiu gp,gp,25708 0x771adcdc <+8>: addu gp,gp,t9 0x771adce0 <+12>: lw v0,-28696(gp) 0x771adce4 <+16>: beq a0,v0,0x771add38 <_ZNSs4_Rep10_M_disposeERKSaIcE+100> 0x771adce8 <+20>: nop 0x771adcec <+24>: lw v0,-30356(gp) 0x771adcf0 <+28>: beqzl v0,0x771add1c <_ZNSs4_Rep10_M_disposeERKSaIcE+72> 0x771adcf4 <+32>: lw v0,8(a0) 0x771adcf8 <+36>: sync => 0x771adcfc <+40>: ll v0,8(a0) 0x771add00 <+44>: addiu at,v0,-1 0x771add04 <+48>: sc at,8(a0) 0x771add08 <+52>: beqz at,0x771adcfc <_ZNSs4_Rep10_M_disposeERKSaIcE+40> 0x771add0c <+56>: nop 0x771add10 <+60>: sync 0x771add14 <+64>: b 0x771add24 <_ZNSs4_Rep10_M_disposeERKSaIcE+80> 0x771add18 <+68>: nop 0x771add1c <+72>: addiu v1,v0,-1 0x771add20 <+76>: sw v1,8(a0) 0x771add24 <+80>: bgtz v0,0x771add38 <_ZNSs4_Rep10_M_disposeERKSaIcE+100> 0x771add28 <+84>: nop 0x771add2c <+88>: lw t9,-27072(gp) 0x771add30 <+92>: jr t9 0x771add34 <+96>: nop 0x771add38 <+100>: jr ra 0x771add3c <+104>: nop End of assembler dump. ll instruction manual: Load Linked: Loads the destination register with the contents of the word that is at the memory location. This instruction implicity performs a SYNC operation; all loads and stores to shared memory fetched prior to the ll must access memory before the ll, and loads and stores to shared memory fetched subsequent to the ll must access memory after ll. Load Linked and Store Conditional can be use to automatically update memory locations. *This instruction is not valid in the mips1 architectures. The machine signals an address exception when the effective address is not divisible by four. Signed-off-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Aliaksey Kandratsenka <alk@tut.by> [alk@tut.by: removed addition of unused #include]	2014-02-16 12:51:51 -08:00
Aliaksey Kandratsenka	38bfc7a1c2	removed irrelevant comment	2014-02-08 14:10:11 -08:00
Aliaksey Kandratsenka	d03c467a34	allow asking for gcc atomics on all platforms I.e. by doing ./configure CPPFLAGS=-DTCMALLOC_PREFER_GCC_ATOMICS	2014-02-08 14:10:04 -08:00
Riku Voipio	e8fe990fa0	implement atomics with gcc intrinsics Gcc after 4.7 provides atomic builtins[1]. Use these instead of adding yet-another-assembly port for Aarch64 (64-bit ARM). This patch enables succesfully building and running atomicops unittest on Aarch64. This patch enables using gcc builtins only when no assembly implementation is provided. But as a quick check, atomicops_unittest and rest of testsuite passes with atomicops-internals-gcc also ARMv7 and X86_64 if the ifdef in atomicops is adjusted to prefer the generic implementation. [1] http://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html	2014-02-05 16:35:49 +02:00
Aliaksey Kandratsenka	fa4b1c401d	issue-599: fixing FreeBSD issue with sbrk Applied patch by yurivict. It was wrong assembly specifically for FreeBSD in sbrk overriding code.	2014-01-19 22:37:44 -08:00
Aliaksey Kandratsenka	71a239e559	check debug_malloc_implementation_space via COMPILE_ASSERT Because we can and because compile-time is always better.	2014-01-19 12:30:53 -08:00
Aliaksey Kandratsenka	54568e32fc	issue-565: don't pollute global namespace with thread lister API Instead those functions that are original taken from google's "base" code now have prefix TCMalloc_. So that they don't conflict with other google's libraries having same functions.	2014-01-18 17:23:14 -08:00
Aliaksey Kandratsenka	64bc1baa1f	issue-{66,547}: use signal's ucontext when unwinding backtrace In issue-66 (and readme) it is pointed out that sometimes there are some issues grabbing backtrace across signal handler boundary. This code attempts to fix it by grabbing backtrace from signal's ucontext which clearly does not include signal handler boundary. We're using "feature" of libunwind that for some important platforms libunwind's context is same as libc's ucontext_t which is given to us as part of calling signal handler.	2014-01-18 17:21:00 -08:00
Aliaksey Kandratsenka	185bf3fcc3	issue-581: avoid destructing DebugMallocImplementation Because otherwise destructor might be invoked well before other places that might touch malloc extension instance. We're using placement new to initialize it and pass pointer to MallocExtension::Register. Which ensures that destructor for it is never run. Based on idea suggested by Andrew C. Morrow.	2014-01-18 14:14:22 -08:00
Aliaksey Kandratsenka	764d304222	don't re-define strtoq for VS2013 Which is part of previous change that wasn't correctly applied.	2014-01-05 12:49:23 -08:00
Aliaksey Kandratsenka	1fc768864d	fix compilation under VS 2013 This is essentially a copy of corresponding chromium change from: https://codereview.chromium.org/27017003	2014-01-05 12:43:59 -08:00
Aliaksey Kandratsenka	4c274b9e20	issue-592: handle recent mingw with C++11 threads Somehow it's c++ headers (like string) define pthread symbols without even us asking for. That breaks old assumption that pthread symbols are not available on windows. In order to fix that we detect this condition in configure.ac and avoid defining windows versions of pthread symbols.	2014-01-04 18:28:36 -08:00
Aliaksey Kandratsenka	1458ee2239	issue-596: removed unused AtomicIncrement operation There's no need for us to attempt to maintain Google's atomic ops code in era of C++11.	2014-01-04 13:59:56 -08:00
Aliaksey Kandratsenka	6630b24e27	Removed unused AtomicPtr::CompareAndSwap	2014-01-04 13:59:49 -08:00
xiaoyur347	a15115271c	add "-finstrument-functions" support for MIPS uclibc. should configure with CXXFLAGS="-finstrument-functions"	2013-12-20 09:41:08 +08:00
xiaoyur347	7c4888515e	add uclibc support * some variables defined with "char " should be modified to "const char" * For uclibc, glibc's "void malloc_stats(void)" should be "void malloc_stats(FILE )", is commented now. For uclibc, __sbrk is with attribute "hidden", so we use mmap allocator for uclibc.	2013-12-20 09:02:49 +08:00
Aliaksey Kandratsenka	7bd193bca9	issue-586: detect main executable even if PIE is active Previous logic of detecting main program addresses is to assume that main executable is at least addressess. With PIE (active by default on Ubuntus) it doesn't work. In order to deal with that, we're attempting to find main executable mapping in /proc/[pid]/maps. And old logic is preserved too just in case.	2013-12-14 17:58:33 -08:00
Aliaksey Kandratsenka	925bbaea76	actually check result of CheckAddressBits Previously call to CheckAddressBits was made but nothing was done to it's result. I've also make sure that actual size is used in checks and in bumping up of TCMalloc_SystemTaken.	2013-11-16 16:00:31 -08:00
Aliaksey Kandratsenka	d4f4c5a310	assert that ClassSize(0) is 0 instead >=0 Because it's return value being size_t cannot be negative anyways. This fixes clang warning	2013-11-16 14:00:19 -08:00
Aliaksey Kandratsenka	946203d60e	assert key size in way that is clearer to gcc Both new and old asserts are checking same condition, however new assert helps gcc see that out of bounds access is not possible in root_ array.	2013-11-16 13:35:59 -08:00
Aliaksey Kandratsenka	bf2d7bd3f8	fixed gcc warning We've recently changed old_signal_handler to by integer, so comparing it with NULL is not good idea.	2013-11-16 13:31:34 -08:00
Aliaksey Kandratsenka	dd5f979c5e	fixed -Wreorder warning in HeapProfileTable constructor	2013-11-16 13:31:08 -08:00
Aliaksey Kandratsenka	e4ea98f147	issue-585: fixed use of TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES In order to apply that, we're now doing explicit EnvToInt64 call as part of initializing thread cache module.	2013-11-16 12:23:55 -08:00
Aliaksey Kandratsenka	e0102230ec	issue-588: Fix profiler_unittest.cc fork() As suggested by Hannes Weisbach. Call heap-profiler_unittest with the arguments 1 -2 (one iteration, 2 fork()ed children). Instead of running the test, the program crashes with a std::bad_alloc exception. This is caused by unconditionally passing the number-of-threads-argument (0 or positive for threads, negative for fork()s) in RunManyThreads(), thus allocating an array of pthread_t of size -2. Depending on the sign of the thread number argument either RunManyThreads or fork() should be called.	2013-11-16 12:03:35 -08:00
Aliaksey Kandratsenka	2bf83af656	issue-587: fix typos in unit test scripts As proposed by Hannes Weisbach. The argument will be garbled because of a misplaced brace, for example (heap-checker_unittest.sh): HEAP_CHECKER="${1:-$BINDIR}/heap-checker_unittest" which should be: HEAP_CHECKER="${1:-$BINDIR/heap-checker_unittest}" This unit test is used to check the binaries heap-checker_unittest and heap-checker_debug_unittest. With the typo, the executable heap-checker_debug_unittest is never actually run.	2013-11-16 11:35:32 -08:00
Aliaksey Kandratsenka	b3b1926978	issue-584: added license note to files without explicit license As suggested at corresponding chromium issue discussion it's seemingly sufficient to simply refer to project-wide LICENSE file.	2013-11-09 12:28:55 -08:00
Joonsoo Kim	7be35fb0d8	central_freelist: change fetch ordering When we fetch objects from the span for thread cache, we make reverse-ordered list against original list on the span and suppy this list to thread cache. This algorithm has trouble with newly created span. Newly created span has ascending ordered objects list. Since thread cache will get reverse-ordered list against it, user gets objects as descending order. Following example shows what occurs in this algorithm. new span: object list: 1 -> 2 -> 3 -> 4 -> 5 -> ... fetch N items: N -> N-1 -> N-2 -> ... -> 2 -> 1 -> NULL thread cache: N -> N-1 -> N-2 -> ... -> 2 -> 1 -> NULL user's 1st malloc: N user's 2nd malloc: N-1 ... user's Nth malloc: 1 In general, access memory with ascending order is better than descending order in terms of the performance. So this patch fix this situation. I run below program to measure performance effect. #define MALLOC_SIZE (512) #define CACHE_SIZE (64) #define TOUCH_SIZE (512 / CACHE_SIZE) array = malloc(sizeof(void ) count); for (i = 0; i < 1; i++) { for (j = 0; j < count; j++) { x = malloc(MALLOC_SIZE); array[j] = x; } } repeat = 10; for (i = 0; i < repeat; i++) { for (j = 0; j < count; j++) { x = array[j]; for (k = 0; k < TOUCH_SIZE; k++) { (x + (k CACHE_SIZE)) = '1'; } } } LD_PRELOAD=libtcmalloc_minimal.so perf stat -r 10 ./a.out 1000000 ** Before Performance counter stats for './a.out 1000000' (10 runs): 2.715161299 seconds time elapsed ( +- 0.07% ) After ** Performance counter stats for './a.out 1000000' (10 runs): 2.259366428 seconds time elapsed ( +- 0.08% )	2013-10-26 23:28:31 -07:00
Joonsoo Kim	7315b45c28	central_freelist: fetch objects as much as possible during each trial It is better to reduce function call if possible. If we try to fetch objects from one span as much as possible during each function call, number of function call would be reduced and this would help performance.	2013-10-26 23:28:31 -07:00
Joonsoo Kim	cc002ea193	skip unnecessary check during double-check SizeClass intergrity On initialization step, tcmalloc double-checks SizeClass integrity with all possible size values, 0 to kMaxSize. This causes tremendous overhead for short-lived applications. For example, consider following command. 'find -exec grep something {} \;' Actual work of each grep is really small, but double-check requires more work. To reduce this overhead, it is best to remove double-check entirely. But we cannot be sure the integrity without double-checking, so alternative is needed. This patch doesn't remove double-check, instead, try to skip unnecessary check based on ClassIndex() implementation. This reduce much overhead and the code has same coverage as previous double-check. Following is the result of this patch. time LD_PRELOAD=libtcmalloc_minimal.so find ./ -exec grep "SOMETHING" {} \; * Before real 0m3.675s user 0m1.000s sys 0m0.640s * This patch real 0m2.833s user 0m0.056s sys 0m0.220s * Remove double-check entirely real 0m2.675s user 0m0.072s sys 0m0.184s	2013-10-26 23:28:31 -07:00
Aliaksey Kandratsenka	3e9a33e8c7	issue-583: include pthread.h into static_var.cc Because we're doing pthread_atfork. Fix suggested by user named drussel.	2013-10-26 16:56:35 -07:00
Aliaksey Kandratsenka	db0d5730ee	issue-579: ensure order between memory region and libunwind locks I.e. to prevent possible deadlock when this locks are taked by different threads in different order. This particular problem was also reported as part of issue 66.	2013-10-12 16:13:51 -07:00
Aliaksey Kandratsenka	42ddc8d42c	added emacs -*- mode lines for google coding style	2013-10-12 15:36:42 -07:00
Aliaksey Kandratsenka	799a22624c	issue-575: do not use cycle count register on arm6 Apparently not all arm6 implementations implement it in this particular way. This applies patch by Ben Avison.	2013-09-28 19:32:20 -07:00
Petr Hosek	83aed118e0	issue-567: Allows for overriding system allocator on Windows [alk@tut.by: minor changes to make mingw build work] Signed-off-by: Aliaksey Kandratsenka <alk@tut.by>	2013-09-21 09:00:29 -07:00
Petr Hosek	4ad16873a0	Exports SysAllocator class to avoid .dll build errors	2013-09-21 08:59:26 -07:00
Aliaksey Kandratsenka	326990b5c3	issue-557: added support for dumping heap profile via signal This applies patch from Jean Lee. I've reformatted it to match surronding code style and changed validation logic a bit. I.e. we're not checking signal for range anymore given we're not sure what different platforms support, but we're checking return value of signal() for SIG_ERR instead.	2013-09-14 17:42:41 -07:00

1 2 3 4 5

207 Commits