Commit Graph

790 Commits

Author SHA1 Message Date
Aliaksey Kandratsenka
33f6781d64 issue-605: avoid compilation errors if pthread_key_t is pointer
Which seems to be the case on later cygwin
2014-02-16 19:22:02 -08:00
Aliaksey Kandratsenka
100f310088 unbreak make dist 2014-02-16 18:28:21 -08:00
Wang YanQing
a0ed9ace53 debugallocation: fix bus error on mipsel-linux platform when enable use_malloc_page_fence
Fix below "BUS ERROR" issue:

a0 hold start address of memory block allocated by DebugAllocate in debugallocation.cc

gdb) info registers
          zero       at       v0       v1       a0       a1       a2       a3
 R0   00000000 10008700 772f62a0 00084d40 766dcfef 7fb5f420 00000000 004b4dd8
            t0       t1       t2       t3       t4       t5       t6       t7
 R8   7713c1a0 7712dbc0 ffffffff 777bc000 f0000000 00000001 00000000 00403d10
            s0       s1       s2       s3       s4       s5       s6       s7
 R16  7fb5ff1c 00401b9c 77050020 7fb5fb18 00000000 004cb008 004ca748 ffffffff
            t8       t9       k0       k1       gp       sp       s8       ra
 R24  0000002f 771adcd4 00000000 00000000 771f4140 7fb5f408 7fb5f430 771add6c
            sr       lo       hi      bad    cause       pc
      00008713 0000e9fe 00000334 766dcff7 00800010 771adcfc
           fsr      fir
      00000004 00000000

(gdb) disassemble
Dump of assembler code for function _ZNSs4_Rep10_M_disposeERKSaIcE:
   0x771adcd4 <+0>:     lui     gp,0x4
   0x771adcd8 <+4>:     addiu   gp,gp,25708
   0x771adcdc <+8>:     addu    gp,gp,t9
   0x771adce0 <+12>:    lw      v0,-28696(gp)
   0x771adce4 <+16>:    beq     a0,v0,0x771add38 <_ZNSs4_Rep10_M_disposeERKSaIcE+100>
   0x771adce8 <+20>:    nop
   0x771adcec <+24>:    lw      v0,-30356(gp)
   0x771adcf0 <+28>:    beqzl   v0,0x771add1c <_ZNSs4_Rep10_M_disposeERKSaIcE+72>
   0x771adcf4 <+32>:    lw      v0,8(a0)
   0x771adcf8 <+36>:    sync
=> 0x771adcfc <+40>:    ll      v0,8(a0)
   0x771add00 <+44>:    addiu   at,v0,-1
   0x771add04 <+48>:    sc      at,8(a0)
   0x771add08 <+52>:    beqz    at,0x771adcfc <_ZNSs4_Rep10_M_disposeERKSaIcE+40>
   0x771add0c <+56>:    nop
   0x771add10 <+60>:    sync
   0x771add14 <+64>:    b       0x771add24 <_ZNSs4_Rep10_M_disposeERKSaIcE+80>
   0x771add18 <+68>:    nop
   0x771add1c <+72>:    addiu   v1,v0,-1
   0x771add20 <+76>:    sw      v1,8(a0)
   0x771add24 <+80>:    bgtz    v0,0x771add38 <_ZNSs4_Rep10_M_disposeERKSaIcE+100>
   0x771add28 <+84>:    nop
   0x771add2c <+88>:    lw      t9,-27072(gp)
   0x771add30 <+92>:    jr      t9
   0x771add34 <+96>:    nop
   0x771add38 <+100>:   jr      ra
   0x771add3c <+104>:   nop
End of assembler dump.

ll instruction manual:
Load Linked:
Loads the destination register with the contents of the word
that is at the memory location. This instruction implicity performs
a SYNC operation; all loads and stores to shared memory fetched prior
to the ll must access memory before the ll, and loads and stores to
shared memory fetched subsequent to the ll must access memory after ll.
Load Linked and Store Conditional can be use to automatically update
memory locations. *This instruction is not valid in the mips1 architectures.
The machine signals an address exception when the effective address is not
divisible by four.

Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Aliaksey Kandratsenka <alk@tut.by>
[alk@tut.by: removed addition of unused #include]
2014-02-16 12:51:51 -08:00
Aliaksey Kandratsenka
38bfc7a1c2 removed irrelevant comment 2014-02-08 14:10:11 -08:00
Aliaksey Kandratsenka
d03c467a34 allow asking for gcc atomics on all platforms
I.e. by doing ./configure CPPFLAGS=-DTCMALLOC_PREFER_GCC_ATOMICS
2014-02-08 14:10:04 -08:00
Aliaksey Kandratsenka
6de1f38b68 chmod -x configure.ac
Because configure.ac is not really executable. And because it
interferes with tab completion of configure.
2014-02-08 14:09:44 -08:00
Riku Voipio
e8fe990fa0 implement atomics with gcc intrinsics
Gcc after 4.7 provides atomic builtins[1]. Use these instead of adding
yet-another-assembly port for Aarch64 (64-bit ARM). This patch enables
succesfully building and running atomicops unittest on Aarch64.

This patch enables using gcc builtins only when no assembly
implementation is provided. But as a quick check, atomicops_unittest
and rest of testsuite passes with atomicops-internals-gcc also
ARMv7 and X86_64 if the ifdef in atomicops is adjusted to prefer
the generic implementation.

[1] http://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
2014-02-05 16:35:49 +02:00
Aliaksey Kandratsenka
fa4b1c401d issue-599: fixing FreeBSD issue with sbrk
Applied patch by yurivict.

It was wrong assembly specifically for FreeBSD in sbrk overriding
code.
2014-01-19 22:37:44 -08:00
Aliaksey Kandratsenka
71a239e559 check debug_malloc_implementation_space via COMPILE_ASSERT
Because we can and because compile-time is always better.
2014-01-19 12:30:53 -08:00
Aliaksey Kandratsenka
54568e32fc issue-565: don't pollute global namespace with thread lister API
Instead those functions that are original taken from google's "base"
code now have prefix TCMalloc_. So that they don't conflict with other
google's libraries having same functions.
2014-01-18 17:23:14 -08:00
Aliaksey Kandratsenka
64bc1baa1f issue-{66,547}: use signal's ucontext when unwinding backtrace
In issue-66 (and readme) it is pointed out that sometimes there are
some issues grabbing backtrace across signal handler boundary.

This code attempts to fix it by grabbing backtrace from signal's
ucontext which clearly does not include signal handler boundary.

We're using "feature" of libunwind that for some important platforms
libunwind's context is same as libc's ucontext_t which is given to us
as part of calling signal handler.
2014-01-18 17:21:00 -08:00
Aliaksey Kandratsenka
185bf3fcc3 issue-581: avoid destructing DebugMallocImplementation
Because otherwise destructor might be invoked well before other places
that might touch malloc extension instance.

We're using placement new to initialize it and pass pointer to
MallocExtension::Register. Which ensures that destructor for it is
never run.

Based on idea suggested by Andrew C. Morrow.
2014-01-18 14:14:22 -08:00
Aliaksey Kandratsenka
48a0d131c1 issue-548: pass -fno-builtin to compiler for unittests
Because clang doesn't understand -fno-builtin-malloc and friends. And
otherwise new/delete pairs get optimized away causing our tests that
expect hooks to be called to fail.
2014-01-18 13:28:46 -08:00
Aliaksey Kandratsenka
e98371540d eliminated gcc warning on __thread configure snippet
gcc complained about lack of matching ' in code that force-fails
__thread detection on mingw
2014-01-11 16:28:15 -08:00
xiaoyur347
60b12171bc fix GCC version detect for platforms other than X86/X64
[alk@tut.by: commented why we're disabling __thread not just for x86]

Signed-off-by: Aliaksey Kandratsenka <alk@tut.by>
2014-01-11 16:24:59 -08:00
Aliaksey Kandratsenka
764d304222 don't re-define strtoq for VS2013
Which is part of previous change that wasn't correctly applied.
2014-01-05 12:49:23 -08:00
Aliaksey Kandratsenka
1fc768864d fix compilation under VS 2013
This is essentially a copy of corresponding chromium change from:
https://codereview.chromium.org/27017003
2014-01-05 12:43:59 -08:00
Aliaksey Kandratsenka
4c274b9e20 issue-592: handle recent mingw with C++11 threads
Somehow it's c++ headers (like string) define pthread symbols without
even us asking for. That breaks old assumption that pthread symbols
are not available on windows.

In order to fix that we detect this condition in configure.ac and
avoid defining windows versions of pthread symbols.
2014-01-04 18:28:36 -08:00
Aliaksey Kandratsenka
1458ee2239 issue-596: removed unused AtomicIncrement operation
There's no need for us to attempt to maintain Google's atomic ops code
in era of C++11.
2014-01-04 13:59:56 -08:00
Aliaksey Kandratsenka
6630b24e27 Removed unused AtomicPtr::CompareAndSwap 2014-01-04 13:59:49 -08:00
xiaoyur347
a15115271c add "-finstrument-functions" support for MIPS uclibc.
should configure with CXXFLAGS="-finstrument-functions"
2013-12-20 09:41:08 +08:00
xiaoyur347
7c4888515e add uclibc support
* some variables defined with "char *" should be modified to "const char*"
* For uclibc, glibc's "void malloc_stats(void)" should be "void malloc_stats(FILE *)", is commented now.
* For uclibc, __sbrk is with attribute "hidden", so we use mmap allocator for uclibc.
2013-12-20 09:02:49 +08:00
Aliaksey Kandratsenka
7bd193bca9 issue-586: detect main executable even if PIE is active
Previous logic of detecting main program addresses is to assume that
main executable is at least addressess. With PIE (active by default on
Ubuntus) it doesn't work.

In order to deal with that, we're attempting to find main executable
mapping in /proc/[pid]/maps. And old logic is preserved too just in
case.
2013-12-14 17:58:33 -08:00
Aliaksey Kandratsenka
f8a2163b51 Added AM_MAINTAINER_MODE to disable Makefile rebuild rules
Some people might want to check-in unpacked result on make dist into
git. But because git doesn't preserve timestamps it would cause those
automatic "auto-retool" rules to trigger. Sometimes even causing build
breakage if system's autotools version don't match autotools version
used for make dist.

Easiest way around this problem is to simply disable those unnecessary
"maintainer" rebuild rules. Especially given that source is always
freely available via git and therefore there should be no reason to
regenerate any of autotools products in 'make dist'-produced sources.
2013-12-06 12:29:14 -08:00
Aliaksey Kandratsenka
925bbaea76 actually check result of CheckAddressBits
Previously call to CheckAddressBits was made but nothing was done to
it's result.

I've also make sure that actual size is used in checks and in bumping
up of TCMalloc_SystemTaken.
2013-11-16 16:00:31 -08:00
Aliaksey Kandratsenka
f216317a87 use AC_PROG_LIBTOOL to summon libtool
So that older autotools of rhel 5 can be used
2013-11-16 15:44:52 -08:00
Aliaksey Kandratsenka
d4f4c5a310 assert that ClassSize(0) is 0 instead >=0
Because it's return value being size_t cannot be negative
anyways. This fixes clang warning
2013-11-16 14:00:19 -08:00
Aliaksey Kandratsenka
946203d60e assert key size in way that is clearer to gcc
Both new and old asserts are checking same condition, however new
assert helps gcc see that out of bounds access is not possible in
root_ array.
2013-11-16 13:35:59 -08:00
Aliaksey Kandratsenka
bf2d7bd3f8 fixed gcc warning
We've recently changed old_signal_handler to by integer, so comparing
it with NULL is not good idea.
2013-11-16 13:31:34 -08:00
Aliaksey Kandratsenka
dd5f979c5e fixed -Wreorder warning in HeapProfileTable constructor 2013-11-16 13:31:08 -08:00
Aliaksey Kandratsenka
e4ea98f147 issue-585: fixed use of TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES
In order to apply that, we're now doing explicit EnvToInt64 call as
part of initializing thread cache module.
2013-11-16 12:23:55 -08:00
Aliaksey Kandratsenka
e0102230ec issue-588: Fix profiler_unittest.cc fork()
As suggested by Hannes Weisbach.

Call heap-profiler_unittest with the arguments 1 -2 (one iteration, 2
fork()ed children).

Instead of running the test, the program crashes with a std::bad_alloc
exception.  This is caused by unconditionally passing the
number-of-threads-argument (0 or positive for threads, negative for
fork()s) in RunManyThreads(), thus allocating an array of pthread_t of
size -2.  Depending on the sign of the thread number argument either
RunManyThreads or fork() should be called.
2013-11-16 12:03:35 -08:00
Aliaksey Kandratsenka
2bf83af656 issue-587: fix typos in unit test scripts
As proposed by Hannes Weisbach.

The argument will be garbled because of a misplaced brace, for example
(heap-checker_unittest.sh):

HEAP_CHECKER="${1:-$BINDIR}/heap-checker_unittest"
which should be:
HEAP_CHECKER="${1:-$BINDIR/heap-checker_unittest}"

This unit test is used to check the binaries heap-checker_unittest and
heap-checker_debug_unittest.  With the typo, the executable
heap-checker_debug_unittest is never actually run.
2013-11-16 11:35:32 -08:00
Aliaksey Kandratsenka
b3b1926978 issue-584: added license note to files without explicit license
As suggested at corresponding chromium issue discussion it's seemingly
sufficient to simply refer to project-wide LICENSE file.
2013-11-09 12:28:55 -08:00
Joonsoo Kim
7be35fb0d8 central_freelist: change fetch ordering
When we fetch objects from the span for thread cache, we make
reverse-ordered list against original list on the span and suppy this list
to thread cache. This algorithm has trouble with newly created span.
Newly created span has ascending ordered objects list. Since thread cache
will get reverse-ordered list against it, user gets objects as descending order.

Following example shows what occurs in this algorithm.

new span: object list: 1 -> 2 -> 3 -> 4 -> 5 -> ...
fetch N items: N -> N-1 -> N-2 -> ... -> 2 -> 1 -> NULL
thread cache: N -> N-1 -> N-2 -> ... -> 2 -> 1 -> NULL

user's 1st malloc: N
user's 2nd malloc: N-1
...
user's Nth malloc: 1

In general, access memory with ascending order is better than descending
order in terms of the performance. So this patch fix this situation.

I run below program to measure performance effect.

	#define MALLOC_SIZE (512)
	#define CACHE_SIZE (64)
	#define TOUCH_SIZE (512 / CACHE_SIZE)

	array = malloc(sizeof(void *) * count);

	for (i = 0; i < 1; i++) {
		for (j = 0; j < count; j++) {
			x = malloc(MALLOC_SIZE);
			array[j] = x;
		}
	}

	repeat = 10;
	for (i = 0; i < repeat; i++) {
		for (j = 0; j < count; j++) {
			x = array[j];
			for (k = 0; k < TOUCH_SIZE; k++) {
				*(x + (k * CACHE_SIZE)) = '1';
			}
		}
	}

LD_PRELOAD=libtcmalloc_minimal.so perf stat -r 10 ./a.out 1000000

**** Before ****
 Performance counter stats for './a.out 1000000' (10 runs):

       2.715161299 seconds time elapsed                                          ( +-  0.07% )

**** After ****
 Performance counter stats for './a.out 1000000' (10 runs):

       2.259366428 seconds time elapsed                                          ( +-  0.08% )
2013-10-26 23:28:31 -07:00
Joonsoo Kim
7315b45c28 central_freelist: fetch objects as much as possible during each trial
It is better to reduce function call if possible. If we try to fetch
objects from one span as much as possible during each function call,
number of function call would be reduced and this would help performance.
2013-10-26 23:28:31 -07:00
Joonsoo Kim
cc002ea193 skip unnecessary check during double-check SizeClass intergrity
On initialization step, tcmalloc double-checks SizeClass integrity with
all possible size values, 0 to kMaxSize. This causes tremendous overhead
for short-lived applications.

For example, consider following command.
'find -exec grep something {} \;'

Actual work of each grep is really small, but double-check requires
more work. To reduce this overhead, it is best to remove double-check
entirely. But we cannot be sure the integrity without double-checking,
so alternative is needed.

This patch doesn't remove double-check, instead, try to skip unnecessary
check based on ClassIndex() implementation. This reduce much overhead and
the code has same coverage as previous double-check. Following is
the result of this patch.

time LD_PRELOAD=libtcmalloc_minimal.so find ./ -exec grep "SOMETHING" {} \;

* Before
real	0m3.675s
user	0m1.000s
sys	0m0.640s

* This patch
real	0m2.833s
user	0m0.056s
sys	0m0.220s

* Remove double-check entirely
real	0m2.675s
user	0m0.072s
sys	0m0.184s
2013-10-26 23:28:31 -07:00
Aliaksey Kandratsenka
3e9a33e8c7 issue-583: include pthread.h into static_var.cc
Because we're doing pthread_atfork.

Fix suggested by user named drussel.
2013-10-26 16:56:35 -07:00
Aliaksey Kandratsenka
db0d5730ee issue-579: ensure order between memory region and libunwind locks
I.e. to prevent possible deadlock when this locks are taked by
different threads in different order.

This particular problem was also reported as part of issue 66.
2013-10-12 16:13:51 -07:00
Aliaksey Kandratsenka
42ddc8d42c added emacs -*- mode lines for google coding style 2013-10-12 15:36:42 -07:00
Aliaksey Kandratsenka
799a22624c issue-575: do not use cycle count register on arm6
Apparently not all arm6 implementations implement it in this
particular way.

This applies patch by Ben Avison.
2013-09-28 19:32:20 -07:00
Petr Hosek
2a2d6596f8 Adds system-alloc_unittest Visual Studio project 2013-09-21 09:01:26 -07:00
Petr Hosek
83aed118e0 issue-567: Allows for overriding system allocator on Windows
[alk@tut.by: minor changes to make mingw build work]
Signed-off-by: Aliaksey Kandratsenka <alk@tut.by>
2013-09-21 09:00:29 -07:00
Petr Hosek
4ad16873a0 Exports SysAllocator class to avoid .dll build errors 2013-09-21 08:59:26 -07:00
Aliaksey Kandratsenka
326990b5c3 issue-557: added support for dumping heap profile via signal
This applies patch from Jean Lee.

I've reformatted it to match surronding code style and changed
validation logic a bit. I.e. we're not checking signal for range
anymore given we're not sure what different platforms support, but
we're checking return value of signal() for SIG_ERR instead.
2013-09-14 17:42:41 -07:00
Aliaksey Kandratsenka
cb65e49b83 issue-536: do not PrintStats if running under valgrind
When we detect running under valgrind we do not initialize our own
malloc. So trying to print malloc stats when asked via MALLOCSTATS
cannot work.

This does fix proposed by Philippe Waroquiers. In which we detect
running under valgrind prior to checking MALLOCSTATS environment
variable and refuse printing stats if we detect valgrind.
2013-09-14 16:45:42 -07:00
Aliaksey Kandratsenka
6979583592 issue-564: added atomic ops support for mips{,64}
This merges patch contributed by Jovan Zelincevic.

And with that patch tcmalloc build with --enable-minimal (just malloc
replacement) appears to work (passes unit tests).
2013-09-09 07:59:25 -07:00
Aliaksey Kandratsenka
28dd85e282 implement pc from ucontext access for mips 2013-08-30 16:57:14 +03:00
Aliaksey Kandratsenka
819a2b051f issue-413: disable __thread usage on OSX
Because it was found that __thread variables access is compiled into
calls to tlv_get_addr which was found to call malloc. Because we
actually use thread-local storage from inside malloc it leads to stack
overflow. So we'll continue using pthreads API for that which is known
to work on OSX.
2013-08-29 19:41:25 +03:00
Aliaksey Kandratsenka
4380908093 lowered autoconf requirement
Autoconf 2.59 works. And most notably it will not affect our releases
which are all prepared with newer autoconf.
2013-08-29 19:40:51 +03:00