Commit Graph

427 Commits

Author SHA1 Message Date
Aliaksey Kandratsenka
cdff090ebd Fix several harmless clang warnings 2016-02-20 22:07:26 -08:00
Aliaksey Kandratsenka
9095ed0840 implemented stacktrace capturing via libgcc's C++ ABI function
Particularly _Unwind_Backtrace which seems to be gcc extension.

This is what glibc's backtrace is commonly is using.

Using _Unwind_Backtrace directly is better than glibc's backtrace, since
it doesn't call into dlopen. While glibc does dlopen when it is built as
shared library apparently to avoid link-time dependency on libgcc_s.so
2016-02-20 20:34:50 -08:00
Aliaksey Kandratsenka
728cbe1021 force profiler_unittest to do 'real' work
'XOR loop' in profiler unittest wasn't 100% effective because it allowed
compiler to avoid loading and storing to memory.

After marking result variable as volatile, we're now forcing compiler to
read and write memory, slowing this loops down sufficiently. And
profiler_unittest is now passing more consistently.

Closes #628
2016-02-20 13:21:47 -08:00
Aliaksey Kandratsenka
fff6b4fb88 Extend low-level allocator to support custom pages allocator 2016-02-06 19:14:23 -08:00
Aliaksey Kandratsenka
32d9926795 added malloc_bench_shared_full 2016-02-06 19:14:23 -08:00
Aliaksey Kandratsenka
00d8fa1ef8 always use real throw() on operators new/delete
Since non-glibc-s have no __THROW and lack of throw() on operators gives
us warning.
2016-02-06 19:13:07 -08:00
Aliaksey Kandratsenka
08e034ad59 Detect working ifunc before enabling dynamic sized delete support
Particularly, on arm-linux and x86-64-debian-kfreebsd compilation fails
due to lack of support for ifunc. So it is necessary to test at
configure time whether ifunc is supported.
2016-02-06 16:21:53 -08:00
Aliaksey Kandratsenka
a788f354a0 include unistd.h for getpid in thread_lister.c
This fixes warning produced on arm-linux.
2016-02-06 16:21:53 -08:00
Bryan Chan
644a6bdbdb Add support for Linux s390x
This resolves gperftools/gperftools#761.
2016-02-04 20:15:26 -05:00
Bryan Chan
bab7753aad Fix typo in heap-checker-death_unittest.sh 2016-02-04 18:34:43 -05:00
Simon Que
17182e1d3c Fix include of malloc_hook_c.h in malloc_hook.h
malloc_hook.h includes malloc_hook_c.h as <gperftools/malloc_hook_c.h>.
This requires the compiler to have designated src/gperftools as a
standard include directory (-I), which may not always be the case.

Instead, include it as "malloc_hook_c.h", which will search in the same
directory first. This will always work, regardless of whether it was
designated a standard include directory.
2016-01-29 18:17:16 -08:00
Andrew Morrow
c69721b2b2 Add support for obtaining cache size of the current thread and softer idling 2016-01-26 19:44:16 -08:00
Brian Silverman
5ce42e535d Don't always arm the profiling timer.
It causes a noticeable performance hit and can sometimes confuse GDB.

Tested with CPUPROFILE_PER_THREAD_TIMERS=1.

Based on an old version by mnissler@google.com.
2016-01-26 18:31:42 -08:00
Duncan Sands
7f801ea091 Make sure the alias is not removed by link-time optimization when it can prove
that it isn't used by the program, as it might still be needed to override the
corresponding symbol in shared libraries (or inline assembler for that matter).
For example, suppose the program uses malloc and free but not calloc and is
statically linked against tcmalloc (built with -flto) and LTO is done.  Then
before this patch the calloc alias would be deleted by LTO due to not being
used, but the malloc/free aliases would be kept because they are used by the
program.  Suppose the program is dynamically linked with a shared library that
allocates memory using calloc and later frees it by calling free.  Then calloc
will use the libc memory allocator, because the calloc alias was deleted, but
free will call into tcmalloc, resulting in a crash.
2016-01-24 19:49:52 -08:00
Aliaksey Kandratsenka
6b3e6ef5e0 don't retain compatibility with old docdir behavior
Since it is not really needed. And since we don't care about too ancient
autoconfs.
2016-01-24 19:45:16 -08:00
Chris Mayo
ccffcbd9e9 support use of configure --docdir argument
Value of docdir was being overridden in Makefile.

Retain compatibility with old Autoconf versions that do not provide
docdir.
2015-12-27 18:55:05 +00:00
Aliaksey Kandratsenka
050f2d28be use alias attribute only for elf platforms
It was reported that clang on OSX doesn't support alias attribute. Most
likely because of executable format limitations.

New code limits use of alias to gcc-compatible compilers on elf
platforms (various gnu and *bsd systems). Elf format is known to support
aliases.
2015-12-12 18:27:56 -08:00
cyshi
07b0b21ddd fix compilation error in spinlock 2015-12-02 14:47:15 +08:00
gshirishfree
e14450366a Added better description for GetStats API 2015-11-23 11:34:13 -08:00
Aliaksey Kandratsenka
64892ae730 lower default transfer batch size down to 512
Some workloads get much slower with too large batch size.

This closes bug #678.

binary_trees benchmark benefits from larger batch size. And I found that
512 is not much slower than huge value that we had.
2015-11-21 19:22:49 -08:00
Aliaksey Kandratsenka
6fdfc5a7f4 implemented enabling sized-delete support at runtime
Under gcc 4.5 or greater we're using ifunc function attribute to resolve
sized delete operator to either plain delete implementation (default) or
to sized delete (if enabled via environment variable
TCMALLOC_ENABLE_SIZED_DELETE).
2015-11-21 19:03:03 -08:00
Aliaksey Kandratsenka
c2a79d063c use x86 pause in spin loop
This saves power and improves performance, particulary on SMT.
2015-11-21 18:17:24 -08:00
Aliaksey Kandratsenka
0fb6dd8aa3 added binary_trees benchmark 2015-11-21 18:17:21 -08:00
Aliaksey Kandratsenka
a8852489e5 drop unsupported allocation sampling code in tcmalloc_minimal 2015-11-21 17:43:42 -08:00
Aliaksey Kandratsenka
a9db0ae516 implemented (disabled by default) sized delete support
gcc 5 and clang++-3.7 support sized deallocation from C++14. We are
taking advantage of that by defining sized versions of operator delete.

This is off by default so that if some existing programs that define own
global operator delete without sized variant are not broken by
tcmalloc's sized delete operator.

There is also risk of breaking exiting code that deletes objects using
wrong class (i.e. base class) without having virtual destructors.
2015-11-21 17:43:42 -08:00
Aliaksey Kandratsenka
88686972b9 pass -fsized-deallocation to gcc 5
Otherwise it gives warning for declaration of sized delete operator.
2015-11-21 17:43:42 -08:00
Aliaksey Kandratsenka
0a18fab3af implemented sized free support via tc_free_sized 2015-11-21 17:43:42 -08:00
Aliaksey Kandratsenka
464688ab6d speedup free code path by dropping "fast path allowed check" 2015-11-21 17:43:42 -08:00
Aliaksey Kandratsenka
10f7e20716 added SizeMap::MaybeSizeClass
Because it allows us to first check for smaller sizes, which is most
likely.
2015-11-21 17:43:42 -08:00
Aliaksey Kandratsenka
436e1dea43 slightly faster GetCacheIfPresent 2015-11-21 17:43:42 -08:00
Aliaksey Kandratsenka
04df911915 tell compiler that non-empty hooks are unlikely 2015-11-21 17:43:42 -08:00
Aliaksey Kandratsenka
8cc75acd1f correctly test for -Wno-unused-result support
gcc is only giving warning for unknown -Wno-XXX flags so test never
fails on gcc even if -Wno-XXX is not supported. By using
-Wunused-result we're able to test if gcc actually supports it.

This fixes issue #703.
2015-11-21 17:43:42 -08:00
Aliaksey Kandratsenka
7753d8239b fixed clang warning about shifting negative values 2015-11-21 17:43:42 -08:00
Jens Rosenboom
ae09ebb383 Fix tmpdir usage in heap-profiler_unittest.sh
Using a single fixed directory would break when tests were being run in
parallel with "make -jN".

Also, the cleanup at the end of the test didn't work because it referred
to the wrong variable.
2015-11-21 17:37:49 -08:00
Aliaksey Kandratsenka
df34e71b57 use $0 when referring to pprof
This fixed debian bug #805536. Debian ships pprof under google-pprof
name so it is handy when google-pprof --help refers to itself correctly.
2015-11-21 17:24:30 -08:00
Adhemerval Zanella
7773ea64ee Alignment fix to static variables for system allocators
This patch the placement new for some system allocator to force the
static buffer to pointer value.
2015-11-06 16:29:12 -02:00
Boris Sazonov
c46eb1f3d2 Fixed printf misuse in pprof - printed string was passed as format. Better use print instead 2015-10-17 22:50:46 -07:00
Boris Sazonov
9bbed8b1a8 Fixed assembler argument passing inside _syscall6 on MIPS - it was causing 'Expression too complex' compilation errors in spinlock 2015-10-17 22:50:46 -07:00
Aliaksey Kandratsenka
962aa53c55 added more fastpath microbenchmarks
This also makes them output nicer results. I.e. every benchmark is run 3
times and iteration duration is printed for every run.

While this is still very synthetic and unrepresentave of malloc performance
as a whole, it is exercising more situations in tcmalloc fastpath. So it a
step forward.
2015-10-17 20:34:19 -07:00
Aliaksey Kandratsenka
347a830689 Ensure that PPROF_PATH is set for debugallocation_test
Which fixes issue #728.
2015-10-17 20:34:19 -07:00
Aliaksey Kandratsenka
a9059b7c30 prevent clang from inlining Mallocer in heap checker unittest
Looks like existing "trick" to avoid inlining doesn't really prevent
sufficiently smart compiler from inlining Mallocer function. Which
breaks tests, since test relies Mallocer having it's own separate stack
frame.

Making mallocer_addr variable volatile is seemingly enough to stop that.
2015-10-17 20:34:19 -07:00
Aliaksey Kandratsenka
6627f9217d drop cycleclock 2015-10-05 21:00:49 -07:00
Aliaksey Kandratsenka
f985abc296 amputate unportable and unused stuff from sysinfo
We still check number of cpus in the system (in spinlock code), but old
code was built under assumption of "no calls malloc" which is not needed
in tcmalloc. Which caused it to be far more complicated than
necessary (parsing procfs files, ifdefs for different OSes and arch-es).

Also we don't need clock cycle frequency measurement.

So I've removed all complexity of ald code and NumCPUs function and
replaced it with GetSystemCPUsCount which is straightforward and
portable call to sysconf.

Renaming of cpus count function was made so that any further code that
we might port from Google that depends on old semantics of NumCPUs will
be detected at compile time. And has to be inspected for whether it
really needs that semantics.
2015-10-05 21:00:49 -07:00
Aliaksey Kandratsenka
16408eb4d7 amputated wait_cycles accounting in spinlocks
This is not needed and pulls in CycleClock dependency that lowers
code portability.
2015-10-05 21:00:45 -07:00
Aliaksey Kandratsenka
fedceef40c drop cycleclock reference in ThreadCache 2015-10-05 20:58:41 -07:00
Aliaksey Kandratsenka
d7fdc3fc9d dropped unused and unsupported synchronization profiling facility
Spinlock usage of cycle counter is due do tracking of time it's spent
waiting for lock. But this tracking is only useful we actually have
synchronization profiling working, which dont have. Thus I'm dropping
calls to this facility with eye towards further removal of cycle clock
usage.
2015-10-05 20:56:28 -07:00
Aliaksey Kandratsenka
3a054d37c1 dropped unused SpinLockWait function 2015-10-05 20:56:28 -07:00
Aliaksey Kandratsenka
5b62d38329 avoid checking for dup. entries on empty backtrace
This might fix issue #721. But it is right thing to do regardless. Since
if depth is 0 we'll be reading random "garbage" on the stack.
2015-10-05 20:56:28 -07:00
Aliaksey Kandratsenka
7b9ded722e fixed compiler warning in memory_region_map.cc 2015-10-05 20:56:28 -07:00
Aliaksey Kandratsenka
4194e485cb Don't link libtcmalloc_minimal.so to libpthread.so
So that LD_PRELOAD-ing doesn't force loading libpthread.so which may
slow down some single-threaded apps.

tcmalloc already has maybe_threads facility that can detect if
libpthread.so is loaded (via weak symbols) and provide 'simulations' of
some pthread functions that tcmalloc needs.
2015-10-05 20:56:28 -07:00