gperftools

mirror of https://github.com/gperftools/gperftools synced 2025-01-02 12:42:04 +00:00

Author	SHA1	Message	Date
Aliaksey Kandratsenka	85048430ac	consolidate do_mallinfo{,2} We had 2 nearly identical implementations. Thankfully C++ templates facility lets us produce 2 different runtime functions (for different type widths) without duplicating source. Amend github issue #1414	2023-12-07 15:01:27 -05:00
Mateusz Jakub Fila	b8e75ae6fe	Add mallinfo2 function	2023-12-07 14:10:51 +01:00
Aliaksey Kandratsenka	57512e9c3d	unbreak -Wthread-safety It actually found real (but arguably minor) issue with memory region map locking. As part of that we're replacing PageHeap::DeleteAndUnlock that had somewhat ambitious 'move' of SpinLockHolder, with more straightforward PageHeap::PrepareAndDelete. Doesn't look like we can support move thingy with thread annotations.	2023-08-06 19:32:32 -04:00
Aliaksey Kandratsenka	8b3f0d6145	undo MarkThreadTemporarilyIdle and make it same as MarkThreadIdle As noted on github issue #880 'temporarily' thing saves us not just on freeing thread cache, but also returning thread's share of thread cache (max_size_) into common pool. And the later has caused trouble to mongo folk who originally proposed 'temporarily' thing. They claim they don't use it anymore. And thus with no users and no clear benefit, it makes no sense for us to keep this API. For API and ABI compat sake we keep it, but it is now identical to regular MarkThreadIdle. Fixes issue #880	2023-07-31 14:11:51 -04:00
Aliaksey Kandratsenka	d2c89ba534	don't return raw span when sampling and stacktrace oomed This is nearly impossible in practice, but still. Somehow we missed this logic that DoSampledAllocation always returns actual object, but in that condition where stacktrace_allocator failed to get us StackTrace object we ended up returning span instead of it's object.	2023-07-24 21:01:35 -04:00
Aliaksey Kandratsenka	0d42a48699	move page heap locking under PageHeap While there is still plenty of code that takes pageheap_lock outside of page_heap module for all kinds of reasons, at least bread-and-butter logic of allocating/deallocating larger chunks of memory is now handling page heap locking inside PageHeap itself. This gives us flexibility. Update issue #1159	2023-07-24 21:01:35 -04:00
Aliaksey Kandratsenka	a3e1080c2e	handle large alloc reporting locklessly Which simplifies codes a bit. Update issue #1159	2023-07-24 21:01:35 -04:00
Aliaksey Kandratsenka	f1eb3c82c6	correctly release memory when system's pagesize is >kPageSize I.e. this covers case of arms that by default compile tcmalloc for 8k logical pages (assuming 4k system pages), but can actually run on systems with 64k pages. Closes #1135	2023-07-24 21:01:35 -04:00
Aliaksey Kandratsenka	d521e3b30e	move page heap allocations with alignment into page heap	2023-07-24 21:01:35 -04:00
Aliaksey Kandratsenka	f06ccc6f79	dont test HAVE_{STDINT,INTTYPES}_H Those are fairly standard by now. We already require C++11 or later compiler.	2023-07-22 14:32:40 -04:00
Gabriel Marin	4a923a6b36	tcmalloc: enable large object pointer offset check Original CL: https://chromiumcodereview.appspot.com/10391178 1. Enable large object pointer offset check in release build. Following code will now cause a check error: char* p = reinterpret_cast<char*>(malloc(kMaxSize + 1)); free(p + 1); 2. Remove a duplicated error reporting function "DieFromBadFreePointer", can use "InvalidGetAllocatedSize". Reviewed-on: https://chromium-review.googlesource.com/1184335 [alkondratenko@gmail.com] removed some unrelated formatting changes Signed-off-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>	2023-07-13 19:41:21 -04:00
Aliaksey Kandratsenka	c29e3059dd	mark CheckCachedSizeClass as used It is only used from inside ASSERT and clang doesn't like it being declared but unused when NDEBUG is set.	2023-07-13 19:21:30 -04:00
Jingyun Hua	fe85bbdf4c	Add support for LoongArch. Only 64-bit is supported at the moment. Signed-off-by: Jingyun Hua <huajingyun@loongson.cn>	2022-02-08 20:47:10 +08:00
Aliaksey Kandratsenka	a015377a54	Set tcmalloc heap limit prior to testing oom Otherwise it can take long time to OOM on osex.	2021-02-28 17:47:56 -08:00
Aliaksey Kandratsenka	c939dd5531	correctly check sized delete hint when asserts are on We previously tested wrong assumption that larger than page size size classes have addresses aligned on page size. New code is making proper check of size class. Also added is unit test coverage for this previously failing condition. And we now also run "assert-ful" unittests for big tcmalloc too, not only tcmalloc_minimal configuration. This fixes github issue #1254	2021-02-28 15:54:22 -08:00
Aliaksey Kandratsenka	7c106ca241	don't bother checking for stl namespace and use std Because there are no compilers left that don't do std namespace.	2021-02-14 15:44:14 -08:00
Aliaksey Kandratsenka	0d6f32b9ce	use standard way to print size_t-sized ints I.e. just use zu/zd/zx instead of finding out right size and defining PRI{u,x,d}S defines. Compilers have long caught up to this part of standard.	2021-02-14 15:44:14 -08:00
Jon Kohler	1bfcb5bc3a	tcmalloc: fragmentation overhead instrumentation This patch adds visibility into the overhead due to fragmentation for each size class in the tcmalloc central free list, which is helpful when debugging fragmentation issues.	2020-02-23 12:17:22 -08:00
Gabriel Marin	b85652bf26	Add generic.total_physical_bytes property to MallocExtension Original CL: - https://codereview.chromium.org/1410353005 Add generic.total_physical_bytes property to MallocExtension The actual physical memory usage of tcmalloc cannot be obtained by GetNumericProperty. This accounts for the current_allocated_bytes, fragmentation and malloc metadata, and excludes the unmapped memory regions. This helps the user to understand how much memory is actually being used for the allocations that were made. Reviewed-on: https://chromium-review.googlesource.com/1130803	2018-10-06 11:07:59 -07:00
Gabriel Marin	90df23c81f	Make some tcmalloc constants truly const Reviewed-on: https://chromium-review.googlesource.com/c/1130809	2018-10-05 17:17:55 -07:00
Aliaksey Kandratsenka	71c8cedaca	Fix incompatible aliasing warnings We aliased functions with different signatures and gcc now correctly gives warning for that. Originally gcc 5 same code merging feature caused us to alias more than necessary, but I am not able to reproduce this problem anymore. So we're now aliasing only compatible functions.	2018-08-05 20:43:53 -07:00
HolyWu	f47a52ce85	Make _recalloc adhere to MS's definition	2018-05-21 16:08:27 +08:00
Junhao Li	fe87ffb7ea	Disable large allocation report by default Fixes issue #360. [alkondratenko@gmail.com: adjusted commit message a bit] [alkondratenko@gmail.com: adjusted configure help message] Signed-off-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>	2018-05-20 21:13:05 -07:00
HolyWu	497ea33165	Fix WIN32_OVERRIDE_ALLOCATORS for VS2017 At first I try to add some functions as what Chrome does at their https://chromium.googlesource.com/chromium/src/+/master/base/allocator/allocator_shim_override_ucrt_symbols_win.h, but it still fails. So I decide to remove all heap-related objects from libucrt.lib to see what happens. At the end I find that a lot of functions in the CRT directly invoke _malloc_base instead of malloc (and the others alike), hence we need to override them as well. This should close issue #716. [alkondratenko@gmail.com: added reference to ticket] Signed-off-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>	2018-04-29 22:59:01 -07:00
Aliaksey Kandratsenka	33ae0ed2ae	unbreak compilation on GNU/Linux i386 Recent commit to fix int overflow for implausibly huge allocation added call to std::min. Notably, first arg was old size divided by unsigned long 4. And on GNU/Linux i386 size_t is not long. So such division was promoting first arg to unsigned long while second arg was still size_t, so just unsigned. And that caused compilation to fail. Fix is droping 'ul'.	2018-04-09 20:58:31 -07:00
Mao	1cb5de6db9	Explicitly prevent int overflow	2018-03-26 17:28:28 +08:00
Aliaksey Kandratsenka	47c99cf492	unbreak printing large span stats One of recent commits started passing kMaxPages to printf but not used it. Thankfully compilers gave us warning. Apparently intention was to print real value of kMaxPages, so this is what we're doing now.	2018-03-24 20:12:44 -07:00
Todd Lipcon	db98aac55a	Add a central free list for kMaxPages-sized spans Previously, the central free list with index '0' was always unused, since freelist index 'i' tracked spans of length 'i' and there are no spans of length 0. This meant that there was no freelist for spans of length 'kMaxPages'. In the default configuration, this corresponds to 1MB, which is a relatively common allocation size in a lot of applications. This changes the free list indexing so that index 'i' tracks spans of length 'i + 1', meaning that free list index 0 is now used and freelist[kMaxPages - 1] tracks allocations of kMaxPages size (1MB by default). This also fixes the stats output to indicate '>128' for the large spans stats rather than the incorrect '>255' which must have referred to a historical value of kMaxPages. No new tests are added since this code is covered by existing tests.	2018-03-17 09:46:28 -07:00
Aliaksey Kandratsenka	2291714518	implement fast-path for memalign/aligned_alloc/tc_new_aligned We're taking advantage of "natural" alignedness of our size classes and instead of previous loop over size classes looking for suitably aligned size, we now directly compute right size. See align_size_up function. And that gives us ability to use our existing malloc fast-path to make memalign neat and fast in most common cases. I.e. memalign/aligned_alloc now only tail calls and thus avoids expensive prologue/epilogue and is almost as fast as regular malloc.	2017-11-30 18:14:14 +00:00
Aliaksey Kandratsenka	79c91a9810	always define empty PERFTOOLS_NOTHROW Because somehow clang still builds "this function will not throw" code even with noexcept. Which breaks performance of tc_malloc/tc_new_nothrow. The difference with throw() seems to be just which function is called when unexpected exception happens. So we work around this sillyness by simply dropping any exception specification when compiling tcmalloc.	2017-11-29 21:44:52 +00:00
Aliaksey Kandratsenka	89fe59c831	Fix OOM handling in fast-path Previous fast-path malloc implementation failed to arrange proper oom handling for operator new. I.e. operator new is supposed to call new handler and throw exception, which was not arranged in fast-path case. Fixed code now passes pointer for oom function to ThreadCache::FetchFromCentralCache which will call it in oom condition. Test is added to verify correct behavior. I've also updated some fast-path-related comments for more accuracy.	2017-11-29 21:44:49 +00:00
Aliaksey Kandratsenka	e6cd69bdec	reintroduce aliasing for aligned delete Without aliasing performance is likely to be at least partially affected. There is still concern that aliasing between functions of different signatures is not 100% safe. We now explicitly list of architectures where aliasing is known to be safe.	2017-11-29 19:52:32 +00:00
Andrey Semashev	7efb3ecf37	Add support for C++17 operator new/delete for overaligned types. - Add auto-detection of std::align_val_t presence to configure scripts. This indicates that the compiler supports C++17 operator new/delete overloads for overaligned types. - Add auto-detection of -faligned-new compiler option that appeared in gcc 7. The option allows the compiler to generate calls to the new operators. It is needed for tests. - Added overrides for the new operators. The overrides are enabled if the support for std::align_val_t has been detected. The implementation is mostly based on the infrastructure used by memalign, which had to be extended to support being used by C++ operators in addition to C functions. In particular, the debug version of the library has to distinguish memory allocated by memalign from that by operator new. The current implementation of sized overaligned delete operators do not make use of the supplied size argument except for the debug allocator because it is difficult to calculate the exact allocation size that was used to allocate memory with alignment. This can be done in the future. - Removed forward declaration of std::nothrow_t. This was not portable as the standard library is not required to provide nothrow_t directly in namespace std (it could use e.g. an inline namespace within std). The <new> header needs to be included for std::align_val_t anyway. - Fixed operator delete[] implementation in libc_override_redefine.h. - Moved TC_ALIAS definition to the beginning of the file in tcmalloc.cc so that the macro is defined before its first use in nallocx. - Added tests to verify the added operators. [alkondratenko@gmail.com: fixed couple minor warnings, and some whitespace change] [alkondratenko@gmail.com: removed addition of TC_ALIAS in debug allocator] Signed-off-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>	2017-11-29 19:51:42 +00:00
Andrew Morrow	7a6e25f3b1	Add new statistics for the PageHeap [alkondratenko@gmail.com: addressed init order mismatch warning] Signed-off-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>	2017-11-28 14:19:08 +00:00
Romain Geissler	2d220c7e26	Replace "throw()" by "PERFTOOLS_NOTHROW" Automatically done with: sed -e 's/\<throw[[:space:]]([[:space:]])/PERFTOOLS_NOTHROW/g' -i $(git grep -l 'throw[[:space:]]([[:space:]])') [alkondratenko@gmail.com: updated to define empty PERFTOOLS_NOTHROW only on pre-c++11 standards]	2017-07-09 14:10:06 -07:00
Romain Geissler	e5fbd0e24e	Rename PERFTOOLS_THROW into PERFTOOLS_NOTHROW. Automatically done with: sed -e 's/\<PERFTOOLS_THROW\>/PERFTOOLS_NOTHROW/g' -i $(git grep -l PERFTOOLS_THROW)	2017-07-08 16:22:27 -07:00
KernelMaker	a495969cb6	update the prev_class_size in each loop, or the min_object_size of tcmalloc.thread will always be 1 when calling GetFreeListSizes	2017-05-29 15:05:55 -07:00
Aliaksey Kandratsenka	cef582350c	align fast-path functions only if compiler supports that Apparently gcc only supports __attribute__((aligned(N))) on functions only since version 4.3. So lets test it in configure script and only use when possible. We now use CACHELINE_ALIGNED_FN macro for aligning functions.	2017-05-22 01:55:50 -07:00
Aliaksey Kandratsenka	bddf862b18	actually support very early freeing of NULL This was caught by unit tests on centos 5. Apparently some early thingy is trying to do vprintf which calls free(0). Which used to crash since before size class cache is initialized it'll report hit (with size class 0) for NULL pointer, so we'd miss the case of checking NULL pointer free and crash. The fix is to check for IsInited in the case when thread cache is null, and if so then we escalte to free_null_or_invalid.	2017-05-22 01:54:56 -07:00
Aliaksey Kandratsenka	b1d88662cb	change size class to be represented by 32 bit int This moves code closer to Google-internal version and provides for slightly tighter code encoding on amd64.	2017-05-14 19:04:56 -07:00
Aliaksey Kandratsenka	7bc34ad1f6	support different number of size classes at runtime With TCMALLOC_TRANSFER_NUM_OBJ environment variable we can change transfer batch size. And with that comes slightly different number of size classes depending on value of transfer batch size. We used to have hardcoded number of size classes, so we couldn't really support any batch size setting. This commit adds support for dynamic number of size classes (runtime value returned by Static::num_size_classes()).	2017-05-14 19:04:56 -07:00
Aliaksey Kandratsenka	4585b78c8d	massage allocation and deallocation fast-path for performance This is significant speedup of fast-path of malloc. Large part comes from avoiding expensive function prologue/epilogue. Which is achieved by making sure that tc_{malloc,new,free} etc are small functions that do only tail-calls. We keep only critical path in those functions and tail-call to slower "full" versions when we need to deal with less common case. This helps compiler generate much tidier code. Fast-path readyness check is now different too. We used to have "min size for slow path" variable, which was set to non-zero value when we know that thread cache is present and ready. We now have use thread-cache pointer not equal to NULL as readyness check. There is special ThreadCache::threadlocal_data_.fast_path_heap copy of that pointer that can be temporarily nulled to disable malloc fast path. This is used to enable emergency malloc. There is also slight change to tracking thread cache size. Instead of tracking total size of free list, it now tracks size headroom. This allows for slightly faster deallocation fast-path check where we're checking headroom to stay above zero. This check is a bit faster than comparing with max_size_.	2017-05-14 19:04:56 -07:00
Aliaksey Kandratsenka	5964a1d9c9	always inline a number of hot functions	2017-05-14 19:04:55 -07:00
Aliaksey Kandratsenka	e419b7b9a6	introduce ATTRIBUTE_ALWAYS_INLINE	2017-05-14 19:04:55 -07:00
Aliaksey Kandratsenka	27da4ade70	reduce size of class_to_size_ array Since 32-bit int is enough and accessing smaller array will use a bit less of cache.	2017-05-14 19:04:55 -07:00
Aliaksey Kandratsenka	121b1cb32e	slightly faster size class cache Lower bits of page index are still used as index into hash table. Those lower bits are zeroed, or-ed with size class and placed into hash table. So checking is just loading value from hash table, xoring with higher bits of address and checking if resultant value is lower than 128. Notably, size class 0 is not considered "invalid" anymore.	2017-05-14 19:04:55 -07:00
Aliaksey Kandratsenka	b57c0bad41	init tcmalloc prior to replacing system alloc Currently on windows, we're depending on uninitialized tcmalloc variables to detect freeing foreign malloc's chunks. This works somewhat by chance due to 0-initialized size classes cache working as cache with no values. But this is about to change, so lets do explicit initialization.	2017-05-14 19:04:55 -07:00
Aliaksey Kandratsenka	dfd53da578	set ENOMEM in handle_oom	2017-05-14 19:04:55 -07:00
Aliaksey Kandratsenka	507a105e84	pass original size to DoSampledAllocation It makes heap profiles more accurate. Google's internal malloc is doing it as well.	2017-05-14 19:04:55 -07:00
Aliaksey Kandratsenka	bb77979dea	don't declare throw() on malloc funtions since it is faster Apparently throw() on functions actually asks compiler to generate code to detect unexpected exceptions. Which prevents tail calls optimization. So in order to re-enable this optimization, we simply don't tell compiler about throw() at all. C++11 noexcept would be even better, but it is not universally available yet. So we change to no exception specifications. Which at least for gcc & clang on Linux (and likely for all ELF platforms, if not just all) really eliminates all overhead of exceptions.	2017-05-14 19:04:55 -07:00

1 2 3

129 Commits