RepoMirrors/musl - musl

Commit Graph

Author	SHA1	Message	Date
Rich Felker	e617b9eea9	move arm-specific translation units out of arch/arm/src, to src//arm this is possible with the new build system that allows src//$(ARCH)/* files which do not shadow a file in the parent directory, and yields a more logical organization. eventually it will be possible to remove arch/*/src from the build system.	2016-01-22 00:02:21 +00:00
Rich Felker	397f0a6a7d	overhaul arm atomics for new atomics framework switch to ll/sc model so that new atomic.h can provide optimized versions of all the atomic primitives without needing an ll/sc loop written in asm for each one. all isa levels which use ldrex/strex now use the inline ll/sc model even if the type of barrier to use is not known until runtime (v6). the cas model is only used for arm v5 and earlier, and it has been optimized to make the call via inline asm with custom constraints rather than as a C function call.	2016-01-21 23:30:30 +00:00
Rich Felker	aa0db4b5d0	overhaul aarch64 atomics for new atomics framework	2016-01-21 19:50:55 +00:00
Rich Felker	61b1e75f7d	overhaul sh atomics for new atomics framework, add j-core cas.l backend sh needs runtime-selected atomic backends since there are a number of supported models that use non-forwards-compatible (non-smp-compatible) atomic mechanisms. previously, the code paths for this were highly inefficient since they involved C function calls with multiple branches in the callee and heavy spills in the caller. the new code performs calls the runtime-selected asm fragment from inline asm with extremely minimal clobbers, rather than using a function call. for the sh4a case where the atomic mechanism is known and there is no forward-compatibility issue, the movli.l and movco.l instructions are provided as a_ll and a_sc, allowing the new shared atomic.h to generate efficient inline versions of all the basic atomic operations without needing a cas loop.	2016-01-21 19:43:04 +00:00
Rich Felker	1315596b51	refactor internal atomic.h rather than having each arch provide its own atomic.h, there is a new shared atomic.h in src/internal which pulls arch-specific definitions from arc/$(ARCH)/atomic_arch.h. the latter can be extremely minimal, defining only a_cas or new ll/sc type primitives which the shared atomic.h will use to construct everything else. this commit avoids making heavy changes to the individual archs' atomic implementations. definitions which are identical or near-identical to what the new shared atomic.h would produce have been removed, but otherwise the changes made are just hooking up the arch-specific files to the new infrastructure. major changes to take advantage of the new system will come in subsequent commits.	2016-01-21 19:08:54 +00:00
Rich Felker	b6363bb70a	fix build regression for arm pre-v7 from out-of-tree build patch commit `2f853dd6b9` failed to replicate the old makefile logic that caused arch/arm/src/arm/atomics.s to be built. since this was the only .s file under arch//src, rather than trying to reproduce the old logic, I'm just moving it up a level and adjusting the glob pattern in the makefile to catch it. eventually arch//src will probably be removed in favor of moving all these files to appropriate src/*/$(ARCH) locations.	2016-01-20 02:31:06 +00:00
Rich Felker	56764601af	fix dynamic linker path file selection for arm vs armhf the __SOFTFP__ macro which was wrongly being used does not reflect the ABI (arm vs armhf) but just the availability of floating point instructions/registers, so -mfloat-abi=softfp was wrongly being treated as armhf. __ARM_PCS_VFP is the correct predefined macro to check for the armhf EABI variant. this macro usage was corrected for the build process in commit `4918c2bb20` but reloc.h was apparently overlooked at the time.	2016-01-20 01:16:09 +00:00
Rich Felker	5e396fb996	adjust mips crt_arch entry point asm to avoid assembler bugs apparently the .gpword directive does not work reliably with local text labels; values produced were offset by 64k from the correct value, resulting in incorrect computation of the got pointer at runtime. instead, use an external label so that the assembler does not munge the relocation; the linker will then get it right. commit `6fef8cafbd` exposed this issue by removing the old, non-PIE-compatible handwritten crt1.s, which was not affected. presumably mips PIE executables (using Scrt1.o produced from crt_arch.h) were already affected at the time.	2015-12-29 13:01:29 -05:00
Rich Felker	71991a803c	adjust i386 max_align_t definition to work around some broken compilers at least gcc 4.7 claims c++11 support but does not accept the alignas keyword, causing breakage when stddef.h is included in c++11 mode. instead, prefer using __attribute__((__aligned__)) on any compiler with GNU extensions, and only use the alignas keyword as a fallback for other C++ compilers. C code should not be affected by this patch.	2015-12-29 12:46:15 -05:00
Rich Felker	0d58bf2d60	remove visibility suppression by SHARED macro in mips and x32 arch files commit `8a8fdf6398` was intended to remove all such usage, but these arch-specific files were overlooked, leading to inconsistent declarations and definitions.	2015-12-15 23:18:38 -05:00
Rich Felker	9439ebd766	fix dynamic loader library mapping for nommu systems on linux/nommu, non-writable private mappings of files may actually use memory shared with other processes or the fs cache. the old nommu loader code (used when mmap with MAP_FIXED fails) simply wrote over top of the original file mapping, possibly clobbering this shared memory. no such breakage was observed in practice, but it should have been possible. the new code starts by mapping anonymous writable memory on archs that might support nommu, then maps load segments over top of it, falling back to read if MAP_FIXED fails. we use an anonymous map rather than a writable file map to avoid reading more data from disk than needed. since pages cannot be loaded lazily on fault, in case of large data/bss, mapping the full file may read a lot of data that will subsequently be thrown away when processing additional LOAD segments. as a result, we cannot skip the first LOAD segment when operating in this mode. these changes affect only non-FDPIC nommu support.	2015-11-11 17:40:27 -05:00
Rich Felker	4e73d12117	explicitly assemble all arm asm sources as UAL these files are all accepted as legacy arm syntax when producing arm code, but legacy syntax cannot be used for producing thumb2 with access to the full ISA. even after switching to UAL, some asm source files contain instructions which are not valid in thumb mode, so these will need to be addressed separately.	2015-11-10 00:01:55 -05:00
Rich Felker	9f290a49bf	remove non-working pre-armv4t support from arm asm the idea of the three-instruction sequence being removed was to be able to return to thumb code when used on armv4t+ from a thumb caller, but also to be able to run on armv4 without the bx instruction available (in which case the low bit of lr would always be 0). however, without compiler support for generating such a sequence from C code, which does not exist and which there is unlikely to be interest in implementing, there is little point in having it in the asm, and it would likely be easier to add pre-armv4t support via enhanced linker handling of R_ARM_V4BX than at the compiler level. removing this code simplifies adding support for building libc in thumb2-only form (for cortex-m).	2015-11-09 22:36:38 -05:00
Rich Felker	4fcb48275a	generalize sh entry point asm not to assume call dests fit in 12 bits this assumption is borderline-unsafe to begin with, and fails badly with -ffunction-sections since the linker can move the callee arbitrarily far away when it lies in a different section.	2015-11-02 18:11:36 -05:00
Rich Felker	cb1bf2f321	properly access mcontext_t program counter in cancellation handler using the actual mcontext_t definition rather than an overlaid pointer array both improves correctness/readability and eliminates some ugly hacks for archs with 64-bit registers bit 32-bit program counter. also fix UB due to comparison of pointers not in a common array object.	2015-11-02 12:41:49 -05:00
Rich Felker	92637bb0d8	prevent reordering of or1k and powerpc thread pointer loads other archs use asm for the thread pointer load, so making that asm volatile is sufficient to inform the compiler that it has a "side effect" (crashing or giving the wrong result if the thread pointer was not yet initialized) that prevents reordering. however, powerpc and or1k have dedicated general purpose registers for the thread pointer and did not need to use any asm to access it; instead, "local register variables with a specified register" were used. however, there is no specification for ordering constraints on this type of usage, and presumably use of the thread pointer could be reordered across its initialization. to impose an ordering, I have added empty volatile asm blocks that produce the "local register variable with a specified register" as an output constraint.	2015-10-15 12:08:51 -04:00
Rich Felker	74483c5955	mark arm thread-pointer-loading inline asm as volatile this builds on commits `a603a75a72` and `0ba35d69c0` to ensure that a compiler cannot conclude that it's valid to reorder the asm to a point before the thread pointer is set up, or to treat the inline function as if it were declared with attribute((const)). other archs already use volatile asm for thread pointer loading.	2015-10-15 12:04:48 -04:00
Rich Felker	11da520c7a	add comment documenting hard-coded opcode for reading mips thread pointer	2015-10-15 00:55:41 -04:00
Rich Felker	0ba35d69c0	remove attribute((const)) from arm __pthread_self inline function commit `a603a75a72` did this for the public pthread_self function but not the internal inline one.	2015-10-15 00:20:50 -04:00
Rich Felker	b61df2294f	fix signal return for sh/fdpic the restorer function pointer provided in the kernel sigaction structure is interpreted by the kernel as a raw code address, not a function descriptor. this commit moves the declarations of the __restore and __restore_rt symbols to ksigaction.h so that arch versions of the file can override them, and introduces a version for sh which declares them as objects rather than functions. an alternate solution would have been defining SA_RESTORER to 0 so that the functions are not used, but this both requires executable stack (since the sh kernel does not have a vdso page with permanent restorer functions) and crashes on qemu user-level emulation.	2015-09-23 18:33:49 +00:00
Rich Felker	e9e770dfd6	have sh/fdpic entry point set fdpic personality if needed the entry point code supports being loaded by a loader which is not fdpic-aware (in practice, either kernel with mmu or qemu without fdpic support). this mostly just works, but signal handling will wrongly use a function descriptor address as a code address if the personality is not adjusted to fdpic. ideally this code could be placed with sigaction so that it's not needed except if/when a signal handler is installed. however, personality is incorrectly maintained per-thread by the kernel, rather than per-process, so it's necessary to correct the personality before any threads are started. also, in order to skip the personality syscall when an fdpic-aware loader is used, we need to be able to detect how the program was loaded, and this information is only readily available at the entry point.	2015-09-22 20:51:59 +00:00
Rich Felker	eaf7ab6e24	add real fdpic loading of shared libraries previously, the normal ELF library loading code was used even for fdpic, so only the kernel-loaded dynamic linker and main app could benefit from separate placement of segments and shared text.	2015-09-22 19:12:48 +00:00
Rich Felker	7f9086df95	size-optimize sh/fdpic dynamic entry point the __fdpic_fixup code is not needed for ET_DYN executables, which instead use reloctions, so we can omit it from the dynamic linker and static-pie entry point and save some code size.	2015-09-22 04:14:07 +00:00
Rich Felker	cab2b1f9d7	work around breakage in sh/fdpic __unmapself function the C implementation of __unmapself used for potentially-nommu sh assumed CRTJMP takes a function descriptor rather than a code address; however, the actual dynamic linker needs a code address, and so commit `7a9669e977` changed the definition of the macro in reloc.h. this commit puts the old macro back in a place where it only affects __unmapself. this is an ugly workaround and should be cleaned up at some point, but at least it's well isolated.	2015-09-22 04:10:42 +00:00
Rich Felker	7a9669e977	add general fdpic support in dynamic linker and arch support for sh at this point not all functionality is complete. the dynamic linker itself, and main app if it is also loaded by the kernel, take advantage of fdpic and do not need constant displacement between segments, but additional libraries loaded by the dynamic linker follow normal ELF semantics for mapping still. this fully works, but does not admit shared text on nommu. in terms of actual functional correctness, dlsym's results are presently incorrect for function symbols, RTLD_NEXT fails to identify the caller correctly, and dladdr fails almost entirely. with the dynamic linker entry point working, support for static pie is automatically included, but linking the main application as ET_DYN (pie) probably does not make sense for fdpic anyway. ET_EXEC is equally relocatable but more efficient at representing relocations.	2015-09-22 03:54:42 +00:00
Rich Felker	12b0b7d8ea	new dlstart stage-2 chaining for x86_64 and x32	2015-09-17 07:28:44 +00:00
Rich Felker	c16182680c	new dlstart stage-2 chaining for powerpc	2015-09-17 07:20:58 +00:00
Rich Felker	4761e63bc4	new dlstart stage-2 chaining for or1k	2015-09-17 07:20:51 +00:00
Rich Felker	cd7159e7be	new dlstart stage-2 chaining for mips	2015-09-17 07:20:43 +00:00
Rich Felker	57e2dce7e4	new dlstart stage-2 chaining for microblaze	2015-09-17 07:20:36 +00:00
Rich Felker	2907afb8db	introduce new symbol-lookup-free rcrt1/dlstart stage chaining previously, the call into stage 2 was made by looking up the symbol name "__dls2" (which was chosen short to be easy to look up) from the dynamic symbol table. this was no problem for the dynamic linker, since it always exports all its symbols. in the case of the static pie entry point, however, the dynamic symbol table does not contain the necessary symbol unless -rdynamic/-E was used when linking. this linking requirement is a major obstacle both to practical use of static-pie as a nommu binary format (since it greatly enlarges the file) and to upstream toolchain support for static-pie (adding -E to default linking specs is not reasonable). this patch replaces the runtime symbolic lookup with a link-time lookup via an inline asm fragment, which reloc.h is responsible for providing. in this initial commit, the asm is provided only for i386, and the old lookup code is left in place as a fallback for archs that have not yet transitioned. modifying crt_arch.h to pass the stage-2 function pointer as an argument was considered as an alternative, but such an approach would not be compatible with fdpic, where it's impossible to compute function pointers without already having performed relocations. it was also deemed desirable to keep crt_arch.h as simple/minimal as possible. in principle, archs with pc-relative or got-relative addressing of static variables could instead load the stage-2 function pointer from a static volatile object. that does not work for fdpic, and is not safe against reordering on mips-like archs that use got slots even for static functions, but it's a valid on i386 and many others, and could provide a reasonable default implementation in the future.	2015-09-17 06:30:55 +00:00
Felix Janda	64b6684ddd	reindent powerpc's bits/termios.h to be consistent with other archs	2015-09-15 14:30:08 -04:00
Felix Janda	b291e7ca9b	fix namespace violations in aarch64/bits/termios.h in analogy with commit `a627eb3586`	2015-09-15 14:28:07 -04:00
Rich Felker	d4c82d05b8	add sh fdpic subarch variants with this commit it should be possible to produce a working static-linked fdpic libc and application binaries for sh. the changes in reloc.h are largely unused at this point since dynamic linking is not supported, but the CRTJMP macro is used one place outside of dynamic linking, in __unmapself.	2015-09-12 03:23:49 +00:00
Rich Felker	4ccc1a01e0	add fdpic version of entry point code for sh this version of the entry point is only suitable for static linking in ET_EXEC form. neither dynamic linking nor pie is supported yet. at some point in the future the fdpic and non-fdpic versions of this code may be unified but for now it's easiest to work with them separately.	2015-09-12 03:18:08 +00:00
Rich Felker	234c58467c	make sh clone asm fdpic-compatible clone calls back to a function pointer provided by the caller, which will actually be a pointer to a function descriptor on fdpic. the obvious solution is to have a separate version of clone for fdpic, but I have taken a simpler approach to go around the problem. instead of calling the pointed-to function from asm, a direct call is made to an internal C function which then calls the pointed-to function. this lets the C compiler generate the appropriate calling convention for an indirect call with no need for ABI-specific assembly.	2015-09-12 02:55:28 +00:00
Rich Felker	878887c50c	fix missing earlyclobber flag in i386 a_ctz_64 asm this error was only found by reading the code, but it seems to have been causing gcc to produce wrong code in malloc: the same register was used for the output and the high word of the input. in principle this could have caused an infinite loop searching for an available bin, but in practice most x86 models seem to implement the "undefined" result of the bsf instruction as "unchanged".	2015-09-09 07:18:28 +00:00
Timo Teräs	d8be1bc019	implement arm eabi mem* functions these functions are part of the ARM EABI, meaning compilers may generate references to them. known versions of gcc do not use them, but llvm does. they are not provided by libgcc, and the de facto standard seems to be that libc provides them.	2015-08-31 06:35:01 +00:00
Rich Felker	5a9c8c05a5	mitigate performance regression in libc-internal locks on x86_64 commit `3c43c0761e` fixed missing synchronization in the atomic store operation for i386 and x86_64, but opted to use mfence for the barrier on x86_64 where it's always available. however, in practice mfence is significantly slower than the barrier approach used on i386 (a nop-like lock orl operation). this commit changes x86_64 (and x32) to use the faster barrier.	2015-08-16 18:15:18 +00:00
Szabolcs Nagy	e5b086e1d5	aarch64: fix 64-bit syscall argument passing On 32bit systems long long arguments are passed in a special way to some syscalls; this accidentally got copied to the AArch64 port. The following interfaces were broken: fallocate, fanotify, ftruncate, posix_fadvise, posix_fallocate, pread, pwrite, readahead, sync_file_range, truncate.	2015-08-11 23:11:57 +00:00
Rich Felker	3c43c0761e	fix missing synchronization in atomic store on i386 and x86_64 despite being strongly ordered, the x86 memory model does not preclude reordering of loads across earlier stores. while a plain store suffices as a release barrier, we actually need a full barrier, since users of a_store subsequently load a waiter count to determine whether to issue a futex wait, and using a stale count will result in soft (fail-to-wake) deadlocks. these deadlocks were observed in malloc and possible with stdio locks and other libc-internal locking. on i386, an atomic operation on the caller's stack is used as the barrier rather than performing the store itself using xchg; this avoids the need to read the cache line on which the store is being performed. mfence is used on x86_64 where it's always available, and could be used on i386 with the appropriate cpu model checks if it's shown to perform better.	2015-07-28 18:40:18 +00:00
Roman Yeryomin	3975577922	socket.h: cleanup/reorder mips and powerpc bits/socket.h ....to be somewhat consistent and easily comparable with asm/socket.h Signed-off-by: Roman Yeryomin <roman@ubnt.com>	2015-07-21 19:14:58 -04:00
Roman Yeryomin	29ec7677a7	socket.h: fix SO_* for mips Signed-off-by: Roman Yeryomin <roman@ubnt.com>	2015-07-21 19:14:26 -04:00
Felix Fietkau	3fffa7a658	mips: fix mcontext_t register array field name glibc and uclibc use gregs instead of regs Signed-off-by: Felix Fietkau <nbd@openwrt.org>	2015-07-21 19:02:31 -04:00
Rich Felker	6ba5517a46	fix local-dynamic model TLS on mips and powerpc the TLS ABI spec for mips, powerpc, and some other (presently unsupported) RISC archs has the return value of __tls_get_addr offset by +0x8000 and the result of DTPOFF relocations offset by -0x8000. I had previously assumed this part of the ABI was actually just an implementation detail, since the adjustments cancel out. however, when the local dynamic model is used for accessing TLS that's known to be in the same DSO, either of the following may happen: 1. the -0x8000 offset may already be applied to the argument structure passed to __tls_get_addr at ld time, without any opportunity for runtime relocations. 2. __tls_get_addr may be used with a zero offset argument to obtain a base address for the module's TLS, to which the caller then applies immediate offsets for individual objects accessed using the local dynamic model. since the immediate offsets have the -0x8000 adjustment applied to them, the base address they use needs to include the +0x8000 offset. it would be possible, but more complex, to store the pointers in the dtv[] array with the +0x8000 offset pre-applied, to avoid the runtime cost of adding 0x8000 on each call to __tls_get_addr. this change could be made later if measurements show that it would help.	2015-06-25 22:22:00 +00:00
Rich Felker	10d0268ccf	switch to using trap number 31 for syscalls on sh nominally the low bits of the trap number on sh are the number of syscall arguments, but they have never been used by the kernel, and some code making syscalls does not even know the number of arguments and needs to pass an arbitrary high number anyway. sh3/sh4 traditionally used the trap range 16-31 for syscalls, but part of this range overlapped with hardware exceptions/interrupts on sh2 hardware, so an incompatible range 32-47 was chosen for sh2. using trap number 31 everywhere, since it's in the existing sh3/sh4 range and does not conflict with sh2 hardware, is a proposed unification of the kernel syscall convention that will allow binaries to be shared between sh2 and sh3/sh4. if this is not accepted into the kernel, we can refit the sh2 target with runtime selection mechanisms for the trap number, but doing so would be invasive and would entail non-trivial overhead.	2015-06-16 15:25:02 +00:00
Rich Felker	3366a99b17	switch sh port's __unmapself to generic version when running on sh2/nommu due to the way the interrupt and syscall trap mechanism works, userspace on sh2 must never set the stack pointer to an invalid value. thus, the approach used on most archs, where __unmapself executes with no stack for the interval between SYS_munmap and SYS_exit, is not viable on sh2. in order not to pessimize sh3/sh4, the sh asm version of __unmapself is not removed. instead it's renamed and redirected through code that calls either the generic (safe) __unmapself or the sh3/sh4 asm, depending on compile-time and run-time conditions.	2015-06-16 14:55:06 +00:00
Rich Felker	f9d84554ba	add support for sh2 interrupt-masking-based atomics to sh port the sh2 target is being considered an ISA subset of sh3/sh4, in the sense that binaries built for sh2 are intended to be usable on later cpu models/kernels with mmu support. so rather than hard-coding sh2-specific atomics, the runtime atomic selection mechanisms that was already in place has been extended to add sh2 atomics. at this time, the sh2 atomics are not SMP-compatible; since the ISA lacks actual atomic operations, the new code instead masks interrupts for the duration of the atomic operation, producing an atomic result on single-core. this is only possible because the kernel/hardware does not impose protections against userspace doing so. additional changes will be needed to support future SMP systems. care has been taken to avoid producing significant additional code size in the case where it's known at compile-time that the target is not sh2 and does not need sh2-specific code.	2015-06-16 14:38:41 +00:00
Szabolcs Nagy	ee59c296d5	arm: add vdso support vdso will be available on arm in linux v4.2, the user-space code for it is in kernel commit 8512287a8165592466cb9cb347ba94892e9c56a5	2015-06-14 04:23:20 +00:00
Rich Felker	9f26ebded1	fix stack alignment code in mips crt_arch.h the instruction used to align the stack, "and $sp, $sp, -8", does not actually exist; it's expanded to 2 instructions using the 'at' (assembler temporary) register, and thus cannot be used in a branch delay slot. since alignment mod 16 commutes with subtracting 8, simply swapping these two operations fixes the problem. crt1.o was not affected because it's still being generated from a dedicated asm source file. dlstart.lo was not affected because the stack pointer it receives is already aligned by the kernel. but Scrt1.o was affected in cases where the dynamic linker gave it a misaligned stack pointer.	2015-05-24 23:03:47 -04:00

1 2 3 4 5 ...

424 Commits