Commit Graph

70 Commits

Author SHA1 Message Date
Szabolcs Nagy
06fbefd100 add a_clz_64 helper function
counts leading zero bits of a 64bit int, undefined on zero input.
(has nothing to do with atomics, added to atomic.h so target specific
helper functions are together.)

there is a logarithmic generic implementation and another in terms of
a 32bit a_clz_32 on targets where that's available.
2017-08-29 21:47:10 -04:00
rofl0r
1f53e7d00c fix crashes in x32 __tls_get_addr
x32 has another gratuitous difference to all other archs:
it passes an array of 64bit values to __tls_get_addr().
usually it is an array of size_t.
2017-01-13 10:47:08 +00:00
Rich Felker
150747b41e reduce impact of REG_* namespace pollution in x86[_64] signal.h
when _GNU_SOURCE is defined, which is always the case when compiling
c++ with gcc, these macros for the the indices in gregset_t are
exposed and likely to clash with applications. by using enum constants
rather than macros defined with integer literals, we can make the
clash slightly less likely to break software. the macros are still
defined in case anything checks for them with #ifdef, but they're
defined to expand to themselves so that non-file-scope (e.g.
namespaced) identifiers by the same names still work.

for the sake of avoiding mistakes, the changes were generated with sed
via the command:

sed -i -e 's/#define  *\(REG_[A-Z_0-9]\{1,\}\)  *\([0-9]\{1,\}\)'\
'/enum { \1 = \2 };\n#define \1 \1/' \
arch/i386/bits/signal.h arch/x86_64/bits/signal.h arch/x32/bits/signal.h
2017-01-04 17:08:19 -05:00
Szabolcs Nagy
62eaf40bf4 add pkey_{mprotect,alloc,free} syscalls from linux v4.9
see linux commit e8c24d3a23a469f1f40d4de24d872ca7023ced0a
and linux Documentation/x86/protection-keys.txt
2016-12-29 22:10:19 -05:00
Rich Felker
54991729fd work around gdb issues recognizing sigreturn trampoline on x86_64
gdb can only backtrace/unwind across signal handlers if it recognizes
the sa_restorer trampoline. for x86_64, gdb first attempts to
determine the symbol name for the function in which the program
counter resides and match it against "__restore_rt". if no name can be
found (e.g. in the case of a stripped binary), the exact instruction
sequence is matched instead.

when matching the function name, however, gdb's unwind code wrongly
considers the interval [sym,sym+size] rather than [sym,sym+size).
thus, if __restore_rt begins immediately after another function, gdb
wrongly identifies pc as lying within the previous adjacent function.
this patch adds a nop before __restore_rt to preclude that
possibility. it also removes the symbol name __restore and replaces it
with a macro since the stability of whether gdb identifies the
function as __restore_rt or __restore is not clear.

for the no-symbols case, the instruction sequence is changed to use
%rax rather than %eax to match what gdb expects.

based on patch by Szabolcs Nagy, with extended description and
corresponding x32 changes added.
2016-11-12 19:54:43 -05:00
Szabolcs Nagy
2ed811a38a fix preadv2 and pwritev2 syscall numbers on x32 for linux v4.8
the numbers were wrong in musl, but they were also wrong in the kernel
and got fixed in v4.8 commit 3ebfd81f7fb3e81a754e37283b7f38c62244641a
2016-10-20 01:27:07 -04:00
Rich Felker
ee3f0c5516 make brace placement in public header typedef'd structs consistent
commit befa5866ee performed this change
for struct definitions that did not also involve typedef, but omitted
the latter.
2016-07-03 16:19:28 -04:00
Rich Felker
befa5866ee make brace placement in public header struct definitions consistent
placing the opening brace on the same line as the struct keyword/tag
is the style I prefer and seems to be the prevailing practice in more
recent additions.

these changes were generated by the command:

find include/ arch/*/bits -name '*.h' \
-exec sed -i '/^struct [^;{]*$/{N;s/\n/ /;}' {} +

and subsequently checked by hand to ensure that the regex did not pick
up any false positives.
2016-07-03 15:02:25 -04:00
Szabolcs Nagy
76d7cfb7e6 use the generic ioctl.h for x86_64, x32 and aarch64
they were slightly different in musl, but should be the same:
the linux uapi and glibc headers are not different.
2016-07-03 12:49:24 -04:00
Szabolcs Nagy
78b1f3cb14 add preadv2 and pwritev2 syscall numbers for linux v4.6
the syscalls take an additional flag argument, they were added in commit
f17d8b35452cab31a70d224964cd583fb2845449 and a RWF_HIPRI priority hint
flag was added to linux/fs.h in 97be7ebe53915af504fb491fb99f064c7cf3cb09.

the syscall is not allocated for microblaze and sh yet.
2016-06-09 13:38:41 -04:00
Bobby Bingham
63e3a1661f deduplicate __NR_* and SYS_* syscall number definitions 2016-05-12 00:34:05 -05:00
Bobby Bingham
8ef6170b43 x32: eliminate __X32_SYSCALL_BIT constant 2016-05-12 00:32:45 -05:00
Bobby Bingham
622fe8b5cf x32: remove arch-specific syscall remapping
These system calls are already all remapped in an arch-agnostic manner in
src/internal/syscall.h
2016-05-12 00:30:51 -05:00
Rich Felker
5c3412d225 fix regression disabling use of pause instruction for x86 a_spin
commits e24984efd5 and
16b55298dc inadvertently disabled the
a_spin implementations for i386, x86_64, and x32 by defining a macro
named a_pause instead of a_spin. this should not have caused any
functional regression, but it inhibited cpu relaxation while spinning
for locks.

bug reported by George Kulakowski.
2016-03-29 21:27:28 -04:00
Szabolcs Nagy
84d4f5eee5 add copy_file_range syscall numbers from linux v4.5
it was introduced for offloading copying between regular files
in linux commit 29732938a6289a15e907da234d6692a2ead71855

(microblaze and sh does not yet have the syscall number.)
2016-03-19 11:30:49 -04:00
Szabolcs Nagy
e9f1c7981a deduplicate bits/mman.h
currently five targets use the same mman.h constants and the rest
share most constants too, so move them to sys/mman.h before the
bits/mman.h include where the differences can be corrected by
redefinition of the macros.

this fixes two minor bugs: POSIX_MADV_DONTNEED was wrong on most
targets (it should be the same as MADV_DONTNEED), and sh defined
the x86-only MAP_32BIT mmap flag.
2016-03-18 22:40:28 -04:00
Rich Felker
4dfac11538 deduplicate the bulk of the arch bits headers
all bits headers that were identical for a number of 'clean' archs are
moved to the new arch/generic tree. in addition, a few headers that
differed only cosmetically from the new generic version are removed.

additional deduplication may be possible in mman.h and in several
headers (limits.h, posix.h, stdint.h) that mostly depend on whether
the arch is 32- or 64-bit, but they are left alone for now because
greater gains are likely possible with more invasive changes to header
logic, which is beyond the scope of this commit.
2016-01-27 21:52:14 -05:00
Szabolcs Nagy
789ff6a9f8 add MCL_ONFAULT and MLOCK_ONFAULT mlockall and mlock2 flags
they lock faulted pages into memory (useful when a small part of a
large mapped file needs efficient access), new in linux v4.4, commit
b0f205c2a3082dd9081f9a94e50658c5fa906ff1

MLOCK_* is not in the POSIX reserved namespace for sys/mman.h
2016-01-26 18:31:05 -05:00
Szabolcs Nagy
51d5f139ca add mlock2 syscall number from linux v4.4
this is mlock with a flags argument, new in linux commit
a8ca5d0ecbdde5cc3d7accacbd69968b0c98764e

as usual microblaze and sh don't have allocated syscall number yet.
2016-01-26 18:30:50 -05:00
Szabolcs Nagy
09001a8f97 add new membarrier, userfaultfd and switch_endian syscalls
new in linux v4.3 added for aarch64, arm, i386, mips, or1k, powerpc,
x32 and x86_64.

membarrier is a system wide memory barrier, moves most of the
synchronization cost to one side, new in kernel commit
5b25b13ab08f616efd566347d809b4ece54570d1

userfaultfd is useful for qemu and is new in kernel commit
8d2afd96c20316d112e04d935d9e09150e988397

switch_endian is powerpc only for switching endianness, new in commit
529d235a0e190ded1d21ccc80a73e625ebcad09b
2016-01-26 18:28:20 -05:00
Rich Felker
66215afc2e move x32 sysinfo impl and syscall fixup code out of arch/x32/src
all such arch-specific translation units are being moved to
appropriate arch dirs under the main src tree.
2016-01-22 03:39:07 +00:00
Rich Felker
16b55298dc clean up x86_64 (and x32) atomics for new atomics framework
this commit mostly makes consistent things like spacing, function
ordering in atomic_arch.h, argument names, use of volatile, etc.
a_ctz_l was also removed from x86_64 since atomic.h provides it
automatically using a_ctz_64.
2016-01-22 00:53:09 +00:00
Rich Felker
1315596b51 refactor internal atomic.h
rather than having each arch provide its own atomic.h, there is a new
shared atomic.h in src/internal which pulls arch-specific definitions
from arc/$(ARCH)/atomic_arch.h. the latter can be extremely minimal,
defining only a_cas or new ll/sc type primitives which the shared
atomic.h will use to construct everything else.

this commit avoids making heavy changes to the individual archs'
atomic implementations. definitions which are identical or
near-identical to what the new shared atomic.h would produce have been
removed, but otherwise the changes made are just hooking up the
arch-specific files to the new infrastructure. major changes to take
advantage of the new system will come in subsequent commits.
2016-01-21 19:08:54 +00:00
Rich Felker
0d58bf2d60 remove visibility suppression by SHARED macro in mips and x32 arch files
commit 8a8fdf6398 was intended to remove
all such usage, but these arch-specific files were overlooked, leading
to inconsistent declarations and definitions.
2015-12-15 23:18:38 -05:00
Rich Felker
cb1bf2f321 properly access mcontext_t program counter in cancellation handler
using the actual mcontext_t definition rather than an overlaid pointer
array both improves correctness/readability and eliminates some ugly
hacks for archs with 64-bit registers bit 32-bit program counter.

also fix UB due to comparison of pointers not in a common array
object.
2015-11-02 12:41:49 -05:00
Rich Felker
12b0b7d8ea new dlstart stage-2 chaining for x86_64 and x32 2015-09-17 07:28:44 +00:00
Rich Felker
5a9c8c05a5 mitigate performance regression in libc-internal locks on x86_64
commit 3c43c0761e fixed missing
synchronization in the atomic store operation for i386 and x86_64, but
opted to use mfence for the barrier on x86_64 where it's always
available. however, in practice mfence is significantly slower than
the barrier approach used on i386 (a nop-like lock orl operation).
this commit changes x86_64 (and x32) to use the faster barrier.
2015-08-16 18:15:18 +00:00
Rich Felker
3c43c0761e fix missing synchronization in atomic store on i386 and x86_64
despite being strongly ordered, the x86 memory model does not preclude
reordering of loads across earlier stores. while a plain store
suffices as a release barrier, we actually need a full barrier, since
users of a_store subsequently load a waiter count to determine whether
to issue a futex wait, and using a stale count will result in soft
(fail-to-wake) deadlocks. these deadlocks were observed in malloc and
possible with stdio locks and other libc-internal locking.

on i386, an atomic operation on the caller's stack is used as the
barrier rather than performing the store itself using xchg; this
avoids the need to read the cache line on which the store is being
performed. mfence is used on x86_64 where it's always available, and
could be used on i386 with the appropriate cpu model checks if it's
shown to perform better.
2015-07-28 18:40:18 +00:00
Rich Felker
c648cefb27 fix inconsistency in a_and and a_or argument types on x86[_64]
conceptually, and on other archs, these functions take a pointer to
int, but in the i386, x86_64, and x32 versions of atomic.h, they took
a pointer to void instead.
2015-05-20 00:17:35 -04:00
Rich Felker
484194dbf4 fix stack protector crashes on x32 & powerpc due to misplaced TLS canary
i386, x86_64, x32, and powerpc all use TLS for stack protector canary
values in the default stack protector ABI, but the location only
matched the ABI on i386 and x86_64. on x32, the expected location for
the canary contained the tid, thus producing spurious mismatches
(resulting in process termination) upon fork. on powerpc, the expected
location contained the stdio_locks list head, so returning from a
function after calling flockfile produced spurious mismatches. in both
cases, the random canary was not present, and a predictable value was
used instead, making the stack protector hardening much less effective
than it should be.

in the current fix, the thread structure has been expanded to have
canary fields at all three possible locations, and archs that use a
non-default location must define a macro in pthread_arch.h to choose
which location is used. for most archs (which lack TLS canary ABI) the
choice does not matter.
2015-05-06 18:37:19 -04:00
Rich Felker
7fe273b2c1 fix broken cancellation on x32 due to incorrect saved-PC offset 2015-05-02 12:16:57 -04:00
Rich Felker
4f69594689 fix dangling pointers in x32 syscall timespec fixup code
the lifetime of compound literals is the block in which they appear.
the temporary struct __timespec_kernel objects created as compound
literals no longer existed at the time their addresses were passed to
the kernel.
2015-05-01 21:22:27 -04:00
Rich Felker
4bf10ebf66 fix breakage in x32 dynamic linker due to mismatching register size
the jmp instruction requires a 64-bit register, so cast the desired PC
address up to uint64_t, going through uintptr_t to ensure that it's
zero-extended rather than possibly sign-extended.
2015-04-20 18:17:48 -04:00
Rich Felker
cbc02ba23c consistently use hidden visibility for cancellable syscall internals
in a few places, non-hidden symbols were referenced from asm in ways
that assumed ld-time binding. while these is no semantic reason these
symbols need to be hidden, fixing the references without making them
hidden was going to be ugly, and hidden reduces some bloat anyway.

in the asm files, .global/.hidden directives have been moved to the
top to unclutter the actual code.
2015-04-14 11:18:59 -04:00
Rich Felker
f3ddd17380 dynamic linker bootstrap overhaul
this overhaul further reduces the amount of arch-specific code needed
by the dynamic linker and removes a number of assumptions, including:

- that symbolic function references inside libc are bound at link time
  via the linker option -Bsymbolic-functions.

- that libc functions used by the dynamic linker do not require
  access to data symbols.

- that static/internal function calls and data accesses can be made
  without performing any relocations, or that arch-specific startup
  code handled any such relocations needed.

removing these assumptions paves the way for allowing libc.so itself
to be built with stack protector (among other things), and is achieved
by a three-stage bootstrap process:

1. relative relocations are processed with a flat function.
2. symbolic relocations are processed with no external calls/data.
3. main program and dependency libs are processed with a
   fully-functional libc/ldso.

reduction in arch-specific code is achived through the following:

- crt_arch.h, used for generating crt1.o, now provides the entry point
  for the dynamic linker too.

- asm is no longer responsible for skipping the beginning of argv[]
  when ldso is invoked as a command.

- the functionality previously provided by __reloc_self for heavily
  GOT-dependent RISC archs is now the arch-agnostic stage-1.

- arch-specific relocation type codes are mapped directly as macros
  rather than via an inline translation function/switch statement.
2015-04-13 03:04:42 -04:00
Rich Felker
fd427c4eae move O_PATH definition back to arch bits
while it's the same for all presently supported archs, it differs at
least on sparc, and conceptually it's no less arch-specific than the
other O_* macros. O_SEARCH and O_EXEC are still defined in terms of
O_PATH in the main fcntl.h.
2015-04-01 19:31:06 -04:00
Rich Felker
d5a5045382 fix MINSIGSTKSZ values for archs with large signal contexts
the previous values (2k min and 8k default) were too small for some
archs. aarch64 reserves 4k in the signal context for future extensions
and requires about 4.5k total, and powerpc reportedly uses over 2k.
the new minimums are chosen to fit the saved context and also allow a
minimal signal handler to run.

since the default (SIGSTKSZ) has always been 6k larger than the
minimum, it is also increased to maintain the 6k usable by the signal
handler. this happens to be able to store one pathname buffer and
should be sufficient for calling any function in libc that doesn't
involve conversion between floating point and decimal representations.

x86 (both 32-bit and 64-bit variants) may also need a larger minimum
(around 2.5k) in the future to support avx-512, but the values on
these archs are left alone for now pending further analysis.

the value for PTHREAD_STACK_MIN is not increased to match MINSIGSTKSZ
at this time. this is so as not to preclude applications from using
extremely small thread stacks when they know they will not be handling
signals. unfortunately cancellation and multi-threaded set*id() use
signals as an implementation detail and therefore require a stack
large enough for a signal context, so applications which use extremely
small thread stacks may still need to avoid using these features.
2015-03-18 00:31:37 -04:00
Rich Felker
673cab5c56 align x32 pthread type sizes to be common with 32-bit archs
previously, commit e7b9887e8b aligned
the sizes with the glibc ABI. subsequent discussion during the merge
of the aarch64 port reached a conclusion that we should reject larger
arch-specific sizes, which have significant cost and no benefit, and
stick with the existing common 32-bit sizes for all 32-bit/ILP32 archs
and the x86_64 sizes for 64-bit archs.

one peculiarity of this change is that x32 pthread_attr_t is now
larger in musl than in the glibc x32 ABI, making it unsafe to call
pthread_attr_init from x32 code that was compiled against glibc. with
all the ABI issues of x32, it's not clear that ABI compatibility will
ever work, but if it's needed, pthread_attr_init and related functions
could be modified not to write to the last slot of the object.

this is not a regression versus previous releases, since on previous
releases the x32 pthread type sizes were all severely oversized
already (due to incorrectly using the x86_64 LP64 definitions).
moreover, x32 is still considered experimental and not ABI-stable.
2015-03-12 14:43:36 -04:00
Szabolcs Nagy
559de8f5f0 fix FLT_ROUNDS to reflect the current rounding mode
Implemented as a wrapper around fegetround introducing a new function
to the ABI: __flt_rounds. (fegetround cannot be used directly from float.h)
2015-03-07 12:05:28 -05:00
Trutz Behn
f5011c62c3 fix POLLWRNORM and POLLWRBAND on mips
these macros have the same distinct definition on blackfin, frv, m68k,
mips, sparc and xtensa kernels. POLLMSG and POLLRDHUP additionally
differ on sparc.
2015-03-04 12:09:37 -05:00
Rich Felker
e7b9887e8b fix x32 pthread type definitions
the previous definitions were copied from x86_64. not only did they
fail to match the ABI sizes; they also wrongly encoded an assumption
that long/pointer types are twice as large as int.
2015-03-04 11:33:26 -05:00
Rich Felker
56fbaa3bbe make all objects used with atomic operations volatile
the memory model we use internally for atomics permits plain loads of
values which may be subject to concurrent modification without
requiring that a special load function be used. since a compiler is
free to make transformations that alter the number of loads or the way
in which loads are performed, the compiler is theoretically free to
break this usage. the most obvious concern is with atomic cas
constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be
transformed to a_cas(p,*p,f(*p)); where the latter is intended to show
multiple loads of *p whose resulting values might fail to be equal;
this would break the atomicity of the whole operation. but even more
fundamental breakage is possible.

with the changes being made now, objects that may be modified by
atomics are modeled as volatile, and the atomic operations performed
on them by other threads are modeled as asynchronous stores by
hardware which happens to be acting on the request of another thread.
such modeling of course does not itself address memory synchronization
between cores/cpus, but that aspect was already handled. this all
seems less than ideal, but it's the best we can do without mandating a
C11 compiler and using the C11 model for atomics.

in the case of pthread_once_t, the ABI type of the underlying object
is not volatile-qualified. so we are assuming that accessing the
object through a volatile-qualified lvalue via casts yields volatile
access semantics. the language of the C standard is somewhat unclear
on this matter, but this is an assumption the linux kernel also makes,
and seems to be the correct interpretation of the standard.
2015-03-03 22:50:02 -05:00
Szabolcs Nagy
f54c28cba2 add syscall numbers for the new execveat syscall
this syscall allows fexecve to be implemented without /proc, it is new
in linux v3.19, added in commit 51f39a1f0cea1cacf8c787f652f26dfee9611874
(sh and microblaze do not have allocated syscall numbers yet)

added a x32 fix as well: the io_setup and io_submit syscalls are no
longer common with x86_64, so use the x32 specific numbers.
2015-02-09 23:00:56 +01:00
Felix Janda
4758f0565d fix typo in x86_64/x32 user_fpregs_struct
mxcs_mask should be mxcr_mask
2015-02-01 13:49:15 -05:00
Trutz Behn
2d67ae923d move MREMAP_MAYMOVE and MREMAP_FIXED out of bits
the definitions are generic for all kernel archs. exposure of these
macros now only occurs on the same feature test as for the function
accepting them, which is believed to be more correct.
2015-01-30 22:02:23 -05:00
Szabolcs Nagy
f90fafea3c add new syscall numbers for bpf and kexec_file_load
these syscalls are new in linux v3.18, bpf is present on all
supported archs except sh, kexec_file_load is only allocted for
x86_64 and x32 yet.

bpf was added in linux commit 99c55f7d47c0dc6fc64729f37bf435abf43f4c60

kexec_file_load syscall number was allocated in commit
f0895685c7fd8c938c91a9d8a6f7c11f22df58d2
2014-12-23 01:44:19 -05:00
Rich Felker
91f15e2d0d move wint_t definition to the shared part of alltypes.h.in 2014-12-21 02:43:35 -05:00
Rich Felker
867b1822f3 add explicit barrier operation to internal atomic.h API 2014-10-10 18:17:09 -04:00
Szabolcs Nagy
4ffc39c654 add new syscall numbers for seccomp, getrandom, memfd_create
these syscalls are new in linux v3.17 and present on all supported
archs except sh.

seccomp was added in commit 48dc92b9fc3926844257316e75ba11eb5c742b2c
it has operation, flags and pointer arguments (if flags==0 then it is
the same as prctl(PR_SET_SECCOMP,...)), the uapi header for flag
definitions is linux/seccomp.h

getrandom was added in commit c6e9d6f38894798696f23c8084ca7edbf16ee895
it provides an entropy source when open("/dev/urandom",..) would fail,
the uapi header for flags is linux/random.h

memfd_create was added in commit 9183df25fe7b194563db3fec6dc3202a5855839c
it allows anon mmap to have an fd, that can be shared, sealed and needs no
mount point, the uapi header for flags is linux/memfd.h
2014-10-08 10:25:04 -04:00
Rich Felker
b7cf71a190 add threads.h and needed per-arch types for mtx_t and cnd_t
based on patch by Jens Gustedt.

mtx_t and cnd_t are defined in such a way that they are formally
"compatible types" with pthread_mutex_t and pthread_cond_t,
respectively, when accessed from a different translation unit. this
makes it possible to implement the C11 functions using the pthread
functions (which will dereference them with the pthread types) without
having to use the same types, which would necessitate either namespace
violations (exposing pthread type names in threads.h) or incompatible
changes to the C++ name mangling ABI for the pthread types.

for the rest of the types, things are much simpler; using identical
types is possible without any namespace considerations.
2014-09-06 20:44:30 -04:00