maintainer's note: while musl does not use the linux kernel headers,
it does provide these three sys/* headers which do nothing but include
the corresponding linux/* headers, since the sys/* versions are the
ones documented for application use (and they arguably provide
interfaces that are not linux-specific but common to other unices).
these headers should probably not be provided by libc (rather by a
separate package), but as long as they are, use the bits header
framework as an aid to out-of-tree ports of musl for non-linux systems
that want to implement them in some other way.
maintainer's note: at some point, probably long before linux separated
the uapi headers, it was the case, or at least I believed it was the
case, that linux/types.h was unsafe to include from userspace. thus,
the inclusion guard macro _LINUX_TYPES_H was defined in sys/kd.h to
prevent linux/kd.h from including linux/types.h (which it spuriously
includes but does not use). as far as I can tell, whatever problem
this was meant to solve does not seem to have been present for a long
time, and the hack was not done correctly anyway, so removing it is
the right thing to do.
commit 32482f61da reduced the number of
int members before the dirent buf from 4 to 3, thereby misaligning it
mod sizeof(off_t), producing invalid accesses on any arch where
alignof(off_t)==sizeof(off_t).
rather than re-adding wasted padding, reorder the struct to meet the
requirement and add a comment and static assertion to prevent this
from getting broken again.
sys/ptrace.h is target specific, use bits/ptrace.h to add target
specific macro definitions.
these macros are kept in the generic sys/ptrace.h even though some
targets don't support them:
PTRACE_GETREGS
PTRACE_SETREGS
PTRACE_GETFPREGS
PTRACE_SETFPREGS
PTRACE_GETFPXREGS
PTRACE_SETFPXREGS
so no macro definition got removed in this patch on any target. only
s390x has a numerically conflicting macro definition (PTRACE_SINGLEBLOCK).
the PT_ aliases follow glibc headers, otherwise the definitions come
from linux uapi headers except ones that are skipped in glibc and
there is no real kernel support (s390x PTRACE_*_AREA) or need special
type definitions (mips PTRACE_*_WATCH_*) or only relevant for linux
2.4 compatibility (PTRACE_OLDSETOPTIONS).
new in linux v3.1 commit 3544d72a0e10d0aa1c1bd59ed77a53a59cdc12f7
changed in linux v3.4 commit 5cdf389aee90109e2e3d88085dea4dd5508a3be7
A tracer recieves this event in the waitpid status of a PTRACED_SEIZED
process.
including uchar.h in c++ code is only well defined in c++11 onwards
where char16_t and char32_t type definitions must be hidden since they
are keywords. however some c++ code compiled for older c++ standard
include uchar.h too and they need the typedefs, this fix makes such
code work.
previously, this operation succeeded, and the relocation results
worked for access from new threads created after dlopen, but produced
invalid accesses (and possibly clobbered other memory) from threads
that already existed.
the way the check is written, it still permits dlopen of libraries
containing initial-exec references to static TLS (TLS in the main
program or in a dynamic library loaded at startup).
tls_id is one-based, whereas [static_]tls_cnt is a count, so
comparison for checking that a given tls_id is dynamic rather than
static needs to use strict inequality.
this flag is notoriously under-/mis-specified, and in the past it was
implemented as a nop, essentially considering the absence of a
loopback interface with 127.0.0.1 and ::1 addresses an unsupported
configuration. however, common real-world container environments omit
IPv6 support (even for the network-namespaced loopback interface), and
some kernels omit IPv6 support entirely. future systems on the other
hand might omit IPv4 entirely.
treat these as supported configurations and suppress results of the
unconfigured/unsupported address families when AI_ADDRCONFIG is
requested. use routability of the loopback address to make the
determination; unlike other implementations, we do not exclude
loopback from the "an address is configured" condition, since there is
no basis in the specification for such exclusion. obtaining a result
with AI_ADDRCONFIG does not imply routability of the result, and
applications must still be able to cope with unroutable results even
if they pass AI_ADDRCONFIG.
commit 0b80a7b040, which added non-stub
setvbuf, applied the UNGET pushback adjustment to the size of the
buffer passed in, but inadvertently omitted offsetting the start by
the same amount, thereby allowing unget to clobber up to 8 bytes
before the start of the buffer. this bug was introduced in the present
release cycle; no releases are affected.
to produce sorted results roughly corresponding to RFC 3484/6724,
__lookup_name computes routability and choice of source address via
dummy UDP connect operations (which do not produce any packets). since
at the logical level, the properties fed into the sort key are
computed on ipv6 addresses, the code was written to use the v4mapped
ipv6 form of ipv4 addresses and share a common code path for them all.
however, on kernels where ipv6 support has been completely omitted,
this causes ipv4 to appear equally unroutable as ipv6, thereby putting
unreachable ipv6 addresses before ipv4 addresses in the results.
instead, use only ipv4 sockets to compute routability for ipv4
addresses. some gratuitous conversion back and forth is left so that
the logic is not affected by these changes. it may be possible to
simplify the ipv4 case considerably, thereby reducing code size and
complexity.
since slack space at the beginning and/or end of writable load maps is
donated to malloc, the application could obtain valid pointers in
these ranges which dladdr would erroneously identify as part of the
shared object whose mapping they came from.
instead of checking the queried address against the mapping base and
length, check it against the load segments from the program headers,
and only match the dso if it lies within the bounds of one of them.
as a shortcut, if the address does match the range of the mapping but
not any of the load segments, we know it cannot match any other dso
and can immediately return failure.
the early-exit condition for the symbol match loop on exact matches
caused dladdr to produce the first match for an exact match, but the
last match for an inexact match. in the interest of consistency,
require a strictly-closer match to replace an already-found one.
commit 8b8fb7f037 added logic to prevent
matching a symbol with no recorded size (closest-match) when there is
an intervening symbol whose size was recorded, but it only worked when
the intervening symbol was encountered later in the search.
instead of rejecting symbols where addr falls outside their recorded
size during the closest-match search, accept them to find the true
closest-match, then reject such a result only once the search has
finished.
based on patch by Axel Siebenborn, with fixes discussed on the mailing
list after submission and and rebased around the UB fix in commit
e829695fcc.
avoid spurious symbol matches by dladdr beyond symbol size. for
symbols with a size recorded, only match if the queried address lies
within the address range determined by the symbol address and size.
for symbols with no size recorded, the old closest-match behavior is
kept, as long as there is no intervening symbol with a recorded size.
the case where no symbol is matched, but the address does lie within
the memory range of a shared object, is specified as success. fix the
return value and produce a valid (with null dli_sname and dli_saddr)
Dl_info structure.
maintainer's note: past sentiment was that, despite being imperfect
and unable to force clearing of all possible copies of sensitive data
(e.g. in registers, register spills, signal contexts left on the
stack, etc.) this function would be added if major implementations
agreed on it, which has happened -- several BSDs and glibc all include
it.
maintainer's note: this change is for conformance with RFC 5952,
4.2.2, which explicitly forbids use of :: to shorten a single 16-bit 0
field when producing the canonical text representation for an IPv6
address. fixes a test failure reported by Philip Homburg, who also
submitted a patch, but this fix is simpler and should produce smaller
code.
if a final dot was included in the queried host name to anchor it to
the dns root/suppress search domains, and the result was not a CNAME,
the returned canonical name included the final dot. this was not
consistent with other implementations, confused some applications, and
does not seem desirable.
POSIX specifies returning a pointer to, or to a copy of, the input
nodename, when the canonical name is not available, but does not
attempt to specify what constitutes "not available". in the case of
search, we already have an implementation-defined "availability" of a
canonical name as the fully-qualified name resulting from search, so
defining it similarly in the no-search case seems reasonable in
addition to being consistent with other implementations.
as a bonus, fix the case where more than one trailing dot is included,
since otherwise the changes made here would wrongly cause lookups with
two trailing dots to succeed. previously this case resulted in
malformed dns queries and produced EAI_AGAIN after a timeout. now it
fails immediately with EAI_NONAME.
commit 587f5a53bc moved the definition
of SO_PEERSEC to bits/socket.h for archs where the SO_* macros differ
from their standard values, but failed to add copies of the generic
definition for powerpc and powerpc64.
writable load segments can have size-in-memory larger than their size
in the ELF file, representing bss or equivalent. the initial partial
page has to be zero-filled, and additional anonymous pages have to be
mapped such that accesses don't failt with SIGBUS.
map_library skips redundant MAP_FIXED mapping of the initial
(lowest-address) segment when processing LOAD segments since it was
already mapped when reserving the virtual address range, but in doing
so, inadvertently also skipped the code to fill/map bss. typical
executable and library files have two or more LOAD segments, and the
first one is text/rodata (non-writable) and thus has no bss, but it is
syntactically valid for an ELF program/library to put its writable
segment first, or to have only one segment (everything writable). the
binutils bfd-based linker has been observed to create such programs in
the presence of unusual sections or linker scripts.
fix by moving only the mmap_fixed operation under the conditional
rather than skipping the remainder of the loop body. add a check to
avoid bss processing in the case where the segment is not writable;
this should not happen, but if it does, the change would be a crashing
regression without this check.
mlock2 syscall was added in linux v4.4 and glibc has api for it.
It falls back to mlock in case of flags==0, so that case works
even on older kernels.
MLOCK_ONFAULT is moved under _GNU_SOURCE following glibc.
the mode member of struct ipc_perm is specified by POSIX to have type
mode_t, which is uniformly defined as unsigned int. however, Linux
defines it with type __kernel_mode_t, and defines __kernel_mode_t as
unsigned short on some archs. since there is a subsequent padding
field, treating it as a 32-bit unsigned int works on little endian
archs, but the order is backwards on big endian archs with the
erroneous definition.
since multiple archs are affected, remedy the situation with fixup
code in the affected functions (shmctl, semctl, and msgctl) rather
than repeating the same shims in syscall_arch.h for every affected
arch.
PR_{SET,GET}_SPECULATION_CTRL controls speculation related vulnerability
mitigations, new in commits
b617cfc858161140d69cc0b5cc211996b557a1c7
356e4bfff2c5489e016fdb925adbf12a1e3950ee
new and missing netlink attributes types for SCM_TIMESTAMPING_OPT_STATS,
new ones were added in commits
7156d194a0772f733865267e7207e0b08f81b02b
be631892948060f44b1ceee3132be1266932071e
87ecc95d81d951b0984f2eb9c5c118cb68d0dce8
introduced to stat ipc objects without permission checks since the
info is available in /proc/sysvipc anyway, new in linux commits
23c8cec8cf679b10997a512abb1e86f0cedc42ba
a280d6dc77eb6002f269d58cd47c7c7e69b617b6
c21a6970ae727839a2f300cd8dd957de0d0238c3
to map at a fixed address without unmapping underlying mappings
(fails with EEXIST unlike MAP_FIXED), new in linux commits
4ed28639519c7bad5f518e70b3284c6e0763e650 and
a4ff8e8620d3f4f50ac4b41e8067b7d395056843.
add pkey_mprotect, pkey_alloc, pkey_free syscall numbers,
new in linux commits 3350eb2ea127978319ced883523d828046af4045
and 9499ec1b5e82321829e1c1510bcc37edc20b6f38
to get seccomp state for checkpoint restore.
added in linux commit 26500475ac1b499d8636ff281311d633909f5d20
struct tag follows the glibc api and ptrace_peeksiginfo_args
got changed too accordingly.
added to uapi in commit 65aaf87b3aa2d049c6b9fd85221858a895df3393
used since commit a9a08845e9acbd224e4ee466f5c1275ed50054e8,
which renamed POLL* to EPOLL* in the kernel.
three ABIs are supported: the default with 68881 80-bit fpu format and
results returned in floating point registers, softfloat-only with the
same format, and coldfire fpu with IEEE single/double only. only the
first is tested at all, and only under qemu which has fpu emulation
bugs.
basic functionality smoke tests have been performed for the most
common arch-specific breakage via libc-test and qemu user-level
emulation. some sysvipc failures remain, but are shared with other big
endian archs and will be fixed separately.
since x86 and m68k are the only archs with 80-bit long double and each
has mandatory endianness, select the variant via endianness.
differences are minor: apparently just byte order and representation
of infinities. the m68k format is not well-documented anywhere I could
find, so if other differences are found they may require additional
changes later.
In TLS variant I the TLS is above TP (or above a fixed offset from TP)
but on some targets there is a reserved gap above TP before TLS starts.
This matters for the local-exec tls access model when the offsets of
TLS variables from the TP are hard coded by the linker into the
executable, so the libc must compute these offsets the same way as the
linker. The tls offset of the main module has to be
alignup(GAP_ABOVE_TP, main_tls_align).
If there is no TLS in the main module then the gap can be ignored
since musl does not use it and the tls access models of shared
libraries are not affected.
The previous setup only worked if (tls_align & -GAP_ABOVE_TP) == 0
(i.e. TLS did not require large alignment) because the gap was
treated as a fixed offset from TP. Now the TP points at the end
of the pthread struct (which is aligned) and there is a gap above
it (which may also need alignment).
The fix required changing TP_ADJ and __pthread_self on affected
targets (aarch64, arm and sh) and in the tlsdesc asm the offset to
access the dtv changed too.