The TOC pointer is constant within a single dso, but needs to be saved
and restored around cross-dso calls. The PLT stub saves it to the
caller's stack frame, and the linker adds code to the caller to restore
it.
With a local call, as within a single dso or with static linking, this
doesn't happen and the TOC pointer is always in r2. Therefore,
setjmp/longjmp need to save/restore the TOC pointer from/to different
locations depending on whether the call to setjmp was a local or non-local
call.
It is always safe for longjmp to restore to both r2 and the caller's stack.
If the call to setjmp was local, and only r2 matters and the stack location
will be ignored, but is required by the ABI to be reserved for the TOC
pointer. If the call was non-local, then only the stack location matters,
and whatever is restored into r2 will be clobbered anyway when the caller
reloads r2 from the stack.
A little extra care is required for sigsetjmp, because it uses setjmp
internally. After the second return from this setjmp call, r2 will contain
the caller's TOC pointer instead of libc's TOC pointer. We need to save
and restore the correct libc pointer before we can tail call to
__sigsetjmp_tail.
neither current compilers nor linkers treat protected visibility the
way I expected, as having fixed source-level semantics rather than
being dependent on target-specific ABI details, and change seems
unlikely. while the use here does not actually depend on the specific
semantics, at least some versions of some linkers, especially lld,
refuse to allow linking to a libc.so where the symbols have protected
visibility. this cannot be detected at configure-time because linking
libc.so itself works fine, and because even if we could test linking
an application against libc.so successfully, we could not justifiably
assume that the same linker used to link libc.so would also be used
later to link applications.
disable the vis.h hack by default at the configure level, but add an
explicit "auto" option to request the old configure-time detection
rather than just removing it. this leaves it easy to evaluate whether
it actually resulted in significant size or performance benefits while
ensuring that out-of-the-box builds are not unlinkable for some users.
fortunately, preliminary evaluation suggests that at least x86_64,
arm, and aarch64 don't suffer at all from the change, and impact on
other archs is low. if low is not low enough, it should not be hard to
analyze where the significant PLT call ABI costs are present and
mitigate them without the hack.
since setlocale(cat, NULL) is required to return the setting for the
global locale, there is no standard mechanism to obtain the name of
the currently active thread-local locale set by uselocale. this makes
it impossible for application/library software to load appropriate
translations, etc. unless using the gettext implementation provided by
libc, which has privileged access to libc internals.
to fill this gap, glibc introduced the _NL_LOCALE_NAME macro which can
be used with nl_langinfo to obtain the name. GNU gettext/gnulib code
already use this functionality on glibc, and can easily be adapted to
make use of it on non-glibc systems if it's available; for other
systems they poke at locale implementation internals, which we want to
avoid. this patch provides a compatible interface to the one glibc
introduced.
The switch statement has no 'default:' case and the function ends
immediately following the switch, so the extra comparison did not
communicate any extra information to the compiler.
The flag 1<<7 is used in several places for different purposes that are
not always easy to distinguish. Mark those usages that correspond to the
flag that is used by the kernel for futexes.
previously, the pathname used to load the program was always used as
argv[0]. the default remains the same, but a new --argv0 option can be
used to provide a different value.
a null pointer for a library's deps list was ambiguous: it could
indicate either no dependencies or that the dependency list had not
yet been populated. inability to distinguish could lead to spurious
work when dlopen is called multiple times on a library with no deps,
and due to related bugs, could actually cause other libraries to
falsely appear as dependencies, translating into false positives for
dlsym.
avoid the problem by always initializing the deps pointer, pointing to
an empty list if there are no deps. rather than wasting memory and
introducing another failure path by allocating an empty list per
library, simply share a global dummy list.
further fixes will be needed for related bugs, and much of this code
may end up being replaced.
while the official elfv2 abi for "powerpc64le" sets power8 as the
baseline isa, we use it for both little and big endian powerpc64
targets and need to maintain compatibility with pre-power8 models. the
instructions for sqrt, fabs, and fma are in the baseline isa; support
for the rest is conditional via predefined isa-level macros.
patch by David Edelsohn.
these were introduced in z196 and not available in the baseline (z900)
ISA level. use __HTM__ as an alternate indicator for ISA level, since
gcc did not define __ARCH__ until 7.x.
patch by David Edelsohn.
in arm rtabi these __aeabi_* functions have special abi (they are
only allowed to clobber r0,r1,r2,r3,ip,lr,cpsr), so they cannot
be simple wrappers around normal string functions (which may
clobber other registers), the safest solution is to write them in
asm, a minimalistic implementation works because these are not
supposed to be emitted by compilers or used in general.
commit 97bd6b09db refactored the table
lookup into a function and introduced an error in index computation.
the error caused garbage to be read from the table if the given charmap
had a non-zero number of elided entries.
POSIX requires ctime_r return a null pointer on failure, which can
occur if the input time_t value is not representable in broken down
form.
based on patch by Alexander Monakov.
these functions return an error code, and are not explicitly
documented to set errno, but they are nonstandard and the historical
implementations do set errno as well, and some applications expect
this behavior. do likewise for compatibility.
patch by Rudolph Pereira.
ctime passes the result from localtime directly to asctime. But in case
of error, localtime returns 0. This causes an error (NULL pointer
dereference) in asctime.
based on patch by Omer Anson.
mremap seems to always fail on nommu, and on some non-Linux
implementations of the Linux syscall API, it at least fails to
increase allocation size, and may fail to move (i.e. defragment) the
existing mapping when shrinking it too. instead of failing realloc or
leaving an over-sized allocation that may waste a large amount of
memory, fallback to malloc-memcpy-free if mremap fails.
POSIX defines getdate error #5 as:
"An I/O error is encountered while reading the template file."
POSIX defines getdate error #7 as:
"There is no line in the template that matches the input."
This change correctly disambiguates between the two error conditions.
the check to prevent matching empty string wrongly blocked matching
of "/" due to checking emptiness after stripping leading slashes
rather than checking the full original argument string.
simplified from patch by Julien Ramseier.
when using the sh4a opcodes, the assembler tags the resulting object
file as requiring sh4a. the linker then refuses to (static) link it
with object files marked as requiring j2, since there is no isa level
that includes both sh4a and j2 instructions.
at one point, clang reportedly failed to support the asm register
constraints needed for inline syscalls. versions of clang that old
have much bigger problems that preclude using them to compile musl
libc.
mips64 requires 'struct stat' conversion due to incorrect 32-bit
fields where time_t should be in the kernel version of the structure.
syscall_arch.h already performed the correct translation for stat,
fstat, and lstat syscalls, but omitted special handling for fstatat.
The flags argument was missing, causing uninitalized data to be passed
to fchownat(2). The correct value of flags should match the fallback for
chown(3).
there was missing reverse-conversion logic for the case, handled
specially in the character set tables, where a byte represents a
unicode codepoint with the same value.
this patch adds code to handle the case, and refactors the two-level
10-bit table lookup for legacy character sets into a function to avoid
repeating it yet another time as part of the fix.
per POSIX, EINVAL is not a mandatory error, only an optional one. but
reporting unsupported flags allows an application to fallback
gracefully when a requested feature is not supported. this is not
helpful now, but it may be in the future if additional flags are
added.
had this checking been present before, applications would have been
able to check for the newly-added POSIX_SPAWN_SETSID feature (added in
commit bb439bb171) at runtime.
the bit is reserved anyway for ABI-compat reasons; this documents it
and makes it so we can have posix_spawnattr_setflags check for flag
validity without hard-coding an anonymous bit value.
This structure was missed when creating the s390x port.
This is based on the report and patch from William Pitcock, but with a
modified structure defintion to more closely match the kernel's
definition.
the code being removed was written to optimize for size assuming the
compiler cannot collapse code paths for different types with the same
underlying representation. modern compilers sometimes succeed in
making this optimization themselves, but either way it's a small size
difference and not worth the source-level complexity or the UB
involved in this hack.
some incorrect use of va_arg still remains, particularly use of void *
where the actual argument has a different pointer type. fixing this
requires some actual code additions, rather than just removing cruft,
so I'm leaving it to be done later as a separate commit.
commit 0a950dcf15 added checking that
the pathname a tty device was opened with actually matches the device,
which can fail to hold when a container inherits a tty from outside
the container. the error code added at the time was ENOENT; however,
discussions between affected applications and glibc developers
resulted in glibc adopting ENODEV as the error for this condition, and
this has now been documented in the man pages project as well. adopt
the same error code for consistency.
patch by Christian Brauner.
commit d6cb08bcac moved the code and
introduced an incorrect string offset for the new parsing, probably
due to a copy-and-paste error.
patch by Stefan Sedich.
in nearest rounding mode scalbn could introduce double rounding error
when an intermediate value and the final result were both in the
subnormal range e.g.
scalbn(0x1.7ffffffffffffp-1, -1073)
returned 0x1p-1073 instead of 0x1p-1074, because the intermediate
computation got rounded to 0x1.8p-1023.
with the fix an intermediate value can only be in the subnormal range
if the final result is 0 which is correct even after double rounding.
(there still can be two roundings so signals may be raised twice, but
that's only observable with trapping exceptions which is not supported.)
normally 32-bit archs use the mmap2 syscall and are limited to an
offset of 2^32 pages. however some 32-bit archs (mainly ILP32-on-64
ones like x32) have 64-bit syscall argument slots and thus can accept
the full range. don't artifically limit them.
analogous to commit 5bf7eba213, use of
AT_PHDR/PT_PHDR does not actually work to find the program base, and
the method with _DYNAMIC vs PT_DYNAMIC must be used as an alternative.
patch by Shiz, along with testing to confirm that this fixes unwinding
in static PIE.
due to testing buf[i].family==AF_INET before checking i==cnt, it was
possible to read past the end of the array, or past the valid part. in
practice, without active bounds/indeterminate-value checking by the
compiler, the worst that happened was failure to return early and
optimize out the sorting that's unneeded for v4-only results.
returning on i==cnt-1 rather than i==cnt would be an alternate fix,
but the approach this patch takes is more idiomatic and less
error-prone.
patch by Timo Teräs.
this should increase performance and reduce code size on aarch64.
the compiled code was checked against using __builtin_* instead
of inline asm with gcc-6.2.0.
lrint is two instructions.
c with inline asm is used because it is safer than a pure asm
implementation, this prevents ll{rint,round} to be an alias
of l{rint,round} (because the types don't match) and depends
on gcc style inline asm support.
ceil, floor, round, trunc can either raise inexact on finite
non-integer inputs or not raise any exceptions. the new
implementation does not raise exceptions while the generic
c code does.
on aarch64, the underflow exception is signaled before rounding
(ieee 754 allows both before and after rounding, but it must be
consistent), the generic fma c code signals it after rounding
so using single instruction fixes a slight conformance issue too.
With REG_NEWLINE, POSIX says:
"A <newline> in string shall not be matched by a period outside
a bracket expression or by any form of a non-matching list"