Commit Graph

2816 Commits

Author SHA1 Message Date
Rich Felker
4e8a356165 overhaul aio implementation for correctness
previously, aio operations were not tracked by file descriptor; each
operation was completely independent. this resulted in non-conforming
behavior for non-seekable/append-mode writes (which are required to be
ordered) and made it impossible to implement aio_cancel, which in turn
made closing file descriptors with outstanding aio operations unsafe.

the new implementation is significantly heavier (roughly twice the
size, and seems to be slightly slower) and presently aims mainly at
correctness, not performance.

most of the public interfaces have been moved into a single file,
aio.c, because there is little benefit to be had from splitting them.
whenever any aio functions are used, aio_cancel and the internal
queue lifetime management and fd-to-queue mapping code must be linked,
and these functions make up the bulk of the code size.

the close function's interaction with aio is implemented with weak
alias magic, to avoid pulling in heavy aio cancellation code in
programs that don't use aio, and the expensive cancellation path
(which includes signal blocking) is optimized out when there are no
active aio queues.
2015-02-13 01:10:11 -05:00
Rich Felker
594ffed82f fix bad character checking in wordexp
the character sequence '$((' was incorrectly interpreted as the
opening of arithmetic even within single-quoted contexts, thereby
suppressing the checks for bad characters after the closing quote.

presently bad character checking is only performed when the WRDE_NOCMD
is used; this patch only corrects checking in that case.
2015-02-11 01:37:01 -05:00
Josiah Worcester
700e08993c refactor passwd file access code
this allows getpwnam and getpwuid to share code with the _r versions
in preparation for alternate backend support.
2015-02-10 22:57:02 -05:00
Denys Vlasenko
74e334dcd1 x86_64/memset: avoid performing final store twice
The code does a potentially misaligned 8-byte store to fill the tail
of the buffer. Then it fills the initial part of the buffer
which is a multiple of 8 bytes.
Therefore, if size is divisible by 8, we were storing last word twice.

This patch decrements byte count before dividing it by 8,
making one less store in "size is divisible by 8" case,
and not changing anything in all other cases.
All at the cost of replacing one MOV insn with LEA insn.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2015-02-10 18:54:27 -05:00
Denys Vlasenko
bf2071eda3 x86_64/memset: simple optimizations
"and $0xff,%esi" is a six-byte insn (81 e6 ff 00 00 00), can use
4-byte "movzbl %sil,%esi" (40 0f b6 f6) instead.

64-bit imul is slow, move it as far up as possible so that the result
(rax) has more time to be ready by the time we start using it
in mem stores.

There is no need to shuffle registers in preparation to "rep movs"
if we are not going to take that code path. Thus, patch moves
"jump if len < 16" instructions up, and changes alternate code path
to use rdx and rdi instead of rcx and r8.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2015-02-10 18:53:31 -05:00
Timo Teräs
6a5242e4cb make protocol table zero byte separated and add ipv6 protocols 2015-02-10 16:54:33 -05:00
Szabolcs Nagy
f54c28cba2 add syscall numbers for the new execveat syscall
this syscall allows fexecve to be implemented without /proc, it is new
in linux v3.19, added in commit 51f39a1f0cea1cacf8c787f652f26dfee9611874
(sh and microblaze do not have allocated syscall numbers yet)

added a x32 fix as well: the io_setup and io_submit syscalls are no
longer common with x86_64, so use the x32 specific numbers.
2015-02-09 23:00:56 +01:00
Szabolcs Nagy
70572dce07 add new socket options SO_INCOMING_CPU, SO_ATTACH_BPF, SO_DETACH_BPF
these socket options are new in linux v3.19, introduced in commit
2c8c56e15df3d4c2af3d656e44feb18789f75837 and commit
89aa075832b0da4402acebd698d0411dcc82d03e

with SO_INCOMING_CPU the cpu can be queried on which a socket is
managed inside the kernel and optimize polling of large number of
sockets accordingly.

SO_ATTACH_BPF lets eBPF programs (created by the bpf syscall) to
be attached to sockets.
2015-02-09 22:22:13 +01:00
Szabolcs Nagy
339cc250f6 use the internal macro name FUTEX_PRIVATE in __wait
the name was recently added for the setxid/synccall rework,
so use the name now that we have it.
2015-02-09 22:11:52 +01:00
Szabolcs Nagy
f3f29795da add IEEE binary128 long double support to floatscan
just defining the necessary constants:

 LD_B1B_MAX is 2^113 - 1 in base 10^9
 KMAX is 2048 so the x array can hold up to 18432 decimal digits

(the worst case is converting 2^-16495 = 5^16495 * 10^-16495 to
binary, it requires the processing of int(log10(5)*16495)+1 = 11530
decimal digits after discarding the leading zeros, the conversion
requires some headroom in x, but KMAX is more than enough for that)

However this code is not optimal on archs with IEEE binary128
long double because the arithmetics is software emulated (on
all such platforms as far as i know) which means big and slow
strtod.
2015-02-09 21:38:02 +01:00
Szabolcs Nagy
018f9df444 math: fix fmodl for IEEE binary128
This trivial copy-paste bug went unnoticed due to lack of testing.
No currently supported target archs are affected.
2015-02-09 01:16:35 +01:00
Szabolcs Nagy
04d522cba6 simplify armhf fesetenv
armhf fesetenv implementation did a useless read of the fpscr.
2015-02-08 19:13:09 +01:00
Szabolcs Nagy
5fc1487832 fix fesetenv(FE_DFL_ENV) on mips
mips fesetenv did not handle FE_DFL_ENV, now fcsr is cleared in that
case.
2015-02-08 18:56:52 +01:00
Szabolcs Nagy
3f92f92cb9 math: fix __fpclassifyl(-0.0) for IEEE binary128
The sign bit was not cleared before checking for 0 so -0.0
was misclassified as FP_SUBNORMAL instead of FP_ZERO.
2015-02-08 17:41:56 +01:00
Szabolcs Nagy
6e76e1540f add parenthesis in fma.c to clarify intent and silence warnings 2015-02-08 17:40:54 +01:00
Rich Felker
c63c98a606 make getaddrinfo support SOCK_RAW and other socket types
all socket types are accepted at this point, but that may be changed
at a later time if the behavior is not meaningful for other types. as
before, omitting type (a value of 0) gives both UDP and TCP results,
and SOCK_DGRAM or SOCK_STREAM restricts to UDP or TCP, respectively.
for other socket types, the service name argument is required to be a
null pointer, and the protocol number provided by the caller is used.
2015-02-07 14:01:34 -05:00
Szabolcs Nagy
e63833cd43 remove cruft from x86_64 syscall.h
x86_64 syscall.h defined some musl internal syscall names and made
them public. These defines were already moved to src/internal/syscall.h
(except for SYS_fadvise which is added now) so the cruft in x86_64
syscall.h is not needed.
2015-02-07 11:55:00 -05:00
Rich Felker
61b1d10212 fix failure of fchmodat to report EOPNOTSUPP in the race path
in the case where a non-symlink file was replaced by a symlink during
the fchmodat operation with AT_SYMLINK_NOFOLLOW, mode change on the
new symlink target was successfully suppressed, but the error was not
reported. instead, fchmodat simply returned 0.
2015-02-05 23:34:27 -05:00
Rich Felker
2736eb6caa fix fd leak race (missing O_CLOEXEC) in fchmodat 2015-02-04 22:50:40 -05:00
Rich Felker
14a0117117 make execvp continue PATH search on EACCES rather than issuing an errror
the specification for execvp itself is unclear as to whether
encountering a file that cannot be executed due to EACCES during the
PATH search is a mandatory error condition; however, XBD 8.3's
specification of the PATH environment variable clarifies that the
search continues until a file with "appropriate execution permissions"
is found.

since it seems undesirable/erroneous to report ENOENT rather than
EACCES when an early path element has a non-executable file and all
later path elements lack any file by the requested name, the new code
stores a flag indicating that EACCES was seen and sets errno back to
EACCES in this case.
2015-02-03 00:31:35 -05:00
Rich Felker
3559f0b894 fix missing memory barrier in cancellation signal handler
in practice this was probably a non-issue, because the necessary
barrier almost certainly exists in kernel space -- implementing signal
delivery without such a barrier seems impossible -- but for the sake
of correctness, it should be done here too.

in principle, without a barrier, it is possible that the thread to be
cancelled does not see the store of its cancellation flag performed by
another thread. this affects both the case where the signal arrives
before entering the critical program counter range from __cp_begin to
__cp_end (in which case both the signal handler and the inline check
fail to see the value which was already stored) and the case where the
signal arrives during the critical range (in which case the signal
handler should be responsible for cancellation, but when it does not
see the cancellation flag, it assumes the signal is spurious and
refuses to act on it).

in the fix, the barrier is placed only in the signal handler, not in
the inline check at the beginning of the critical program counter
range. if the signal handler runs before the critical range is
entered, it will of course take no action, but its barrier will ensure
that the inline check subsequently sees the store. if on the other
hand the inline check runs first, it may miss seeing the store, but
the subsequent signal handler in the critical range will act upon the
cancellation request. this strategy avoids adding a memory barrier in
the common, non-cancellation code path.
2015-02-03 00:16:55 -05:00
Felix Janda
4758f0565d fix typo in x86_64/x32 user_fpregs_struct
mxcs_mask should be mxcr_mask
2015-02-01 13:49:15 -05:00
Trutz Behn
0b21a07c78 make fsync, fdatasync, and msync cancellation points
these are mandatory cancellation points per POSIX, so their omission
was a conformance bug.
2015-01-30 22:05:40 -05:00
Trutz Behn
2d67ae923d move MREMAP_MAYMOVE and MREMAP_FIXED out of bits
the definitions are generic for all kernel archs. exposure of these
macros now only occurs on the same feature test as for the function
accepting them, which is believed to be more correct.
2015-01-30 22:02:23 -05:00
Trutz Behn
c7b05bc817 fix missing comma in sh setjmp asm
this typo did not result in an erroneous setjmp with at least binutils
2.22 but fix it for clarity and compatibility with potentially stricter
sh assemblers.
2015-01-30 21:58:28 -05:00
Trutz Behn
02d8770dcf remove mips-only EINIT and EREMDEV errnos
the errno values are unused by the kernel and the macro definitions were
never exposed by glibc.
2015-01-30 21:58:11 -05:00
Rich Felker
b553dc4fe6 fix failure of configure to detect gcc due to message translations
based on patch by Vadim Ushakov. in general overriding LC_ALL rather
than specific categories (here, LC_MESSAGES) is undesirable, but
LC_ALL is easier and in this case there is nothing else that depends
on the locale in this invocation of the compiler.
2015-01-30 21:54:58 -05:00
Rich Felker
ecb608192a fix erroneous return of partial username matches by getspnam[_r]
when using /etc/shadow (rather than tcb) as its backend, getspnam_r
matched any username starting with the caller-provided string rather
than requiring an exact match. in practice this seems to have affected
only systems where one valid username is a prefix for another valid
username, and where the longer username appears first in the shadow
file.
2015-01-21 14:26:05 -05:00
Rich Felker
63cac4e29a simplify part of getopt_long
as a result of commit e8e4e56a8c,
the later code path for setting optarg to a null pointer is no longer
necessary, and removing it eliminates an indention level and arguably
makes the code more readable.
2015-01-21 13:28:40 -05:00
Rich Felker
e8e4e56a8c always set optarg in getopt_long
the standard getopt does not touch optarg unless processing an option
with an argument. however, programs using the GNU getopt API, which we
attempt to provide in getopt_long, expect optarg to be a null pointer
after processing an option without an argument.

before argument permutation support was added, such programs typically
detected its absence and used their own replacement getopt_long,
masking the discrepency in behavior.
2015-01-21 13:16:15 -05:00
Rich Felker
78a8ef47c4 overhaul __synccall and fix AS-safety and other issues in set*id
multi-threaded set*id and setrlimit use the internal __synccall
function to work around the kernel's wrongful treatment of these
process properties as thread-local. the old implementation of
__synccall failed to be AS-safe, despite POSIX requiring setuid and
setgid to be AS-safe, and was not rigorous in assuring that all
threads were caught. in a worst case, threads late in the process of
exiting could retain permissions after setuid reported success, in
which case attacks to regain dropped permissions may have been
possible under the right conditions.

the new implementation of __synccall depends on the presence of
/proc/self/task and will fail if it can't be opened, but is able to
determine that it has caught all threads, and does not use any locks
except its own. it thereby achieves AS-safety simply by blocking
signals to preclude re-entry in the same thread.

with this commit, all known conformance and safety issues in set*id
functions should be fixed.
2015-01-15 23:17:38 -05:00
Rich Felker
7152a61a3a add FUTEX_PRIVATE macro to internal futex.h 2015-01-15 22:51:55 -05:00
Rich Felker
c0ed5a201b suppress EINTR in sem_wait and sem_timedwait
per POSIX, the EINTR condition is an optional error for these
functions, not a mandatory one. since old kernels (pre-2.6.22) failed
to honor SA_RESTART for the futex syscall, it's dangerous to trust
EINTR from the kernel. thankfully POSIX offers an easy way out.
2015-01-15 07:21:02 -05:00
Rich Felker
472e8b71f7 for multithreaded set*id/setrlimit, handle case where callback does not run
in the current version of __synccall, the callback is always run, so
failure to handle this case did not matter. however, the upcoming
overhaul of __synccall will have failure cases, in which case the
callback does not run and errno is already set. the changes being
committed now are in preparation for that.
2015-01-15 07:09:14 -05:00
Rich Felker
996d148bf1 release 1.1.6 2015-01-13 23:35:08 -05:00
Rich Felker
3f65494a4c increase syslog message limit from 256 to 1024
this addresses alpine linux issue #3692 and brings the syslog message
length limit in alignment with uclibc's implementation.
2015-01-13 12:04:38 -05:00
Rich Felker
84b5c5479e remove rlimit hacks from multi-threaded set*id() code
the code being removed was introduced to work around "partial failure"
of multi-threaded set*id() operations, where some threads would
succeed in changing their ids but an RLIMIT_NPROC setting would
prevent the rest from succeeding, leaving the process in an
inconsistent and dangerous state. however, the workaround code did not
handle important usage cases like swapping real and effective uids
then restoring their original values, and the wrongful kernel
enforcement of RLIMIT_NPROC at setuid time was removed in Linux 3.1,
making the workaround obsolete.

since the partial failure still is dangerous on old kernels, and could
in principle happen on post-fix kernels as well if set*id() syscalls
fail for another spurious reason such as resource-related failures,
new code is added to detect and forcibly kill the process if/when such
a situation arises. future documentation releases should be updated to
reflect that setting RLIMIT_NPROC to RLIM_INFINITY is necessary to
avoid this forced-kill on old kernels. ideally, at some point the
kernel will get proper multi-threaded set*id() syscalls capable of
performing their actions atomically, and all of the userspace code to
emulate them can be treated as a fallback for outdated kernels.
2015-01-12 18:16:32 -05:00
Rich Felker
9772eadba8 simplify ctermid
opening /dev/tty then using ttyname_r on it does not produce a
canonical terminal name; it simply yields "/dev/tty".

it would be possible to make ctermid determine the actual controlling
terminal device via field 7 of /proc/self/stat, but doing so would
introduce a buffer overflow into applications built with L_ctermid==9,
which glibc defines, adversely affecting the quality of ABI compat.
2015-01-12 00:59:49 -05:00
Rich Felker
699d4532f6 fix regression in getopt_long support for non-option arguments
commit b72cd07f17 added support for a
this feature in getopt, but it was later broken in the case where
getopt_long is used as a side effect of the changes made in commit
91184c4f16, which prevented the
underlying getopt call from seeing the leading '-' or '+' character in
optstring.

this commit changes the logic in the getopt_long core to check for a
leading colon, possibly after the leading '-' or '+', without
depending on the latter having been skipped by the caller. a minor
incorrectness in the return value for one error condition in
getopt_long is also fixed when opterr has been set to zero but
optstring has no leading ':'.
2015-01-11 16:32:47 -05:00
Rich Felker
c574321d75 check for connect failure in syslog log opening
based on patch by Dima Krasner, with minor improvements for code size.
connect can fail if there is no listening syslogd, in which case a
useless socket was kept open, preventing subsequent syslog call from
attempting to connect again.
2015-01-09 00:09:54 -05:00
Szabolcs Nagy
11ac2a6e81 add new prctl command PR_SET_MM_MAP to sys/prctl.h
PR_SET_MM_MAP was introduced as a subcommand for PR_SET_MM in
linux v3.18 commit f606b77f1a9e362451aca8f81d8f36a3a112139e

the associated struct type is replicated in sys/prctl.h using
libc types.

example usage:

 struct prctl_mm_map *p;
 ...
 prctl(PR_SET_MM, PR_SET_MM_MAP, p, sizeof *p);

the kernel side supported struct size may be queried with
the PR_SET_MM_MAP_SIZE subcommand.
2014-12-23 01:46:22 -05:00
Szabolcs Nagy
f90fafea3c add new syscall numbers for bpf and kexec_file_load
these syscalls are new in linux v3.18, bpf is present on all
supported archs except sh, kexec_file_load is only allocted for
x86_64 and x32 yet.

bpf was added in linux commit 99c55f7d47c0dc6fc64729f37bf435abf43f4c60

kexec_file_load syscall number was allocated in commit
f0895685c7fd8c938c91a9d8a6f7c11f22df58d2
2014-12-23 01:44:19 -05:00
Rich Felker
91f15e2d0d move wint_t definition to the shared part of alltypes.h.in 2014-12-21 02:43:35 -05:00
Rich Felker
dac4fc49ae fix signedness of UINT32_MAX and UINT64_MAX at the preprocessor level
per the rules for hexadecimal integer constants, the previous
definitions were correctly treated as having unsigned type except
possibly when used in preprocessor conditionals, where all artithmetic
takes place as intmax_t or uintmax_t. the explicit 'u' suffix ensures
that they are treated as unsigned in all contexts.
2014-12-21 02:30:29 -05:00
Rich Felker
814aae2009 overhaul forkpty function using new login_tty
based on discussion with and patches by Felix Janda. these changes
started as an effort to factor forkpty in terms of login_tty, which
returns an error and skips fd reassignment and closing if setting the
controlling terminal failed. the previous forkpty code was unable to
handle errors in the child, and did not attempt to; it just silently
ignored them. but this would have been unacceptable when switching to
using login_tty, since the child would start with the wrong stdin,
stdout, and stderr and thereby clobber the parent's files.

the new code uses the same technique as the posix_spawn implementation
to convey any possible error in the child to the parent so that the
parent can report failure to the caller. it is also safe against
thread cancellation and against signal delivery in the child prior to
the determination of success.
2014-12-21 02:10:51 -05:00
Rich Felker
1227e418ea block pthread cancellation in openpty function
being a nonstandard function, this isn't strictly necessary, but it's
inexpensive and avoids unpleasant surprises. eventually I would like
all functions in libc to be safe against cancellation, either ignoring
it or acting on it cleanly.
2014-12-20 23:38:25 -05:00
Rich Felker
3b26a32df4 don't write openpty results until success is determined
not only is this semantically more correct; it also reduces code size
slightly by eliminating the need for the compiler to assume the
possibility of aliasing.
2014-12-20 23:22:57 -05:00
Felix Janda
4b2cb37770 add login_tty function 2014-12-20 20:13:27 -05:00
Rich Felker
0217ed72f9 set optopt in getopt_long
this is undocumented but possibly expected behavior of GNU
getopt_long, and useful when error message printing has been
suppressed.
2014-12-20 19:49:19 -05:00
Rich Felker
91184c4f16 add error message printing to getopt_long and make related improvements
some related changes are also made to getopt, and the return value of
getopt_long in the case of missing arguments is fixed.
2014-12-20 19:44:37 -05:00