invalid format strings invoke undefined behavior, so this is not a
conformance issue, but it's nicer for scanf to report the error safely
instead of calling free on a potentially-uninitialized pointer or a
pointer to memory belonging to the caller.
rather than allocating a PATH_MAX-sized buffer when the caller does
not provide an output buffer, work first with a PATH_MAX-sized temp
buffer with automatic storage, and either copy it to the caller's
buffer or strdup it on success. this not only avoids massive memory
waste, but also avoids pulling in free (and thus the full malloc
implementation) unnecessarily in static programs.
this avoids failure if the file is not readable and avoids odd
behavior for device nodes, etc. on old kernels that lack O_PATH, the
old behavior (O_RDONLY) will naturally happen as the fallback.
commit 07827d1a82 seems to have
introduced this issue. sigqueue is called from the synccall core, at
which time, even implementation-internal signals are blocked. however,
pthread_sigmask removes the implementation-internal signals from the
old mask before returning, so that a process which began life with
them blocked will not be able to save a signal mask that has them
blocked, possibly causing them to become re-blocked later. however,
this was causing sigqueue to unblock the implementation-internal
signals during synccall, leading to deadlock.
the BSD and GNU versions of this structure differ, so exposing it in
the default _BSD_SOURCE profile is possibly problematic. both versions
could be simultaneously supported with anonymous unions if needed in
the future, but for now, just omitting it except under _GNU_SOURCE
should be safe.
I originally added this warning option based on a misunderstanding of
how it works. it does not warn whenever the destination of the cast
has stricter alignment; it only warns in cases where misaligned
dereference could lead to a fault. thus, it's essentially a no-op for
i386, which had me wrongly believing the code was clean for this
warning level. on other archs, numerous diagnostic messages are
produced, and all of them are false-positives, so it's better just not
to use it.
unlike the old C memcpy, this version handles word-at-a-time reads and
writes even for misaligned copies. it does not require that the cpu
support misaligned accesses; instead, it performs bit shifts to
realign the bytes for the destination.
essentially, this is the C version of the ARM assembly language
memcpy. the ideas are all the same, and it should perform well on any
arch with a decent number of general-purpose registers that has a
barrel shift operation. since the barrel shifter is an optional cpu
feature on microblaze, it may be desirable to provide an alternate asm
implementation on microblaze, but otherwise the C code provides a
competitive implementation for "generic risc-y" cpu archs that should
alleviate the urgent need for arch-specific memcpy asm.
while the incorporation of this requirement from C99 into C++11 was
likely an accident, some software expects it to be defined, and it
doesn't hurt. if the requirement is removed, then presumably
__bool_true_false_are_defined would just be in the implementation
namespace and thus defining it would still be legal.
this version of memset is optimized both for small and large values of
n, and makes no misaligned writes, so it is usable (and near-optimal)
on all archs. it is capable of filling up to 52 or 56 bytes without
entering a loop and with at most 7 branches, all of which can be fully
predicted if memset is called multiple times with the same size.
it also uses the attribute extension to inform the compiler that it is
violating the aliasing rules, unlike the previous code which simply
assumed it was safe to violate the aliasing rules since translation
unit boundaries hide the violations from the compiler. for non-GNUC
compilers, 100% portable fallback code in the form of a naive loop is
provided. I intend to eventually apply this approach to all of the
string/memory functions which are doing word-at-a-time accesses.
this will be needed for upcoming commits to the string/mem functions
to correct their unannounced use of aliasing violations for
word-at-a-time search, fill, and copy operations.
this is a nonstandard extension but will be required in the next
version of POSIX, and it's widely used/useful in shell scripts
utilizing the date utility.
this may need further revision in the future, since POSIX is rather
unclear on the requirements, and is designed around the assumption of
POSIX TZ specifiers which are not sufficiently powerful to represent
real-world timezones (this is why zoneinfo support was added).
the basic issue is that strftime gets the string and numeric offset
for the timezone from the extra fields in struct tm, which are
initialized when calling localtime/gmtime/etc. however, a conforming
application might have created its own struct tm without initializing
these fields, in which case using __tm_zone (a pointer) could crash.
other zoneinfo-based implementations simply check for a null pointer,
but otherwise can still crash of the field contains junk.
simply ignoring __tm_zone and using tzname[] would "work" but would
give incorrect results in time zones with more complex rules. I feel
like this would lower the quality of implementation.
instead, simply validate __tm_zone: unless it points to one of the
zone name strings managed by the timezone system, assume it's invalid.
this commit also fixes several other minor bugs with formatting:
tm_isdst being negative is required to suppress printing of the zone
formats, and %z was using the wrong format specifiers since the type
of val was changed, resulting in bogus output.
the empty TZ string was matching equal to the initial value of the
cached TZ name, thus causing do_tzset never to run and never to
initialize the time zone data.
1. an occurrence of ${ORIGIN} before $ORIGIN would be ignored due to
the strstr logic. (note that rpath contains multiple :-delimited paths
to be searched.)
2. data read by readlink was not null-terminated.
fallback to argv[0] as before. unlike argv[0], AT_EXECFN was a valid
(but possibly relative) pathname for the new program image at the time
the execve syscall was made.
as a special case, ignore AT_EXECFN if it begins with "/proc/", in
order not to give bogus (and possibly harmful) results when fexecve
was used.
previously, rpath was only honored for direct dependencies. in other
words, if A depends on B and B depends on C, only B's rpath (if any),
not A's rpath, was being searched for C. this limitation made
rpath-based deployment difficult in the presence of multiple levels of
library dependency.
at present, $ORIGIN processing in rpath is still unsupported.
at present, since POSIX requires %F to behave as %+4Y-%m-%d and ISO C
requires %F to behave as %Y-%m-%d, the default behavior for %Y has
been changed to match %+4Y. this seems to be the only way to conform
to the requirements of both standards, and it does not affect years
prior to the year 10000. depending on the outcome of interpretations
from the standards bodies, this may be adjusted at some point.
use a long long value so that even with offsets, values cannot
overflow. instead of using different format strings for different
numeric formats, simply use a per-format width and %0*lld for all of
them.
this width specifier is not for use with strftime field widths; that
will be a separate step in the caller.
make __strftime_fmt_1 return a string (possibly in the caller-provided
temp buffer) rather than writing into the output buffer. this approach
makes more sense when padding to a minimum field width might be
required, and it's also closer to what wcsftime wants.
one place where semicolon (non-portable) was still used in place of
separate -e options (copied over from an old version of this code),
and use of a literal slash in the bracket expression for the final
command, despite slash being used as the delimiter for the s command.
fesetround.c is a wrapper to do the arch independent argument
check (on archs where rounding mode is not stored in 2 bits
__fesetround still has to check its arguments)
on powerpc fe*except functions do not accept the extra invalid
flags of its fpscr register
the useless FENV_ACCESS pragma was removed from feupdateenv
the x87 exception summary (ES) and stack fault (SF) flags may be
spuriously cleared by feclearexcept using the fnclex instruction,
but these flags are not observable through libc hence maintaining
their state is not critical.
the sse and x87 rounding modes should be always the same,
the visible exception flags are the bitwise or of the two
fenv states (so it's enough to query the rounding mode or
raise exceptions on one fenv)
the historical (non-standardized) install command is really
inappropriate for installing binaries/libraries on a system that
utilizes memory-mapped executable files. rather than replacing an
existing file atomically, it overwrites the existing file. this can
cause running programs to see a partially-modified version of the
file, resulting in unpredictable behavior, or SIGBUS. a MAP_COPY mode
for mmap would get around this problem, but Linux lacks MAP_COPY.
the shell script added with this commit works around the problem by
writing temporary files and moving them into place. unlike the
historical install utility, it also support a -l option for installing
a symbolic link atomically, via the same method.
with these changes, the character set implemented as "big5" in musl is
a pure superset of cp950, the canonical "big5", and agrees with the
normative parts of Unicode. this means it has minor differences from
both hkscs and big5-2003:
- the range A2CC-A2CE maps to CJK ideographs rather than numerals,
contrary to changes made in big5-2003.
- C6CD maps to a CJK ideograph rather than its corresponding Kangxi
radical character, contrary to changes made in hkscs.
- F9FE maps to U+2593 rather than U+FFED.
of these differences, none but the last are visually distinct, and the
last is a character used purely for text-based graphics, not to convey
linguistic content.
should there be future demand for strict conformance to big5-2003 or
hkscs mappings, the present charset aliases can be replaced with
distinct variants.
reportedly there are other non-standard big5 extensions in common use
in Taiwan and perhaps elsewhere, which could also be added as layers
on top of the existing big5 support.
there may be additional characters which should be added to the hkscs
table: the whatwg standard for big5 defines what appears to be a
superset of hkscs.