mirror of
git://git.musl-libc.org/musl
synced 2025-01-05 14:09:56 +00:00
make nl_langinfo(CODESET) always return "ASCII" in byte-based C locale
commit 844212d94f
, which did not make it
into any releases, changed nl_langinfo(CODESET) to always return
"UTF-8", even in the byte-based C locale. this was problematic because
application software was found to use the string match for "UTF-8" to
activate its own UTF-8 processing. this both undermines the byte-based
functionality of the C locale, and if mixed with with calls to the
standard multibyte functions, which happened in practice, could result
in severe mis-handling of input.
the motive for the previous change was that, to avoid widespread
compatibility problems, the string returned by nl_langinfo(CODESET)
needs to be accepted by iconv and by third-party character conversion
code. thus, the only remaining choice is "ASCII". this choice
accurately represents the intent that high bytes do not have
individual meaning in the C locale, but it does mean that iconv, when
passed nl_langinfo(CODESET) in the C locale, will produce errors in
cases where mbrtowc would have succeeded. for reference, glibc behaves
similarly in this regard, so I don't think it will be a problem.
This commit is contained in:
parent
fd2add5ba0
commit
2d51c4ad57
@ -33,7 +33,7 @@ char *__nl_langinfo_l(nl_item item, locale_t loc)
|
||||
int idx = item & 65535;
|
||||
const char *str;
|
||||
|
||||
if (item == CODESET) return "UTF-8";
|
||||
if (item == CODESET) return MB_CUR_MAX==1 ? "ASCII" : "UTF-8";
|
||||
|
||||
switch (cat) {
|
||||
case LC_NUMERIC:
|
||||
|
Loading…
Reference in New Issue
Block a user