History log of /seL4-camkes-master/projects/musllibc/src/locale/langinfo.c
Revision Date Author Comments
# a946e811 10-Nov-2015 Rich Felker <dalias@aerifal.cx>

fix return value of nl_langinfo for invalid item arguments

it was wrongly returning a null pointer instead of an empty string.


# 2d51c4ad 01-Oct-2015 Rich Felker <dalias@aerifal.cx>

make nl_langinfo(CODESET) always return "ASCII" in byte-based C locale

commit 844212d94f582c4e3c5055e0a1524931e89ebe76, which did not make it
into any releases, changed nl_langinfo(CODESET) to always return
"UTF-8", even in the byte-based C locale. this was problematic because
application software was found to use the string match for "UTF-8" to
activate its own UTF-8 processing. this both undermines the byte-based
functionality of the C locale, and if mixed with with calls to the
standard multibyte functions, which happened in practice, could result
in severe mis-handling of input.

the motive for the previous change was that, to avoid widespread
compatibility problems, the string returned by nl_langinfo(CODESET)
needs to be accepted by iconv and by third-party character conversion
code. thus, the only remaining choice is "ASCII". this choice
accurately represents the intent that high bytes do not have
individual meaning in the C locale, but it does mean that iconv, when
passed nl_langinfo(CODESET) in the C locale, will produce errors in
cases where mbrtowc would have succeeded. for reference, glibc behaves
similarly in this regard, so I don't think it will be a problem.


# 58f6259d 09-Sep-2015 Rich Felker <dalias@aerifal.cx>

fix breakage in nl_langinfo from previous commit


# 844212d9 08-Sep-2015 Rich Felker <dalias@aerifal.cx>

make nl_langinfo(CODESET) always return "UTF-8"

this restores the original behavior prior to the addition of the
byte-based C locale and fixes what is effectively a regression in
musl's property of always providing working UTF-8 support.

commit 1507ebf837334e9e07cfab1ca1c2e88449069a80 introduced the codeset
name "UTF-8-CODE-UNITS" for the byte-based C locale to represent that
the semantic content is UTF-8 but that it is being processed as code
units (bytes) rather than whole multibyte characters. however, many
programs assume that the codeset name is usable with iconv and/or
comes from a set of standard/widely-used names known to the
application. such programs are likely to produce warnings or errors,
run with reduced functionality, or mangle character data when run
explicitly in the C locale.

the standard places basically no requirements for the string returned
by nl_langinfo(CODESET) and how it interacts with other interfaces, so
returning "UTF-8" is permissible. moreover, it seems like the right
thing to do, since the identity of the character encoding as "UTF-8"
is independent of whether it is being processed as bytes of characters
by the standard library functions.


# 1507ebf8 15-Jun-2015 Rich Felker <dalias@aerifal.cx>

byte-based C locale, phase 1: multibyte character handling functions

this patch makes the functions which work directly on multibyte
characters treat the high bytes as individual abstract code units
rather than as multibyte sequences when MB_CUR_MAX is 1. since
MB_CUR_MAX is presently defined as a constant 4, all of the new code
added is dead code, and optimizing compilers' code generation should
not be affected at all. a future commit will activate the new code.

as abstract code units, bytes 0x80 to 0xff are represented by wchar_t
values 0xdf80 to 0xdfff, at the end of the surrogates range. this
ensures that they will never be misinterpreted as Unicode characters,
and that all wctype functions return false for these "characters"
without needing locale-specific logic. a high range outside of Unicode
such as 0x7fffff80 to 0x7fffffff was also considered, but since C11's
char16_t also needs to be able to represent conversions of these
bytes, the surrogate range was the natural choice.


# c5b8f193 26-Jul-2014 Rich Felker <dalias@aerifal.cx>

add support for LC_TIME and LC_MESSAGES translations

for LC_MESSAGES, translation of strerror and similar literal message
functions is supported. for messages in other places (particularly the
dynamic linker) that use format strings, translation is not yet
supported. in order to make it possible and safe, such messages will
need to be refactored to separate the textual content from the format.

for LC_TIME, the day and month names and strftime-style format strings
provided by nl_langinfo are supported for translation. however there
may be limitations, as some of the original C-locale nl_langinfo
strings are non-unique and thus perhaps non-suitable as keys.

overall, the locale support activated by this commit should not be
seen as complete and polished but as a basis for beginning to test
locale functionality and implement locales.


# 0206f596 26-Jul-2014 Rich Felker <dalias@aerifal.cx>

add missing yes/no strings to nl_langinfo

these were removed from the standard but still offered as an extension
in langinfo.h, so nl_langinfo should support them.


# a19cd2b6 26-Jul-2014 Rich Felker <dalias@aerifal.cx>

fix nl_langinfo table for LC_TIME era-related items

due to a skipped slot and missing null terminator, the last few
strings were off by one or two slots from their item codes.


# 4c48501e 02-Jul-2014 Rich Felker <dalias@aerifal.cx>

properly pass current locale to *_l functions when used internally

this change is presently non-functional since the callees do not yet
use their locale argument for anything.


# 1ae4bc42 28-Jul-2013 Rich Felker <dalias@aerifal.cx>

fix semantically incorrect use of LC_GLOBAL_LOCALE

LC_GLOBAL_LOCALE refers to the global locale, controlled by setlocale,
not the thread-local locale in effect which these functions should be
using. neither LC_GLOBAL_LOCALE nor 0 has an argument to the *_l
functions has behavior defined by the standard, but 0 is a more
logical choice for requesting the callee to lookup the current locale.
in the future I may move the current locale lookup the the caller (the
non-_l-suffixed wrapper).

at this point, all of the locale logic is dummied out, so no harm was
done, but it should at least avoid misleading usage.


# 87be54a1 24-Jul-2013 Rich Felker <dalias@aerifal.cx>

rework langinfo code for ABI compat and for use by time code


# 5600088d 03-Apr-2011 Rich Felker <dalias@aerifal.cx>

fix nl_langinfo to actually use the existing, correct internal version


# 0b44a031 11-Feb-2011 Rich Felker <dalias@aerifal.cx>

initial check-in, version 0.5.0