History log of /seL4-camkes-master/projects/musllibc/src/multibyte/internal.h
Revision Date Author Comments
# 6cec7bc5 21-Jun-2016 Rich Felker <dalias@aerifal.cx>

remove comments on copyright status from UTF-8 implementation files

despite clarifications made to the COPYRIGHT file in commit
f0a61399330bae42beeb27d6ecd05570b3382a60, there continues to be
confusion about whether the permissions granted actually apply to all
files. I am the sole author of these files and clearly intend, and
have always intended, for the grant of permission to apply to them.


# fe7582f4 24-Jul-2015 Rich Felker <dalias@aerifal.cx>

fix undefined left-shift of negative values in utf-8 state table


# 1507ebf8 15-Jun-2015 Rich Felker <dalias@aerifal.cx>

byte-based C locale, phase 1: multibyte character handling functions

this patch makes the functions which work directly on multibyte
characters treat the high bytes as individual abstract code units
rather than as multibyte sequences when MB_CUR_MAX is 1. since
MB_CUR_MAX is presently defined as a constant 4, all of the new code
added is dead code, and optimizing compilers' code generation should
not be affected at all. a future commit will activate the new code.

as abstract code units, bytes 0x80 to 0xff are represented by wchar_t
values 0xdf80 to 0xdfff, at the end of the surrogates range. this
ensures that they will never be misinterpreted as Unicode characters,
and that all wctype functions return false for these "characters"
without needing locale-specific logic. a high range outside of Unicode
such as 0x7fffff80 to 0x7fffffff was also considered, but since C11's
char16_t also needs to be able to represent conversions of these
bytes, the surrogate range was the natural choice.


# 9d836f44 22-Apr-2015 Rich Felker <dalias@aerifal.cx>

remove libc.h dependency from otherwise-independent multibyte code


# 57174444 11-Dec-2013 Szabolcs Nagy <nsz@port70.net>

include cleanups: remove unused headers and add feature test macros


# 8f06ab0e 08-Apr-2013 Rich Felker <dalias@aerifal.cx>

fix out-of-bounds access in UTF-8 decoding

SA and SB are used as the lowest and highest valid starter bytes, but
the value of SB was one-past the last valid starter. this caused
access past the end of the state table when the illegal byte '\xf5'
was encountered in a starter position. the error did not show up in
full-character decoding tests, since the bogus state read from just
past the table was unlikely to admit any continuation bytes as valid,
but would have shown up had we tested feeding '\xf5' to the
byte-at-a-time decoding in mbrtowc: it would cause the funtion to
wrongly return -2 rather than -1.

I may eventually go back and remove all references to SA and SB,
replacing them with the values; this would make the code more
transparent, I think. the original motivation for using macros was to
allow misguided users of the code to redefine them for the purpose of
enlarging the set of accepted sequences past the end of Unicode...


# bae2e52b 23-Feb-2012 Rich Felker <dalias@aerifal.cx>

cleanup and work around visibility bug in gcc 3 that affects x86_64

in gcc 3, the visibility attribute must be placed on both the
declaration and on the definition. if it's omitted from the
definition, the compiler fails to emit the ".hidden" directive in the
assembly, and the linker will either generate textrels (if supported,
such as on i386) or refuse to link (on targets where certain types of
textrels are forbidden or impossible without further assumptions about
memory layout, such as on x86_64).

this patch also unifies the decision about when to use visibility into
libc.h and makes the visibility in the utf-8 state machine tables
based on libc.h rather than a duplicate test.


# 015d33c5 26-Feb-2011 Rich Felker <dalias@aerifal.cx>

cleanup utf-8 multibyte code, use visibility if possible

this code was written independently of musl, with support for a the
backwards, nonstandard "31-bit unicode" some libraries/apps might
want. unfortunately the extra code (inside #ifdef) makes the source
harder to read and makes code that should be simple look complex, so
i'm removing it. anyone who wants to use the old code can find it in
the history or from elsewhere.

also, change the visibility of the __fsmu8 state machine table to
hidden, if supported. this should improve performance slightly in
shared-library builds.


# f9d880d2 13-Feb-2011 Rich Felker <dalias@aerifal.cx>

cleanup multibyte stuff to remove ugly casts, sanitize the ptr align casts


# 0b44a031 11-Feb-2011 Rich Felker <dalias@aerifal.cx>

initial check-in, version 0.5.0