History log of /seL4-test-master/projects/musllibc/src/thread/pthread_create.c
Revision Date Author Comments
# 31fb174d 07-Nov-2016 Rich Felker <dalias@aerifal.cx>

add limited pthread_setattr_default_np API to set stack size defaults

based on patch by Timo Teräs:

While generally this is a bad API, it is the only existing API to
affect c++ (std::thread) and c11 (thrd_create) thread stack size.
This patch allows applications only to increate stack and guard
page sizes.


# ea7891a6 08-Nov-2016 Rich Felker <dalias@aerifal.cx>

fix pthread_create regression from stack/guard size simplification

commit 33ce920857405d4f4b342c85b74588a15e2702e5 broke pthread_create
in the case where a null attribute pointer is passed; rather than
using the default sizes, sizes of 0 (plus the remainder of one page
after TLS/TCB use) were used.


# 33ce9208 07-Nov-2016 Rich Felker <dalias@aerifal.cx>

simplify pthread_attr_t stack/guard size representation

previously, the pthread_attr_t object was always initialized all-zero,
and stack/guard size were represented as differences versus their
defaults. this required lots of confusing offset arithmetic everywhere
they were used. instead, have pthread_attr_init fill in the default
values, and work with absolute sizes everywhere.


# 384d103d 27-Jun-2016 Rich Felker <dalias@aerifal.cx>

fix failure to obtain EOWNERDEAD status for process-shared robust mutexes

Linux's documentation (robust-futex-ABI.txt) claims that, when a
process dies with a futex on the robust list, bit 30 (0x40000000) is
set to indicate the status. however, what actually happens is that
bits 0-30 are replaced with the value 0x40000000, i.e. bits 0-29
(containing the old owner tid) are cleared at the same time bit 30 is
set.

our userspace-side code for robust mutexes was written based on that
documentation, assuming that kernel would never produce a futex value
of 0x40000000, since the low (owner) bits would always be non-zero.
commit d338b506e39b1e2c68366b12be90704c635602ce introduced this
assumption explicitly while fixing another bug in how non-recoverable
status for robust mutexes was tracked. presumably the tests conducted
at that time only checked non-process-shared robust mutexes, which are
handled in pthread_exit (which implemented the documented kernel
protocol, not the actual one) rather than by the kernel.

change pthread_exit robust list processing to match the kernel
behavior, clearing bits 0-29 while setting bit 30, and use the value
0x7fffffff instead of 0x40000000 to encode non-recoverable status. the
choice of value here is arbitrary; any value with at least one of bits
0-29 set should work just as well,


# 75eceb3a 17-Jun-2015 Rich Felker <dalias@aerifal.cx>

ignore ENOSYS error from mprotect in pthread_create and dynamic linker

this error simply indicated a system without memory protection (NOMMU)
and should not cause failure in the caller.


# 1b0cdc87 16-Jun-2015 Rich Felker <dalias@aerifal.cx>

refactor stdio open file list handling, move it out of global libc struct

functions which open in-memory FILE stream variants all shared a tail
with __fdopen, adding the FILE structure to stdio's open file list.
replacing this common tail with a function call reduces code size and
duplication of logic. the list is also partially encapsulated now.

function signatures were chosen to facilitate tail call optimization
and reduce the need for additional accessor functions.

with these changes, static linked programs that do not use stdio no
longer have an open file list at all.


# 68630b55 15-May-2015 Rich Felker <dalias@aerifal.cx>

eliminate costly tricks to avoid TLS access for current locale state

the code being removed used atomics to track whether any threads might
be using a locale other than the current global locale, and whether
any threads might have abstract 8-bit (non-UTF-8) LC_CTYPE active, a
feature which was never committed (still pending). the motivations
were to support early execution prior to setup of the thread pointer,
to partially support systems (ancient kernels) where thread pointer
setup is not possible, and to avoid high performance cost on archs
where accessing the thread pointer may be very slow.

since commit 19a1fe670acb3ab9ead0fe31859ca7d4fe40dd54, the thread
pointer is always available, so these hacks are no longer needed.
removing them greatly simplifies the affected code.


# 484194db 06-May-2015 Rich Felker <dalias@aerifal.cx>

fix stack protector crashes on x32 & powerpc due to misplaced TLS canary

i386, x86_64, x32, and powerpc all use TLS for stack protector canary
values in the default stack protector ABI, but the location only
matched the ABI on i386 and x86_64. on x32, the expected location for
the canary contained the tid, thus producing spurious mismatches
(resulting in process termination) upon fork. on powerpc, the expected
location contained the stdio_locks list head, so returning from a
function after calling flockfile produced spurious mismatches. in both
cases, the random canary was not present, and a predictable value was
used instead, making the stack protector hardening much less effective
than it should be.

in the current fix, the thread structure has been expanded to have
canary fields at all three possible locations, and archs that use a
non-default location must define a macro in pthread_arch.h to choose
which location is used. for most archs (which lack TLS canary ABI) the
choice does not matter.


# 01d42747 18-Apr-2015 Rich Felker <dalias@aerifal.cx>

make dlerror state and message thread-local and dynamically-allocated

this fixes truncation of error messages containing long pathnames or
symbol names.

the dlerror state was previously required by POSIX to be global. the
resolution of bug 97 relaxed the requirements to allow thread-safe
implementations of dlerror with thread-local state and message buffer.


# 19a1fe67 13-Apr-2015 Rich Felker <dalias@aerifal.cx>

remove remnants of support for running in no-thread-pointer mode

since 1.1.0, musl has nominally required a thread pointer to be setup.
most of the remaining code that was checking for its availability was
doing so for the sake of being usable by the dynamic linker. as of
commit 71f099cb7db821c51d8f39dfac622c61e54d794c, this is no longer
necessary; the thread pointer is now valid before any libc code
(outside of dynamic linker bootstrap functions) runs.

this commit essentially concludes "phase 3" of the "transition path
for removing lazy init of thread pointer" project that began during
the 1.1.0 release cycle.


# a2d30533 10-Apr-2015 Rich Felker <dalias@aerifal.cx>

apply vmlock wait to __unmapself in pthread_exit


# f08ab9e6 10-Apr-2015 Rich Felker <dalias@aerifal.cx>

redesign and simplify vmlock system

this global lock allows certain unlock-type primitives to exclude
mmap/munmap operations which could change the identity of virtual
addresses while references to them still exist.

the original design mistakenly assumed mmap/munmap would conversely
need to exclude the same operations which exclude mmap/munmap, so the
vmlock was implemented as a sort of 'symmetric recursive rwlock'. this
turned out to be unnecessary.

commit 25d12fc0fc51f1fae0f85b4649a6463eb805aa8f already shortened the
interval during which mmap/munmap held their side of the lock, but
left the inappropriate lock design and some inefficiency.

the new design uses a separate function, __vm_wait, which does not
hold any lock itself and only waits for lock users which were already
present when it was called to release the lock. this is sufficient
because of the way operations that need to be excluded are sequenced:
the "unlock-type" operations using the vmlock need only block
mmap/munmap operations that are precipitated by (and thus sequenced
after) the atomic-unlock they perform while holding the vmlock.

this allows for a spectacular lack of synchronization in the __vm_wait
function itself.


# 4e98cce1 09-Apr-2015 Rich Felker <dalias@aerifal.cx>

optimize out setting up robust list with kernel when not needed

as a result of commit 12e1e324683a1d381b7f15dd36c99b37dd44d940, kernel
processing of the robust list is only needed for process-shared
mutexes. previously the first attempt to lock any owner-tracked mutex
resulted in robust list initialization and a set_robust_list syscall.
this is no longer necessary, and since the kernel's record of the
robust list must now be cleared at thread exit time for detached
threads, optimizing it out is more worthwhile than before too.


# 12e1e324 09-Apr-2015 Rich Felker <dalias@aerifal.cx>

process robust list in pthread_exit to fix detached thread use-after-unmap

the robust list head lies in the thread structure, which is unmapped
before exit for detached threads. this leaves the kernel unable to
process the exiting thread's robust list, and with a dangling pointer
which may happen to point to new unrelated data at the time the kernel
processes it.

userspace processing of the robust list was already needed for
non-pshared robust mutexes in order to perform private futex wakes
rather than the shared ones the kernel would do, but it was
conditional on linking pthread_mutexattr_setrobust and did not bother
processing the pshared mutexes in the list, which requires additional
logic for the robust list pending slot in case pthread_exit is
interrupted by asynchronous process termination.

the new robust list processing code is linked unconditionally (inlined
in pthread_exit), handles both private and shared mutexes, and also
removes the kernel's reference to the robust list before unmapping and
exit if the exiting thread is detached.


# 36d8e972 16-Feb-2015 Rich Felker <dalias@aerifal.cx>

make pthread_exit responsible for disabling cancellation

this requirement is tucked away in XSH 2.9.5 Thread Cancellation under
the heading Thread Cancellation Cleanup Handlers.


# 78a8ef47 15-Jan-2015 Rich Felker <dalias@aerifal.cx>

overhaul __synccall and fix AS-safety and other issues in set*id

multi-threaded set*id and setrlimit use the internal __synccall
function to work around the kernel's wrongful treatment of these
process properties as thread-local. the old implementation of
__synccall failed to be AS-safe, despite POSIX requiring setuid and
setgid to be AS-safe, and was not rigorous in assuring that all
threads were caught. in a worst case, threads late in the process of
exiting could retain permissions after setuid reported success, in
which case attacks to regain dropped permissions may have been
possible under the right conditions.

the new implementation of __synccall depends on the presence of
/proc/self/task and will fail if it can't be opened, but is able to
determine that it has caught all threads, and does not use any locks
except its own. it thereby achieves AS-safety simply by blocking
signals to preclude re-entry in the same thread.

with this commit, all known conformance and safety issues in set*id
functions should be fixed.


# 23614b0f 07-Sep-2014 Rich Felker <dalias@aerifal.cx>

add C11 thread creation and related thread functions

based on patch by Jens Gustedt.

the main difficulty here is handling the difference between start
function signatures and thread return types for C11 threads versus
POSIX threads. pointers to void are assumed to be able to represent
faithfully all values of int. the function pointer for the thread
start function is cast to an incorrect type for passing through
pthread_create, but is cast back to its correct type before calling so
that the behavior of the call is well-defined.

changes to the existing threads implementation were kept minimal to
reduce the risk of regressions, and duplication of code that carries
implementation-specific assumptions was avoided for ease and safety of
future maintenance.


# df7d0dfb 31-Aug-2014 Jens Gustedt <Jens.Gustedt@inria.fr>

use weak symbols for the POSIX functions that will be used by C threads

The intent of this is to avoid name space pollution of the C threads
implementation.

This has two sides to it. First we have to provide symbols that wouldn't
pollute the name space for the C threads implementation. Second we have
to clean up some internal uses of POSIX functions such that they don't
implicitly drag in such symbols.


# 5345c9b8 23-Aug-2014 Rich Felker <dalias@aerifal.cx>

fix false ownership of stdio FILEs due to tid reuse

this is analogous commit fffc5cda10e0c5c910b40f7be0d4fa4e15bb3f48
which fixed the corresponding issue for mutexes.

the robust list can't be used here because the locks do not share a
common layout with mutexes. at some point it may make sense to simply
incorporate a mutex object into the FILE structure and use it, but
that would be a much more invasive change, and it doesn't mesh well
with the current design that uses a simpler code path for internal
locking and pulls in the recursive-mutex-like code when the flockfile
API is used explicitly.


# a6293285 22-Aug-2014 Rich Felker <dalias@aerifal.cx>

fix use of uninitialized memory with application-provided thread stacks

the subsequent code in pthread_create and the code which copies TLS
initialization images to the new thread's TLS space assume that the
memory provided to them is zero-initialized, which is true when it's
obtained by pthread_create using mmap. however, when the caller
provides a stack using pthread_attr_setstack, pthread_create cannot
make any assumptions about the contents. simply zero-filling the
relevant memory in this case is the simplest and safest fix.


# b092f1c5 16-Aug-2014 Rich Felker <dalias@aerifal.cx>

enable private futex for process-local robust mutexes

the kernel always uses non-private wake when walking the robust list
when a thread or process exits, so it's not able to wake waiters
listening with the private futex flag. this problem is solved by doing
the equivalent in userspace as the last step of pthread_exit.

care is taken to remove mutexes from the robust list before unlocking
them so that the kernel will not attempt to access them again,
possibly after another thread locks them. this removal code can treat
the list as singly-linked, since no further code which would add or
remove items is able to run at this point. moreover, the pending
pointer is not needed since the mutexes being unlocked are all
process-local; in the case of asynchronous process termination, they
all cease to exist.

since a process-local robust mutex cannot come into existence without
a call to pthread_mutexattr_setrobust in the same process, the code
for userspace robust list processing is put in that source file, and
a weak alias to a dummy function is used to avoid pulling in this
bloat as part of pthread_exit in static-linked programs.


# a6adb2bc 16-Jul-2014 Rich Felker <dalias@aerifal.cx>

work around constant folding bug 61144 in gcc 4.9.0 and 4.9.1

previously we detected this bug in configure and issued advice for a
workaround, but this turned out not to work. since then gcc 4.9.0 has
appeared in several distributions, and now 4.9.1 has been released
without a fix despite this being a wrong code generation bug which is
supposed to be a release-blocker, per gcc policy.

since the scope of the bug seems to affect only data objects (rather
than functions) whose definitions are overridable, and there are only
a very small number of these in musl, I am just changing them from
const to volatile for the time being. simply removing the const would
be sufficient to make gcc 4.9.1 work (the non-const case was
inadvertently fixed as part of another change in gcc), and this would
also be sufficient with 4.9.0 if we forced -O0 on the affected files
or on the whole build. however it's cleaner to just remove all the
broken compiler detection and use volatile, which will ensure that
they are never constant-folded. the quality of a non-broken compiler's
output should not be affected except for the fact that these objects
are no longer const and thus possibly add a few bytes to data/bss.

this change can be reconsidered and possibly reverted at some point in
the future when the broken gcc versions are no longer relevant.


# 83dc6eb0 05-Jul-2014 Rich Felker <dalias@aerifal.cx>

eliminate use of cached pid from thread structure

the main motivation for this change is to remove the assumption that
the tid of the main thread is also the pid of the process. (the value
returned by the set_tid_address syscall was used to fill both fields
despite it semantically being the tid.) this is historically and
presently true on linux and unlikely to change, but it conceivably
could be false on other systems that otherwise reproduce the linux
syscall api/abi.

only a few parts of the code were actually still using the cached pid.
in a couple places (aio and synccall) it was a minor optimization to
avoid a syscall. caching could be reintroduced, but lazily as part of
the public getpid function rather than at program startup, if it's
deemed important for performance later. in other places (cancellation
and pthread_kill) the pid was completely unnecessary; the tkill
syscall can be used instead of tgkill. this is actually a rather
subtle issue, since tgkill is supposedly a solution to race conditions
that can affect use of tkill. however, as documented in the commit
message for commit 7779dbd2663269b465951189b4f43e70839bc073, tgkill
does not actually solve this race; it just limits it to happening
within one process rather than between processes. we use a lock that
avoids the race in pthread_kill, and the use in the cancellation
signal handler is self-targeted and thus not subject to tid reuse
races, so both are safe regardless of which syscall (tgkill or tkill)
is used.


# 0bc03091 02-Jul-2014 Rich Felker <dalias@aerifal.cx>

add locale framework

this commit adds non-stub implementations of setlocale, duplocale,
newlocale, and uselocale, along with the data structures and minimal
code needed for representing the active locale on a per-thread basis
and optimizing the common case where thread-local locale settings are
not in use.

at this point, the data structures only contain what is necessary to
represent LC_CTYPE (a single flag) and LC_MESSAGES (a name for use in
finding message translation files). representation for the other
categories will be added later; the expectation is that a single
pointer will suffice for each.

for LC_CTYPE, the strings "C" and "POSIX" are treated as special; any
other string is accepted and treated as "C.UTF-8". for other
categories, any string is accepted after being truncated to a maximum
supported length (currently 15 bytes). for LC_MESSAGES, the name is
kept regardless of whether libc itself can use such a message
translation locale, since applications using catgets or gettext should
be able to use message locales libc is not aware of. for other
categories, names which are not successfully loaded as locales (which,
at present, means all names) are treated as aliases for "C". setlocale
never fails.

locale settings are not yet used anywhere, so this commit should have
no visible effects except for the contents of the string returned by
setlocale.


# ac31bf27 10-Jun-2014 Rich Felker <dalias@aerifal.cx>

simplify errno implementation

the motivation for the errno_ptr field in the thread structure, which
this commit removes, was to allow the main thread's errno to keep its
address when lazy thread pointer initialization was used. &errno was
evaluated prior to setting up the thread pointer and stored in
errno_ptr for the main thread; subsequently created threads would have
errno_ptr pointing to their own errno_val in the thread structure.

since lazy initialization was removed, there is no need for this extra
level of indirection; __errno_location can simply return the address
of the thread's errno_val directly. this does cause &errno to change,
but the change happens before entry to application code, and thus is
not observable.


# df15168c 10-Jun-2014 Rich Felker <dalias@aerifal.cx>

replace all remaining internal uses of pthread_self with __pthread_self

prior to version 1.1.0, the difference between pthread_self (the
public function) and __pthread_self (the internal macro or inline
function) was that the former would lazily initialize the thread
pointer if it was not already initialized, whereas the latter would
crash in this case. since lazy initialization is no longer supported,
use of pthread_self no longer makes sense; it simply generates larger,
slower code.


# 689e0e6b 24-Mar-2014 Rich Felker <dalias@aerifal.cx>

fix pointer type mismatch and misplacement of const


# dab441ae 24-Mar-2014 Rich Felker <dalias@aerifal.cx>

always initialize thread pointer at program start

this is the first step in an overhaul aimed at greatly simplifying and
optimizing everything dealing with thread-local state.

previously, the thread pointer was initialized lazily on first access,
or at program startup if stack protector was in use, or at certain
random places where inconsistent state could be reached if it were not
initialized early. while believed to be fully correct, the logic was
fragile and non-obvious.

in the first phase of the thread pointer overhaul, support is retained
(and in some cases improved) for systems/situation where loading the
thread pointer fails, e.g. old kernels.

some notes on specific changes:

- the confusing use of libc.main_thread as an indicator that the
thread pointer is initialized is eliminated in favor of an explicit
has_thread_pointer predicate.

- sigaction no longer needs to ensure that the thread pointer is
initialized before installing a signal handler (this was needed to
prevent a situation where the signal handler caused the thread
pointer to be initialized and the subsequent sigreturn cleared it
again) but it still needs to ensure that implementation-internal
thread-related signals are not blocked.

- pthread tsd initialization for the main thread is deferred in a new
manner to minimize bloat in the static-linked __init_tp code.

- pthread_setcancelstate no longer needs special handling for the
situation before the thread pointer is initialized. it simply fails
on systems that cannot support a thread pointer, which are
non-conforming anyway.

- pthread_cleanup_push/pop now check for missing thread pointer and
nop themselves out in this case, so stdio no longer needs to avoid
the cancellable path when the thread pointer is not available.

a number of cases remain where certain interfaces may crash if the
system does not support a thread pointer. at this point, these should
be limited to pthread interfaces, and the number of such cases should
be fewer than before.


# 271c2119 16-Sep-2013 Rich Felker <dalias@aerifal.cx>

omit CLONE_PARENT flag to clone in pthread_create

CLONE_PARENT is not necessary (CLONE_THREAD provides all the useful
parts of it) and Linux treats CLONE_PARENT as an error in certain
situations, without noticing that it would be a no-op due to
CLONE_THREAD. this error case prevents, for example, use of a
multi-threaded init process and certain usages with containers.


# f68a3468 16-Sep-2013 Rich Felker <dalias@aerifal.cx>

use symbolic names for clone flags in pthread_create


# b20760c0 14-Sep-2013 Szabolcs Nagy <nsz@port70.net>

support configurable page size on mips, powerpc and microblaze

PAGE_SIZE was hardcoded to 4096, which is historically what most
systems use, but on several archs it is a kernel config parameter,
user space can only know it at execution time from the aux vector.

PAGE_SIZE and PAGESIZE are not defined on archs where page size is
a runtime parameter, applications should use sysconf(_SC_PAGE_SIZE)
to query it. Internally libc code defines PAGE_SIZE to libc.page_size,
which is set to aux[AT_PAGESZ] in __init_libc and early in __dynlink
as well. (Note that libc.page_size can be accessed without GOT, ie.
before relocations are done)

Some fpathconf settings are hardcoded to 4096, these should be actually
queried from the filesystem using statfs.


# 2c074b0d 26-Apr-2013 Rich Felker <dalias@aerifal.cx>

transition to using functions for internal signal blocking/restoring

there are several reasons for this change. one is getting rid of the
repetition of the syscall signature all over the place. another is
sharing the constant masks without costly GOT accesses in PIC.

the main motivation, however, is accurately representing whether we
want to block signals that might be handled by the application, or all
signals.


# d674f858 26-Apr-2013 Rich Felker <dalias@aerifal.cx>

prevent code from running under a thread id which already gave ESRCH


# 082fb4e9 26-Apr-2013 Rich Felker <dalias@aerifal.cx>

fix clobbering of signal mask when creating thread with sched attributes

this was simply a case of saving the state in the wrong place.


# d0ba0983 26-Apr-2013 Rich Felker <dalias@aerifal.cx>

make last thread's pthread_exit give exit(0) a consistent state

the previous few commits ended up leaving the thread count and signal
mask wrong for atexit handlers and stdio cleanup.


# c3a6839c 26-Apr-2013 Rich Felker <dalias@aerifal.cx>

use atomic decrement rather than cas in pthread_exit thread count

now that blocking signals prevents any application code from running
while the last thread is exiting, the cas logic is no longer needed to
prevent decrementing below zero.


# 6e531f99 26-Apr-2013 Rich Felker <dalias@aerifal.cx>

add comments on some of the pthread_exit logic


# 23f21c30 26-Apr-2013 Rich Felker <dalias@aerifal.cx>

always block signals in pthread_exit before decrementing thread count

the thread count (1+libc.threads_minus_1) must always be greater than
or equal to the number of threads which could have application code
running, even in an async-signal-safe sense. there is at least one
dangerous race condition if this invariant fails to hold: dlopen could
allocate too little TLS for existing threads, and a signal handler
running in the exiting thread could claim the allocated TLS for itself
(via __tls_get_addr), leaving too little for the other threads it was
allocated for and thereby causing out-of-bounds access.

there may be other situations where it's dangerous for the thread
count to be too low, particularly in the case where only one thread
should be left, in which case locking may be omitted. however, all
such code paths seem to arise from undefined behavior, since
async-signal-unsafe functions are not permitted to be called from a
signal handler that interrupts pthread_exit (which is itself
async-signal-unsafe).

this change may also simplify logic in __synccall and improve the
chances of making __synccall async-signal-safe.


# ced64995 05-Apr-2013 Rich Felker <dalias@aerifal.cx>

fix type error in pthread_create, introduced with pthread_getattr_np


# 14a835b3 31-Mar-2013 Rich Felker <dalias@aerifal.cx>

implement pthread_getattr_np

this function is mainly (purely?) for obtaining stack address
information, but we also provide the detach state since it's easy to
do anyway.


# ccc7b4c3 26-Mar-2013 Rich Felker <dalias@aerifal.cx>

remove __SYSCALL_SSLEN arch macro in favor of using public _NSIG

the issue at hand is that many syscalls require as an argument the
kernel-ABI size of sigset_t, intended to allow the kernel to switch to
a larger sigset_t in the future. previously, each arch was defining
this size in syscall_arch.h, which was redundant with the definition
of _NSIG in bits/signal.h. as it's used in some not-quite-portable
application code as well, _NSIG is much more likely to be recognized
and understood immediately by someone reading the code, and it's also
shorter and less cluttered.

note that _NSIG is actually 65/129, not 64/128, but the division takes
care of throwing away the off-by-one part.


# 72768ea9 01-Feb-2013 Rich Felker <dalias@aerifal.cx>

fix stale locks left behind when pthread_create fails

this bug seems to have been around a long time.


# 077549e0 01-Feb-2013 Rich Felker <dalias@aerifal.cx>

if pthread_create fails, it must not attempt mmap if there is no mapping

this bug was introduced when support for application-provided stacks
was originally added.


# d5142642 01-Feb-2013 Rich Felker <dalias@aerifal.cx>

pthread stack treatment overhaul for application-provided stacks, etc.

the main goal of these changes is to address the case where an
application provides a stack of size N, but TLS has size M that's a
significant portion of the size N (or even larger than N), thus giving
the application less stack space than it expected or no stack at all!

the new strategy pthread_create now uses is to only put TLS on the
application-provided stack if TLS is smaller than 1/8 of the stack
size or 2k, whichever is smaller. this ensures that the application
always has "close enough" to what it requested, and the threshold is
chosen heuristically to make sure "sane" amounts of TLS still end up
in the application-provided stack.

if TLS does not fit the above criteria, pthread_create uses mmap to
obtain space for TLS, but still uses the application-provided stack
for actual call frame stack. this is to avoid wasting memory, and for
the sake of supporting ugly hacks like garbage collection based on
assumptions that the implementation will use the provided stack range.

in order for the above heuristics to ever succeed, the amount of TLS
space wasted on POSIX TSD (pthread_key_create based) needed to be
reduced. otherwise, these changes would preclude any use of
pthread_create without mmap, which would have serious memory usage and
performance costs for applications trying to create huge numbers of
threads using pre-allocated stack space. the new value of
PTHREAD_KEYS_MAX is the minimum allowed by POSIX, 128. this should
still be plenty more than real-world applications need, especially now
that C11/gcc-style TLS is now supported in musl, and most apps and
libraries choose to use that instead of POSIX TSD when available.

at the same time, PTHREAD_STACK_MIN has been decreased. it was
originally set to PAGE_SIZE back when there was no support for TLS or
application-provided stacks, and requests smaller than a whole page
did not make sense. now, there are two good reasons to support
requests smaller than a page: (1) applications could provide
pre-allocated stacks smaller than a page, and (2) with smaller stack
sizes, stack+TLS+TSD can all fit in one page, making it possible for
applications which need huge numbers of threads with minimal stack
needs to allocate exactly one page per thread. the new value of
PTHREAD_STACK_MIN, 2k, is aligned with the minimum size for
sigaltstack.


# 1e21e78b 11-Nov-2012 Rich Felker <dalias@aerifal.cx>

add support for thread scheduling (POSIX TPS option)

linux's sched_* syscalls actually implement the TPS (thread
scheduling) functionality, not the PS (process scheduling)
functionality which the sched_* functions are supposed to have.
omitting support for the PS option (and having the sched_* interfaces
fail with ENOSYS rather than omitting them, since some broken software
assumes they exist) seems to be the only conforming way to do this on
linux.


# efd4d87a 08-Nov-2012 Rich Felker <dalias@aerifal.cx>

clean up sloppy nested inclusion from pthread_impl.h

this mirrors the stdio_impl.h cleanup. one header which is not
strictly needed, errno.h, is left in pthread_impl.h, because since
pthread functions return their error codes rather than using errno,
nearly every single pthread function needs the errno constants.

in a few places, rather than bringing in string.h to use memset, the
memset was replaced by direct assignment. this seems to generate much
better code anyway, and makes many functions which were previously
non-leaf functions into leaf functions (possibly eliminating a great
deal of bloat on some platforms where non-leaf functions require ugly
prologue and/or epilogue).


# 9ec4283b 15-Oct-2012 Rich Felker <dalias@aerifal.cx>

add support for TLS variant I, presently needed for arm and mips

despite documentation that makes it sound a lot different, the only
ABI-constraint difference between TLS variants II and I seems to be
that variant II stores the initial TLS segment immediately below the
thread pointer (i.e. the thread pointer points to the end of it) and
variant I stores the initial TLS segment above the thread pointer,
requiring the thread descriptor to be stored below. the actual value
stored in the thread pointer register also tends to have per-arch
random offsets applied to it for silly micro-optimization purposes.

with these changes applied, TLS should be basically working on all
supported archs except microblaze. I'm still working on getting the
necessary information and a working toolchain that can build TLS
binaries for microblaze, but in theory, static-linked programs with
TLS and dynamic-linked programs where only the main executable uses
TLS should already work on microblaze.

alignment constraints have not yet been heavily tested, so it's
possible that this code does not always align TLS segments correctly
on archs that need TLS variant I.


# 42c36f95 14-Oct-2012 Rich Felker <dalias@aerifal.cx>

fix overlap of thread stacks with thread tls segments


# 0a96a37f 07-Oct-2012 Rich Felker <dalias@aerifal.cx>

clean up and refactor program initialization

the code in __libc_start_main is now responsible for parsing auxv,
rather than duplicating the parsing all over the place. this should
shave off a few cycles and some code size. __init_libc is left as an
external-linkage function despite the fact that it could be static, to
prevent it from being inlined and permanently wasting stack space when
main is called.

a few other minor changes are included, like eliminating per-thread
ssp canaries (they were likely broken when combined with certain
dlopen usages, and completely unnecessary) and some other unnecessary
checks. since this code gets linked into every program, it should be
as small and simple as possible.


# dcd60371 05-Oct-2012 Rich Felker <dalias@aerifal.cx>

support for TLS in dynamic-loaded (dlopen) modules

unlike other implementations, this one reserves memory for new TLS in
all pre-existing threads at dlopen-time, and dlopen will fail with no
resources consumed and no new libraries loaded if memory is not
available. memory is not immediately distributed to running threads;
that would be too complex and too costly. instead, assurances are made
that threads needing the new TLS can obtain it in an async-signal-safe
way from a buffer belonging to the dynamic linker/new module (via
atomic fetch-and-add based allocator).

I've re-appropriated the lock that was previously used for __synccall
(synchronizing set*id() syscalls between threads) as a general
pthread_create lock. it's a "backwards" rwlock where the "read"
operation is safe atomic modification of the live thread count, which
multiple threads can perform at the same time, and the "write"
operation is making sure the count does not increase during an
operation that depends on it remaining bounded (__synccall or dlopen).
in static-linked programs that don't use __synccall, this lock is a
no-op and has no cost.


# 8431d797 04-Oct-2012 Rich Felker <dalias@aerifal.cx>

TLS (GNU/C11 thread-local storage) support for static-linked programs

the design for TLS in dynamic-linked programs is mostly complete too,
but I have not yet implemented it. cost is nonzero but still low for
programs which do not use TLS and/or do not use threads (a few hundred
bytes of new code, plus dependency on memcpy). i believe it can be
made smaller at some point by merging __init_tls and __init_security
into __libc_start_main and avoiding duplicate auxv-parsing code.

at the same time, I've also slightly changed the logic pthread_create
uses to allocate guard pages to ensure that guard pages are not
counted towards commit charge.


# 0c05bd3a 06-Sep-2012 Rich Felker <dalias@aerifal.cx>

further use of _Noreturn, for non-plain-C functions

note that POSIX does not specify these functions as _Noreturn, because
POSIX is aligned with C99, not the new C11 standard. when POSIX is
eventually updated to C11, it will almost surely give these functions
the _Noreturn attribute. for now, the actual _Noreturn keyword is not
used anyway when compiling with a c99 compiler, which is what POSIX
requires; the GCC __attribute__ is used instead if it's available,
however.

in a few places, I've added infinite for loops at the end of _Noreturn
functions to silence compiler warnings. presumably
__buildin_unreachable could achieve the same thing, but it would only
work on newer GCCs and would not be portable. the loops should have
near-zero code size cost anyway.

like the previous _Noreturn commit, this one is based on patches
contributed by philomath.


# 400c5e5c 06-Sep-2012 Rich Felker <dalias@aerifal.cx>

use restrict everywhere it's required by c99 and/or posix 2008

to deal with the fact that the public headers may be used with pre-c99
compilers, __restrict is used in place of restrict, and defined
appropriately for any supported compiler. we also avoid the form
[restrict] since older versions of gcc rejected it due to a bug in the
original c99 standard, and instead use the form *restrict.


# 2f437040 09-Aug-2012 Rich Felker <dalias@aerifal.cx>

fix (hopefully) all hard-coded 8's for kernel sigset_t size

some minor changes to how hard-coded sets for thread-related purposes
are handled were also needed, since the old object sizes were not
necessarily sufficient. things have gotten a bit ugly in this area,
and i think a cleanup is in order at some point, but for now the goal
is just to get the code working on all supported archs including mips,
which was badly broken by linux rejecting syscalls with the wrong
sigset_t size.


# bbbe87e3 12-Jul-2012 Rich Felker <dalias@aerifal.cx>

fix several locks that weren't updated right for new futex-based __lock

these could have caused memory corruption due to invalid accesses to
the next field. all should be fixed now; I found the errors with fgrep
-r '__lock(&', which is bogus since the argument should be an array.


# 92f8396b 11-Jul-2012 Rich Felker <dalias@aerifal.cx>

fix potential race condition in detached threads

after the thread unmaps its own stack/thread structure, the kernel,
performing child tid clear and futex wake, could clobber a new mapping
made at the same location as the just-removed thread's tid field.
disable kernel clearing of child tid to prevent this.


# 819006a8 09-Jun-2012 Rich Felker <dalias@aerifal.cx>

add pthread_attr_setstack interface (and get)

i originally omitted these (optional, per POSIX) interfaces because i
considered them backwards implementation details. however, someone
later brought to my attention a fairly legitimate use case: allocating
thread stacks in memory that's setup for sharing and/or fast transfer
between CPU and GPU so that the thread can move data to a GPU directly
from automatic-storage buffers without having to go through additional
buffer copies.

perhaps there are other situations in which these interfaces are
useful too.


# 1e597a3e 02-Jun-2012 Rich Felker <dalias@aerifal.cx>

remove no-longer-needed unblocking of signals in pthread_create

this action is now performed in pthread_self initialization; it must
be performed there in case the first call to pthread_create is from a
signal handler, in which case the old signal mask could be restored on
return from the signal.


# cfd892fd 23-May-2012 Rich Felker <dalias@aerifal.cx>

simplify cancellation push/pop slightly

no need to pass unnecessary extra arguments on to the core code in
pthread_create.c. this just wastes cycles and code bloat.


# 7e4d7946 04-May-2012 Rich Felker <dalias@aerifal.cx>

make pthread stacks non-executable

this change is necessary or pthread_create will always fail on
security-hardened kernels. i considered first trying to make the stack
executable and simply retrying without execute permissions when the
first try fails, but (1) this would incur a serious performance
penalty on hardened systems, and (2) having the stack be executable is
just a bad idea from a security standpoint.

if there is real-world "GNU C" code that uses nested functions with
threads, and it can't be fixed, we'll have to consider other ways of
solving the problem, but for now this seems like the best fix.


# 58aa5f45 03-May-2012 Rich Felker <dalias@aerifal.cx>

overhaul SSP support to use a real canary

pthread structure has been adjusted to match the glibc/GCC abi for
where the canary is stored on i386 and x86_64. it will need variants
for other archs to provide the added security of the canary's entropy,
but even without that it still works as well as the old "minimal" ssp
support. eventually such changes will be made anyway, since they are
also needed for GCC/C11 thread-local storage support (not yet
implemented).

care is taken not to attempt initializing the thread pointer unless
the program actually uses SSP (by reference to __stack_chk_fail).


# e3234d01 28-Feb-2012 Rich Felker <dalias@aerifal.cx>

fix pthread_cleanup_pop(1) crash in non-thread-capable, static-linked programs


# 2230218c 09-Feb-2012 Rich Felker <dalias@aerifal.cx>

small fix for new pthread cleanup stuff

even if pthread_create/exit code is not linked, run flag needs to be
checked and cleanup function potentially run on pop. thus, move the
code to the module that's always linked when pthread_cleanup_push/pop
is used.


# afc35d5e 09-Feb-2012 Rich Felker <dalias@aerifal.cx>

replace bad cancellation cleanup abi with a sane one

the old abi was intended to duplicate glibc's abi at the expense of
being ugly and slow, but it turns out glib was not even using that abi
except on non-gcc-compatible compilers (which it doesn't even support)
and was instead using an exceptions-in-c/unwind-based approach whose
abi we could not duplicate anyway without nasty dwarf2/unwind
integration.

the new abi is copied from a very old glibc abi, which seems to still
be supported/present in current glibc. it avoids all unwinding,
whether by sjlj or exceptions, and merely maintains a linked list of
cleanup functions to be called from the context of pthread_exit. i've
made some care to ensure that longjmp out of a cleanup function should
work, even though it is not required to.

this change breaks abi compatibility with programs which were using
pthread cancellation, which is unfortunate, but that's why i'm making
the change now rather than later. considering that most pthread
features have not been usable until recently anyway, i don't see it as
a major issue at this point.


# 3f39c9b3 26-Sep-2011 Rich Felker <dalias@aerifal.cx>

fix incorrect allocation failure check in pthread_create

mmap returns MAP_FAILED not 0 because some idiot thought the ability
to mmap the null pointer page would be a good idea...


# 3f72cdac 18-Sep-2011 Rich Felker <dalias@aerifal.cx>

overhaul clone syscall wrapping

several things are changed. first, i have removed the old __uniclone
function signature and replaced it with the "standard" linux
__clone/clone signature. this was necessary to expose clone to
applications anyway, and it makes it easier to port __clone to new
archs, since it's now testable independently of pthread_create.

secondly, i have removed all references to the ugly ldt descriptor
structure (i386 only) from the c code and pthread structure. in places
where it is needed, it is now created on the stack just when it's
needed, in assembly code. thus, the i386 __clone function takes the
desired thread pointer as its argument, rather than an ldt descriptor
pointer, just like on all other sane archs. this should not affect
applications since there is really no way an application can use clone
with threads/tls in a way that doesn't horribly conflict with and
clobber the underlying implementation's use. applications are expected
to use clone only for creating actual processes, possibly with new
namespace features and whatnot.


# 407d9330 12-Aug-2011 Rich Felker <dalias@aerifal.cx>

pthread and synccall cleanup, new __synccall_wait op

fix up clone signature to match the actual behavior. the new
__syncall_wait function allows a __synccall callback to wait for other
threads to continue without returning, so that it can resume action
after the caller finishes. this interface could be made significantly
more general/powerful with minimal effort, but i'll wait to do that
until it's actually useful for something.


# 5f37fc13 03-Aug-2011 Rich Felker <dalias@aerifal.cx>

further debloat cancellation handlers

cleanup push and pop are also no-ops if pthread_exit is not reachable.
this can make a big difference for library code which needs to protect
itself against cancellation, but which is unlikely to actually be used
in programs with threads/cancellation.


# 56385dd5 03-Aug-2011 Rich Felker <dalias@aerifal.cx>

missed detail in cancellation bloat fix


# 730bee72 03-Aug-2011 Rich Felker <dalias@aerifal.cx>

fix static linking dependency bloat with cancellation

previously, pthread_cleanup_push/pop were pulling in all of
pthread_create due to dependency on the __pthread_unwind_next
function. this was not needed, as cancellation cleanup handlers can
never be called unless pthread_exit or pthread_cancel is reachable.


# dba68bf9 30-Jul-2011 Rich Felker <dalias@aerifal.cx>

add proper fuxed-based locking for stdio

previously, stdio used spinlocks, which would be unacceptable if we
ever add support for thread priorities, and which yielded
pathologically bad performance if an application attempted to use
flockfile on a key file as a major/primary locking mechanism.

i had held off on making this change for fear that it would hurt
performance in the non-threaded case, but actually support for
recursive locking had already inflicted that cost. by having the
internal locking functions store a flag indicating whether they need
to perform unlocking, rather than using the actual recursive lock
counter, i was able to combine the conditionals at unlock time,
eliminating any additional cost, and also avoid a nasty corner case
where a huge number of calls to ftrylockfile could cause deadlock
later at the point of internal locking.

this commit also fixes some issues with usage of pthread_self
conflicting with __attribute__((const)) which resulted in crashes with
some compiler versions/optimizations, mainly in flockfile prior to
pthread_create.


# acb04806 29-Jul-2011 Rich Felker <dalias@aerifal.cx>

new attempt at making set*id() safe and robust

changing credentials in a multi-threaded program is extremely
difficult on linux because it requires synchronizing the change
between all threads, which have their own thread-local credentials on
the kernel side. this is further complicated by the fact that changing
the real uid can fail due to exceeding RLIMIT_NPROC, making it
possible that the syscall will succeed in some threads but fail in
others.

the old __rsyscall approach being replaced was robust in that it would
report failure if any one thread failed, but in this case, the program
would be left in an inconsistent state where individual threads might
have different uid. (this was not as bad as glibc, which would
sometimes even fail to report the failure entirely!)

the new approach being committed refuses to change real user id when
it cannot temporarily set the rlimit to infinity. this is completely
POSIX conformant since POSIX does not require an implementation to
allow real-user-id changes for non-privileged processes whatsoever.
still, setting the real uid can fail due to memory allocation in the
kernel, but this can only happen if there is not already a cached
object for the target user. thus, we forcibly serialize the syscalls
attempts, and fail the entire operation on the first failure. this
*should* lead to an all-or-nothing success/failure result, but it's
still fragile and highly dependent on kernel developers not breaking
things worse than they're already broken.

ideally linux will eventually add a CLONE_USERCRED flag that would
give POSIX conformant credential changes without any hacks from
userspace, and all of this code would become redundant and could be
removed ~10 years down the line when everyone has abandoned the old
broken kernels. i'm not holding my breath...


# 7779dbd2 13-Jun-2011 Rich Felker <dalias@aerifal.cx>

fix race condition in pthread_kill

if thread id was reused by the kernel between the time pthread_kill
read it from the userspace pthread_t object and the time of the tgkill
syscall, a signal could be sent to the wrong thread. the tgkill
syscall was supposed to prevent this race (versus the old tkill
syscall) but it can't; it can only help in the case where the tid is
reused in a different process, but not when the tid is reused in the
same process.

the only solution i can see is an extra lock to prevent threads from
exiting while another thread is trying to pthread_kill them. it should
be very very cheap in the non-contended case.


# f58c8a0f 13-Jun-2011 Rich Felker <dalias@aerifal.cx>

run dtors before taking the exit-lock in pthread exit

previously a long-running dtor could cause pthread_detach to block.


# 6232b96f 13-Jun-2011 Rich Felker <dalias@aerifal.cx>

minor locking optimizations


# 11e4b925 07-May-2011 Rich Felker <dalias@aerifal.cx>

optimize out useless default-attribute object in pthread_create


# 4c4e22d7 07-May-2011 Rich Felker <dalias@aerifal.cx>

optimize compound-literal sigset_t's not to contain useless hurd bits


# 99b8a25e 07-May-2011 Rich Felker <dalias@aerifal.cx>

overhaul implementation-internal signal protections

the new approach relies on the fact that the only ways to create
sigset_t objects without invoking UB are to use the sig*set()
functions, or from the masks returned by sigprocmask, sigaction, etc.
or in the ucontext_t argument to a signal handler. thus, as long as
sigfillset and sigaddset avoid adding the "protected" signals, there
is no way the application will ever obtain a sigset_t including these
bits, and thus no need to add the overhead of checking/clearing them
when sigprocmask or sigaction is called.

note that the old code actually *failed* to remove the bits from
sa_mask when sigaction was called.

the new implementations are also significantly smaller, simpler, and
faster due to ignoring the useless "GNU HURD signals" 65-1024, which
are not used and, if there's any sanity in the world, never will be
used.


# a6054e3c 19-Apr-2011 Rich Felker <dalias@aerifal.cx>

move some more code out of pthread_create.c

this also de-uglifies the dummy function aliasing a bit.


# 2afed79f 17-Apr-2011 Rich Felker <dalias@aerifal.cx>

pthread_exit is not supposed to affect cancellability

if the exit was caused by cancellation, __cancel has already set these
flags anyway.


# 1ebde9c3 17-Apr-2011 Rich Felker <dalias@aerifal.cx>

fix pthread_exit from cancellation handler

cancellation frames were not correctly popped, so this usage would not
only loop, but also reuse discarded and invalid parts of the stack.


# 9080cc15 17-Apr-2011 Rich Felker <dalias@aerifal.cx>

clean up handling of thread/nothread mode, locking


# feee9890 17-Apr-2011 Rich Felker <dalias@aerifal.cx>

overhaul pthread cancellation

this patch improves the correctness, simplicity, and size of
cancellation-related code. modulo any small errors, it should now be
completely conformant, safe, and resource-leak free.

the notion of entering and exiting cancellation-point context has been
completely eliminated and replaced with alternative syscall assembly
code for cancellable syscalls. the assembly is responsible for setting
up execution context information (stack pointer and address of the
syscall instruction) which the cancellation signal handler can use to
determine whether the interrupted code was in a cancellable state.

these changes eliminate race conditions in the previous generation of
cancellation handling code (whereby a cancellation request received
just prior to the syscall would not be processed, leaving the syscall
to block, potentially indefinitely), and remedy an issue where
non-cancellable syscalls made from signal handlers became cancellable
if the signal handler interrupted a cancellation point.

x86_64 asm is untested and may need a second try to get it right.


# 016a5dc1 13-Apr-2011 Rich Felker <dalias@aerifal.cx>

use a separate signal from SIGCANCEL for SIGEV_THREAD timers

otherwise we cannot support an application's desire to use
asynchronous cancellation within the callback function. this change
also slightly debloats pthread_create.c.


# 9beb6330 13-Apr-2011 Rich Felker <dalias@aerifal.cx>

simplify cancellation point handling

we take advantage of the fact that unless self->cancelpt is 1,
cancellation cannot happen. so just increment it by 2 to temporarily
block cancellation. this drops pthread_create.o well under 1k.


# c2cd25bf 06-Apr-2011 Rich Felker <dalias@aerifal.cx>

consistency: change all remaining syscalls to use SYS_ rather than __NR_ prefix


# b2486a89 06-Apr-2011 Rich Felker <dalias@aerifal.cx>

move rsyscall out of pthread_create module

this is something of a tradeoff, as now set*id() functions, rather
than pthread_create, are what pull in the code overhead for dealing
with linux's refusal to implement proper POSIX thread-vs-process
semantics. my motivations are:

1. it's cleaner this way, especially cleaner to optimize out the
rsyscall locking overhead from pthread_create when it's not needed.
2. it's expected that only a tiny number of core system programs will
ever use set*id() functions, whereas many programs may want to use
threads, and making thread overhead tiny is an incentive for "light"
programs to try threads.


# 74950b33 06-Apr-2011 Rich Felker <dalias@aerifal.cx>

pthread exit stuff: don't bother setting errno when we won't check it.


# 622804ec 06-Apr-2011 Rich Felker <dalias@aerifal.cx>

fix rsyscall handler: must not clobber errno from signal context


# 729cb49f 05-Apr-2011 Rich Felker <dalias@aerifal.cx>

new framework to inhibit thread cancellation when needed

with these small changes, libc functions which need to call functions
which are cancellation points, but which themselves must not be
cancellation points, can use the CANCELPT_INHIBIT and CANCELPT_RESUME
macros to temporarily inhibit all cancellation.


# 7fd39952 03-Apr-2011 Rich Felker <dalias@aerifal.cx>

pthread_create need not set errno


# 66def4e7 03-Apr-2011 Rich Felker <dalias@aerifal.cx>

block all signals during rsyscall

otherwise a signal handler could see an inconsistent and nonconformant
program state where different threads have different uids/gids.


# 1ad049b7 03-Apr-2011 Rich Felker <dalias@aerifal.cx>

fix race condition in rsyscall handler

the problem: there is a (single-instruction) race condition window
between a thread flagging itself dead and decrementing itself from the
thread count. if it receives the rsyscall signal at this exact moment,
the rsyscall caller will never succeed in signalling enough flags to
succeed, and will deadlock forever. in previous versions of musl, the
about-to-terminate thread masked all signals prior to decrementing
the thread count, but this cost a whole syscall just to account for
extremely rare races.

the solution is a huge hack: rather than blocking in the signal
handler if the thread is dead, modify the signal mask of the saved
context and return in order to prevent further signal handling by the
dead thread. this allows the dead thread to continue decrementing the
thread count (if it had not yet done so) and exiting, even while the
live part of the program blocks for rsyscall.


# c9b2d801 02-Apr-2011 Rich Felker <dalias@aerifal.cx>

don't trust siginfo in rsyscall handler

for some inexplicable reason, linux allows the sender of realtime
signals to spoof its identity. permission checks for sending signals
should limit the impact to same-user processes, but just to be safe,
we avoid trusting the siginfo structure and instead simply examine the
program state to see if we're in the middle of a legitimate rsyscall.


# f01d3518 02-Apr-2011 Rich Felker <dalias@aerifal.cx>

simplify calling of timer signal handler


# fd80cfa0 03-Apr-2011 Rich Felker <dalias@aerifal.cx>

omit pthread tsd dtor code if tsd is not used


# 4ae5e811 01-Apr-2011 Rich Felker <dalias@aerifal.cx>

simplify setting result on thread cancellation


# 3df3d4f5 01-Apr-2011 Rich Felker <dalias@aerifal.cx>

fix misspelled PTHREAD_CANCELED constant


# bf619d82 28-Mar-2011 Rich Felker <dalias@aerifal.cx>

major improvements to cancellation handling

- there is no longer any risk of spoofing cancellation requests, since
the cancel flag is set in pthread_cancel rather than in the signal
handler.

- cancellation signal is no longer unblocked when running the
cancellation handlers. instead, pthread_create will cause any new
threads created from a cancellation handler to unblock their own
cancellation signal.

- various tweaks in preparation for POSIX timer support.


# ea343364 25-Mar-2011 Rich Felker <dalias@aerifal.cx>

match glibc/lsb cancellation abi on i386

glibc made the ridiculous choice to use pass-by-register calling
convention for these functions, which is impossible to duplicate
directly on non-gcc compilers. instead, we use ugly asm to wrap and
convert the calling convention. presumably this works with every
compiler anyone could potentially want to use.


# b470030f 24-Mar-2011 Rich Felker <dalias@aerifal.cx>

overhaul cancellation to fix resource leaks and dangerous behavior with signals

this commit addresses two issues:

1. a race condition, whereby a cancellation request occurring after a
syscall returned from kernelspace but before the subsequent
CANCELPT_END would cause cancellable resource-allocating syscalls
(like open) to leak resources.

2. signal handlers invoked while the thread was blocked at a
cancellation point behaved as if asynchronous cancellation mode wer in
effect, resulting in potentially dangerous state corruption if a
cancellation request occurs.

the glibc/nptl implementation of threads shares both of these issues.

with this commit, both are fixed. however, cancellation points
encountered in a signal handler will not be acted upon if the signal
was received while the thread was already at a cancellation point.
they will of course be acted upon after the signal handler returns, so
in real-world usage where signal handlers quickly return, it should
not be a problem. it's possible to solve this problem too by having
sigaction() wrap all signal handlers with a function that uses a
pthread_cleanup handler to catch cancellation, patch up the saved
context, and return into the cancellable function that will catch and
act upon the cancellation. however that would be a lot of complexity
for minimal if any benefit...


# aa398f56 19-Mar-2011 Rich Felker <dalias@aerifal.cx>

global cleanup to use the new syscall interface


# 685e40bb 19-Mar-2011 Rich Felker <dalias@aerifal.cx>

syscall overhaul part two - unify public and internal syscall interface

with this patch, the syscallN() functions are no longer needed; a
variadic syscall() macro allows syscalls with anywhere from 0 to 6
arguments to be made with a single macro name. also, manually casting
each non-integer argument with (long) is no longer necessary; the
casts are hidden in the macros.

some source files which depended on being able to define the old macro
SYSCALL_RETURNS_ERRNO have been modified to directly use __syscall()
instead of syscall(). references to SYSCALL_SIGSET_SIZE and SYSCALL_LL
have also been changed.

x86_64 has not been tested, and may need a follow-up commit to fix any
minor bugs/oversights.


# d00ff295 19-Mar-2011 Rich Felker <dalias@aerifal.cx>

overhaul syscall interface

this commit shuffles around the location of syscall definitions so
that we can make a syscall() library function with both SYS_* and
__NR_* style syscall names available to user applications, provides
the syscall() library function, and optimizes the code that performs
the actual inline syscalls in the library itself.

previously on i386 when built as PIC (shared library), syscalls were
incurring bus lock (lock prefix) overhead at entry and exit, due to
the way the ebx register was being loaded (xchg instruction with a
memory operand). now the xchg takes place between two registers.

further cleanup to arch/$(ARCH)/syscall.h is planned.


# 29fae657 16-Mar-2011 Rich Felker <dalias@aerifal.cx>

cut out a syscall on thread creation in the case where guard size is 0


# 5eb0d33e 12-Mar-2011 Rich Felker <dalias@aerifal.cx>

implement flockfile api, rework stdio locking


# 5fcebcde 10-Mar-2011 Rich Felker <dalias@aerifal.cx>

optimize pthread termination in the non-detached case

we can avoid blocking signals by simply using a flag to mark that the
thread has exited and prevent it from getting counted in the rsyscall
signal-pingpong. this restores the original pthread create/join
throughput from before the sigprocmask call was added.


# 52213f73 10-Mar-2011 Rich Felker <dalias@aerifal.cx>

security fix: check that cancel/rsyscall signal was sent by the process itself


# 98e02144 19-Feb-2011 Rich Felker <dalias@aerifal.cx>

use rt_sigprocmask, not legacy sigprocmask, syscall in pthread exit code


# 19eb13b9 19-Feb-2011 Rich Felker <dalias@aerifal.cx>

race condition fix: block all signals before decrementing thread count

the existence of a (kernelspace) thread must never have observable
effects after the thread count is decremented. if signals are not
blocked, it could end up handling the signal for rsyscall and
contributing towards the count of threads which have changed ids,
causing a thread to be missed. this could lead to one thread retaining
unwanted privilege level.

this change may also address other subtle race conditions in
application code that uses signals.


# fb11b6b8 19-Feb-2011 Rich Felker <dalias@aerifal.cx>

make pthread_exit run dtors for last thread, wait to decrement thread count


# e8827563 17-Feb-2011 Rich Felker <dalias@aerifal.cx>

reorganize pthread data structures and move the definitions to alltypes.h

this allows sys/types.h to provide the pthread types, as required by
POSIX. this design also facilitates forcing ABI-compatible sizes in
the arch-specific alltypes.h, while eliminating the need for
developers changing the internals of the pthread types to poke around
with arch-specific headers they may not be able to test.


# 0b2006c8 15-Feb-2011 Rich Felker <dalias@aerifal.cx>

begin unifying clone/thread management interface in preparation for porting


# 59666802 15-Feb-2011 Rich Felker <dalias@aerifal.cx>

make pthread_create return EAGAIN on resource failure, as required by POSIX


# 1a9a2ff7 13-Feb-2011 Rich Felker <dalias@aerifal.cx>

reorganize thread exit code, make pthread_exit call cancellation handlers (pt2)


# 0b44a031 11-Feb-2011 Rich Felker <dalias@aerifal.cx>

initial check-in, version 0.5.0