Cross Reference: /seL4-test-master/projects/musllibc/src/thread/synccall.c

History log of /seL4-test-master/projects/musllibc/src/thread/synccall.c
Revision	Date	Author	Comments
# 56fbaa3b	03-Mar-2015	Rich Felker <dalias@aerifal.cx>	make all objects used with atomic operations volatile the memory model we use internally for atomics permits plain loads of values which may be subject to concurrent modification without requiring that a special load function be used. since a compiler is free to make transformations that alter the number of loads or the way in which loads are performed, the compiler is theoretically free to break this usage. the most obvious concern is with atomic cas constructs: something of the form tmp=p;a_cas(p,tmp,f(tmp)); could be transformed to a_cas(p,p,f(p)); where the latter is intended to show multiple loads of p whose resulting values might fail to be equal; this would break the atomicity of the whole operation. but even more fundamental breakage is possible. with the changes being made now, objects that may be modified by atomics are modeled as volatile, and the atomic operations performed on them by other threads are modeled as asynchronous stores by hardware which happens to be acting on the request of another thread. such modeling of course does not itself address memory synchronization between cores/cpus, but that aspect was already handled. this all seems less than ideal, but it's the best we can do without mandating a C11 compiler and using the C11 model for atomics. in the case of pthread_once_t, the ABI type of the underlying object is not volatile-qualified. so we are assuming that accessing the object through a volatile-qualified lvalue via casts yields volatile access semantics. the language of the C standard is somewhat unclear on this matter, but this is an assumption the linux kernel also makes, and seems to be the correct interpretation of the standard.
# 78a8ef47	15-Jan-2015	Rich Felker <dalias@aerifal.cx>	overhaul __synccall and fix AS-safety and other issues in setid multi-threaded setid and setrlimit use the internal __synccall function to work around the kernel's wrongful treatment of these process properties as thread-local. the old implementation of __synccall failed to be AS-safe, despite POSIX requiring setuid and setgid to be AS-safe, and was not rigorous in assuring that all threads were caught. in a worst case, threads late in the process of exiting could retain permissions after setuid reported success, in which case attacks to regain dropped permissions may have been possible under the right conditions. the new implementation of __synccall depends on the presence of /proc/self/task and will fail if it can't be opened, but is able to determine that it has caught all threads, and does not use any locks except its own. it thereby achieves AS-safety simply by blocking signals to preclude re-entry in the same thread. with this commit, all known conformance and safety issues in set*id functions should be fixed.
# 83dc6eb0	05-Jul-2014	Rich Felker <dalias@aerifal.cx>	eliminate use of cached pid from thread structure the main motivation for this change is to remove the assumption that the tid of the main thread is also the pid of the process. (the value returned by the set_tid_address syscall was used to fill both fields despite it semantically being the tid.) this is historically and presently true on linux and unlikely to change, but it conceivably could be false on other systems that otherwise reproduce the linux syscall api/abi. only a few parts of the code were actually still using the cached pid. in a couple places (aio and synccall) it was a minor optimization to avoid a syscall. caching could be reintroduced, but lazily as part of the public getpid function rather than at program startup, if it's deemed important for performance later. in other places (cancellation and pthread_kill) the pid was completely unnecessary; the tkill syscall can be used instead of tgkill. this is actually a rather subtle issue, since tgkill is supposedly a solution to race conditions that can affect use of tkill. however, as documented in the commit message for commit 7779dbd2663269b465951189b4f43e70839bc073, tgkill does not actually solve this race; it just limits it to happening within one process rather than between processes. we use a lock that avoids the race in pthread_kill, and the use in the cancellation signal handler is self-targeted and thus not subject to tid reuse races, so both are safe regardless of which syscall (tgkill or tkill) is used.
# 57174444	11-Dec-2013	Szabolcs Nagy <nsz@port70.net>	include cleanups: remove unused headers and add feature test macros
# b4b9743c	02-Sep-2013	Rich Felker <dalias@aerifal.cx>	fix mips-specific bug in synccall (too little space for signal mask) switch to the new __block_all_sigs/__restore_sigs internal API to clean up the code too.
# 3c0501d2	02-Sep-2013	Rich Felker <dalias@aerifal.cx>	in synccall, ignore the signal before any threads' signal handlers return this protects against deadlock from spurious signals (e.g. sent by another process) arriving after the controlling thread releases the other threads from the sync operation.
# a731e410	02-Sep-2013	Rich Felker <dalias@aerifal.cx>	fix invalid pointer in synccall (multithread setuid, etc.) the head pointer was not being reset between calls to synccall, so any use of this interface more than once would build the linked list incorrectly, keeping the (now invalid) list nodes from the previous call.
# 47d2bf51	26-Apr-2013	Rich Felker <dalias@aerifal.cx>	synccall signal handler need not handle dead threads anymore they have already blocked signals before decrementing the thread count, so the code being removed is unreachable in the case where the thread is no longer counted.
# ccc7b4c3	26-Mar-2013	Rich Felker <dalias@aerifal.cx>	remove __SYSCALL_SSLEN arch macro in favor of using public _NSIG the issue at hand is that many syscalls require as an argument the kernel-ABI size of sigset_t, intended to allow the kernel to switch to a larger sigset_t in the future. previously, each arch was defining this size in syscall_arch.h, which was redundant with the definition of _NSIG in bits/signal.h. as it's used in some not-quite-portable application code as well, _NSIG is much more likely to be recognized and understood immediately by someone reading the code, and it's also shorter and less cluttered. note that _NSIG is actually 65/129, not 64/128, but the division takes care of throwing away the off-by-one part.
# efd4d87a	08-Nov-2012	Rich Felker <dalias@aerifal.cx>	clean up sloppy nested inclusion from pthread_impl.h this mirrors the stdio_impl.h cleanup. one header which is not strictly needed, errno.h, is left in pthread_impl.h, because since pthread functions return their error codes rather than using errno, nearly every single pthread function needs the errno constants. in a few places, rather than bringing in string.h to use memset, the memset was replaced by direct assignment. this seems to generate much better code anyway, and makes many functions which were previously non-leaf functions into leaf functions (possibly eliminating a great deal of bloat on some platforms where non-leaf functions require ugly prologue and/or epilogue).
# dcd60371	05-Oct-2012	Rich Felker <dalias@aerifal.cx>	support for TLS in dynamic-loaded (dlopen) modules unlike other implementations, this one reserves memory for new TLS in all pre-existing threads at dlopen-time, and dlopen will fail with no resources consumed and no new libraries loaded if memory is not available. memory is not immediately distributed to running threads; that would be too complex and too costly. instead, assurances are made that threads needing the new TLS can obtain it in an async-signal-safe way from a buffer belonging to the dynamic linker/new module (via atomic fetch-and-add based allocator). I've re-appropriated the lock that was previously used for __synccall (synchronizing set*id() syscalls between threads) as a general pthread_create lock. it's a "backwards" rwlock where the "read" operation is safe atomic modification of the live thread count, which multiple threads can perform at the same time, and the "write" operation is making sure the count does not increase during an operation that depends on it remaining bounded (__synccall or dlopen). in static-linked programs that don't use __synccall, this lock is a no-op and has no cost.
# 2f437040	09-Aug-2012	Rich Felker <dalias@aerifal.cx>	fix (hopefully) all hard-coded 8's for kernel sigset_t size some minor changes to how hard-coded sets for thread-related purposes are handled were also needed, since the old object sizes were not necessarily sufficient. things have gotten a bit ugly in this area, and i think a cleanup is in order at some point, but for now the goal is just to get the code working on all supported archs including mips, which was badly broken by linux rejecting syscalls with the wrong sigset_t size.
# 0c29adfe	22-May-2012	Rich Felker <dalias@aerifal.cx>	remove everything related to forkall i made a best attempt, but the intended semantics of this function are fundamentally contradictory. there is no consistent way to handle ownership of locks when forking a multi-threaded process. the code could have worked by accident for programs that only used normal mutexes and nothing else (since they don't actually store or care about their owner), but that's about it. broken-by-design interfaces that aren't even in glibc (only solaris) don't belong in musl.
# 407d9330	12-Aug-2011	Rich Felker <dalias@aerifal.cx>	pthread and synccall cleanup, new __synccall_wait op fix up clone signature to match the actual behavior. the new __syncall_wait function allows a __synccall callback to wait for other threads to continue without returning, so that it can resume action after the caller finishes. this interface could be made significantly more general/powerful with minimal effort, but i'll wait to do that until it's actually useful for something.
# 7dd60b80	29-Jul-2011	Rich Felker <dalias@aerifal.cx>	fix bug in synccall with no threads: lock was taken but never released
# acb04806	29-Jul-2011	Rich Felker <dalias@aerifal.cx>	new attempt at making setid() safe and robust changing credentials in a multi-threaded program is extremely difficult on linux because it requires synchronizing the change between all threads, which have their own thread-local credentials on the kernel side. this is further complicated by the fact that changing the real uid can fail due to exceeding RLIMIT_NPROC, making it possible that the syscall will succeed in some threads but fail in others. the old __rsyscall approach being replaced was robust in that it would report failure if any one thread failed, but in this case, the program would be left in an inconsistent state where individual threads might have different uid. (this was not as bad as glibc, which would sometimes even fail to report the failure entirely!) the new approach being committed refuses to change real user id when it cannot temporarily set the rlimit to infinity. this is completely POSIX conformant since POSIX does not require an implementation to allow real-user-id changes for non-privileged processes whatsoever. still, setting the real uid can fail due to memory allocation in the kernel, but this can only happen if there is not already a cached object for the target user. thus, we forcibly serialize the syscalls attempts, and fail the entire operation on the first failure. this should* lead to an all-or-nothing success/failure result, but it's still fragile and highly dependent on kernel developers not breaking things worse than they're already broken. ideally linux will eventually add a CLONE_USERCRED flag that would give POSIX conformant credential changes without any hacks from userspace, and all of this code would become redundant and could be removed ~10 years down the line when everyone has abandoned the old broken kernels. i'm not holding my breath...