History log of /freebsd-9.3-release/lib/libkse/thread/thr_kern.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 267654 19-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 225736 22-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


# 174335 06-Dec-2007 deischen

Set the tcb (thread control block) in the child process after a fork.
This protects against a race with an upcall in the parent during the
fork which can clobber the parent's tcb before the vm space is copied
in the child. The child then gets a corrupted tcb that is either null
or that points to another thread that doesn't exist in the child (after
a fork, only the fork()ing thread exists in the child).

Reported by: Arno J. Klaassen (arno at heho / snv / jussieu / fr)


# 174111 30-Nov-2007 deischen

Initialize the current thread and signal locks so that sigaction()
will work after a fork().

WARNS=3'ify.


# 173967 27-Nov-2007 jasone

Add _pthread_mutex_init_calloc_cb() to libthr and libkse, so that malloc(3)
(part of libc) can use pthreads mutexes without causing infinite recursion
during initialization.


# 172491 09-Oct-2007 obrien

Repo copy libpthreads to libkse.
This introduces the WITHOUT_LIBKSE nob,
and changes WITHOUT_LIBPTHREADS to mean with neither threading libs.
Approved by: re(kensmith)


# 167244 05-Mar-2007 brian

Oops, fix a typo in the last commit :-/


# 167241 05-Mar-2007 brian

In the NOTYET code path when a process forks, the remaining
child thread goes back to system scope rather than process
scope. This allows an ensuing exec() to actually work.

This change was made a year ago here, but I "forgot" to
commit it :(

Approved by: deischen
MFC after: 3 weeks


# 165334 18-Dec-2006 peadar

Clean bound and non-bound pthread structures consistently before
they become candidates for reuse. Without this fix, some of the
state from a thread structure's previous incarnation could interfere
with its new one. Specifically, a non-bound thread started as
"suspended" (see pthread_attr_setcreatesuspend_np()) might not get
scheduled at all when resumed, as the "active" flag would be set
spuriously.

Reviewed by: deischen@, davidxu@
MFC after: 1 week


# 155745 15-Feb-2006 deischen

Don't forget to initialize a tailq before using it.

MFC candidate
Noticed by: luoqi


# 153989 03-Jan-2006 brian

For the ``#ifdef NOTYET'' code that allows calling non-async-safe
functions in the child after a fork() from a threaded process,
use __sys_setprocmask() rather than setprocmask() to keep our
signal handling sane. Without this fix, signals are essentially
ignored in said child and things such as protection violations
result in an endless busy loop.

Reviewed by: deischen


# 150499 23-Sep-2005 brian

Modify the code path of the ifdef NOTYET part of _kse_single_thread():

o Don't reinitialise the atfork() handler list in the child. We
are meant to call the child handler, and on subsequent fork()s
should call all three functions as normal.
o Don't reinitialise the thread specific keyed data in the
child after a fork. Applications may require this for context.
o Reinitialise curthread->tlflags after removing ourselves from
(and reinitialising) the various internal thread lists.
o Reinitialise __malloc_lock in the child after fork() (to balance
our explicitly taking the lock prior to the fork()).

With these changes, it is possible to enable the NOTYET code in
thr_kern.c to allow the use of non-async-safe functions after
fork()ing from a threaded program.

Reviewed by: Daniel Eischen <deischen@freebsd.org>
[_malloc_lock reinitialisation has since been moved to avoid polluting the
!NOTYET code]


# 149617 30-Aug-2005 deischen

Allocate a thread's tcb last so it is easier to handle failures to
malloc() siginfo.

PR: 85468


# 149579 29-Aug-2005 deischen

Handle failure to malloc() part of the thread structure.

PR: 83457


# 139023 18-Dec-2004 deischen

Use a generic way to back threads out of wait queues when handling
signals instead of having more intricate knowledge of thread state
within signal handling.

Simplify signal code because of above (by David Xu).

Use macros for libpthread usage of pthread_cleanup_push() and
pthread_cleanup_pop(). This removes some instances of malloc()
and free() from the semaphore and pthread_once() implementations.

When single threaded and forking(), make sure that the current
thread's signal mask is inherited by the forked thread.

Use private mutexes for libc and libpthread. Signals are
deferred while threads hold private mutexes. This fix also
breaks www/linuxpluginwrapper; a patch that fixes it is at
http://people.freebsd.org/~deischen/kse/linuxpluginwrapper.diff

Fix race condition in condition variables where handling a
signal (pthread_kill() or kill()) may not see a wakeup
(pthread_cond_signal() or pthread_cond_broadcast()).

In collaboration with: davidxu


# 136846 23-Oct-2004 davidxu

1. Move thread list flags into new separate member, and atomically
put DEAD thread on GC list, this closes a race between pthread_join
and thr_cleanup.
2. Introduce a mutex to protect tcb initialization, tls allocation and
deallocation code in rtld seems no lock protection or it is broken,
under stress testing, memory is corrupted.

Reviewed by: deischen
patch partly provided by: deischen


# 136286 08-Oct-2004 davidxu

if system scope thread didn't set timeout, don't call clock_gettime syscall
before and after sleeping.

Reviewed by: deischen


# 135714 24-Sep-2004 ssouhlal

Make sure we don't call _thr_start_sig_daemon() when SYSTEM_SCOPE_ONLY is defined. This makes libpthread usable on powerpc.

Approved by: grehan (mentor), deischen


# 133756 15-Aug-2004 dfr

Add TLS support for i386 and amd64.


# 133563 12-Aug-2004 deischen

As long as we have a knob to force system scope threads, why not have
a knob to force process scope threads. If the environment variable
LIBPTHREAD_PROCESS_SCOPE is set, force all threads to be process
scope threads regardless of how the application creates them. If
LIBPTHREAD_SYSTEM_SCOPE is set (forcing system scope threads), it
overrides LIBPTHREAD_PROCESS_SCOPE.

$ # To force system scope threads
$ LIBPTHREAD_SYSTEM_SCOPE=anything threaded_app
$ # To force process scope threads
$ LIBPTHREAD_PROCESS_SCOPE=anything threaded_app


# 133344 08-Aug-2004 davidxu

Check debugger suspending flag for system scope thread.

Reviewed by: deischen


# 133269 07-Aug-2004 deischen

Add a way to force 1:1 mode for libpthread. To do this, define
LIBPTHREAD_SYSTEM_SCOPE in the environment.

You can still force libpthread to be built in strictly 1:1 by
adding -DSYSTEM_SCOPE_ONLY to CFLAGS. This is kept for archs
that don't yet support M:N mode.

Requested by: rwatson
Reviewed by: davidxu


# 133047 03-Aug-2004 davidxu

s/TMDF_DONOTRUNUSER/TMDF_SUSPEND/g

Dicussed with: deischen


# 132120 13-Jul-2004 davidxu

Add code to support thread debugging.
1. Add global varible _libkse_debug, debugger uses the varible to identify
libpthread. when the varible is written to non-zero by debugger, libpthread
will take some special action at context switch time, it will check
TMDF_DOTRUNUSER flags, if a thread has the flags set by debugger, it won't
be scheduled, when a thread leaves KSE critical region, thread checks
the flag, if it was set, the thread relinquish CPU.

2. Add pq_first_debug to select a thread allowd to run by debugger.

3. Some names prefixed with _thr are renamed to _thread prefix.

which is allowed to run by debugger.


# 128041 08-Apr-2004 deischen

After forking and initializing the library to single-threaded
mode (where the forked thread is the one and only thread and
is marked as system scope), set the system scope flag before
initializing the signal mask. This prevents trying to use
internal locks that haven't yet been initialized.

Reported by: Dan Nelson <dnelson at allantgroup.com>
Reviewed by: davidxu


# 123668 19-Dec-2003 davidxu

Replace a comment with more accurated one, memory heap is now protected by
new fork() wrapper.


# 123312 09-Dec-2003 davidxu

Rename _thr_enter_cancellation_point to _thr_cancel_enter, rename
_thr_leave_cancellation_point to _thr_cancel_leave, add a parameter
to _thr_cancel_leave to indicate whether cancellation point should be
checked, this gives us an option to not check cancallation point if
a syscall successfully returns to avoid any leaks, current I have
creat(), open() and fcntl(F_DUPFD) to not check cancellation point
after they sucessfully returned.

Replace some members in structure kse with bit flags to same some
memory.

Conditionally compile THR_ASSERT to nothing if _PTHREAD_INVARIANTS is
not defined.

Inline some small functions in thr_cancel.c.

Use __predict_false in thr_kern.c for some executed only once code.

Reviewd by: deischen


# 123048 29-Nov-2003 davidxu

1.Macro optimizing KSE_LOCK_ACQUIRE and THR_LOCK_ACQUIRE to use static fall
through branch predict as suggested in INTEL IA32 optimization guide.

2.Allocate siginfo arrary separately to avoid pthread to be allocated at
2K boundary, which hits L1 address alias problem and causes context
switch to be slow down.

3.Simplify context switch code by removing redundant code, code size is
reduced, so it is expected to run faster.

Reviewed by: deischen
Approved by: re (scottl)


# 122338 08-Nov-2003 davidxu

If a thread in critical region got a synchronous signal, according current
signal handling mode, there is no chance to handle the signal, something
must be wrong in the library, just call kse_thr_interrupt to dump its core.
I have the code for a long time, but forgot to commit it.


# 122075 04-Nov-2003 deischen

Add an implementation for pthread_atfork().

Aside from the POSIX requirements for pthread_atfork(), when
fork()ing, take the malloc lock to keep malloc state consistent
in the child.

Reviewed by: davidxu


# 120896 07-Oct-2003 davidxu

Complete cancellation support for M:N threads, check cancelling flag when
thread state is changed from RUNNING to WAIT state and do some cancellation
operations for every cancellable state.

Reviewed by: deischen


# 120567 29-Sep-2003 davidxu

When concurrency level is reduced and a kse is exiting, make sure no other
threads are still referencing the kse by migrating them to initial kse.

Reviewed by: deischen


# 120554 28-Sep-2003 davidxu

Remove unused variable.


# 120370 23-Sep-2003 davidxu

Free thread name memory if there is.


# 120263 19-Sep-2003 marcel

Make KSE_STACKSIZE machine dependent by moving it from thr_kern.c to
pthread_md.h. This commit only moves the definition; it does not
change it for any of the platforms. This more easily allows 64-bit
architectures (in particular) to pick a slightly larger stack size.


# 120109 16-Sep-2003 davidxu

Fix a typo. Also turn on PTHREAD_SCOPE_SYSTEM after fork().


# 120074 14-Sep-2003 davidxu

1. Allocating and freeing lock related resource in _thr_alloc and _thr_free
to avoid potential memory leak, also fix a bug in pthread_create, contention
scope should be inherited when PTHREAD_INHERIT_SCHED is set, and also check
right field for PTHREAD_INHERIT_SCHED, scheduling inherit flag is in sched_inherit.
2. Execute hooks registered by atexit() on thread stack but not on scheduler
stack.
3. Simplify some code in _kse_single_thread by calling xxx_destroy functions.

Reviewed by: deischen


# 119736 04-Sep-2003 davidxu

Add code to support barrier synchronous object and implement
pthread_mutex_timedlock().

Reviewed by: deischen


# 119732 04-Sep-2003 davidxu

Allow hooks registered by atexit() to run with current thread pointer set,
without this change, my atexit test dumps core.


# 119704 02-Sep-2003 davidxu

This is a force commit for revision 1.90 to explain further:
Removes a surplus kse_wakeup_multi call when there is no thread can run.
Also reduce time window that an IDLE kse wakes and sleeps again because
it can not get scheduler lock after wakeup, the change is small and not
perfect, futher refining it is possible but may not worth to do, it
is unknown whether we can gain performance benifit by refining it.

Prodded by: scottl


# 119700 02-Sep-2003 davidxu

Move kse_wakeup_multi call to just before KSE_SCHED_UNLOCK.

Tested on: SMP


# 119577 30-Aug-2003 deischen

Allow the concurrency level to be reduced.

Reviewed by: davidxu


# 119063 18-Aug-2003 davidxu

Treat initial thread as scope system thread when KSE mode is not activated
yet, so we can protect some locking code from being interrupted by signal
handling. When KSE mode is turned on, reset the thread flag to scope process
except we are running in 1:1 mode which we needn't turn it off.
Also remove some unused member variables in structure kse.

Tested by: deischen


# 118985 16-Aug-2003 davidxu

Keep initial kse and kse group just like we keep initial thread,
Don't free them, so some code can still reference them.

Reviewed by: deischen


# 118850 12-Aug-2003 davidxu

Always set tcb for bound thread, and switch tcb for M:N thread at correct
time.


# 118817 12-Aug-2003 davidxu

Correctly set current tcb. This fixes some IA64/KSE problems.

Reviewed by: deischen, julian


# 118747 10-Aug-2003 davidxu

Initialize rtld lock just before turning on thread mode and
uninitialize rtld lock after thread mode shutdown.


# 118676 08-Aug-2003 davidxu

o Add code to GC freed KSEs and KSE groups
o Fix a bug in kse_free_unlocked(), kcb_dtor shouldn't be called because
the KSE is cached and will be resued in _kse_alloc().

Reviewed by: deischen


# 118516 05-Aug-2003 deischen

Don't call kse_set_curthread() when scheduling a new bound
thread. It should only be called by the current kse and
never by a KSE on behalf of another.

Submitted by: davidxu


# 118510 05-Aug-2003 deischen

Rethink the MD interfaces for libpthread to account for
archs that can (or are required to) have per-thread registers.

Tested on i386, amd64; marcel is testing on ia64 and will
have some follow-up commits.

Reviewed by: davidxu


# 117907 23-Jul-2003 deischen

Move idle kse wakeup to outside of regions where locks are held.
This eliminates ping-ponging of locks, where the idle KSE wakes
up only to find the lock it needs is being held. This gives
little or no gain to M:N mode but greatly speeds up 1:1 mode.

Reviewed & Tested by: davidxu


# 117715 18-Jul-2003 deischen

Cleanup thread accounting. Don't reset a threads timeslice
when it blocks; it only gets reset when it yields.

Properly set a thread's default stack guardsize.

Reviewed by: davidxu


# 117706 17-Jul-2003 davidxu

o Eliminate upcall for PTHREAD_SYSTEM_SCOPE thread, now it
is system bound thread and when it is blocked, no upcall is generated.

o Add ability to libkse to allow it run in pure 1:1 threading mode,
defining SYSTEM_SCOPE_ONLY in Makefile can turn on this option.

o Eliminate code for installing dummy signal handler for sigwait call.

o Add hash table to find thread.

Reviewed by: deischen


# 117345 08-Jul-2003 davidxu

Restore signal mask correctly after fork().


# 117344 08-Jul-2003 davidxu

Save and restore thread's error code around signal handling.

Reviewed by: deischen


# 117193 03-Jul-2003 davidxu

Check if thread is in critical region, only testing check_pending
is not enough.


# 117066 30-Jun-2003 davidxu

Because there are only _SIG_MAXSIG elements in thread siginfo array,
use [signal number - 1] as subscript to access the array.


# 117063 30-Jun-2003 davidxu

Remove surplus unlocking code I accidentally checked in. This won't be
triggered until LDT entry is exhausted.


# 116977 28-Jun-2003 davidxu

o Use a daemon thread to monitor signal events in kernel, if pending
signals were changed in kernel, it will retrieve the pending set and
try to find a thread to dispatch the signal. The dispatching process
can be rolled back if the signal is no longer in kernel.

o Create two functions _thr_signal_init() and _thr_signal_deinit(),
all signal action settings are retrieved from kernel when threading
mode is turned on, after a fork(), child process will reset them to
user settings by calling _thr_signal_deinit(). when threading mode
is not turned on, all signal operations are direct past to kernel.

o When a thread generated a synchoronous signals and its context returned
from completed list, UTS will retrieve the signal from its mailbox and try
to deliver the signal to thread.

o Context signal mask is now only used when delivering signals, thread's
current signal mask is always the one in pthread structure.

o Remove have_signals field in pthread structure, replace it with
psf_valid in pthread_signal_frame. when psf_valid is true, in context
switch time, thread will backout itself from some mutex/condition
internal queues, then begin to process signals. when a thread is not
at blocked state and running, check_pending indicates there are signals
for the thread, after preempted and then resumed time, UTS will try to
deliver signals to the thread.

o At signal delivering time, not only pending signals in thread will be
scanned, process's pending signals will be scanned too.

o Change sigwait code a bit, remove field sigwait in pthread_wait_data,
replace it with oldsigmask in pthread structure, when a thread calls
sigwait(), its current signal mask is backuped to oldsigmask, and waitset
is copied to its signal mask and when the thread gets a signal in the
waitset range, its current signal mask is restored from oldsigmask,
these are done in atomic fashion.

o Two additional POSIX APIs are implemented, sigwaitinfo() and sigtimedwait().

o Signal code locking is better than previous, there is fewer race conditions.

o Temporary disable most of code in _kse_single_thread as it is not safe
after fork().


# 116771 23-Jun-2003 marcel

Untangle the inter-dependency of kse types and ksd types/functions
by moving the definition of struct ksd to pthread_md.h and removing
the inclusion of ksd.h from thr_private.h (which has the definition
of struct kse and kse_critical_t). This allows ksd.h to have inline
functions that use struct kse and kse_critical_t and generally
yields a cleaner implementation at the cost of not having all ksd
related types/definitions in one header.

Implement the ksd functionality on ia64 by using inline functions
and permanently remove ksd.c from the ia64 specific makefile.

This change does not clean up the i386 specific version of ksd.h.

NOTE: The ksd code on ia64 abuses the tp register in the same way
as it is abused in libthr in that it is incompatible with the
runtime specification. This will be address when support for TLS
hits the tree.


# 116719 23-Jun-2003 marcel

Change the definition of _ksd_curkse, _ksd_curthread and
_ksd_readandclear_tmbx to be function-like. That way we
can define them as inline functions or create prototypes
for them.

This change allows the ksd interface on ia64 to be fully
inlined.


# 116060 08-Jun-2003 deischen

Insert threads at the end of the free thread list so that
the chance of getting the same thread id when allocating a
new thread is reduced. This won't work if the application
creates a new thread for every time a thread exits, but
we're still within the allowances of POSIX.


# 115798 04-Jun-2003 davidxu

KMF_DONE is now in /sys/sys/kse.h, no longer need to define it here.


# 115278 24-May-2003 deischen

Change low-level locking a bit so that we can tell if
a lock is being waitied on.

Fix a races in join and cancellation.

When trying to wait on a CV and the library is not yet
threaded, make it threaded so that waiting actually works.

When trying to nanosleep() and we're not threaded, just
call the system call nanosleep instead of adding the thread
to the wait queue.

Clean up adding/removing new threads to the "all threads queue",
assigning them unique ids, and tracking how many active threads
there are. Do it all when the thread is added to the scheduling
queue instead of making pthread_create() know how to do it.

Fix a race where a thread could be marked for signal delivery
but it could be exited before we actually add the signal to it.

Other minor cleanups and bug fixes.

Submitted by: davidxu
Approved by: re@ (blanket for libpthread)


# 115173 19-May-2003 deischen

Eek, staticize a couple of functions that shouldn't
be external (initialize()!).

Remove cancellation points from _pthread_cond_wait and
_pthread_cond_timedwait (single underscore versions are
libc private functions). Point the weak reference(!) for
these functions to the versions with cancellation points.

Approved by: re@(blanket till 5/19)
Pointed out by: kan (cancellation point bug)


# 115080 16-May-2003 deischen

Add a method of yielding the current thread with the scheduler
lock held (_thr_sched_switch_unlocked()) and use this to avoid
dropping the scheduler lock and having the scheduler retake the
same lock again.

Add a better way of detecting if a low-level lock is in use.

When switching out a thread due to blocking in the UTS, don't
switch to the KSE's scheduler stack only to switch back to
another thread. If possible switch to the new thread directly
from the old thread and avoid the overhead of the extra
context switch.

Check for pending signals on a thread when entering the scheduler
and add them to the threads signal frame. This includes some
other minor signal fixes.

Most of this was a joint effor between davidxu and myself.

Reviewed by: davidxu
Approved by: re@ (blanket for libpthread)


# 114688 05-May-2003 davidxu

call dump_queues() only when DEBUG_THREAD_KERN is defined, save some
cpu cycles.


# 114664 04-May-2003 deischen

Fix suspend and resume.

Submitted (in part) by: Kazuaki Oda <kaakun@highway.ne.jp>


# 114295 30-Apr-2003 deischen

Move the mailbox to the beginning of the thread and align the
thread so that the context (SSE FPU state) is also aligned.


# 114267 29-Apr-2003 davidxu

Call kse_wakeup_mutli() after remove current thread from RUNQ to avoid
doing unnecessary idle kse wakeup.


# 114266 29-Apr-2003 davidxu

Call kse_wakeup_multi() to wakeup idle KSEs when there are threads ready
to run.


# 114187 28-Apr-2003 deischen

o Don't add a scope system thread's KSE to the list of available
KSEs when it's thread exits; allow the GC handler to do that.

o Make spinlock/spinlock critical regions.

The following were submitted by davidxu

o Alow thr_switch() to take a null mailbox argument.

o Better protect cancellation checks.

o Don't set KSE specific data when creating new KSEs; rely on the
first upcall of the KSE to set it.

o Add the ability to set the maximum concurrency level and do this
automatically. We should have a way to enable/disable this with
some sort of tunable because some applications may not want this
to be the default.

o Hold the scheduling lock across thread switch calls.

o If scheduling of a thread fails, make sure to remove it from the list
of active threads.

o Better protect accesses to a joining threads when the target thread is
exited and detached.

o Remove some macro definitions that are now provided by <sys/kse.h>.

o Don't leave the library in threaded mode if creation of the initial
KSE fails.

o Wakeup idle KSEs when there are threads ready to run.

o Maintain the number of threads active in the priority queue.


# 113942 23-Apr-2003 deischen

Protect thread errno from being changed while operating
on behalf of the KSE.

Add a kse_reinit function to reinitialize a reused KSE.

Submitted by: davidxu


# 113881 22-Apr-2003 deischen

Set the quantum for scope system threads to 0 (no quantum).


# 113870 22-Apr-2003 deischen

Add a couple asserts to pthread_cond_foo to ensure the (low-level)
lock level is 0. Thus far, the threads implementation doesn't use
mutexes or condition variables so the lock level should be 0.

Save the return value when trying to schedule a new thread and
use this to return an error from pthread_create().

Change the max sleep time for an idle KSE to 1 minute from 2 minutes.

Maintain a count of the number of KSEs within a KSEG.

With these changes scope system threads seem to work, but heavy
use of them crash the kernel (supposedly VM bugs).


# 113786 21-Apr-2003 deischen

Add an i386-specifc hack to always set %gs. There still seems
to be instances where the kernel doesn't properly save and/or
restore it.

Use noupcall and nocompleted flags in the KSE mailbox. These
require kernel changes to work which will be committed sometime
later. Things still work without the changes.

Remove the general kse entry function and use two different
functions -- one for scope system threads and one for scope
process threads. The scope system function is not yet enabled
and we use the same function for all threads at the moment.

Keep a copy of the KSE stack for the case that a KSE runs
a scope system thread and uses the same stack as the thread
(no upcalls are generated, so a separate stack isn't needed).
This isn't enabled yet.

Use a separate field for the KSE waiting flag. It isn't
correct to use the mailbox flags field.

The following fixes were provided by David Xu:

o Initialize condition variable locks with thread versions
of the low-level locking functions instead of the kse versions.

o Enable threading before creating the first thread instead
of after.

o Don't enter critical regions when trying to malloc/free
or call functions that malloc/free.

o Take the scheduling lock when inheriting thread attributes.

o Check the attribute's stack pointer instead of the
attributes stack size for null when allocating a
thread's stack.

o Add a kseg reinit function so we don't have to destroy and
then recreate the same lock.

o Check the return value of kse_create() and return an
appropriate error if it fails.

o Don't forget to destroy a thread's locks when freeing it.

o Examine the correct flags word for checking to see if
a thread is in a synchronization queue.

Things should now work on an SMP kernel.


# 113662 18-Apr-2003 deischen

Remove duplicate $FreeBSD$ id.


# 113661 18-Apr-2003 deischen

Sorry folks; I accidentally committed a patch from what I was working
on a couple of days ago. This should be the most recent changes.

Noticed by: davidxu


# 113658 18-Apr-2003 deischen

Revamp libpthread so that it has a chance of working in an SMP
environment. This includes support for multiple KSEs and KSEGs.

The ability to create more than 1 KSE via pthread_setconcurrency()
is in the works as well as support for PTHREAD_SCOPE_SYSTEM threads.
Those should come shortly.

There are still some known issues which davidxu and I are working
on, but it'll make it easier for us by committing what we have.

This library now passes all of the ACE tests that libc_r passes
with the exception of one. It also seems to work OK with KDE
including konqueror, kwrite, etc. I haven't been able to get
mozilla to run due to lack of java plugin, so I'd be interested
to see how it works with that.

Reviewed by: davidxu


# 111542 26-Feb-2003 davidxu

Fix compiling error.


# 111035 17-Feb-2003 mini

Deliver signals posted via an upcall to the appropriate thread.


# 107202 24-Nov-2002 mini

Schedule spinlocked threads by moving them through the work queue, instead
of the wait queue.

Approved by: re (blanket)
Stolen from: davidxu


# 107201 24-Nov-2002 mini

Get the wall clock time from the KSE mailbox, rather than doing another
syscall.


# 107102 20-Nov-2002 davidxu

Fix idle timeout bug, use correct current time of day.


# 106786 11-Nov-2002 mini

Schedule an idle context to block until timeouts expire without blocking
further upcalls.


# 106191 30-Oct-2002 mini

Use KSE to schedule threads.


# 103419 16-Sep-2002 mini

Make libpthread KSE aware.

Reviewed by: deischen, julian
Approved by: -arch


# 103388 16-Sep-2002 mini

Make the changes needed for libpthread to compile in its new home.
The new libpthread will provide POSIX threading support using KSE.
These files were previously repo-copied from src/lib/libc_r.

Reviewed by: deischen
Approved by: -arch


# 102546 28-Aug-2002 archie

When poll(2)'ing for readability or writability of a file descriptor
on behalf of a thread, we should check the POLLERR, POLLHUP, and
POLLNVAL flags as well to wake up the thread in these cases.

Suggested by: deischen
MFC after: 3 days


# 102411 25-Aug-2002 charnier

Replace various spelling with FALLTHROUGH which is lint()able


# 90431 09-Feb-2002 deischen

This has been sitting in my local tree long enough. Remove the use
of an alternate signal stack for handling signals. Let the kernel
send signals on the stack of the current thread and teach the threads
signal handler how to deliver signals to the current thread if it
needs to. Also, always store a threads context as a jmp_buf. Eventually
this will change to be a ucontext_t or mcontext_t.

Other small nits. Use struct pthread * instead of pthread_t in internal
library routines. The threads code wants struct pthread *, and pthread_t
doesn't necessarily have to be the same.

Reviewed by: jasone


# 84610 07-Oct-2001 deischen

Limit maximum poll interval to 60 seconds. This prevents an overflow
from occurring when converting from a timeval/timespec to a timeout in
milliseconds.

Submitted by: dwmalone


# 76280 04-May-2001 deischen

Move the check for a pending signals to after the thread has been
placed in any scheduling queue(s). The process of dispatching
signals to a thread can change its state which will attempt to add
or remove the thread from any scheduling queue to which it belongs.
This can break some assertions if the thread isn't in the queue(s)
implied by its state.

When adding dispatching a pending signal to a thread, be sure to
remove the signal from the threads set of pending signals.

PR: 27035
Tested by: brian
MFC in: 1 week


# 71581 24-Jan-2001 deischen

Add weak definitions for wrapped system calls. In general:

_foo - wrapped system call
foo - weak definition to _foo

and for cancellation points:

_foo - wrapped system call
__foo - enter cancellation point, call _foo(), leave
cancellation point
foo - weak definition to __foo

Change use of global _thread_run to call a function to get the
currently running thread.

Make all pthread_foo functions weak definitions to _pthread_foo,
where _pthread_foo is the implementation. This allows an application
to provide its own pthread functions.

Provide slightly different versions of pthread_mutex_lock and
pthread_mutex_init so that we can tell the difference between
a libc mutex and an application mutex. Threads holding mutexes
internal to libc should never be allowed to exit, call signal
handlers, or cancel.

Approved by: -arch


# 70231 20-Dec-2000 deischen

Enable check for pending signals after calling a signal handler.
Restoration of a threads signal mask after invocation of a signal
handler may allow pending signals to become deliverable.

PR: 23647


# 68835 16-Nov-2000 deischen

Delete 4 lines of misleading/incorrect comments.


# 68726 14-Nov-2000 deischen

When entering the scheduler from the signal handler, tell
the kernel to (re)use the alternate signal stack. In this
case, we don't return normally from the signal handler,
so the kernel still thinks we are using the signal stack.
The fixes a nasty bug where the signal handler can start
fiddling with the stack of a thread while the handler is
actually running on the same stack.

MFC candidate


# 68615 11-Nov-2000 deischen

Correct the logic for checking the emptiness of the waiting queue.
This fixes a potential problem where the file descriptors would not
be polled causing waiting threads to stay waiting. Doh!

MFC candidate.


# 68516 09-Nov-2000 deischen

Don't needlessly poll file descriptors when there are no
file descriptors needing to be polled (Doh!). Reported
by Dan Nelson <dnelson@emsphone.com>.

Don't install and start the scheduling timer until the
first thread is created. This prevents the overhead of
having a periodic scheduling signal in a single threaded
program. Reported by Dan Nelson <dnelson@emsphone.com>.

Allow builtin longjmps out of application installed
signal handlers without the need perform any post-handler
cleanup:

o Change signal handling to save the threads interrupted
context on the stack. The threads current context is
now always stored in the same place (in the pthread).
If and when a signal handler returns, the interrupted
context is copied back to the storage area in the pthread.

o Before calling invoking a signal handler for a thread,
back the thread out of any internal waiting queues
(mutex, CV, join, etc) to which it belongs.

Rework uthread_info.c a bit to make it easier to change
the format of a thread dump.

Use an alternal signal stack for the thread library's
signal handler. This allows us to fiddle with the main
threads stack without fear of it being in use.

Reviewed by: jasone


# 67097 13-Oct-2000 deischen

Implement zero system call thread switching. Performance of
thread switches should be on par with that under scheduler
activations.

o Timing is achieved through the use of a fixed interval
timer (ITIMER_PROF) to count scheduling ticks instead
of retrieving the time-of-day upon every thread switch
and calculating elapsed real time.

o Polling for I/O readiness is performed once for each
scheduling tick instead of every thread switch.

o The non-signal saving/restoring versions of setjmp/longjmp
are used to save and restore thread contexts. This may
allow the removal of _THREAD_SAFE macros from setjmp()
and longjmp() - needs more investigation.

Change signal handling so that signals are handled in the
context of the thread that is receiving the signal. When
signals are dispatched to a thread, a special signal handling
frame is created on top of the target threads stack. The
frame contains the threads saved state information and a new
context in which the thread can run. The applications signal
handler is invoked through a wrapper routine that knows how
to restore the threads saved state and unwind to previous
frames.

Fix interruption of threads due to signals. Some states
were being improperly interrupted while other states were
not being interrupted. This should fix several PRs.

Signal handlers, which are invoked as a result of a process
signal (not by pthread_kill()), are now called with the
code (or siginfo_t if SA_SIGINFO was set in sa_flags) and
sigcontext_t as received from the process signal handler.

Modify the search for a thread to which a signal is delivered.
The search algorithm is now:

o First thread found in sigwait() with signal in wait mask.
o First thread found sigsuspend()'d on the signal.
o Current thread if signal is unmasked.
o First thread found with signal unmasked.

Collapse machine dependent support into macros defined in
pthread_private.h. These should probably eventually be moved
into separate MD files.

Change the range of settable priorities to be compliant with
POSIX (0-31). The threads library uses higher priorities
internally for real-time threads (not yet implemented) and
threads executing signal handlers. Real-time threads and
threads running signal handlers add 64 and 32, respectively,
to a threads base priority.

Some other small changes and cleanups.

PR: 17757 18559 21943
Reviewed by: jasone


# 64346 07-Aug-2000 jlemon

Add wrapper for kevent() syscall

Noted as missing by: nicolas.leonard@animaths.com


# 58094 15-Mar-2000 deischen

Fix pthread_suspend_np/pthread_resume_np. For the record, suspending a
thread waiting on an event (I/O, condvar, etc) will, when resumed using
pthread_resume_np, return with EINTR. For example, suspending and resuming
a thread blocked on read() will not requeue the thread for the read, but
will return -1 with errno = EINTR. If the suspended thread is in a critical
region, the thread is suspended as soon as it leaves the critical region.

Fix a bogon in pthread_kill() where a signal was being delivered twice
to threads waiting in sigwait().

Reported by (suspend/resume bug): jdp
Reviewed by: jasone


# 56310 20-Jan-2000 jasone

Do signal deferral for pthread_kill() as it was done in the old days.

Submitted by: deischen


# 56277 19-Jan-2000 jasone

Implement continuations to correctly handle [sig|_]longjmp() inside of a
signal handler. Explicitly check for jumps to anywhere other than the
current stack, since such jumps are undefined according to POSIX.

While we're at it, convert thread cancellation to use continuations, since
it's cleaner than the original cancellation code.

Avoid delivering a signal to a thread twice. This was a pre-existing bug,
but was likely unexposed until these other changes were made.

Defer signals generated by pthread_kill() so that they can be delivered on
the appropriate stack. deischen claims that this is unnecessary, which is
likely true, but without this change, pthread_kill() can cause undefined
priority queue states and/or PANICs in [sig|_]longjmp(), so I'm leaving
this in for now. To compile this code out and exercise the bug, define
the _NO_UNDISPATCH cpp macro. Defining _PTHREADS_INVARIANTS as well will
cause earlier crashes.

PR: kern/14685
Collaboration with: deischen


# 54708 16-Dec-1999 deischen

Fix problems with cancellation while in critical regions.

o Cancellation flags were not getting properly set/cleared.
o Loops waiting for internal locks were not being exited
correctly by a cancelled thread.
o Minor spelling (cancelation -> cancellation) and formatting
corrections (missing tab).

Found by: tg
Reviewed by: jasone


# 54707 16-Dec-1999 deischen

Fixes for signal handling:

o Don't call signal handlers with the signal handler access lock
held.
o Remove pending signals before calling signal handlers. If
pending signals were not removed prior to handling them,
invocation of the handler could cause the handler to be
called more than once for the same signal. Found by: JB
o When SIGCHLD arrives, wake up all threads in PS_WAIT_WAIT
(wait4).

PR: bin/15328
Reviewed by: jasone


# 53812 28-Nov-1999 alfred

add pthread_cancel, obtained from OpenBSD.

eischen (Daniel Eischen) added wrappers to protect against cancled
threads orphaning internal resources.

the cancelability code is still a bit fuzzy but works for test
programs of my own, OpenBSD's and some examples from ORA's books.

add readdir_r to both libc and libc_r

add some 'const' attributes to function parameters

Reviewed by: eischen, jasone


# 51794 29-Sep-1999 marcel

sigset_t change (part 5 of 5)
-----------------------------

Most of the userland changes are in libc. For both the alpha
and the i386 setjmp has been changed to accomodate for the
new sigset_t. Internally, libc is mostly rewritten to use the
new syscalls. The exception is in compat-43/sigcompat.c

The POSIX thread library has also been rewritten to use the
new sigset_t. Except, that it currently only handles NSIG
signals instead of the maximum _SIG_MAXSIG. This should not
be a problem because current applications don't use any
signals higher than NSIG.

There are version bumps for the following libraries:
libdialog
libreadline
libc
libc_r
libedit
libftpio
libss

These libraries either a) have one of the modified structures
visible in the interface, or b) use sigset_t internally and
may cause breakage if new binaries are used against libraries
that don't have the sigset_t change. This not an immediate
issue, but will be as soon as applications start using the
new range to its fullest.

NOTE: libncurses already had an version bump and has not been
given one now.

NOTE: doscmd is a real casualty and has been disconnected for
the moment. Reconnection will eventually happen after
doscmd has been fixed. I'm aware that being the last one
to touch it, I'm automaticly promoted to being maintainer.
According to good taste this means that I will receive a
badge which either will be glued or mechanically stapled,
drilled or otherwise violently forced onto me :-)

NOTE: pcvt/vttest cannot be compiled with -traditional. The
change cause sys/types to be included along the way which
contains the const and volatile modifiers. I don't consider
this a solution, but more a workaround.


# 50476 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 50057 19-Aug-1999 alfred

handle under/overflow of time values in a more robust manner,
there may be an overflow that need to be adjusted more than once.

Pointed out by: Fabian Thylmann <fthylmann@stats.net>

Reviewed by: eivind, jb


# 49661 12-Aug-1999 deischen

Add check for runnable threads before polling file descriptors.

Submitted by: tegge


# 48046 20-Jun-1999 jb

In the words of the author:

o The polling mechanism for I/O readiness was changed from
select() to poll(). In additon, a wrapped version of poll()
is now provided.

o The wrapped select routine now converts each fd_set to a
poll array so that the thread scheduler doesn't have to
perform a bitwise search for selected fds each time file
descriptors are polled for I/O readiness.

o The thread scheduler was modified to use a new queue (_workq)
for threads that need work. Threads waiting for I/O readiness
and spinblocks are added to the work queue in addition to the
waiting queue. This reduces the time spent forming/searching
the array of file descriptors being polled.

o The waiting queue (_waitingq) is now maintained in order of
thread wakeup time. This allows the thread scheduler to
find the nearest wakeup time by looking at the first thread
in the queue instead of searching the entire queue.

o Removed file descriptor locking for select/poll routines. An
application should not rely on the threads library for providing
this locking; if necessary, the application should use mutexes
to protect selecting/polling of file descriptors.

o Retrieve and use the kernel clock rate/resolution at startup
instead of hardcoding the clock resolution to 10 msec (tested
with kernel running at 1000 HZ).

o All queues have been changed to use queue.h macros. These
include the queues of all threads, dead threads, and threads
waiting for file descriptor locks.

o Added reinitialization of the GC mutex and condition variable
after a fork. Also prevented reallocation of the ready queue
after a fork.

o Prevented the wrapped close routine from closing the thread
kernel pipes.

o Initialized file descriptor table for stdio entries at thread
init.

o Provided additional flags to indicate to what queues threads
belong.

o Moved TAILQ initialization for statically allocated mutex and
condition variables to after the spinlock.

o Added dispatching of signals to pthread_kill. Removing the
dispatching of signals from thread activation broke sigsuspend
when pthread_kill was used to send a signal to a thread.

o Temporarily set the state of a thread to PS_SUSPENDED when it
is first created and placed in the list of threads so that it
will not be accidentally scheduled before becoming a member
of one of the scheduling queues.

o Change the signal handler to queue signals to the thread kernel
pipe if the scheduling queues are protected. When scheduling
queues are unprotected, signals are then dequeued and handled.

o Ensured that all installed signal handlers block the scheduling
signal and that the scheduling signal handler blocks all
other signals. This ensures that the signal handler is only
interruptible for and by non-scheduling signals. An atomic
lock is used to decide which instance of the signal handler
will handle pending signals.

o Removed _lock_thread_list and _unlock_thread_list as they are
no longer used to protect the thread list.

o Added missing RCS IDs to modified files.

o Added checks for appropriate queue membership and activity when
adding, removing, and searching the scheduling queues. These
checks add very little overhead and are enabled when compiled
with _PTHREADS_INVARIANTS defined. Suggested and implemented
by Tor Egge with some modification by me.

o Close a race condition in uthread_close. (Tor Egge)

o Protect the scheduling queues while modifying them in
pthread_cond_signal and _thread_fd_unlock. (Tor Egge)

o Ensure that when a thread gets a mutex, the mutex is on that
threads list of owned mutexes. (Tor Egge)

o Set the kernel-in-scheduler flag in _thread_kern_sched_state
and _thread_kern_sched_state_unlock to prevent a scheduling
signal from calling the scheduler again. (Tor Egge)

o Don't use TAILQ_FOREACH macro while searching the waiting
queue for threads in a sigwait state, because a change of
state destroys the TAILQ link. It is actually safe to do
so, though, because once a sigwaiting thread is found, the
loop ends and the function returns. (Tor Egge)

o When dispatching signals to threads, make the thread inherit
the signal deferral flag of the currently running thread.
(Tor Egge)

Submitted by: Daniel Eischen <eischen@vigrid.com> and
Tor Egge <Tor.Egge@fast.no>


# 46680 08-May-1999 jasone

Back out patch for cond_timedwait() bug from -current, since other changes
have made the patch obsolete, as pointed out by Daniel Eischen
<eischen@vigrid.com>.

PR: bin/8872


# 46628 07-May-1999 jasone

Apply patch included in bin/8872. This fixes a bug that occurs when
pthread_cond_timedwait() times out.

PR: bin/8872
Submitted by: Jason Evans <jasone@canonware.com>
Reviewed by: David Schwartz <davids@webmaster.com>


# 44963 23-Mar-1999 jb

[ The author's description... ]

o Runnable threads are now maintained in priority queues. The
implementation requires two things:

1.) The priority queues must be protected during insertion
and removal of threads. Since the kernel scheduler
must modify the priority queues, a spinlock for
protection cannot be used. The functions
_thread_kern_sched_defer() and _thread_kern_sched_undefer()
were added to {un}defer kernel scheduler activation.

2.) A thread (active) priority change can be performed only
when the thread is removed from the priority queue. The
implementation uses a threads active priority when
inserting it into the queue.

A by-product is that thread switches are much faster. A
separate queue is used for waiting and/or blocked threads,
and it is searched at most 2 times in the kernel scheduler
when there are active threads. It should be possible to
reduce this to once by combining polling of threads waiting
on I/O with the loop that looks for timed out threads and
the minimum timeout value.

o Functions to defer kernel scheduler activation were added. These
are _thread_kern_sched_defer() and _thread_kern_sched_undefer()
and may be called recursively. These routines do not block the
scheduling signal, but latch its occurrence. The signal handler
will not call the kernel scheduler when the running thread has
deferred scheduling, but it will be called when running thread
undefers scheduling.

o Added support for _POSIX_THREAD_PRIORITY_SCHEDULING. All the
POSIX routines required by this should now be implemented.
One note, SCHED_OTHER, SCHED_FIFO, and SCHED_RR are required
to be defined by including pthread.h. These defines are currently
in sched.h. I modified pthread.h to include sched.h but don't
know if this is the proper thing to do.

o Added support for priority protection and inheritence mutexes.
This allows definition of _POSIX_THREAD_PRIO_PROTECT and
_POSIX_THREAD_PRIO_INHERIT.

o Added additional error checks required by POSIX for mutexes and
condition variables.

o Provided a wrapper for sigpending which is marked as a hidden
syscall.

o Added a non-portable function as a debugging aid to allow an
application to monitor thread context switches. An application
can install a routine that gets called everytime a thread
(explicitly created by the application) gets context switched.
The routine gets passed the pthread IDs of the threads that are
being switched in and out.

Submitted by: Dan Eischen <eischen@vigrid.com>

Changes by me:

o Added a PS_SPINBLOCK state to deal with the priority inversion
problem most often (I think) seen by threads calling malloc/free/realloc.

o Dispatch signals to the running thread directly rather than at a
context switch to avoid the situation where the switch never occurs.


# 41164 15-Nov-1998 jb

Close a window between unlocking a spinlock and changing the thread state.


# 40127 09-Oct-1998 dt

Fix some bugs in pthread scheduler:
make pthread_yield() more reliable,
threads always (I hope) preempted at least every 0.1 sec, as intended.

PR: bin/7744
Submitted by: "Richard Seaman, Jr." <dick@tar.com>


# 39807 30-Sep-1998 jb

Move the cleanup code that frees memory allocated for a dead thread from
the thread kernel into a garbage collector thread which is started when
the fisrt thread is created (other than the initial thread). This
removes the window of opportunity where a context switch will cause a
thread that has locked the malloc spinlock, to enter the thread kernel,
find there is a dead thread and try to free memory, therefore trying
to lock the malloc spinlock against itself.

The garbage collector thread acts just like any other thread, so
instead of having a spinlock to control accesses to the dead thread
list, it uses a mutex and a condition variable so that it can happily
wait to be signalled when a thread exists.


# 38925 07-Sep-1998 alex

Removed unused variables.


# 35564 30-Apr-1998 jb

Fix the incremental priority increment.

PR: bin/6467 Marino Ladavac <lada@pc8811.gud.siemens.at>


# 35509 29-Apr-1998 jb

Change signal model to match POSIX (i.e. one set of signal handlers
for the process, not a separate set for each thread). By default, the
process now only has signal handlers installed for SIGVTALRM, SIGINFO
and SIGCHLD. The thread kernel signal handler is installed for other
signals on demand. This means that SIG_IGN and SIG_DFL processing is now
left to the kernel, not the thread kernel.

Change the signal dispatch to no longer use a signal thread, and
call the signal handler using the stack of the thread that has the
signal pending.

Change the atomic lock method to use test-and-set asm code with
a yield if blocked. This introduces separate locks for each type
of object instead of blocking signals to prevent a context
switch. It was this blocking of signals that caused the performance
degradation the people have noted.

This is a *big* change!


# 35246 17-Apr-1998 jb

When in PS_SIGWAIT state, still call signal handlers and set errno
to EINTR.


# 35130 11-Apr-1998 jb

Change the FILE locking to be by FILE, not by the underlying fd as
it was. Add a FILE_WAIT state and queue threads waiting for a FILE
lock. Start using the sys/queue.h macros instead of the way that MIT
pthreads did it.

Add a thread name to the private thread structure and a non-POSIX
function to set this. This helps (me at least) when sending a SIGINFO
to a threaded process to get a /tmp/uthread.dump to see what the
<expletive deleted> threads are doing this time. It is nice to be
able to recognise (yes, I spell that with an 's' too) which threads
are which.


# 34362 09-Mar-1998 jb

Add FreeBSD/Alpha code to initialise a jmpbuf for a created thread.
Change a bunch of __alpha references to __alpha__.


# 33292 12-Feb-1998 julian

Fixes from Jeremy Allison and Terry Lambert for pthreads:

specifically:
uthread_accept.c: Fix for inherited socket not getting correct entry in
pthread flags.
uthread_create.c: Fix to allow pthread_t pointer return to be null if
caller doesn't care about return.
uthread_fd.c: Fix for return codes to be placed into correct errno.
uthread_init.c: Changes to make gcc-2.8 thread aware for exception stack
frames (WARNING: This is #ifdef'ed out by default and is
different from the Cygnus egcs fix).
uthread_ioctl.c: Fix for blocking/non-blocking ioctl.
uthread_kern.c: Signal handling fixes (only one case left to fix,
that of an externally sent SIGSEGV and friends -
a fairly unusual case).
uthread_write.c: Fix for lock of fd - ask for write lock, not read/write.
uthread_writev.c: Fix for lock of fd - ask for write lock, not read/write.

Pthreads now works well enough to run the LDAP and ACAPD(with the gcc 2.8 fix)
sample implementations.


# 24520 01-Apr-1997 jb

Fix indentations. Sigh.


# 22315 05-Feb-1997 julian

Submitted by: John Birrell
uthreads update from the author.


# 18415 20-Sep-1996 nate

Remove now un-necessary FreeBSD specific code since our timespec
structure now has the correct member names.

Pointed out by: Peter Wemm


# 17706 20-Aug-1996 julian

Submitted by: John Birrell <cimaxp1!jb@werple.net.au>

Here are the diffs for libc_r to get it one step closer to P1003.1c
These make most of the thread/mutex/condvar structures opaque to the
user. There are three functions which have been renamed with _np
suffixes because they are extensions to P1003.1c (I did them for JAVA,
which needs to suspend/resume threads and also start threads suspended).

I've created a new header (pthread_np.h) for the non-POSIX stuff.

The egrep tags stuff in /usr/src/lib/libc_r/Makefile that I uncommented
doesn't work. I think its best to delete it. I don't think libc_r needs
tags anyway, 'cause most of the source is in libc which does have tags.

also:

Here's the first batch of man pages for the thread functions.
The diff to /usr/src/lib/libc_r/Makefile removes some stuff that was
inherited from /usr/src/lib/libc/Makefile that should only be done with
libc.

also:

I should have sent this diff with the pthread(3) man page.
It allows people to type

make -DWANT_LIBC_R world

to get libc_r built with the rest of the world. I put this in the
pthread(3) man page. The default is still not to build libc_r.


also:
The diff attached adds a pthread(3) man page to /usr/src/share/man/man3.
The idea is that without libc_r installed, this man page will give people
enough info to know that they have to build libc_r.


# 13546 21-Jan-1996 julian

Reviewed by: julian
Submitted by: john birrel

One version of the pthreads library
another will follow with differnt actions under some cases..
not QUITE complete