#
285830 |
|
23-Jul-2015 |
gjb |
- Copy stable/10@285827 to releng/10.2 in preparation for 10.2-RC1 builds. - Update newvers.sh to reflect RC1. - Update __FreeBSD_version to reflect 10.2. - Update default pkg(8) configuration to use the quarterly branch.[1]
Discussed with: re, portmgr [1] Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
260817 |
|
17-Jan-2014 |
avg |
MFC r258622: dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINE
|
#
256281 |
|
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
#
246923 |
|
17-Feb-2013 |
pjd |
Update the comment: we do show the backtrace of misbehaving thread.
|
#
240424 |
|
12-Sep-2012 |
attilio |
Improve check coverage about idle threads.
Idle threads are not allowed to acquire any lock but spinlocks. Deny any attempt to do so by panicing at the locking operation when INVARIANTS is on. Then, remove the check on blocking on a turnstile. The check in sleepqueues is left because they are not allowed to use tsleep() either which could happen still.
Reviewed by: bde, jhb, kib MFC after: 1 week
|
#
239585 |
|
22-Aug-2012 |
jhb |
Mark the idle threads as non-sleepable and also assert that an idle thread never blocks on a turnstile.
|
#
235459 |
|
15-May-2012 |
rstone |
Implement the DTrace sched provider. This implementation aims to be compatible with the sched provider implemented by Solaris and its open- source derivatives. Full documentation of the sched provider can be found on Oracle's DTrace wiki pages.
Note that for compatibility with scripts originally written for Solaris, serveral probes are defined that will never fire. These probes are defined to fire when Solaris-specific features perform certain actions. As these features are not present in FreeBSD, the probes can never fire.
Also, I have added a two probes that are not defined in Solaris, lend-pri and load-change. These probes have been added to make it possible to collect schedgraph data with DTrace.
Finally, a few probes are defined in Solaris to take a cpuinfo_t * argument. As it was not immediately clear to me how to translate that to FreeBSD, currently those probes are passed NULL in place of a cpuinfo_t *.
Sponsored by: Sandvine Incorporated MFC after: 2 weeks
|
#
234303 |
|
15-Apr-2012 |
davide |
Fix a typo.
Approved by: gnn (mentor) MFC after: 2 days
|
#
234280 |
|
14-Apr-2012 |
marius |
Fix !DDB build after r234190.
|
#
234190 |
|
12-Apr-2012 |
jhb |
- Extend the KDB interface to add a per-debugger callback to print a backtrace for an arbitrary thread (rather than the calling thread). A kdb_backtrace_thread() wrapper function uses the configured debugger if possible, otherwise it falls back to using stack(9) if that is available. - Replace a direct call to db_trace_thread() in propagate_priority() with a call to kdb_backtrace_thread() instead.
MFC after: 1 week
|
#
227309 |
|
07-Nov-2011 |
ed |
Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
|
#
218272 |
|
04-Feb-2011 |
jhb |
Always assert that the turnstile chain lock is held in turnstile_wait() and remove a duplicate hash lookup.
MFC after: 1 week
|
#
201879 |
|
09-Jan-2010 |
attilio |
Introduce the new kernel thread called "deadlock resolver". While the name is pretentious, a good explanation of its targets is reported in this 17 months old presentation e-mail: http://lists.freebsd.org/pipermail/freebsd-arch/2008-August/008452.html
In order to implement it, the sq_type in sleepqueues is mandatory and not only compiled along with INVARIANTS option. Additively, a new sleepqueue function, sleepq_type() is added, returning the type of the sleepqueue linked to a wchan. Three new sysctls are added in order to configure the thread: debug.deadlkres.slptime_threshold debug.deadlkres.blktime_threshold debug.deadlkres.sleepfreq
rappresenting the thresholds for sleep and block time that will lead to a deadlock matching (when exceeded), while the sleepfreq rappresents the number of seconds between 2 consecutive thread runnings. In order to enable the deadlock resolver thread recompile your kernel with the option DEADLKRES.
Reviewed by: jeff Tested by: pho, Giovanni Trematerra Sponsored by: Nokia Incorporated, Sandvine Incorporated MFC after: 2 weeks
|
#
200761 |
|
20-Dec-2009 |
ed |
Fix indentation.
|
#
183054 |
|
15-Sep-2008 |
sam |
Make ddb command registration dynamic so modules can extend the command set (only so long as the module is present): o add db_command_register and db_command_unregister to add and remove commands, respectively o replace linker sets with SYSINIT's (and SYSUINIT's) that register commands o expose 3 list heads: db_cmd_table, db_show_table, and db_show_all_table for registering top-level commands, show operands, and show all operands, respectively
While here also: o sort command lists o add DB_ALIAS, DB_SHOW_ALIAS, and DB_SHOW_ALL_ALIAS to add aliases for existing commands o add "show all trace" as an alias for "show alltrace" o add "show all locks" as an alias for "show alllocks"
Submitted by: Guillaume Ballet <gballet@gmail.com> (original version) Reviewed by: jhb MFC after: 1 month
|
#
182879 |
|
08-Sep-2008 |
jhb |
- Reduce scope of #ifdef's in uma_zcreate() call in init_turnstile0(). - Set UMA_ZONE_NOFREE so that the per-turnstile spin locks are type stable to avoid a race where one thread might dereference a lock in a free'd turnstile that was previously used by another thread.
Theorized by: tegge (2) MFC after: 1 week
|
#
178272 |
|
17-Apr-2008 |
jeff |
- Make SCHED_STATS more generic by adding a wrapper to create the variables and sysctl nodes. - In reset walk the children of kern_sched_stats and reset the counters via the oid_arg1 pointer. This allows us to add arbitrary counters to the tree and still reset them properly. - Define a set of switch types to be passed with flags to mi_switch(). These types are named SWT_*. These types correspond to SCHED_STATS counters and are automatically handled in this way. - Make the new SWT_ types more specific than the older switch stats. There are now stats for idle switches, remote idle wakeups, remote preemption ithreads idling, etc. - Add switch statistics for ULE's pickcpu algorithm. These stats include how much migration there is, how often affinity was successful, how often threads were migrated to the local cpu on wakeup, etc.
Sponsored by: Nokia
|
#
176078 |
|
07-Feb-2008 |
jeff |
- Add THREAD_LOCKPTR_ASSERT() to assert that the thread's lock points at the provided lock or &blocked_lock. The thread may be temporarily assigned to the blocked_lock by the scheduler so a direct comparison can not always be made. - Use THREAD_LOCKPTR_ASSERT() in the primary consumers of the scheduling interfaces. The schedulers themselves still use more explicit asserts.
Sponsored by: Nokia
|
#
176017 |
|
06-Feb-2008 |
jeff |
Adaptive spinning in write path with readers and writer starvation avoidance. - Move recursion checking into rwlock inlines to free a bit for use with adaptive spinners. - Clear the RW_LOCK_WRITE_SPINNERS flag whenever the lock state changes causing write spinners to restart their loop. - Write spinners are limited by a count while readers hold the lock as there is no way to know for certain whether readers are running still. - In the read path block if there are write waiters or spinners to avoid starving writers. Use a new per-thread count, td_rw_rlocks, to skip starvation avoidance if it might cause a deadlock. - Remove or change invalid assertions in turnstiles.
Reviewed by: attilio (developed parts of the patch as well) Sponsored by: Nokia
|
#
173600 |
|
14-Nov-2007 |
julian |
generally we are interested in what thread did something as opposed to what process. Since threads by default have teh name of the process unless over-written with more useful information, just print the thread name instead.
|
#
170640 |
|
12-Jun-2007 |
jeff |
- Include opt_sched.h for SCHED_STATS.
|
#
170295 |
|
04-Jun-2007 |
jeff |
Commit 3/14 of sched_lock decomposition. - Add a per-turnstile spinlock to solve potential priority propagation deadlocks that are possible with thread_lock(). - The turnstile lock order is defined as the exact opposite of the lock order used with the sleep locks they represent. This allows us to walk in reverse order in priority_propagate and this is the only place we wish to multiply acquire turnstile locks. - Use the turnstile_chain lock to protect assigning mutexes to turnstiles. - Change the turnstile interface to pass back turnstile pointers to the consumers. This allows us to reduce some locking and makes it easier to cancel turnstile assignment while the turnstile chain lock is held.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
#
169666 |
|
18-May-2007 |
jeff |
- Convert turnstiles and sleepqueus to use UMA. This provides a modest speedup and will be more useful after each gains a spinlock in the impending thread_lock() commit. - Move initialization and asserts into init/fini routines. fini routines are only needed in the INVARIANTS case for now.
Submitted by: Attilio Rao <attilio@FreeBSD.org> Tested by: kris, jeff
|
#
166188 |
|
23-Jan-2007 |
jeff |
- Remove setrunqueue and replace it with direct calls to sched_add(). setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched.
Discussed with: julian
- Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers.
Suggested by: jhb
Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.
|
#
166073 |
|
17-Jan-2007 |
delphij |
Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.
|
#
165946 |
|
11-Jan-2007 |
jhb |
Wrap propagate_priority() in a critical section to prevent unwanted preemptions when adjusting the priority of a thread that is on a run queue. This was only observed when FULL_PREEMPTION was enabled.
Reported by: kris Diagnosed by: ups MFC after: 1 week
|
#
161337 |
|
15-Aug-2006 |
jhb |
Add a new 'show sleepchain' ddb command similar to 'show lockchain' except that it operates on lockmgr and sx locks. This can be useful for tracking down vnode deadlocks in VFS for example. Note that this command is a bit more fragile than 'show lockchain' as we have to poke around at the wait channel of a thread to see if it points to either a struct lock or a condition variable inside of a struct sx. If td_wchan points to something unmapped, then this command will terminate early due to a fault, but no harm will be done.
|
#
161324 |
|
15-Aug-2006 |
jhb |
Rename 'show lockchain' to 'show locktree' and 'show threadchain' to 'show lockchain'. The churn is because I'm about to add a new 'show sleepchain' similar to 'show lockchain' for sleep locks (lockmgr and sx) and 'show threadchain' was a bit ambiguous as both commands show a chain of thread dependencies, 'lockchain' is for non-sleepable locks (mtx and rw) and 'sleepchain' is for sleepable locks.
|
#
160313 |
|
12-Jul-2006 |
jhb |
Honor db_pager_quit in 'show threadchain', 'show allchains', and 'show lockchain'. This is especially helpful for the first 2 as a threadchain could get stuck in an infinite loop during a mutex deadlock.
|
#
158031 |
|
25-Apr-2006 |
jhb |
Add some new commands to hopefully make it easier to diagnose lock-related problems in ddb: - "show threadchain [thread]" will start with the specified thread (or the current kdb thread by default) and show it's state. If it is blocked on a lock, it will find the owner of the lock and show its state, etc. - "show allchains" will find all of the threads that are blocked on a lock (but do not have any threads blocked on a lock they hold) and show the resulting thread chain. - "show lockchain <lock>" takes a pointer to a lock_object (such as a mutex or rwlock). If there is a turnstile for that lock, then it will display all the threads blocked on the lock. In addition, for each thread blocked on the lock, it will display any contested locks they hold, and recurse on those locks to show any threads blocked on those locks, etc.
|
#
157952 |
|
21-Apr-2006 |
jhb |
Print td_name instead of p_comm if td_name is non-empty for 'show turnstile' and 'show sleepq'.
|
#
157844 |
|
18-Apr-2006 |
jhb |
- Bring back turnstile_empty() which can check to see if an individual queue on a turnstile is empty. - Add a turnstile_disown() function that allows a thread to give up ownership of a turnstile w/o waking up any waiters.
|
#
157275 |
|
29-Mar-2006 |
jhb |
Always explicitly panic in propogate_priority() if we try to propogate a lock's priority to a sleeping thread. When we panic, dump a stack trace of the thread that is asleep if DDB is compiled into the kernel just before calling panic(). This is much more informative and useful for debugging than the current behavior of getting a page fault and not having an easy way of determining which thread caused the original problem.
MFC after: 1 week
|
#
154937 |
|
27-Jan-2006 |
jhb |
- Add support for having both a shared and exclusive queue of threads in each turnstile. Also, allow for the owner thread pointer of a turnstile to be NULL. This is needed for the upcoming reader/writer lock implementation. - Add a new ddb command 'show turnstile' that will look up the turnstile associated with the given lock argument and display useful information like the list of threads blocked on each queue, etc. If there isn't an active turnstile for a lock at the specified address, then the function will see if there is an active turnstile at the specified address and display info about it if so. - Adjust the mutex code to handle the turnstile API changes.
Tested on: i386 (all), alpha, amd64, sparc64 (1 and 3)
|
#
154482 |
|
17-Jan-2006 |
jhb |
Initialize thread0.td_contested in init_turnstiles() rather than mutex_init() as it is used by the turnstile code and is not mutex-specific.
|
#
154480 |
|
17-Jan-2006 |
jhb |
Garbage collect turnstile_empty() since it is unused.
|
#
150727 |
|
29-Sep-2005 |
jhb |
Trim a couple of unneeded includes.
|
#
141616 |
|
10-Feb-2005 |
phk |
Make a bunch of malloc types static.
Found by: src/tools/tools/kernxref
|
#
139453 |
|
30-Dec-2004 |
jhb |
Rework the interface between priority propagation (lending) and the schedulers a bit to ensure more correct handling of priorities and fewer priority inversions: - Add two functions to the sched(9) API to handle priority lending: sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these functions to ask the scheduler to lend a thread a set priority and to tell the scheduler when it thinks it is ok for a thread to stop borrowing priority. The unlend case is slightly complex in that the turnstile code tells the scheduler what the minimum priority of the thread needs to be to satisfy the requirements of any other threads blocked on locks owned by the thread in question. The scheduler then decides where the thread can go back to normal mode (if it's normal priority is high enough to satisfy the pending lock requests) or it it should continue to use the priority specified to the sched_unlend_prio() call. This involves adding a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag for priority elevation. - Schedulers now refuse to lower the priority of a thread that is currently borrowing another therad's priority. - If a scheduler changes the priority of a thread that is currently sitting on a turnstile, it will call a new function turnstile_adjust() to inform the turnstile code of the change. This function resorts the thread on the priority list of the turnstile if needed, and if the thread ends up at the head of the list (due to having the highest priority) and its priority was raised, then it will propagate that new priority to the owner of the lock it is blocked on.
Some additional fixes specific to the 4BSD scheduler include: - Common code for updating the priority of a thread when the user priority of its associated kse group has been consolidated in a new static function resetpriority_thread(). One change to this function is that it will now only adjust the priority of a thread if it already has a time sharing priority, thus preserving any boosts from a tsleep() until the thread returns to userland. Also, resetpriority() no longer calls maybe_resched() on each thread in the group. Instead, the code calling resetpriority() is responsible for calling resetpriority_thread() on any threads that need to be updated. - schedcpu() now uses resetpriority_thread() instead of just calling sched_prio() directly after it updates a kse group's user priority. - sched_clock() now uses resetpriority_thread() rather than writing directly to td_priority. - sched_nice() now updates all the priorities of the threads after the group priority has been adjusted.
Discussed with: bde Reviewed by: ups, jeffr Tested on: 4bsd, ule Tested on: i386, alpha, sparc64
|
#
136445 |
|
12-Oct-2004 |
jhb |
Refine the turnstile and sleep queue interfaces just a bit: - Add a new _lock() call to each API that locks the associated chain lock for a lock_object pointer or wait channel. The _lookup() functions now require that the chain lock be locked via _lock() when they are called. - Change sleepq_add(), turnstile_wait() and turnstile_claim() to lookup the associated queue structure internally via _lookup() rather than accepting a pointer from the caller. For turnstiles, this means that the actual lookup of the turnstile in the hash table is only done when the thread actually blocks rather than being done on each loop iteration in _mtx_lock_sleep(). For sleep queues, this means that sleepq_lookup() is no longer used outside of the sleep queue code except to implement an assertion in cv_destroy(). - Change sleepq_broadcast() and sleepq_signal() to require that the chain lock is already required. For condition variables, this lets the cv_broadcast() and cv_signal() functions lock the sleep queue chain lock while testing the waiters count. This means that the waiters count internal to condition variables is no longer protected by the interlock mutex and cv_broadcast() and cv_signal() now no longer require that the interlock be held when they are called. This lets consumers of condition variables drop the lock before waking other threads which can result in fewer context switches.
MFC after: 1 month
|
#
136150 |
|
05-Oct-2004 |
jhb |
Add a critical section in turnstile_unpend() from before dropping the turnstile chain lock until after making all the awakened threads runnable. First, this fixes a priority inversion race. Second, this attempts to finish waking up all of the threads waiting on a turnstile before doing a preemption.
Reviewed by: Stephan Uphoff (who found the priority inversion race)
|
#
134586 |
|
01-Sep-2004 |
julian |
Give setrunqueue() and sched_add() more of a clue as to where they are coming from and what is expected from them.
MFC after: 2 days
|
#
132646 |
|
25-Jul-2004 |
rwatson |
Revert modification of subr_turnstile.c accidentally included in the last commit; this assertion was provided by jhb for local debugging and not intended for broader consumption.
|
#
132645 |
|
25-Jul-2004 |
rwatson |
In uipc_connect(), assert that the passed thread is curthread, and pass td into unp_connect() instead of reading curthread.
|
#
131473 |
|
02-Jul-2004 |
jhb |
- Change mi_switch() and sched_switch() to accept an optional thread to switch to. If a non-NULL thread pointer is passed in, then the CPU will switch to that thread directly rather than calling choosethread() to pick a thread to choose to. - Make sched_switch() aware of idle threads and know to do TD_SET_CAN_RUN() instead of sticking them on the run queue rather than requiring all callers of mi_switch() to know to do this if they can be called from an idlethread. - Move constants for arguments to mi_switch() and thread_single() out of the middle of the function prototypes and up above into their own section.
|
#
131263 |
|
29-Jun-2004 |
jhb |
Oops, this didn't make it into my submit before I committed: Defer creation of the sysctl tree for the turnstile profiling stats until a SI_SUB_LOCK sysinit. Doing it in init_turnstiles() is too early as it is called before mi_startup().
|
#
131259 |
|
29-Jun-2004 |
jhb |
Add two new kernel options to allow rudimentary profiling of the internal hash tables used in the sleep queue and turnstile code. Each option adds a sysctl tree under debug containing the maximum depth of any bucket in the hash table as well as a separate node for each bucket (or chain) containing the current depth and maximum depth for that bucket.
|
#
127951 |
|
06-Apr-2004 |
jhb |
Rename turnstile_wakeup() to turnstile_broadcast() to make the naming more consistent with other APIs. sleepq and cv's use signal/broadcast, and msleep uses wakeup_one/wakeup. Prior to this turnstiles were using a signal/wakeup mixture.
|
#
126884 |
|
12-Mar-2004 |
jhb |
Fixup a comment.
|
#
126324 |
|
27-Feb-2004 |
jhb |
Add an implementation of a generic sleep queue abstraction that is used to queue threads sleeping on a wait channel similar to how turnstiles are used to queue threads waiting for a lock. This subsystem will be used as the backend for sleep/wakeup and condition variables initially. Eventually it will also be used to replace the ithread-specific iwait thread inhibitor.
Sleep queues are also not locked by sched_lock, so this splits sched_lock up a bit further increasing concurrency within the scheduler. Sleep queues also natively support timeouts on sleeps and interruptible sleeps allowing for the reduction of a lot of duplicated code between the sleep/wakeup and condition variable implementations. For more details on the sleep queue implementation, check the comments in sys/sleepqueue.h and kern/subr_sleepqueue.c.
|
#
126317 |
|
27-Feb-2004 |
jhb |
Clarify and tweak some comments.
|
#
124944 |
|
25-Jan-2004 |
jeff |
- Add a flags parameter to mi_switch. The value of flags may be SW_VOL or SW_INVOL. Assert that one of these is set in mi_switch() and propery adjust the rusage statistics. This is to simplify the large number of users of this interface which were previously all required to adjust the proper counter prior to calling mi_switch(). This also facilitates more switch and locking optimizations. - Change all callers of mi_switch() to pass the appropriate paramter and remove direct references to the process statistics.
|
#
123364 |
|
09-Dec-2003 |
jhb |
Adjust an assertion for the TDF_TSNOBLOCK race handling in turnstile_unpend(). A racing thread that does not have TDI_LOCK set may either be running on another CPU or it may be sitting on a run queue if it was preempted during the very small window in turnstile_wait() between unlocking the turnstile chain lock and locking sched_lock.
|
#
123363 |
|
09-Dec-2003 |
jhb |
Assert that the we never give a thread a NULL turnstile when waking it up.
|
#
123362 |
|
09-Dec-2003 |
jhb |
Revert the previous race fix and replace it with a more general fix. The case of a turnstile having no threads is just one instance of the more general case where the thread we are examining has been partially awakened already in that it has been removed from the turnstile's blocked list but still has TDI_LOCK set. We detect that case by checking to see if the thread has already had a turnstile reassigned to it.
|
#
122590 |
|
12-Nov-2003 |
jhb |
- Close a race where a thread on another CPU could release a contested lock and empty its turnstile while the blocking threads still pointed to the turnstile. If the thread on the first CPU blocked on a lock owned by one of the threads blocked on the turnstile just woken up, then the first CPU could try to manipulate a bogus thread queue in the turnstile during priority propagation. - Update locking notes for ts_owner and always clear ts_owner, not just under INVARIANTS.
Tested by: sam (1)
|
#
122561 |
|
12-Nov-2003 |
jhb |
Fix a typo in a comment.
Submitted by: das
|
#
122514 |
|
11-Nov-2003 |
jhb |
Add an implementation of turnstiles and change the sleep mutex code to use turnstiles to implement blocking isntead of implementing a thread queue directly. These turnstiles are somewhat similar to those used in Solaris 7 as described in Solaris Internals but are also different.
Turnstiles do not come out of a fixed-sized pool. Rather, each thread is assigned a turnstile when it is created that it frees when it is destroyed. When a thread blocks on a lock, it donates its turnstile to that lock to serve as queue of blocked threads. The queue associated with a given lock is found by a lookup in a simple hash table. The turnstile itself is protected by a lock associated with its entry in the hash table. This means that sched_lock is no longer needed to contest on a mutex. Instead, sched_lock is only used when manipulating run queues or thread priorities. Turnstiles also implement priority propagation inherently.
Currently turnstiles only support mutexes. Eventually, however, turnstiles may grow two queue's to support a non-sleepable reader/writer lock implementation. For more details, see the comments in sys/turnstile.h and kern/subr_turnstile.c.
The two primary advantages from the turnstile code include: 1) the size of struct mutex shrinks by four pointers as it no longer stores the thread queue linkages directly, and 2) less contention on sched_lock in SMP systems including the ability for multiple CPUs to contend on different locks simultaneously (not that this last detail is necessarily that much of a big win). Note that 1) means that this commit is a kernel ABI breaker, so don't mix old modules with a new kernel and vice versa.
Tested on: i386 SMP, sparc64 SMP, alpha SMP
|
#
118272 |
|
31-Jul-2003 |
jhb |
If a spin lock is held for too long and WITNESS is enabled, then call witness_display_spinlock() to see if we can find out where the current owner of the spin lock last acquired the lock.
|
#
118227 |
|
30-Jul-2003 |
jhb |
When complaining about a sleeping thread owning a mutex, display the thread's pid to make debugging easier for people who don't want to have to use the intended tool for these panics (witness).
Indirectly prodded by: kris
|
#
117168 |
|
02-Jul-2003 |
jhb |
- Add comments about the maintenance of the per-thread list of contested locks held by each thread. - Fix a bug in the original BSD/OS code where a contested lock was not properly handed off from the old thread to the new thread when a contested lock with more than one blocked thread was transferred from one thread to another. - Don't use an atomic operation to write the MTX_CONTESTED value to mtx_lock in the aforementioned special case. The memory barriers and exclusion provided by sched_lock are sufficient.
Spotted by: alc (2)
|
#
116182 |
|
11-Jun-2003 |
obrien |
Use __FBSDID().
|
#
115568 |
|
31-May-2003 |
phk |
Add "" around mutex name to make message less confusing.
|
#
113632 |
|
17-Apr-2003 |
jhb |
Use TD_IS_RUNNING() instead of thread_running() in the adaptive mutex code.
|
#
113339 |
|
10-Apr-2003 |
julian |
Move the _oncpu entry from the KSE to the thread. The entry in the KSE still exists but it's purpose will change a bit when we add the ability to lock a KSE to a cpu.
|
#
112513 |
|
23-Mar-2003 |
tjr |
Remove unused mtx_lock_giant(), mtx_unlock_giant(), related globals and sysctls.
|
#
112367 |
|
18-Mar-2003 |
phk |
Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
|
#
112108 |
|
11-Mar-2003 |
jhb |
Axe the useless MTX_SLEEPABLE flag. mutexes are not sleepable locks. Nothing used this flag and WITNESS would have panic'd during mtx_init() if anything had.
|
#
111885 |
|
04-Mar-2003 |
jhb |
Remove safety belt: it is now ok to do a mtx_trylock() on a mutex you already own. The mtx_trylock() will fail however. Enhance the comment at the top of the try lock function to explain this.
Requested by: jlemon and his evil netisr locking
|
#
111880 |
|
04-Mar-2003 |
jhb |
Miscellaneous cleanups to _mtx_lock_sleep(): - Declare some local variables at the top of the function instead of in a nested block. - Use mtx_owned() instead of masking off bits from mtx_lock manually. - Read the value of mtx_lock into 'v' as a separate line rather than inside an if statement for clarity. This code is hairy enough as it is.
|
#
111879 |
|
04-Mar-2003 |
jhb |
Properly assert that mtx_trylock() is not called on a mutex we already owned. Previously the KASSERT would only trigger if we successfully acquired a lock that we already held. However, _obtain_lock() fails to acquire locks that we already hold, so the KASSERT was never checked in the case it was supposed to fail.
|
#
111508 |
|
25-Feb-2003 |
mtm |
Unbreak mutex profiling (at least for me). o Always check for null when dereferencing the filename component. o Implement a try-and-backoff method for allocating memory to dump stats to avoid a spin-lock -> sleep-lock mutex lock order panic with WITNESS.
Approved by: des, markm (mentor) Not objected: jhb
|
#
109654 |
|
21-Jan-2003 |
des |
There's absolutely no need for a struct-within-a-struct, so move the counters out of the inner struct and remove it.
|
#
105919 |
|
25-Oct-2002 |
phk |
Disable the kernacc() check in mtx_validate() until such time that kernacc does not require Giant.
This means that we may miss panics on a class of mutex programming bugs, but only if running with a Chernobyl setting of debug-flags.
Spotted by: Pete Carah <pete@ns.altadena.net>
|
#
105782 |
|
23-Oct-2002 |
des |
Whitespace cleanup.
|
#
105719 |
|
22-Oct-2002 |
robert |
Change the `mutex_prof' structure to use three variables contained in an anonymous structure as counters, instead of an array with preprocessor-defined names for indices. Remove the associated XXX- comment.
|
#
105644 |
|
21-Oct-2002 |
des |
Reduce the overhead of the mutex statistics gathering code, try to produce shorter lines in the report, and clean up some minor style issues.
|
#
104964 |
|
12-Oct-2002 |
jeff |
- Create a new scheduler api that is defined in sys/sched.h - Begin moving scheduler specific functionality into sched_4bsd.c - Replace direct manipulation of scheduler data with hooks provided by the new api. - Remove KSE specific state modifications and single runq assumptions from kern_switch.c
Reviewed by: -arch
|
#
104387 |
|
02-Oct-2002 |
jhb |
Rename the mutex thread and process states to use a more generic 'LOCK' name instead. (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of TD_ON_MUTEX()) Eventually a turnstile abstraction will be added that will be shared with mutexes and other types of locks. SLOCK/TDI_LOCK will be used internally by the turnstile code and will not be specific to mutexes. Making the change now ensures that turnstiles can be dropped in at a later date without affecting the ABI of userland applications.
|
#
104161 |
|
29-Sep-2002 |
julian |
uh, commit all of the patch
|
#
104160 |
|
29-Sep-2002 |
julian |
commit the version I actually tested..
Submitted by: davidxu
|
#
104157 |
|
29-Sep-2002 |
julian |
Implement basic KSE loaning. This stops a hread that is blocked in BOUND mode from stopping another thread from completing a syscall, and this allows it to release its resources etc. Probably more related commits to follow (at least one I know of)
Initial concept by: julian, dillon Submitted by: davidxu
|
#
103216 |
|
11-Sep-2002 |
julian |
Completely redo thread states.
Reviewed by: davidxu@freebsd.org
|
#
102907 |
|
03-Sep-2002 |
jhb |
Add some KASSERT()'s to ensure that we don't perform spin mutex ops on sleep mutexes and vice versa. WITNESS normally should catch this but not everyone uses WITNESS so this is a fallback to catch nasty but easy to do bugs.
|
#
102450 |
|
26-Aug-2002 |
iedowse |
Add a new KTR type KTR_CONTENTION, and use it in the mutex code to log the start and end of periods during which mtx_lock() is waiting to acquire a sleep mutex. The log message includes the file and line of both the waiter and the holder.
Reviewed by: jhb, jake
|
#
100754 |
|
27-Jul-2002 |
jhb |
Disable optimization of spinlocks on UP kernels w/o debugging for now since it breaks mtx_owned() on spin mutexes when used outside of mtx_assert(). Unfortunately we currently use it in the i386 MD code and in the sio(4) driver.
Reported by: bde
|
#
99324 |
|
03-Jul-2002 |
des |
Add mtx_ prefixes to the fields used for mutex profiling, and fix a bug where the profiling code would report the release point instead of the acquisition point.
Requested by: bde
|
#
99072 |
|
29-Jun-2002 |
julian |
Part 1 of KSE-III
The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
#
97839 |
|
04-Jun-2002 |
jhb |
Replace thread_runnable() with thread_running() as the latter is more accurate.
Suggested by: julian
|
#
97837 |
|
04-Jun-2002 |
jhb |
Optimize the adaptive mutex spin a bit. Use a simple while loop with simple reads (and on IA32, a "pause" instruction for each interation of the loop) to spin until either the mutex owner field changes, or the lock owner stops executing.
Suggested by: tanimura Tested on: i386
|
#
97836 |
|
04-Jun-2002 |
jhb |
Add a private thread_runnable() macro to make the code more readable and make the KSE diff easier to maintain.
|
#
97156 |
|
23-May-2002 |
des |
Make the counters uintmax_ts, and use %ju rather than %llu.
|
#
97139 |
|
22-May-2002 |
jhb |
Rename pause() to ia32_pause() so it doesn't conflict with the pause() function defined in <unistd.h>. I didn't #ifdef _KERNEL it because the mutex implementation in libpthread will probably need this.
|
#
97113 |
|
22-May-2002 |
jhb |
Rename cpu_pause() to pause(). Originally I was going to make this an MI API with empty cpu_pause() functions on other arch's, but this functionality is definitely unique to IA-32, so I decided to leave it as i386-only and wrap it in #ifdef's. I should have dropped the cpu_ prefix when I made that decision.
Requested by: bde
|
#
97086 |
|
21-May-2002 |
jhb |
Add appropriate IA32 "pause" instructions to improve performanec on Pentium 4's and newer IA32 processors. The "pause" instruction has been verified by Intel to be a NOP on all currently existing IA32 processors prior to the Pentium 4.
|
#
97084 |
|
21-May-2002 |
jhb |
Fix an old cut 'n' paste bug inherited from BSD/OS: don't increment 'i' twice once we are in the long wait stage of spinning on a spin mutex.
|
#
97082 |
|
21-May-2002 |
jhb |
Whitespace fixup, properly indent the body of an else clause.
|
#
97081 |
|
21-May-2002 |
jhb |
Add code to make default mutexes adaptive if the ADAPTIVE_MUTEXES kernel option is used (not on by default).
- In the case of trying to lock a mutex, if the MTX_CONTESTED flag is set, then we can safely read the thread pointer from the mtx_lock member while holding sched_lock. We then examine the thread to see if it is currently executing on another CPU. If it is, then we keep looping instead of blocking. - In the case of trying to unlock a mutex, it is now possible for a mutex to have MTX_CONTESTED set in mtx_lock but to not have any threads actually blocked on it, so we need to handle that case. In that case, we just release the lock as if MTX_CONTESTED was not set and return. - We do not adaptively spin on Giant as Giant is held for long times and it slows SMP systems down to a crawl (it was taking several minutes, like 5-10 or so for my test alpha and sparc64 SMP boxes to boot up when they adaptively spinned on Giant). - We only compile in the code to do this for SMP kernels, it doesn't make sense for UP kernels.
Tested on: i386, alpha, sparc64
|
#
97079 |
|
21-May-2002 |
jhb |
Optimize spin mutexes for UP kernels without debugging to just enter and exit critical sections. We only contest on a spin mutex on an SMP kernel running on an SMP machine.
|
#
93813 |
|
04-Apr-2002 |
jhb |
Change mtx_init() to now take an extra argument. The third argument is the generic lock type for use with witness. If this argument is NULL then the lock name is used as the lock type. Add a macro for a lock type name for network driver locks.
|
#
93705 |
|
02-Apr-2002 |
des |
Revert to open hashing. It makes the code simpler, and works farily well even when the number of records approaches the size of the hash table. Besides, the previous implementation (using linear probing) was broken :)
Also, use the newly introduced MTX_SYSINIT.
|
#
93702 |
|
02-Apr-2002 |
jhb |
- Move the MI mutexes sched_lock and Giant from being declared in the various machdep.c's to being declared in kern_mutex.c. - Add a new function mutex_init() used to perform early initialization needed for mutexes such as setting up thread0's contested lock list and initializing MI mutexes. Change the various MD startup routines to call this function instead of duplicating all the code themselves.
Tested on: alpha, i386
|
#
93692 |
|
02-Apr-2002 |
jhb |
Spelling police.
|
#
93672 |
|
02-Apr-2002 |
arr |
- Add MTX_SYSINIT and SX_SYSINIT as macro glue for allowing sx and mtx locks to be able to setup a SYSINIT call. This helps in places where a lock is needed to protect some data, but the data is not truly associated with a subsystem that can properly initialize it's lock. The macros use the mtx_sysinit() and sx_sysinit() functions, respectively, as the handler argument to SYSINIT().
Reviewed by: alfred, jhb, smp@
|
#
93667 |
|
02-Apr-2002 |
des |
Instead of get_cyclecount(9), use nanotime(9) to record acquisition and release times. Measurements are made and stored in nanoseconds but presented in microseconds, which should be sufficient for the locks for which we actually want this (those that are held long and / or often). Also, rename some variables and structure members to unit-agnostic names.
|
#
93609 |
|
02-Apr-2002 |
des |
Mutex profiling code, conditional on the MUTEX_PROFILING option. Adds the following sysctl variables:
debug.mutex.prof.enable enable / disable profiling debug.mutex.prof.acquisitions number of mutex acquisitions recorded debug.mutex.prof.records number of acquisition points recorded debug.mutex.prof.maxrecords max number of acquisition points debug.mutex.prof.rejected number of rejections (due to full table) debug.mutex.prof.hashsize hash size debug.mutex.prof.collisions number of hash collisions debug.mutex.prof.stats profiling statistics
The code records four numbers for each acquisition point (identified by source file name and line number): longest time held, total time held, number of non-recursive acquisitions, average time held. The measurements are in clock cycles (as returned by get_cyclecount(9)); this may cause measurements on some SMP systems to be unreliable. This can probably be worked around by replacing get_cyclecount(9) by some incarnation of nanotime(9).
This work was derived from initial patches by eivind.
|
#
93273 |
|
27-Mar-2002 |
jeff |
Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks with this flag. Remove the dup_list and dup_ok code from subr_witness. Now we just check for the flag instead of doing string compares.
Also, switch the process lock, process group lock, and uma per cpu locks over to this interface. The original mechanism did not work well for uma because per cpu lock names are unique to each zone.
Approved by: jhb
|
#
92723 |
|
19-Mar-2002 |
alfred |
Remove __P.
|
#
90997 |
|
20-Feb-2002 |
peter |
Tidy up some unused variables
|
#
90864 |
|
18-Feb-2002 |
dillon |
Add kern_giant_ucred to instrument Giant around ucred related operations such a getgid(), setgid(), etc...
|
#
90538 |
|
11-Feb-2002 |
julian |
In a threaded world, differnt priorirites become properties of different entities. Make it so.
Reviewed by: jhb@freebsd.org (john baldwin)
|
#
90418 |
|
09-Feb-2002 |
jhb |
Use the mtx_owner() macro in one spot in _mtx_lock_sleep() to make the code easier to read.
|
#
89392 |
|
15-Jan-2002 |
jhb |
Bump the limits for determining if we've held a spinlock too long as they seem to be too short for the 500 Mhz DS20 I'm testing on. The rather arbitrary numbers are rather bogus anyways. We should probably have variables for these limits that are calibrated in the MD startup code somehow.
|
#
88900 |
|
05-Jan-2002 |
jhb |
Change the preemption code for software interrupt thread schedules and mutex releases to not require flags for the cases when preemption is not allowed:
The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent switching to a higher priority thread on mutex releease and swi schedule, respectively when that switch is not safe. Now that the critical section API maintains a per-thread nesting count, the kernel can easily check whether or not it should switch without relying on flags from the programmer. This fixes a few bugs in that all current callers of swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from fast interrupt handlers and the swi_sched of softclock needed this flag. Note that to ensure that swi_sched()'s in clock and fast interrupt handlers do not switch, these handlers have to be explicitly wrapped in critical_enter/exit pairs. Presently, just wrapping the handlers is sufficient, but in the future with the fully preemptive kernel, the interrupt must be EOI'd before critical_exit() is called. (critical_exit() can switch due to a deferred preemption in a fully preemptive kernel.)
I've tested the changes to the interrupt code on i386 and alpha. I have not tested ia64, but the interrupt code is almost identical to the alpha code, so I expect it will work fine. PowerPC and ARM do not yet have interrupt code in the tree so they shouldn't be broken. Sparc64 is broken, but that's been ok'd by jake and tmm who will be fixing the interrupt code for sparc64 shortly.
Reviewed by: peter Tested on: i386, alpha
|
#
88088 |
|
18-Dec-2001 |
jhb |
Modify the critical section API as follows: - The MD functions critical_enter/exit are renamed to start with a cpu_ prefix. - MI wrapper functions critical_enter/exit maintain a per-thread nesting count and a per-thread critical section saved state set when entering a critical section while at nesting level 0 and restored when exiting to nesting level 0. This moves the saved state out of spin mutexes so that interlocking spin mutexes works properly. - Most low-level MD code that used critical_enter/exit now use cpu_critical_enter/exit. MI code such as device drivers and spin mutexes use the MI wrappers. Note that since the MI wrappers store the state in the current thread, they do not have any return values or arguments. - mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is assigned to curthread->td_savecrit during fork_exit().
Tested on: i386, alpha
|
#
86411 |
|
15-Nov-2001 |
jhb |
Remove definition of witness and comment stating that this file implements witness. Witness moved off to subr_witness.c a while ago.
|
#
85564 |
|
26-Oct-2001 |
dillon |
Add mtx_lock_giant() and mtx_unlock_giant() wrappers for sysctl management of Giant during the Giant unwinding phase, and start work on instrumenting Giant for the file and proc mutexes.
These wrappers allow developers to turn on and off Giant around various subsystems. DEVELOPERS SHOULD NEVER TURN OFF GIANT AROUND A SUBSYSTEM JUST BECAUSE THE SYSCTL EXISTS! General developers should only considering turning on Giant for a subsystem whos default is off (to help track down bugs). Only developers working on particular subsystems who know what they are doing should consider turning off Giant.
These wrappers will greatly improve our ability to unwind Giant and test the kernel on a (mostly) subsystem by subsystem basis. They allow Giant unwinding developers (GUDs) to emplace appropriate subsystem and structural mutexes in the main tree and then request that the larger community test the work by turning off Giant around the subsystem(s), without the larger community having to mess around with patches. These wrappers also allow GUDs to boot into a (more likely to be) working system in the midst of their unwinding work and to test that work under more controlled circumstances.
There is a master sysctl, kern.giant.all, which defaults to 0 (off). If turned on it overrides *ALL* other kern.giant sysctls and forces Giant to be turned on for all wrapped subsystems. If turned off then Giant around individual subsystems are controlled by various other kern.giant.XXX sysctls.
Code which overlaps multiple subsystems must have all related subsystem Giant sysctls turned off in order to run without Giant.
|
#
85205 |
|
20-Oct-2001 |
jhb |
The mtx_init() and sx_init() functions bzero'd locks before handing them off to witness_init() making the check for double intializating a lock by testing the LO_INITIALIZED flag moot. Workaround this by checking the LO_INITIALIZED flag ourself before we bzero the lock structure.
|
#
83947 |
|
26-Sep-2001 |
jhb |
Remove superflous parens after de-macroizing.
|
#
83841 |
|
22-Sep-2001 |
jhb |
Since we no longer inline any debugging code in the mutex operations, move all the debugging code into the function versions of the mutex operations in kern_mutex.c. This reduced the __mtx_* macros to simply wrappers of the _{get,rel}_lock_* macros, so the __mtx_* macros were also abolished in favor of just calling the _{get,rel}_lock_* macros. The tangled hairy mass of macros calling macros is at least a bit more sane now.
|
#
83679 |
|
19-Sep-2001 |
jhb |
Fix a bug in propagate priority: the kse group pointer wasn't being updated in the loop so the new thread always seemd to have the same priority as the original thread and no actual priorities were changed.
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
82304 |
|
24-Aug-2001 |
bmilekic |
Force a commit on kern_mutex.c to explain reason for last commit but while I'm at it also add a comment in mtx_validate() explaining the purpose of the last change.
Basically, this fixes booting kernels compiled with MUTEX_DEBUG. What used to happen is before we setidt from init386() [still using BTX idt], we called mtx_init() on several mutex locks, notably Giant and some others. This is a problem for MUTEX_DEBUG because it enables mtx_validate() which calls kernacc(), some of which in turn requires Giant. Fix by calling kernacc() from mtx_validate() only if (!cold).
|
#
82302 |
|
24-Aug-2001 |
bmilekic |
*** empty log message ***
|
#
80748 |
|
31-Jul-2001 |
jhb |
If we have already panic'd then don't bother enforcing mutex asserts as things are pretty much shot already and all panic'ing does is hurt our chances of getting a dump.
Inspired by: sheldonh
|
#
78766 |
|
25-Jun-2001 |
jhb |
Count the context switch when blocking on a mutex as a voluntary context switch. Count the context switch when preempting the current thread to let a higher priority thread blocked on a mutex we just released run as an involuntary context switch.
Reported by: bde
|
#
76272 |
|
04-May-2001 |
jhb |
- Move state about lock objects out of struct lock_object and into a new struct lock_instance that is stored in the per-process and per-CPU lock lists. Previously, the lock lists just kept a pointer to each lock held. That pointer is now replaced by a lock instance which contains a pointer to the lock object, the file and line of the last acquisition of a lock, and various flags about a lock including its recursion count. - If we sleep while holding a sleepable lock, then mark that lock instance as having slept and ignore any lock order violations that occur while acquiring Giant when we wake up with slept locks. This is ok because of Giant's special nature. - Allow witness to differentiate between shared and exclusive locks and unlocks of a lock. Witness will now detect the case when a lock is acquired first in one mode and then in another. Mutexes are always locked and unlocked exclusively. Witness will also now detect the case where a process attempts to unlock a shared lock while holding an exclusive lock and vice versa. - Fix a bug in the lock list implementation where we used the wrong constant to detect the case where a lock list entry was full.
|
#
76166 |
|
01-May-2001 |
markm |
Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files.
Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files.
Sort sys/*.h includes where possible in affected files.
OK'ed by: bde (with reservations)
|
#
75568 |
|
17-Apr-2001 |
jhb |
Exit and re-enter the critical section while spinning for a spinlock so that interrupts can come in while we are waiting for a lock.
|
#
75468 |
|
13-Apr-2001 |
markm |
Handle a rare but fatal race invoked sometimes when SIGSTOP is invoked.
|
#
74912 |
|
28-Mar-2001 |
jhb |
Rework the witness code to work with sx locks as well as mutexes. - Introduce lock classes and lock objects. Each lock class specifies a name and set of flags (or properties) shared by all locks of a given type. Currently there are three lock classes: spin mutexes, sleep mutexes, and sx locks. A lock object specifies properties of an additional lock along with a lock name and all of the extra stuff needed to make witness work with a given lock. This abstract lock stuff is defined in sys/lock.h. The lockmgr constants, types, and prototypes have been moved to sys/lockmgr.h. For temporary backwards compatability, sys/lock.h includes sys/lockmgr.h. - Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin locks held. By making this per-cpu, we do not have to jump through magic hoops to deal with sched_lock changing ownership during context switches. - Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with proc->p_sleeplocks, which is a list of held sleep locks including sleep mutexes and sx locks. - Add helper macros for logging lock events via the KTR_LOCK KTR logging level so that the log messages are consistent. - Add some new flags that can be passed to mtx_init(): - MTX_NOWITNESS - specifies that this lock should be ignored by witness. This is used for the mutex that blocks a sx lock for example. - MTX_QUIET - this is not new, but you can pass this to mtx_init() now and no events will be logged for this lock, so that one doesn't have to change all the individual mtx_lock/unlock() operations. - All lock objects maintain an initialized flag. Use this flag to export a mtx_initialized() macro that can be safely called from drivers. Also, we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness performs the corresponding checks using the initialized flag. - The lock order reversal messages have been improved to output slightly more accurate file and line numbers.
|
#
74900 |
|
28-Mar-2001 |
jhb |
- Switch from using save/disable/restore_intr to using critical_enter/exit and change the u_int mtx_saveintr member of struct mtx to a critical_t mtx_savecrit. - On the alpha we no longer need a custom _get_spin_lock() macro to avoid an extra PAL call, so remove it. - Partially fix using mutexes with WITNESS in modules. Change all the _mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line parameters and rename them to use a prefix of two underscores. Inside of kern_mutex.c, generate wrapper functions for _mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore) that are called from modules. The macros mtx_{un,}lock_{spin,}_flags() are mapped to the __mtx_* macros inside of the kernel to inline the usual case of mutex operations and map to the internal _mtx_* functions in the module case so that modules will use WITNESS and KTR logging if the kernel is compiled with support for it.
|
#
74016 |
|
09-Mar-2001 |
jhb |
Fix mtx_legal2block. The only time that it is bad to block on a mutex is if we hold a spin mutex, since we can trivially get into deadlocks if we start switching out of processes that hold spinlocks. Checking to see if interrupts were disabled was a sort of cheap way of doing this since most of the time interrupts were only disabled when holding a spin lock. At least on the i386. To fix this properly, use a per-process counter p_spinlocks that counts the number of spin locks currently held, and instead of checking to see if interrupts are disabled in the witness code, check to see if we hold any spin locks. Since child processes always start up with the sched lock magically held in fork_exit(), we initialize p_spinlocks to 1 for child processes. Note that proc0 doesn't go through fork_exit(), so it starts with no spin locks held.
Consulting from: cp
|
#
73912 |
|
07-Mar-2001 |
jhb |
- Add an extra check in priority_propagation() for UP systems to ensure we don't end up back at ourselves which would indicate deadlock. - Add the proc lock to the witness dup_list as we may hold more than one process lock at a time. - Don't assert a mutex is owned in _mtx_unlock_sleep() as that is too late. We do the checks in the macros instead.
|
#
73238 |
|
28-Feb-2001 |
julian |
Shuffle netgraph mutexes a bit and hold a reference on a node from the function that is calling the destructor.
|
#
73205 |
|
28-Feb-2001 |
jake |
Sigh. Try to get priorities sorted out. Don't bother trying to update native priority, it is diffcult to get right and likely to end up horribly wrong. Use an honestly wrong fixed value that seems to work; PUSER for user threads, and the interrupt priority for ithreads. Set it once when the process is created and forget about it.
Suggested by: bde Pointy hat: me
|
#
73114 |
|
26-Feb-2001 |
jake |
Initialize native priority to PRI_MAX. It was usually 0 which made a process's priority go through the roof when it released a (contested) mutex. Only set the native priority in mtx_lock if hasn't already been set.
Reviewed by: jhb
|
#
73033 |
|
25-Feb-2001 |
jake |
Remove brackets around variables in a function that used to be a macro.
|
#
73003 |
|
25-Feb-2001 |
julian |
Move netgraph spimlock order entries out of the #ifdef SMP section. They need to be there for UP too.
|
#
72996 |
|
24-Feb-2001 |
jhb |
Grrr, s/INVARIANTS_SUPPORT/INVARIANT_SUPPORT/.
|
#
72994 |
|
24-Feb-2001 |
jhb |
- Axe RETIP() as it was very i386 specific and unwieldy. Instead, use the passed in filename and line number in the KTR tracepoint message. - Even though it is #if 0'd code, change the code to detect that a process is an interrupt thread to check p->p_ithd against NULL rather than checking non-existant process flags from BSD/OS. - Use '%p' to print pointers in KTR log messages instead of assuming sizeof(int) == sizeof(void *). - Don't set p_mtxname to NULL when releasing a mutex. It doesn't hurt to leave it set (we don't clear w_mesg for example) and at least at one time in the past, there used to be race conditions in the kernel that would result in setting this to NULL causing the kernel to dereference NULL. - Make the _mtx_assert() function be compiled in if INVARIANTS_SUPPORT is defined rather than if INVARIANTS is defined so that a KLD compiled with INVARIANTS that uses mtx_assert() can be used with a kernel that just has INVARIANT_SUPPORT compiled in.
|
#
72979 |
|
24-Feb-2001 |
julian |
Add knowledge of the netgraph spinlocks into the Witness code. Well, at least I think that's how it's done.
|
#
72836 |
|
22-Feb-2001 |
jhb |
- Use the NOCPU constant. - Move the ithread spin locks before sched lock and clk in preparation for future commits to the ithread code.
|
#
72393 |
|
12-Feb-2001 |
bmilekic |
Change all instances of `CURPROC' and `CURTHD' to `curproc,' in order to stay consistent.
Requested by: bde
|
#
72376 |
|
12-Feb-2001 |
jake |
Implement a unified run queue and adjust priority levels accordingly.
- All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.
|
#
72344 |
|
11-Feb-2001 |
bmilekic |
- Place back STR string declarations for lock/unlock strings used for KTR_LOCK tracing in order to avoid duplication. - Insert some tracepoints back into the mutex acq/rel code, thus ensuring that we can trace all lock acq/rel's again. - All CURPROC != NULL checks are MPASS()es (under MUTEX_DEBUG) because they signify a serious mutex corruption. - Change up some KASSERT()s to MPASS()es, and vice-versa, depending on the type of problem we're debugging (INVARIANTS is used here to check that the API is being used properly whereas MUTEX_DEBUG is used to ensure that something general isn't happening that will have bad impact on mutex locks).
Reminded by: jhb, jake, asmodai
|
#
72256 |
|
09-Feb-2001 |
jhb |
Unify the two sleep lock order lists to enforce the process lock -> uidinfo lock locking order.
|
#
72224 |
|
09-Feb-2001 |
jhb |
- Change the 'witness_list' ddb command to 'show mutexes'. Note that this will only display sleep mutexes held by the current process. - Clean up some nits in the witness_display() function and add a ddb command 'show witness' that dumps the hierarchy and order lists to the console. - Use queue(3) macros where appropriate. - Resort the spin lock order list so that "com" is before "sched_lock". Also, add appropriate #ifdef's around SMP and i386-specific mutexes. - Add two new mutexes used to protect the ithread lists and tables to the order list.
Requested by: bde (1)
|
#
72200 |
|
09-Feb-2001 |
bmilekic |
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case.
Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
|
#
71709 |
|
27-Jan-2001 |
jhb |
Add a new ddb command 'witness_list' that lists the mutexes held by curproc.
Requested by: peter
|
#
71576 |
|
24-Jan-2001 |
jasone |
Convert all simplelocks to mutexes and remove the simplelock implementations.
|
#
71560 |
|
24-Jan-2001 |
jhb |
- Don't use a union and fun tricks to shave one extra pointer off of struct mtx right now as it makes debugging harder. When we are in optimizing mode, we can revisit this. - Fix the KTR trace messages to use %p rather than 0x%p to avoid duplicate 0x's in KTR output. - During witness_fixup, release Giant so that witness doesn't get confused. Also, grab all_mtx while walking the list of mutexes. - Remove w_sleep and w_recurse. Instead, perform checks on mutexes using the mutex's mtx_flags field. - Allow debug.witness_ddb and debug.witness_skipspin to be set from the loader. - Add Giant to the front of existing order_list entries to help ensure Giant is always first. - Add an order entry for the various proc locks. Note that this only helps keep proc in order mostly as the allproc and proctree mutexes are only obtained during a lockmgr operation on the specified mutex.
|
#
71360 |
|
22-Jan-2001 |
jasone |
Print correct file name and line number in mtx_assert().
Noticed by: jake
|
#
71352 |
|
21-Jan-2001 |
jasone |
Move most of sys/mutex.h into kern/kern_mutex.c, thereby making the mutex inline functions non-inlined. Hide parts of the mutex implementation that should not be exposed.
Make sure that WITNESS code is not executed during boot until the mutexes are fully initialized by SI_SUB_MUTEX (the original motivation for this commit).
Submitted by: peter
|
#
71328 |
|
21-Jan-2001 |
jasone |
Make the order of the static initializer for all_mtx match the order of fields in struct mtx.
Found by: jake
|
#
71320 |
|
21-Jan-2001 |
jasone |
Remove MUTEX_DECLARE() and MTX_COLD. Instead, postpone full mutex initialization until after malloc() is safe to call, then iterate through all mutexes and complete their initialization.
This change is necessary in order to avoid some circular bootstrapping dependencies.
|
#
71287 |
|
20-Jan-2001 |
jake |
- Make npx_intr INTR_MPSAFE and move acquiring Giant into the function itself. - Remove a hack to allow acquiring Giant from the npx asm trap vector.
|
#
71228 |
|
19-Jan-2001 |
bmilekic |
Implement MTX_RECURSE flag for mtx_init(). All calls to mtx_init() for mutexes that recurse must now include the MTX_RECURSE bit in the flag argument variable. This change is in preparation for an upcoming (further) mutex API cleanup. The witness code will call panic() if a lock is found to recurse but the MTX_RECURSE bit was not set during the lock's initialization.
The old MTX_RECURSE "state" bit (in mtx_lock) has been renamed to MTX_RECURSED, which is more appropriate given its meaning.
The following locks have been made "recursive," thus far: eventhandler, Giant, callout, sched_lock, possibly some others declared in the architecture-specific code, all of the network card driver locks in pci/, as well as some other locks in dev/ stuff that I've found to be recursive.
Reviewed by: jhb
|
#
70861 |
|
10-Jan-2001 |
jake |
Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables other then curproc.
|
#
69998 |
|
13-Dec-2000 |
jhb |
- Add a new flag MTX_QUIET that can be passed to the various mtx_* functions. If this flag is set, then no KTR log messages are issued. This is useful for blocking excessive logging, such as with the internal mutex used by the witness code. - Use MTX_QUIET on all of the mtx_enter/exit operations on the internal mutex used by the witness code. - If we are in a panic, don't do witness checks in witness_enter(), witness_exit(), and witness_try_enter(), just return.
|
#
69881 |
|
12-Dec-2000 |
jake |
- Add code to detect if a system call returns with locks other than Giant held and panic if so (conditional on witness). - Change witness_list to return the number of locks held so this is easier. - Add kern/syscalls.c to the kernel build if witness is defined so that the panic message can contain the name of the offending system call. - Add assertions that Giant and sched_lock are not held when returning from a system call, which were missing for alpha and ia64.
|
#
69879 |
|
12-Dec-2000 |
jhb |
Oops, the witness mutex is a spin lock, so use MTX_SPIN in the call to mtx_init(). Since the witness code ignores its internal mutex, this doesn't result in any functional change.
|
#
69781 |
|
08-Dec-2000 |
dwmalone |
Convert more malloc+bzero to malloc+M_ZERO.
Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>
|
#
69429 |
|
01-Dec-2000 |
jhb |
Split the WITNESS and MUTEX_DEBUG options apart so that WITNESS does not depend on MUTEX_DEBUG. The MUTEX_DEBUG option turns on extra assertions and checks to verify that mutexes themselves are implemented properly. The WITNESS option uses extra checks and diagnostics to verify that other code is using mutexes properly.
|
#
69376 |
|
30-Nov-2000 |
jhb |
Fix up priority propagation: - Use a better test for determining when a process is running. - Convert some checks to assertions. - Remove unnecessary tests. - Save the priority before acquiring a mutex rather than in msleep(9).
|
#
69369 |
|
29-Nov-2000 |
jhb |
Set p_mtxname when blocking on a mutex and clear it when waking up.
|
#
69363 |
|
29-Nov-2000 |
jhb |
Use an atomic operation with an appropriate memory barrier when releasing a contested sleep mutex in the case that at least two processes are blocked on the contested mutex.
|
#
69362 |
|
29-Nov-2000 |
jhb |
The sched_lock mutex goes after the sio mutex in the locking order since a software interrupt can be scheduled in the sio interrupt handler while the sio mutex is held.
|
#
69361 |
|
29-Nov-2000 |
jhb |
Save the line number and filename of the last mtx_enter operation for spin locks. We already do this for sleep locks.
|
#
69215 |
|
26-Nov-2000 |
alfred |
Move the #define of _KERN_MUTEX_C_ so that it's before any system headers are included. System headers can include sys/mutex.h and then certain macros do not get defined.
Reviewed by: jake
|
#
69208 |
|
26-Nov-2000 |
jake |
Add uidinfo hash and uidinfo struct to the witness order list.
|
#
68889 |
|
19-Nov-2000 |
jake |
- Protect the callout wheel with a separate spin mutex, callout_lock. - Use the mutex in hardclock to ensure no races between it and softclock. - Make softclock be INTR_MPSAFE and provide a flag, CALLOUT_MPSAFE, which specifies that a callout handler does not need giant. There is still no way to set this flag when regstering a callout.
Reviewed by: -smp@, jlemon
|
#
68862 |
|
17-Nov-2000 |
jake |
- Split the run queue and sleep queue linkage, so that a process may block on a mutex while on the sleep queue without corrupting it. - Move dropping of Giant to after the acquire of sched_lock.
Tested by: John Hay <jhay@icomtek.csir.co.za> jhb
|
#
68808 |
|
16-Nov-2000 |
jhb |
Don't release and acquire Giant in mi_switch(). Instead, release and acquire Giant as needed in functions that call mi_switch(). The releases need to be done outside of the sched_lock to avoid potential deadlocks from trying to acquire Giant while interrupts are disabled.
Submitted by: witness
|
#
68790 |
|
15-Nov-2000 |
jhb |
Include the right headers to get the DDB #define and the db_active variable.
|
#
68786 |
|
15-Nov-2000 |
jhb |
Declare the 'witness_spin_check' properly as a per-CPU variable in the non-SMP case.
|
#
68785 |
|
15-Nov-2000 |
jhb |
Don't perform witness checks in witness_enter() during a panic.
|
#
68582 |
|
10-Nov-2000 |
jhb |
Minor whitespace nit in a comment.
|
#
67676 |
|
27-Oct-2000 |
jhb |
- Use MUTEX_DECLARE() and MTX_COLD for the WITNESS code's internal mutex so it can function before malloc(9) is up and running. - Add two new options WITNESS_DDB and WITNESS_SKIPSPIN. If WITNESS_SKIPSPIN is enabled, then spin mutexes are ignored by the WITNESS code. If WITNESS_DDB is turned on and DDB is compiled into the kernel, then the kernel will drop into DDB when either a lock hierarchy violation occurs or mutexes are held when going to sleep. - Add some new sysctls: debug.witness_ddb is a read-write sysctl that corresponds to WITNESS_DDB. The kernel option merely changes the default value to on at boot. debug.witness_skipspin is a read-only sysctl that one can use to determine if the kernel was compiled with WITNESS_SKIPSPIN. - Wipe out the BSD/OS-specific lock order lists. We get to build our own lists now as we add mutexes to the kernel.
|
#
67548 |
|
25-Oct-2000 |
jhb |
Quite some warnings.
|
#
67404 |
|
20-Oct-2000 |
jhb |
Propogate the 'const'ness of mutex descriptions to the witness code to quiet warnings.
|
#
67401 |
|
20-Oct-2000 |
jhb |
Actually enable the witness code if the WITNESS kernel option is enabled.
|
#
67396 |
|
20-Oct-2000 |
jhb |
Doh. Fix a 64-bit-ism by using uintptr_t for a temporary lock variable instead of int.
|
#
67352 |
|
20-Oct-2000 |
jhb |
- Make the mutex code almost completely machine independent. This greatly reducues the maintenance load for the mutex code. The only MD portions of the mutex code are in machine/mutex.h now, which include the assembly macros for handling mutexes as well as optionally overriding the mutex micro-operations. For example, we use optimized micro-ops on the x86 platform #ifndef I386_CPU. - Change the behavior of the SMP_DEBUG kernel option. In the new code, mtx_assert() only depends on INVARIANTS, allowing other kernel developers to have working mutex assertiions without having to include all of the mutex debugging code. The SMP_DEBUG kernel option has been renamed to MUTEX_DEBUG and now just controls extra mutex debugging code. - Abolish the ugly mtx_f hack. Instead, we dynamically allocate seperate mtx_debug structures on the fly in mtx_init, except for mutexes that are initiated very early in the boot process. These mutexes are declared using a special MUTEX_DECLARE() macro, and use a new flag MTX_COLD when calling mtx_init. This is still somewhat hackish, but it is less evil than the mtx_f filler struct, and the mtx struct is now the same size with and without mutex debugging code. - Add some micro-micro-operation macros for doing the actual atomic operations on the mutex mtx_lock field to make it easier for other archs to override/optimize mutex ops if needed. These new tiny ops also clean up the code in some places by replacing long atomic operation function calls that spanned 2-3 lines with a short 1-line macro call. - Don't call mi_switch() from mtx_enter_hard() when we block while trying to obtain a sleep mutex. Calling mi_switch() would bogusly release Giant before switching to the next process. Instead, inline most of the code from mi_switch() in the mtx_enter_hard() function. Note that when we finally kill Giant we can back this out and go back to calling mi_switch().
|
#
65856 |
|
14-Sep-2000 |
jhb |
Remove the mtx_t, witness_t, and witness_blessed_t types. Instead, just use struct mtx, struct witness, and struct witness_blessed.
Requested by: bde
|
#
65651 |
|
09-Sep-2000 |
jasone |
Style cleanups. No functional changes.
|
#
65650 |
|
09-Sep-2000 |
jasone |
Add file and line arguments to WITNESS_ENTER() and WITNESS_EXIT, since __FILE__ and __LINE__ don't get expanded usefully in inline functions.
Add const to all witness*() arguments that are filenames.
|
#
65624 |
|
08-Sep-2000 |
jasone |
Rename mtx_enter(), mtx_try_enter(), and mtx_exit() and wrap them with cpp macros that expand to pass filename and line number information. This is necessary since we're using inline functions instead of macros now.
Add const to the filename pointers passed througout the mtx and witness code.
|
#
65557 |
|
07-Sep-2000 |
jasone |
Major update to the way synchronization is done in the kernel. Highlights include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be preempted (i386 only).
Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
|