History log of /freebsd-current/sys/sys/mutex.h
Revision Date Author Comments
# 1148518a 28-Nov-2023 Emmanuel Vadot <manu@FreeBSD.org>

Revert "sys/mutex.h: Include sys/lock.h instead of sys/_lock.h"

This reverts commit 2a35f3cdf63d1f9b1ea5ab0174adabb631757210.

Doesn't appears to be needed anymore and if it is at some point I'll
fix the driver.


# 7b4b05d8 28-Nov-2023 Emmanuel Vadot <manu@FreeBSD.org>

Revert "sys/mutex.h: Reorder includes"

This reverts commit 50335b1ae4e48712f831e85ddfa7b00da0af382c.


# 50335b1a 24-Nov-2023 Emmanuel Vadot <manu@FreeBSD.org>

sys/mutex.h: Reorder includes

Fixes: 2a35f3cdf63d ("sys/mutex.h: Include sys/lock.h instead of sys/_lock.h")


# 2a35f3cd 27-Oct-2022 Emmanuel Vadot <manu@FreeBSD.org>

sys/mutex.h: Include sys/lock.h instead of sys/_lock.h

It uses the LA_ defines when INVARIANTS is set.
This unbreak dpaa2 with FDT only kernel (like ALLWINNER or ROCKCHIP) as
the driver only include sys/lock.h via header polution for ACPI kernels.

Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D37145
Reviewed by: kib, mjg


# 2ff63af9 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .h pattern

Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/


# bc2ccf0e 10-Dec-2022 Mateusz Guzik <mjg@FreeBSD.org>

mtx: retire PARTIAL_PICKUP_GIANT

It does not appear to have ever been used.

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 840680e6 27-Oct-2021 Gleb Smirnoff <glebius@FreeBSD.org>

Wrap mutex(9), rwlock(9) and sx(9) macros into __extension__ ({})
instead of do {} while (0).

This makes them real void expressions, and they can be used anywhere
where a void function call can be used, for example in a conditional
operator.

Reviewed by: kib, mjg
Differential revision: https://reviews.freebsd.org/D32696


# 6a467cc5 23-May-2021 Mateusz Guzik <mjg@FreeBSD.org>

lockprof: pass lock type as an argument instead of reading the spin flag


# bd66a075 04-Aug-2020 Mateusz Guzik <mjg@FreeBSD.org>

mtx: add mtx_wait_unlocked


# 6e8c1ccb 06-Dec-2018 Mateusz Guzik <mjg@FreeBSD.org>

Annotate Giant drop/pickup macros with __predict_false

They are used in important places of the kernel with the lock not being held
majority of the time.

Sponsored by: The FreeBSD Foundation


# f4b36404 02-Jul-2018 Matt Macy <mmacy@FreeBSD.org>

inline atomics and allow tied modules to inline locks

- inline atomics in modules on i386 and amd64 (they were always
inline on other arches)
- allow modules to opt in to inlining locks by specifying
MODULE_TIED=1 in the makefile

Reviewed by: kib
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D16079


# d2576988 18-Feb-2018 Mateusz Guzik <mjg@FreeBSD.org>

mtx: add mtx_spin_wait_unlocked

The primitive can be used to wait for the lock to be released. Intended
usage is for locks in structures which are about to be freed.

The benefit is the avoided interrupt enable/disable trip + atomic op to
grab the lock and shorter wait if the lock is held (since there is no
worry someone will contend on the lock, re-reads can be more aggressive).

Briefly discussed with: kib


# c4e20cad 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# 11183f42 25-Nov-2017 Mateusz Guzik <mjg@FreeBSD.org>

Convert in-kernel thread_lock_flags calls to thread_lock when debug is disabled

The flags argument is not used in this case.


# b584eb2e 22-Nov-2017 Mateusz Guzik <mjg@FreeBSD.org>

locks: pass the found lock value to unlock slow path

This avoids an explicit read later.

While here whack the cheaply obtainable 'tid' argument.


# 013c0b49 22-Nov-2017 Mateusz Guzik <mjg@FreeBSD.org>

locks: remove the file + line argument from internal primitives when not used

The pair is of use only in debug or LOCKPROF kernels, but was passed (zeroed)
for many locks even in production kernels.

While here whack the tid argument from wlock hard and xlock hard.

There is no kbi change of any sort - "external" primitives still accept the
pair.


# be49509e 21-Oct-2017 Mateusz Guzik <mjg@FreeBSD.org>

mtx: implement thread lock fastpath

MFC after: 1 week


# 0d74fe26 19-Oct-2017 Mateusz Guzik <mjg@FreeBSD.org>

mtx: clean up locking spin mutexes

1) shorten the fast path by pushing the lockstat probe to the slow path
2) test for kernel panic only after it turns out we will have to spin,
in particular test only after we know we are not recursing

MFC after: 1 week


# 30a33cef 13-Oct-2017 Mateusz Guzik <mjg@FreeBSD.org>

mtx: change MTX_UNOWNED from 4 to 0

The value is spread all over the kernel and zeroing a register is
cheaper/shorter than setting it up to an arbitrary value.

Reduces amd64 GENERIC-NODEBUG .text size by 0.4%.

MFC after: 1 week


# 2f1ddb89 26-Sep-2017 Mateusz Guzik <mjg@FreeBSD.org>

mtx: drop the tid argument from _mtx_lock_sleep

tid must be equal to curthread and the target routine was already reading
it anyway, which is not a problem. Not passing it as a parameter allows for
a little bit shorter code in callers.

MFC after: 1 week


# a2101806 28-Feb-2017 Mateusz Guzik <mjg@FreeBSD.org>

locks: ensure proper barriers are used with atomic ops when necessary

Unclear how, but the locking routine for mutexes was using the *release*
barrier instead of acquire. This must have been either a copy-pasto or bad
completion.

Going through other uses of atomics shows no barriers in:
- upgrade routines (addressed in this patch)
- sections protected with turnstile locks - this should be fine as necessary
barriers are in the worst case provided by turnstile unlock

I would like to thank Mark Millard and andreast@ for reporting the problem and
testing previous patches before the issue got identified.

ps.
.-'---`-.
,' `.
| \
| \
\ _ \
,\ _ ,'-,/-)\
( * \ \,' ,' ,'-)
`._,) -',-')
\/ ''/
) / /
/ ,'-'

Hardware provided by: IBM LTC


# 13d2ef0f 20-Feb-2017 Mateusz Guzik <mjg@FreeBSD.org>

mtx: fix spin mutexes interaction with failed fcmpset

While doing so move recursion support down to the fallback routine.


# a24c8eb8 17-Feb-2017 Mateusz Guzik <mjg@FreeBSD.org>

mtx: plug the 'opts' argument when not used


# 09f1319a 17-Feb-2017 Mateusz Guzik <mjg@FreeBSD.org>

mtx: restrict r313875 to kernels without LOCK_PROFILING


# 08da2677 05-Feb-2017 Mateusz Guzik <mjg@FreeBSD.org>

mtx: move lockstat handling out of inline primitives

Lockstat requires checking if it is enabled and if so, calling a 6 argument
function. Further, determining whether to call it on unlock requires
pre-reading the lock value.

This is problematic in at least 3 ways:
- more branches in the hot path than necessary
- additional cacheline ping pong under contention
- bigger code

Instead, check first if lockstat handling is necessary and if so, just fall
back to regular locking routines. For this purpose a new macro is introduced
(LOCKSTAT_PROFILE_ENABLED).

LOCK_PROFILING uninlines all primitives. Fold in the current inline lock
variant into the _mtx_lock_flags to retain the support. With this change
the inline variants are not used when LOCK_PROFILING is defined and thus
can ignore its existence.

This results in:
text data bss dec hex filename
22259667 1303208 4994976 28557851 1b3c21b kernel.orig
21797315 1303208 4994976 28095499 1acb40b kernel.patched

i.e. about 3% reduction in text size.

A remaining action is to remove spurious arguments for internal kernel
consumers.


# 90836c32 04-Feb-2017 Mateusz Guzik <mjg@FreeBSD.org>

mtx: switch to fcmpset

The found value is passed to locking routines in order to reduce cacheline
accesses.

mtx_unlock grows an explicit check for regular unlock. On ll/sc architectures
the routine can fail even if the lock could have been handled by the inline
primitive.

Discussed with: jhb
Tested by: pho (previous version)


# 2604eb9e 03-Jan-2017 Mateusz Guzik <mjg@FreeBSD.org>

mtx: reduce lock accesses

Instead of spuriously re-reading the lock value, read it once.

This change also has a side effect of fixing a performance bug:
on failed _mtx_obtain_lock, it was possible that re-read would find
the lock is unowned, but in this case the primitive would make a trip
through turnstile code.

This is diff reduction to a variant which uses atomic_fcmpset.

Discussed with: jhb (previous version)
Tested by: pho (previous version)


# 90b581f2 22-Jul-2016 Konstantin Belousov <kib@FreeBSD.org>

Implement mtx_trylock_spin(9).

Discussed with: bde
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D7192


# fc4f686d 01-Jun-2016 Mateusz Guzik <mjg@FreeBSD.org>

Microoptimize locking primitives by avoiding unnecessary atomic ops.

Inline version of primitives do an atomic op and if it fails they fallback to
actual primitives, which immediately retry the atomic op.

The obvious optimisation is to check if the lock is free and only then proceed
to do an atomic op.

Reviewed by: jhb, vangyzen


# 32cd0147 19-Jul-2015 Mark Johnston <markj@FreeBSD.org>

Implement the lockstat provider using SDT(9) instead of the custom provider
in lockstat.ko. This means that lockstat probes now have typed arguments and
will utilize SDT probe hot-patching support when it arrives.

Reviewed by: gnn
Differential Revision: https://reviews.freebsd.org/D2993


# fd07ddcf 13-Dec-2014 Dmitry Chagin <dchagin@FreeBSD.org>

Add _NEW flag to mtx(9), sx(9), rmlock(9) and rwlock(9).
A _NEW flag passed to _init_flags() to avoid check for double-init.

Differential Revision: https://reviews.freebsd.org/D1208
Reviewed by: jhb, wblock
MFC after: 1 Month


# a2496f6e 02-May-2014 Robert Watson <rwatson@FreeBSD.org>

Garbage collect mtxpool_lockbuilder, the mutex pool historically used
for lockmgr and sx interlocks, but unused since optimised versions of
those sleep locks were introduced. This will save a (quite) small
amount of memory in all kernel configurations. The sleep mutex pool is
retained as it is used for 'struct bio' and several other consumers.

Discussed with: jhb
MFC after: 3 days


# 3a6cdc4e 28-Jan-2014 John-Mark Gurney <jmg@FreeBSD.org>

fix spelling of lock_initialized.. jhb approved..

MFC after: 1 week


# 54366c0b 25-Nov-2013 Attilio Rao <attilio@FreeBSD.org>

- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging
option, unbreak the lock tracing release semantic by embedding
calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined
version of the releasing functions for mutex, rwlock and sxlock.
Failing to do so skips the lockstat_probe_func invokation for
unlocking.
- As part of the LOCKSTAT support is inlined in mutex operation, for
kernel compiled without lock debugging options, potentially every
consumer must be compiled including opt_kdtrace.h.
Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the
dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES
is linked there and it is only used as a compile-time stub [0].

[0] immediately shows some new bug as DTRACE-derived support for debug
in sfxge is broken and it was never really tested. As it was not
including correctly opt_kdtrace.h before it was never enabled so it
was kept broken for a while. Fix this by using a protection stub,
leaving sfxge driver authors the responsibility for fixing it
appropriately [1].

Sponsored by: EMC / Isilon storage division
Discussed with: rstone
[0] Reported by: rstone
[1] Discussed with: philip


# f112d4f8 01-Jun-2013 John Baldwin <jhb@FreeBSD.org>

Remove an unused macro we originally got from BSD/OS.


# 24e48c6d 03-Mar-2013 Davide Italiano <davide@FreeBSD.org>

MFcalloutng:
Introduce sbt variants of msleep(), msleep_spin(), pause(), tsleep() in
the KPI, allowing to specify timeout in 'sbintime_t' rather than ticks.

Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo, marius, ian, markj, Fabian Keil


# e10acbc4 02-Nov-2012 Attilio Rao <attilio@FreeBSD.org>

Tweak comment to make more clear why it will fail.

Submitted by: jimharris


# 7f44c618 31-Oct-2012 Attilio Rao <attilio@FreeBSD.org>

Give mtx(9) the ability to crunch different type of structures, with the
only constraint that they have a lock cookie named mtx_lock.
This name, then, becames reserved from the struct that wants to use the
mtx(9) KPI and other locking primitives cannot reuse it for their
members.

Namely such structs are the current struct mtx and the new
struct mtx_padalign. The new structure will define an object which is
the same as the same layout of a struct mtx but will be allocated in
areas aligned to the cache line size and will be as big as a cache line.

This is supposed to give higher performance for highly contented mutexes
both spin or sleep (because of the adaptive spinning), where the cache
line contention results in too much traffic on the system bus.

The struct mtx_padalign can be used in a completely transparent way
with the mtx(9) KPI.

At the moment, a possibility to MFC the patch should be carefully
evaluated because this patch breaks the low level KPI
(not its representation though).

Discussed with: jhb
Reviewed by: jeff, andre
Reviewed by: mdf (earlier version)
Tested by: jimharris


# 35370593 11-Dec-2011 Andriy Gapon <avg@FreeBSD.org>

panic: add a switch and infrastructure for stopping other CPUs in SMP case

Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.

Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state

Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.

This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.

The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.

PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)


# ccdf2333 20-Nov-2011 Attilio Rao <attilio@FreeBSD.org>

Introduce macro stubs in the mutex implementation that will be always
defined and will allow consumers, willing to provide options, file and
line to locking requests, to not worry about options redefining the
interfaces.
This is typically useful when there is the need to build another
locking interface on top of the mutex one.

The introduced functions that consumers can use are:
- mtx_lock_flags_
- mtx_unlock_flags_
- mtx_lock_spin_flags_
- mtx_unlock_spin_flags_
- mtx_assert_
- thread_lock_flags_

Spare notes:
- Likely we can get rid of all the 'INVARIANTS' specification in the
ppbus code by using the same macro as done in this patch (but this is
left to the ppbus maintainer)
- all the other locking interfaces may require a similar cleanup, where
the most notable case is sx which will allow a further cleanup of
vm_map locking facilities
- The patch should be fully compatible with older branches, thus a MFC
is previewed (infact it uses all the underlying mechanisms already
present).

Comments review by: eadler, Ben Kaduk
Discussed with: kib, jhb
MFC after: 1 month


# d576deed 16-Nov-2011 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Constify arguments for locking KPIs where possible.

This enables locking consumers to pass their own structures around as const and
be able to assert locks embedded into those structures.

Reviewed by: ed, kib, jhb


# 4fb70884 13-Feb-2011 Alan Cox <alc@FreeBSD.org>

Retire mp_fixme(). It's no longer used.


# 961135ea 09-Nov-2010 John Baldwin <jhb@FreeBSD.org>

- Remove <machine/mutex.h>. Most of the headers were empty, and the
contents of the ones that were not empty were stale and unused.
- Now that <machine/mutex.h> no longer exists, there is no need to allow it
to override various helper macros in <sys/mutex.h>.
- Rename various helper macros for low-level operations on mutexes to live
in the _mtx_* or __mtx_* namespaces. While here, change the names to more
closely match the real API functions they are backing.
- Drop support for including <sys/mutex.h> in assembly source files.

Suggested by: bde (1, 2)


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 0ee8ad67 29-Sep-2010 John Baldwin <jhb@FreeBSD.org>

Account for unlocking a spin mutex in the lock profiling code in the !SMP
case.

Submitted by: truckman
MFC after: 3 days


# 20a0556c 21-Jun-2009 Roman Divacky <rdivacky@FreeBSD.org>

In non-debugging mode make this define (void)0 instead of nothing. This
helps to catch bugs like the below with clang.

if (cond); <--- note the trailing ;
something();

Approved by: ed (mentor)
Discussed on: current@


# a5aedd68 26-May-2009 Stacey Son <sson@FreeBSD.org>

Add the OpenSolaris dtrace lockstat provider. The lockstat provider
adds probes for mutexes, reader/writer and shared/exclusive locks to
gather contention statistics and other locking information for
dtrace scripts, the lockstat(1M) command and other potential
consumers.

Reviewed by: attilio jhb jb
Approved by: gnn (mentor)


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 90356491 15-May-2008 Attilio Rao <attilio@FreeBSD.org>

- Embed the recursion counter for any locking primitive directly in the
lock_object, using an unified field called lo_data.
- Replace lo_type usage with the w_name usage and at init time pass the
lock "type" directly to witness_init() from the parent lock init
function. Handle delayed initialization before than
witness_initialize() is called through the witness_pendhelp structure.
- Axe out LO_ENROLLPEND as it is not really needed. The case where the
mutex init delayed wants to be destroyed can't happen because
witness_destroy() checks for witness_cold and panic in case.
- In enroll(), if we cannot allocate a new object from the freelist,
notify that to userspace through a printf().
- Modify the depart function in order to return nothing as in the current
CVS version it always returns true and adjust callers accordingly.
- Fix the witness_addgraph() argument name prototype.
- Remove unuseful code from itismychild().

This commit leads to a shrinked struct lock_object and so smaller locks,
in particular on amd64 where 2 uintptr_t (16 bytes per-primitive) are
gained.

Reviewed by: jhb


# d716b994 19-Nov-2007 Attilio Rao <attilio@FreeBSD.org>

Unify assertion flags for all the main primitives using the LA_* underlying
family of macros. This will allow to use unified flags for assertions
with the generic locking primitive class.


# 0bf686c1 06-Aug-2007 Robert Watson <rwatson@FreeBSD.org>

Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which
previously conditionally acquired Giant based on debug.mpsafenet. As that
has now been removed, they are no longer required. Removing them
significantly simplifies error-handling in the socket layer, eliminated
quite a bit of unwinding of locking in error cases.

While here clean up the now unneeded opt_net.h, which previously was used
for the NET_WITH_GIANT kernel option. Clean up some related gotos for
consistency.

Reviewed by: bz, csjp
Tested by: kris
Approved by: re (kensmith)


# c6b28997 28-Jul-2007 Robert Watson <rwatson@FreeBSD.org>

Replace references to NET_CALLOUT_MPSAFE with CALLOUT_MPSAFE, and remove
definition of NET_CALLOUT_MPSAFE, which is no longer required now that
debug.mpsafenet has been removed.

The once over: bz
Approved by: re (kensmith)


# 33d2bb9c 27-Jul-2007 Robert Watson <rwatson@FreeBSD.org>

First in a series of changes to remove the now-unused Giant compatibility
framework for non-MPSAFE network protocols:

- Remove debug_mpsafenet variable, sysctl, and tunable.
- Remove NET_NEEDS_GIANT() and associate SYSINITSs used by it to force
debug.mpsafenet=0 if non-MPSAFE protocols are compiled into the kernel.
- Remove logic to automatically flag interrupt handlers as non-MPSAFE if
debug.mpsafenet is set for an INTR_TYPE_NET handler.
- Remove logic to automatically flag netisr handlers as non-MPSAFE if
debug.mpsafenet is set.
- Remove references in a few subsystems, including NFS and Cronyx drivers,
which keyed off debug_mpsafenet to determine various aspects of their own
locking behavior.
- Convert NET_LOCK_GIANT(), NET_UNLOCK_GIANT(), and NET_ASSERT_GIANT into
no-op's, as their entire behavior was determined by the value in
debug_mpsafenet.
- Alias NET_CALLOUT_MPSAFE to CALLOUT_MPSAFE.

Many remaining references to NET_.*_GIANT() and NET_CALLOUT_MPSAFE are still
present in subsystems, and will be removed in followup commits.

Reviewed by: bz, jhb
Approved by: re (kensmith)


# 6ea38de8 18-Jul-2007 Jeff Roberson <jeff@FreeBSD.org>

- Remove the global definition of sched_lock in mutex.h to break
new code and third party modules which try to depend on it.
- Initialize sched_lock in sched_4bsd.c.
- Declare sched_lock in sparc64 pmap.c and assert that we're compiling
with SCHED_4BSD to prevent accidental crashes from running ULE. This
is the sole remaining file outside of the scheduler that uses the
global sched_lock.

Approved by: re


# 710eacdc 05-Jun-2007 Jeff Roberson <jeff@FreeBSD.org>

- Placing the 'volatile' on the right side of the * in the td_lock
declaration removes the need for __DEVOLATILE().

Pointed out by: tegge


# 7b20fb19 04-Jun-2007 Jeff Roberson <jeff@FreeBSD.org>

Commit 1/14 of sched_lock decomposition.
- Move all scheduler locking into the schedulers utilizing a technique
similar to solaris's container locking.
- A per-process spinlock is now used to protect the queue of threads,
thread count, suspension count, p_sflags, and other process
related scheduling fields.
- The new thread lock is actually a pointer to a spinlock for the
container that the thread is currently owned by. The container may
be a turnstile, sleepqueue, or run queue.
- thread_lock() is now used to protect access to thread related scheduling
fields. thread_unlock() unlocks the lock and thread_set_lock()
implements the transition from one lock to another.
- A new "blocked_lock" is used in cases where it is not safe to hold the
actual thread's lock yet we must prevent access to the thread.
- sched_throw() and sched_fork_exit() are introduced to allow the
schedulers to fix-up locking at these points.
- Add some minor infrastructure for optionally exporting scheduler
statistics that were invaluable in solving performance problems with
this patch. Generally these statistics allow you to differentiate
between different causes of context switches.

Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)


# 4796ce49 11-Apr-2007 John Baldwin <jhb@FreeBSD.org>

Group the loop to acquire/release Giant with the WITNESS_SAVE/RESTORE under
a single conditional. The two operations are linked, but since the link
is not very direct, Coverity can't see it. Humans might also miss the
link as well. So, this isn't fixing any actual bugs, just improving
readability.

CID: 1787 (likely others as well)
Found by: Coverity Prevent (tm)


# 70fe8436 03-Apr-2007 Kip Macy <kmacy@FreeBSD.org>

move lock_profile calls out of the macros and into kern_mutex.c
add check for mtx_recurse == 0 when releasing sleep lock


# 028923e5 30-Mar-2007 John Baldwin <jhb@FreeBSD.org>

- Use PARTIAL_PICKUP_GIANT() to implement PICKUP_GIANT().
- Move UGAR() macro up to the comment that describes it.
- Fix a couple of typos.


# aa89d8cd 21-Mar-2007 John Baldwin <jhb@FreeBSD.org>

Rename the 'mtx_object', 'rw_object', and 'sx_object' members of mutexes,
rwlocks, and sx locks to 'lock_object'.


# e7573e7a 09-Mar-2007 John Baldwin <jhb@FreeBSD.org>

Allow threads to atomically release rw and sx locks while waiting for an
event. Locking primitives that support this (mtx, rw, and sx) now each
include their own foo_sleep() routine.
- Rename msleep() to _sleep() and change it's 'struct mtx' object to a
'struct lock_object' pointer. _sleep() uses the recently added
lc_unlock() and lc_lock() function pointers for the lock class of the
specified lock to release the lock while the thread is suspended.
- Add wrappers around _sleep() for mutexes (mtx_sleep()), rw locks
(rw_sleep()), and sx locks (sx_sleep()). msleep() still exists and
is now identical to mtx_sleep(), but it is deprecated.
- Rename SLEEPQ_MSLEEP to SLEEPQ_SLEEP.
- Rewrite much of sleep.9 to not be msleep(9) centric.
- Flesh out the 'RETURN VALUES' section in sleep.9 and add an 'ERRORS'
section.
- Add __nonnull(1) to _sleep() and msleep_spin() so that the compiler will
warn if you try to pass a NULL wait channel. The functions already have
a KASSERT to that effect.


# 224a2f31 07-Mar-2007 John Baldwin <jhb@FreeBSD.org>

Wrap a few lines at 80 cols.


# 3ad48efa 26-Feb-2007 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Replace spaces with tabs in some places.


# fe68a916 26-Feb-2007 Kip Macy <kmacy@FreeBSD.org>

general LOCK_PROFILING cleanup

- only collect timestamps when a lock is contested - this reduces the overhead
of collecting profiles from 20x to 5x

- remove unused function from subr_lock.c

- generalize cnt_hold and cnt_lock statistics to be kept for all locks

- NOTE: rwlock profiling generates invalid statistics (and most likely always has)
someone familiar with that should review


# 297507d0 21-Dec-2006 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Document MTX_NOPROFILE flag.


# 1364a812 15-Dec-2006 Kip Macy <kmacy@FreeBSD.org>

- Fix some gcc warnings in lock_profile.h
- add cnt_hold cnt_lock support for spin mutexes
- make sure contested is initialized to zero to only bump contested when appropriate
- move initialization function to kern_mutex.c to avoid cyclic dependency between
mutex.h and lock_profile.h


# 7c0435b9 10-Nov-2006 Kip Macy <kmacy@FreeBSD.org>

MUTEX_PROFILING has been generalized to LOCK_PROFILING. We now profile
wait (time waited to acquire) and hold times for *all* kernel locks. If
the architecture has a system synchronized TSC, the profiling code will
use that - thereby minimizing profiling overhead. Large chunks of profiling
code have been moved out of line, the overhead measured on the T1 for when
it is compiled in but not enabled is < 1%.

Approved by: scottl (standing in for mentor rwatson)
Reviewed by: des and jhb


# 186abbd7 27-Jul-2006 John Baldwin <jhb@FreeBSD.org>

Write a magic value into mtx_lock when destroying a mutex that will force
all other mtx_lock() operations to block. Previously, when the mutex was
destroyed, it would still have a valid value in mtx_lock(): either the
unowned cookie, which would allow a subsequent mtx_lock() to succeed, or a
pointer to the thread who destroyed the mutex if the mutex was locked when
it was destroyed.

MFC after: 3 days


# 49b94bfc 03-Jun-2006 John Baldwin <jhb@FreeBSD.org>

Bah, fix fat finger in last. Invert the ~ on MTX_FLAGMASK as it's
non-intuitive for the ~ to be built into the mask. All the users now
explicitly ~ the mask. In addition, add MTX_UNOWNED to the mask even
though it technically isn't a flag. This should unbreak mtx_owner().

Quickly spotted by: kris


# 83a81bcb 17-Jan-2006 John Baldwin <jhb@FreeBSD.org>

Add a new file (kern/subr_lock.c) for holding code related to struct
lock_obj objects:
- Add new lock_init() and lock_destroy() functions to setup and teardown
lock_object objects including KTR logging and registering with WITNESS.
- Move all the handling of LO_INITIALIZED out of witness and the various
lock init functions into lock_init() and lock_destroy().
- Remove the constants for static indices into the lock_classes[] array
and change the code outside of subr_lock.c to use LOCK_CLASS to compare
against a known lock class.
- Move the 'show lock' ddb function and lock_classes[] array out of
kern_mutex.c over to subr_lock.c.


# 9d61a2e6 02-Aug-2005 John Baldwin <jhb@FreeBSD.org>

Include a SYSUNINIT() to destroy the mutex in MTX_SYSINIT. This makes
MTX_SYSINIT mutexes play well with modules that can be unloaded.

Reported by: sam
MFC after: 3 days


# 122eceef 15-Jul-2005 John Baldwin <jhb@FreeBSD.org>

Convert the atomic_ptr() operations over to operating on uintptr_t
variables rather than void * variables. This makes it easier and simpler
to get asm constraints and volatile keywords correct.

MFC after: 3 days
Tested on: i386, alpha, sparc64
Compiled on: ia64, powerpc, amd64
Kernel toolchain busted on: arm


# 849bfaf9 23-Jun-2005 John Baldwin <jhb@FreeBSD.org>

Adjust some comments to be a bit more correct.

Approved by: re (scottl)


# 951407ab 22-Apr-2005 Jeff Roberson <jeff@FreeBSD.org>

- Define LOP_DUPOK in lock.h so that we may pass it to individual
witness calls rather than as a flag on the lock object.
- Define MTX_DUPOK in terms of LOP_DUPOK in mutex.h.

Sponsored by: Isilon Systems, Inc.


# c6a37e84 04-Apr-2005 John Baldwin <jhb@FreeBSD.org>

Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions. They no longer have any affect on
interrupts. This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit(). This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock. For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections. Note that I've also taken this
opportunity to push a few things into MD code rather than MI. For example,
critical_fork_exit() no longer exists. Instead, MD code ensures that new
threads have the correct state when they are created. Also, we no longer
try to fixup the idlethreads for APs in MI code. Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by: grehan, cognet, arch@, others
Tested on: i386, alpha, sparc64, powerpc, arm, possibly more


# d111bbbc 01-Mar-2005 Gleb Smirnoff <glebius@FreeBSD.org>

Add macro NET_CALLOUT_MPSAFE, which should be used when initializing
network related callouts.

Reviewed by: rwatson


# 5fec8ab2 28-Feb-2005 John Baldwin <jhb@FreeBSD.org>

Define the _mtx_assert() function prototype as well as the MA_* constants
if either INVARIANTS or INVARIANT_SUPPORT is defined so that kernel modules
that want to use mtx_assert() only need to define INVARIANTS.

MFC after: 1 week


# 33fb8a38 05-Jan-2005 John Baldwin <jhb@FreeBSD.org>

Rework the optimization for spinlocks on UP to be slightly less drastic and
turn it back on. Specifically, the actual changes are now less intrusive
in that the _get_spin_lock() and _rel_spin_lock() macros now have their
contents changed for UP vs SMP kernels which centralizes the changes.
Also, UP kernels do not use _mtx_lock_spin() and no longer include it. The
UP versions of the spin lock functions do not use any atomic operations,
but simple compares and stores which allow mtx_owned() to still work for
spin locks while removing the overhead of atomic operations.

Tested on: i386, alpha


# bdcfcf5b 04-Aug-2004 John Baldwin <jhb@FreeBSD.org>

Cache the value of curthread in the _get_sleep_lock() and _get_spin_lock()
macros and pass the value to the associated _mtx_*() functions to avoid
more curthread dereferences in the function implementations. This provided
a very modest perf improvement in some benchmarks.

Suggested by: rwatson
Tested by: scottl


# 4dd70299 04-Aug-2004 John Baldwin <jhb@FreeBSD.org>

Whitspace fix.


# 7372ef9a 20-Jun-2004 Robert Watson <rwatson@FreeBSD.org>

Include an annotation of NET_{LOCK,UNLOCK}_GIANT() noting that these
calls do not have the same recursion semantics as DROP_GIANT and
PICKUP_GIANT.


# 7101d752 28-Mar-2004 Robert Watson <rwatson@FreeBSD.org>

Invert the logic of NET_LOCK_GIANT(), and remove the one reference to it.
Previously, Giant would be grabbed at entry to the IP local delivery code
when debug.mpsafenet was set to true, as that implied Giant wouldn't be
grabbed in the driver path. Now, we will use this primitive to
conditionally grab Giant in the event the entire network stack isn't
running MPSAFE (debug.mpsafenet == 0).


# 6200a93f 01-Mar-2004 Robert Watson <rwatson@FreeBSD.org>

Rename NET_PICKUP_GIANT() to NET_LOCK_GIANT(), and NET_DROP_GIANT()
to NET_UNLOCK_GIANT(). While they are used in similar ways, the
semantics are quite different -- NET_LOCK_GIANT() and NET_UNLOCK_GIANT()
directly wrap mutex lock and unlock operations, whereas drop/pickup
special case the handling of Giant recursion. Add a comment saying
as much.

Add NET_ASSERT_GIANT(), which conditionally asserts Giant based
on the value of debug_mpsafenet.


# d3be1471 05-Nov-2003 Sam Leffler <sam@FreeBSD.org>

o make debug_mpsafenet globally visible
o move it from subr_bus.c to netisr.c where it more properly belongs
o add NET_PICKUP_GIANT and NET_DROP_GIANT macros that will be used to
grab Giant as needed when MPSAFE operation is enabled

Supported by: FreeBSD Foundation


# c16ec48b 16-Oct-2003 Jeff Roberson <jeff@FreeBSD.org>

- mtx_ownedby() was unpopular and is no longer needed. Remove it.


# a34419fe 12-Oct-2003 Jeff Roberson <jeff@FreeBSD.org>

- Implement a mtx_ownedby() macro which can be used to determine if a
particular thread owns a mutex. This cannot be done without races
unless the thread is curthread.


# 0fd7279e 19-Sep-2003 Sam Leffler <sam@FreeBSD.org>

revert rev 1.64; this is not needed with rev 1.49 of lock.h
as LOCK_DEBUG is implied by MUTEX_PROFILING which stops inline expansion
of the mutex operations

Supported by: FreeBSD Foundation


# afc77db4 19-Sep-2003 John Baldwin <jhb@FreeBSD.org>

Don't inline mutex operations if MUTEX_PROFILING is enabled.

Reported by: sam


# 64e6fa28 16-Jul-2003 Don Lewis <truckman@FreeBSD.org>

Nuke the declaration of a function which was not implemented.


# 857d9c60 12-Jul-2003 Don Lewis <truckman@FreeBSD.org>

Extend the mutex pool implementation to permit the creation and use of
multiple mutex pools with different options and sizes. Mutex pools can
be created with either the default sleep mutexes or with spin mutexes.
A dynamically created mutex pool can now be destroyed if it is no longer
needed.

Create two pools by default, one that matches the existing pool that
uses the MTX_NOWITNESS option that should be used for building higher
level locks, and a new pool with witness checking enabled.

Modify the users of the existing mutex pool to use the appropriate pool
in the new implementation.

Reviewed by: jhb


# 8c33536c 17-May-2003 Scott Long <scottl@FreeBSD.org>

Add the MUTEX_NOINLINE option that explicitely de-inlines the mutex
operations.

Submitted by: jhb


# f949f795 23-Mar-2003 Tim J. Robbins <tjr@FreeBSD.org>

Remove unused mtx_lock_giant(), mtx_unlock_giant(), related globals
and sysctls.


# 75d468ee 11-Mar-2003 John Baldwin <jhb@FreeBSD.org>

Axe the useless MTX_SLEEPABLE flag. mutexes are not sleepable locks.
Nothing used this flag and WITNESS would have panic'd during mtx_init()
if anything had.


# 25ba12df 28-Dec-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Allow lint-like tools to override DROP_GIANT and friends.

Apply parens around macro arguments.


# ce39e722 27-Jul-2002 John Baldwin <jhb@FreeBSD.org>

Disable optimization of spinlocks on UP kernels w/o debugging for now
since it breaks mtx_owned() on spin mutexes when used outside of
mtx_assert(). Unfortunately we currently use it in the i386 MD code
and in the sio(4) driver.

Reported by: bde


# e8fdcfb5 21-May-2002 John Baldwin <jhb@FreeBSD.org>

Optimize spin mutexes for UP kernels without debugging to just enter and
exit critical sections. We only contest on a spin mutex on an SMP kernel
running on an SMP machine.


# 0c88508a 04-Apr-2002 John Baldwin <jhb@FreeBSD.org>

Change mtx_init() to now take an extra argument. The third argument is
the generic lock type for use with witness. If this argument is NULL then
the lock name is used as the lock type. Add a macro for a lock type name
for network driver locks.


# c53c013b 02-Apr-2002 John Baldwin <jhb@FreeBSD.org>

- Move the MI mutexes sched_lock and Giant from being declared in the
various machdep.c's to being declared in kern_mutex.c.
- Add a new function mutex_init() used to perform early initialization
needed for mutexes such as setting up thread0's contested lock list
and initializing MI mutexes. Change the various MD startup routines
to call this function instead of duplicating all the code themselves.

Tested on: alpha, i386


# 93d3325f 02-Apr-2002 Dag-Erling Smørgrav <des@FreeBSD.org>

Oops, forgot to commit the definition of the mtx_name() macro.


# c27b5699 02-Apr-2002 Andrew R. Reiter <arr@FreeBSD.org>

- Add MTX_SYSINIT and SX_SYSINIT as macro glue for allowing sx and mtx
locks to be able to setup a SYSINIT call. This helps in places where
a lock is needed to protect some data, but the data is not truly
associated with a subsystem that can properly initialize it's lock.
The macros use the mtx_sysinit() and sx_sysinit() functions,
respectively, as the handler argument to SYSINIT().

Reviewed by: alfred, jhb, smp@


# f22a4b62 27-Mar-2002 Jeff Roberson <jeff@FreeBSD.org>

Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks
with this flag. Remove the dup_list and dup_ok code from subr_witness. Now
we just check for the flag instead of doing string compares.

Also, switch the process lock, process group lock, and uma per cpu locks over
to this interface. The original mechanism did not work well for uma because
per cpu lock names are unique to each zone.

Approved by: jhb


# 2e5cf899 15-Mar-2002 John Baldwin <jhb@FreeBSD.org>

Fix a stupid whitespace style bogon from way back in the declarations of
sched_lock and Giant.


# 735da6de 18-Feb-2002 Matthew Dillon <dillon@FreeBSD.org>

Add kern_giant_ucred to instrument Giant around ucred related operations
such a getgid(), setgid(), etc...


# c86b6ff5 05-Jan-2002 John Baldwin <jhb@FreeBSD.org>

Change the preemption code for software interrupt thread schedules and
mutex releases to not require flags for the cases when preemption is
not allowed:

The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent
switching to a higher priority thread on mutex releease and swi schedule,
respectively when that switch is not safe. Now that the critical section
API maintains a per-thread nesting count, the kernel can easily check
whether or not it should switch without relying on flags from the
programmer. This fixes a few bugs in that all current callers of
swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from
fast interrupt handlers and the swi_sched of softclock needed this flag.
Note that to ensure that swi_sched()'s in clock and fast interrupt
handlers do not switch, these handlers have to be explicitly wrapped
in critical_enter/exit pairs. Presently, just wrapping the handlers is
sufficient, but in the future with the fully preemptive kernel, the
interrupt must be EOI'd before critical_exit() is called. (critical_exit()
can switch due to a deferred preemption in a fully preemptive kernel.)

I've tested the changes to the interrupt code on i386 and alpha. I have
not tested ia64, but the interrupt code is almost identical to the alpha
code, so I expect it will work fine. PowerPC and ARM do not yet have
interrupt code in the tree so they shouldn't be broken. Sparc64 is
broken, but that's been ok'd by jake and tmm who will be fixing the
interrupt code for sparc64 shortly.

Reviewed by: peter
Tested on: i386, alpha


# 7e1f6dfe 17-Dec-2001 John Baldwin <jhb@FreeBSD.org>

Modify the critical section API as follows:
- The MD functions critical_enter/exit are renamed to start with a cpu_
prefix.
- MI wrapper functions critical_enter/exit maintain a per-thread nesting
count and a per-thread critical section saved state set when entering
a critical section while at nesting level 0 and restored when exiting
to nesting level 0. This moves the saved state out of spin mutexes so
that interlocking spin mutexes works properly.
- Most low-level MD code that used critical_enter/exit now use
cpu_critical_enter/exit. MI code such as device drivers and spin
mutexes use the MI wrappers. Note that since the MI wrappers store
the state in the current thread, they do not have any return values or
arguments.
- mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is
assigned to curthread->td_savecrit during fork_exit().

Tested on: i386, alpha


# 0bbc8826 11-Dec-2001 John Baldwin <jhb@FreeBSD.org>

Overhaul the per-CPU support a bit:

- The MI portions of struct globaldata have been consolidated into a MI
struct pcpu. The MD per-CPU data are specified via a macro defined in
machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the
interface would be cleaner (PCPU_GET(my_md_field) vs.
PCPU_GET(md.md_my_md_field)).
- All references to globaldata are changed to pcpu instead. In a UP kernel,
this data was stored as global variables which is where the original name
came from. In an SMP world this data is per-CPU and ideally private to each
CPU outside of the context of debuggers. This also included combining
machine/globaldata.h and machine/globals.h into machine/pcpu.h.
- The pointer to the thread using the FPU on i386 was renamed from
npxthread to fpcurthread to be identical with other architectures.
- Make the show pcpu ddb command MI with a MD callout to display MD
fields.
- The globaldata_register() function was renamed to pcpu_init() and now
init's MI fields of a struct pcpu in addition to registering it with
the internal array and list.
- A pcpu_destroy() function was added to remove a struct pcpu from the
internal array and list.

Tested on: alpha, i386
Reviewed by: peter, jake


# f2860039 13-Nov-2001 Matthew Dillon <dillon@FreeBSD.org>

Create a mutex pool API for short term leaf mutexes.
Replace the manual mutex pool in kern_lock.c (lockmgr locks) with the new API.
Replace the mutexes embedded in sxlocks with the new API.


# d23f5958 26-Oct-2001 Matthew Dillon <dillon@FreeBSD.org>

Add mtx_lock_giant() and mtx_unlock_giant() wrappers for sysctl management
of Giant during the Giant unwinding phase, and start work on instrumenting
Giant for the file and proc mutexes.

These wrappers allow developers to turn on and off Giant around various
subsystems. DEVELOPERS SHOULD NEVER TURN OFF GIANT AROUND A SUBSYSTEM JUST
BECAUSE THE SYSCTL EXISTS! General developers should only considering
turning on Giant for a subsystem whos default is off (to help track down
bugs). Only developers working on particular subsystems who know what
they are doing should consider turning off Giant.

These wrappers will greatly improve our ability to unwind Giant and test
the kernel on a (mostly) subsystem by subsystem basis. They allow Giant
unwinding developers (GUDs) to emplace appropriate subsystem and structural
mutexes in the main tree and then request that the larger community test
the work by turning off Giant around the subsystem(s), without the larger
community having to mess around with patches. These wrappers also allow
GUDs to boot into a (more likely to be) working system in the midst of
their unwinding work and to test that work under more controlled
circumstances.

There is a master sysctl, kern.giant.all, which defaults to 0 (off). If
turned on it overrides *ALL* other kern.giant sysctls and forces Giant to
be turned on for all wrapped subsystems. If turned off then Giant around
individual subsystems are controlled by various other kern.giant.XXX sysctls.

Code which overlaps multiple subsystems must have all related subsystem Giant
sysctls turned off in order to run without Giant.


# fb63feef 19-Oct-2001 John Baldwin <jhb@FreeBSD.org>

- Move the definition of LOCK_DEBUG back to sys/lock.h from sys/_lock.h.
- Change LOCK_DEBUG so that it is always on if KTR is compiled in
regardless of the state of KTR_COMPILE. This means that we no longer
need to include sys/ktr.h before sys/lock.h to ensure a valid setting
for LOCK_DEBUG.
- Change the use of LOCK_DEBUG so that it is now always defined and its
value is used instead of merely its definition. That is, instead of
#ifdef LOCK_DEBUG, code should now use #if LOCK_DEBUG > 0.
- Use this latest to #error out in sys/mutex.h if sys/lock.h isn't
included before sys/mutex.h to ensure that the proper versions of the
mutex operations are used.
- As a result of (2) sys/mutex.h no longer includes sys/ktr.h in the
KERNEL case.

Requested by: bde (1)


# 6b12d30f 25-Sep-2001 John Baldwin <jhb@FreeBSD.org>

Include sys/ktr.h before sys/_lock.h to ensure LOCK_DEBUG is set to its
proper value.


# dde96c99 22-Sep-2001 John Baldwin <jhb@FreeBSD.org>

Since we no longer inline any debugging code in the mutex operations, move
all the debugging code into the function versions of the mutex operations
in kern_mutex.c. This reduced the __mtx_* macros to simply wrappers of
the _{get,rel}_lock_* macros, so the __mtx_* macros were also abolished in
favor of just calling the _{get,rel}_lock_* macros. The tangled hairy mass
of macros calling macros is at least a bit more sane now.


# fd1135c7 21-Sep-2001 John Baldwin <jhb@FreeBSD.org>

Use __FILE__ and __LINE__ explicitly since we know we will be using them
when calling _mtx_assert() to prevent mtx_assert() from requiring
sys/lock.h as well as sys/mutex.h.


# 52d4106b 17-Sep-2001 John Baldwin <jhb@FreeBSD.org>

Use NULL instead of __FILE__ in the !LOCK_DEBUG case in the locking code
since the filenames are only used in the LOCK_DEBUG case and are just bloat
in the !LOCK_DEBUG case.


# 58dac15e 17-Sep-2001 John Baldwin <jhb@FreeBSD.org>

Don't inline mutexes in the LOCK_DEBUG case.


# b40ce416 12-Sep-2001 Julian Elischer <julian@FreeBSD.org>

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 36565f03 30-Aug-2001 Matthew Dillon <dillon@FreeBSD.org>

Get rid of most of the GIANT_XXX assertion defines. Nobody is going to use
them, including me.


# 55be7f3e 30-Aug-2001 John Baldwin <jhb@FreeBSD.org>

Add a UGAR() macro to simplify the diff's for the Giant pushdown.


# 6d03d577 04-Jul-2001 Matthew Dillon <dillon@FreeBSD.org>

Reorg vm_page.c into vm_page.c, vm_pageq.c, and vm_contig.c (for contigmalloc).
Also removed some spl's and added some VM mutexes, but they are not actually
used yet, so this commit does not really make any operational changes
to the system.

vm_page.c relates to vm_page_t manipulation, including high level deactivation,
activation, etc... vm_pageq.c relates to finding free pages and aquiring
exclusive access to a page queue (exclusivity part not yet implemented).
And the world still builds... :-)


# 7b9673fa 04-Jul-2001 Matthew Dillon <dillon@FreeBSD.org>

cleanup: GIANT macros, rename DEPRECIATE to DEPRECATE
Move p_giant_optional to proc zero'd section
Remove (old) XXX zfree comment in pipe code


# 0cddd8f0 04-Jul-2001 Matthew Dillon <dillon@FreeBSD.org>

With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage). Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.


# 2d96f0b1 04-May-2001 John Baldwin <jhb@FreeBSD.org>

- Move state about lock objects out of struct lock_object and into a new
struct lock_instance that is stored in the per-process and per-CPU lock
lists. Previously, the lock lists just kept a pointer to each lock held.
That pointer is now replaced by a lock instance which contains a pointer
to the lock object, the file and line of the last acquisition of a lock,
and various flags about a lock including its recursion count.
- If we sleep while holding a sleepable lock, then mark that lock instance
as having slept and ignore any lock order violations that occur while
acquiring Giant when we wake up with slept locks. This is ok because of
Giant's special nature.
- Allow witness to differentiate between shared and exclusive locks and
unlocks of a lock. Witness will now detect the case when a lock is
acquired first in one mode and then in another. Mutexes are always
locked and unlocked exclusively. Witness will also now detect the case
where a process attempts to unlock a shared lock while holding an
exclusive lock and vice versa.
- Fix a bug in the lock list implementation where we used the wrong
constant to detect the case where a lock list entry was full.


# fb919e4d 01-May-2001 Mark Murray <markm@FreeBSD.org>

Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by: bde (with reservations)


# 19284646 28-Mar-2001 John Baldwin <jhb@FreeBSD.org>

Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.


# 6283b7d0 27-Mar-2001 John Baldwin <jhb@FreeBSD.org>

- Switch from using save/disable/restore_intr to using critical_enter/exit
and change the u_int mtx_saveintr member of struct mtx to a critical_t
mtx_savecrit.
- On the alpha we no longer need a custom _get_spin_lock() macro to avoid
an extra PAL call, so remove it.
- Partially fix using mutexes with WITNESS in modules. Change all the
_mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line
parameters and rename them to use a prefix of two underscores. Inside
of kern_mutex.c, generate wrapper functions for
_mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore)
that are called from modules. The macros mtx_{un,}lock_{spin,}_flags()
are mapped to the __mtx_* macros inside of the kernel to inline the
usual case of mutex operations and map to the internal _mtx_* functions
in the module case so that modules will use WITNESS and KTR logging if
the kernel is compiled with support for it.


# c4a21abf 06-Mar-2001 John Baldwin <jhb@FreeBSD.org>

- Include <sys/systm.h> for KASSERT().
- Move the _mtx_assert() prototype up to the top of the file with the rest
of the function prototypes.
- Define all the mtx_foo() macros in terms of mtx_foo_flags().
- Add a KASSERT() to check for invalid options in mtx_lock_flags().
- Move the mtx_assert() to ensure a mutex is owned before releasing it
in front of WITNESS_EXIT() in all the mtx_unlock_* macros.
- Change the MPASS* macros to be on #ifdef INVARIANTS, not just #ifdef
MUTEX_DEBUG since most of them check to see that the mutex functions are
called properly. Define MPASS4() in terms of KASSERT() to do this.
- Define MPASS{,[23]} in terms of MPASS4() to simplify things and avoid
code duplication.


# 219042c9 02-Mar-2001 Bosko Milekic <bmilekic@FreeBSD.org>

Fix INVARIANT_SUPPORT-only builds (without INVARIANTS). The required
`infrastructure' built with INVARIANT_SUPPORT for kern_mutex.c essentially
involves _mtx_assert(), which makes use of constants that were defined
under #ifdef INVARIANTS here.


# 27863426 11-Feb-2001 Bosko Milekic <bmilekic@FreeBSD.org>

Change all instances of `CURPROC' and `CURTHD' to `curproc,' in order
to stay consistent.

Requested by: bde


# 5746a1d8 10-Feb-2001 Bosko Milekic <bmilekic@FreeBSD.org>

- Place back STR string declarations for lock/unlock strings used for KTR_LOCK
tracing in order to avoid duplication.
- Insert some tracepoints back into the mutex acq/rel code, thus ensuring
that we can trace all lock acq/rel's again.
- All CURPROC != NULL checks are MPASS()es (under MUTEX_DEBUG) because they
signify a serious mutex corruption.
- Change up some KASSERT()s to MPASS()es, and vice-versa, depending on the
type of problem we're debugging (INVARIANTS is used here to check that
the API is being used properly whereas MUTEX_DEBUG is used to ensure that
something general isn't happening that will have bad impact on mutex
locks).

Reminded by: jhb, jake, asmodai


# 9ed346ba 08-Feb-2001 Bosko Milekic <bmilekic@FreeBSD.org>

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


# 8484de75 24-Jan-2001 John Baldwin <jhb@FreeBSD.org>

- Don't use a union and fun tricks to shave one extra pointer off of struct
mtx right now as it makes debugging harder. When we are in optimizing
mode, we can revisit this.
- Fix the KTR trace messages to use %p rather than 0x%p to avoid duplicate
0x's in KTR output.
- During witness_fixup, release Giant so that witness doesn't get confused.
Also, grab all_mtx while walking the list of mutexes.
- Remove w_sleep and w_recurse. Instead, perform checks on mutexes using
the mutex's mtx_flags field.
- Allow debug.witness_ddb and debug.witness_skipspin to be set from the
loader.
- Add Giant to the front of existing order_list entries to help ensure
Giant is always first.
- Add an order entry for the various proc locks. Note that this only
helps keep proc in order mostly as the allproc and proctree mutexes are
only obtained during a lockmgr operation on the specified mutex.


# 56771ca7 21-Jan-2001 Jason Evans <jasone@FreeBSD.org>

Print correct file name and line number in mtx_assert().

Noticed by: jake


# 0cde2e34 21-Jan-2001 Jason Evans <jasone@FreeBSD.org>

Move most of sys/mutex.h into kern/kern_mutex.c, thereby making the mutex
inline functions non-inlined. Hide parts of the mutex implementation that
should not be exposed.

Make sure that WITNESS code is not executed during boot until the mutexes
are fully initialized by SI_SUB_MUTEX (the original motivation for this
commit).

Submitted by: peter


# d1c1b841 21-Jan-2001 Jason Evans <jasone@FreeBSD.org>

Remove MUTEX_DECLARE() and MTX_COLD. Instead, postpone full mutex
initialization until after malloc() is safe to call, then iterate through
all mutexes and complete their initialization.

This change is necessary in order to avoid some circular bootstrapping
dependencies.


# 87dce368 19-Jan-2001 Jake Burkholder <jake@FreeBSD.org>

Simplify the i386 asm MTX_{ENTER,EXIT} macros to just call the
appropriate function, rather than doing a horse-and-buggy
acquire. They now take the mutex type as an arg and can be
used with sleep as well as spin mutexes.


# 08812b39 18-Jan-2001 Bosko Milekic <bmilekic@FreeBSD.org>

Implement MTX_RECURSE flag for mtx_init().
All calls to mtx_init() for mutexes that recurse must now include
the MTX_RECURSE bit in the flag argument variable. This change is in
preparation for an upcoming (further) mutex API cleanup.
The witness code will call panic() if a lock is found to recurse but
the MTX_RECURSE bit was not set during the lock's initialization.

The old MTX_RECURSE "state" bit (in mtx_lock) has been renamed to
MTX_RECURSED, which is more appropriate given its meaning.

The following locks have been made "recursive," thus far:
eventhandler, Giant, callout, sched_lock, possibly some others declared
in the architecture-specific code, all of the network card driver locks
in pci/, as well as some other locks in dev/ stuff that I've found to
be recursive.

Reviewed by: jhb


# 562e4ffe 13-Dec-2000 John Baldwin <jhb@FreeBSD.org>

- Add a new flag MTX_QUIET that can be passed to the various mtx_*
functions. If this flag is set, then no KTR log messages are issued.
This is useful for blocking excessive logging, such as with the internal
mutex used by the witness code.
- Use MTX_QUIET on all of the mtx_enter/exit operations on the internal
mutex used by the witness code.
- If we are in a panic, don't do witness checks in witness_enter(),
witness_exit(), and witness_try_enter(), just return.


# 61d68d6c 11-Dec-2000 John Baldwin <jhb@FreeBSD.org>

Since _mtx_enter() and friends are static inline functions now instead of
macros, the mutex KTR log entries don't actually have the useful filename
and line numbers in the KTR_EXTEND case, so remove a comment claiming this
and go back to one set of KTR strings.


# 92cf772d 11-Dec-2000 Jake Burkholder <jake@FreeBSD.org>

- Add code to detect if a system call returns with locks other than Giant
held and panic if so (conditional on witness).
- Change witness_list to return the number of locks held so this is easier.
- Add kern/syscalls.c to the kernel build if witness is defined so that the
panic message can contain the name of the offending system call.
- Add assertions that Giant and sched_lock are not held when returning from
a system call, which were missing for alpha and ia64.


# bdc49e9b 08-Dec-2000 John Baldwin <jhb@FreeBSD.org>

Remove a comment that referrred to the obsolete mtxf struct.


# ce7d8211 08-Dec-2000 Jake Burkholder <jake@FreeBSD.org>

Whitespace. Make the indentation for MPASS and MPASS2 consistent and
slightly more sane. Make the arguments to the nop MPASS2 match those
of the functional one. Change 4 spaces to a tab. Don't indeent a
label so its easier to see.


# fc2befbb 08-Dec-2000 Jake Burkholder <jake@FreeBSD.org>

Add macros MPASS3 and MPASS4, which take the file and line number
as parameters. Use them in the mutex inlines so that the file and
line numbers are those of the caller instead of always in this file.


# 6936206e 30-Nov-2000 John Baldwin <jhb@FreeBSD.org>

Split the WITNESS and MUTEX_DEBUG options apart so that WITNESS does not
depend on MUTEX_DEBUG. The MUTEX_DEBUG option turns on extra assertions
and checks to verify that mutexes themselves are implemented properly.
The WITNESS option uses extra checks and diagnostics to verify that other
code is using mutexes properly.


# f377b2b1 22-Nov-2000 John Baldwin <jhb@FreeBSD.org>

Fix the KTR tracepoints for mtx_enter/exit/try_enter to properly order the
parameters for the KTR_EXTEND case.


# 9cce2a0c 15-Nov-2000 John Baldwin <jhb@FreeBSD.org>

- Add a new macro DROP_GIANT_NOSWITCH() that is similar to DROP_GIANT()
except that it uses the MTX_NOSWITCH flag while it releases Giant via
mtx_exit().
- Add a mtx_recursed() primitive. This primitive should only be used on
a mutex owned by the current process. It will return non-zero if the
mutex is recursively owned, or zero otherwise.
- Add two new flags MA_RECURSED and MA_NOTRECURSED that can be used in
conjuction with MA_OWNED to control the assertion checked by mtx_assert().
- Fix some of the KTR tracepoint strings to use %p when displaying the lock
field of a mutex, which is a uintptr_t.


# d9de7cc7 06-Nov-2000 John Baldwin <jhb@FreeBSD.org>

Remove an unneeded #include <machine/bus.h> that snuck in accidentally with
the MI mutexes.

Submitted indirectly by: bde


# bfbc104f 31-Oct-2000 John Baldwin <jhb@FreeBSD.org>

Use do { ... } while (0) to wrap the body of mtx_assert().

Reported by: rwatson


# eb661345 23-Oct-2000 Matt Jacob <mjacob@FreeBSD.org>

Move bogus proc reference stuff into <machine/globals.h>. There is no
more include file including <sys/proc.h>, but there still is this wonky
and (causes warnings on i386) reference in globals.h.

CURTHD is now defined in <machine/globals.h> as well. The correct thing
to do is provide a platform function for this.


# 4ae338d0 23-Oct-2000 Matt Jacob <mjacob@FreeBSD.org>

Put back inclusion of proc.h so that alpha kernels (at the very least)
will compile again. I can't quite see where this was a recursive inclusion.
We probably need to do something to fix the alpha, but let's not break it
in the interim- it's broken enough.


# 04c94f3c 23-Oct-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Generate LOTS of warnings to remind the SMPng crew to fix the curproc
UP/SMP issue.


# 90b32bf8 23-Oct-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Do not recursively include <sys/proc.h>


# 36412d79 20-Oct-2000 John Baldwin <jhb@FreeBSD.org>

- Make the mutex code almost completely machine independent. This greatly
reducues the maintenance load for the mutex code. The only MD portions
of the mutex code are in machine/mutex.h now, which include the assembly
macros for handling mutexes as well as optionally overriding the mutex
micro-operations. For example, we use optimized micro-ops on the x86
platform #ifndef I386_CPU.
- Change the behavior of the SMP_DEBUG kernel option. In the new code,
mtx_assert() only depends on INVARIANTS, allowing other kernel developers
to have working mutex assertiions without having to include all of the
mutex debugging code. The SMP_DEBUG kernel option has been renamed to
MUTEX_DEBUG and now just controls extra mutex debugging code.
- Abolish the ugly mtx_f hack. Instead, we dynamically allocate
seperate mtx_debug structures on the fly in mtx_init, except for mutexes
that are initiated very early in the boot process. These mutexes
are declared using a special MUTEX_DECLARE() macro, and use a new
flag MTX_COLD when calling mtx_init. This is still somewhat hackish,
but it is less evil than the mtx_f filler struct, and the mtx struct is
now the same size with and without mutex debugging code.
- Add some micro-micro-operation macros for doing the actual atomic
operations on the mutex mtx_lock field to make it easier for other archs
to override/optimize mutex ops if needed. These new tiny ops also clean
up the code in some places by replacing long atomic operation function
calls that spanned 2-3 lines with a short 1-line macro call.
- Don't call mi_switch() from mtx_enter_hard() when we block while trying
to obtain a sleep mutex. Calling mi_switch() would bogusly release
Giant before switching to the next process. Instead, inline most of the
code from mi_switch() in the mtx_enter_hard() function. Note that when
we finally kill Giant we can back this out and go back to calling
mi_switch().