History log of /freebsd-11-stable/sys/sys/mutex.h
Revision Date Author Comments
# 331722 29-Mar-2018 eadler

Revert r330897:

This was intended to be a non-functional change. It wasn't. The commit
message was thus wrong. In addition it broke arm, and merged crypto
related code.

Revert with prejudice.

This revert skips files touched in r316370 since that commit was since
MFCed. This revert also skips files that require $FreeBSD$ property
changes.

Thank you to those who helped me get out of this mess including but not
limited to gonzo, kevans, rgrimes.

Requested by: gjb (re)


# 330897 14-Mar-2018 eadler

Partial merge of the SPDX changes

These changes are incomplete but are making it difficult
to determine what other changes can/should be merged.

No objections from: pfg


# 327413 31-Dec-2017 mjg

MFC r320561,r323236,r324041,r324314,r324609,r324613,r324778,r324780,r324787,
r324803,r324836,r325469,r325706,r325917,r325918,r325919,r325920,r325921,
r325922,r325925,r325963,r326106,r326107,r326110,r326111,r326112,r326194,
r326195,r326196,r326197,r326198,r326199,r326200,r326237:

rwlock: perform the typically false td_rw_rlocks check later

Check if the lock is available first instead.

=============

Sprinkle __read_frequently on few obvious places.

Note that some of annotated variables should probably change their types
to something smaller, preferably bit-sized.

=============

mtx: drop the tid argument from _mtx_lock_sleep

tid must be equal to curthread and the target routine was already reading
it anyway, which is not a problem. Not passing it as a parameter allows for
a little bit shorter code in callers.

=============

locks: partially tidy up waiting on readers

spin first instant of instantly re-readoing and don't re-read after
spinning is finished - the state is already known.

Note the code is subject to significant changes later.

=============

locks: take the number of readers into account when waiting

Previous code would always spin once before checking the lock. But a lock
with e.g. 6 readers is not going to become free in the duration of once spin
even if they start draining immediately.

Conservatively perform one for each reader.

Note that the total number of allowed spins is still extremely small and is
subject to change later.

=============

mtx: change MTX_UNOWNED from 4 to 0

The value is spread all over the kernel and zeroing a register is
cheaper/shorter than setting it up to an arbitrary value.

Reduces amd64 GENERIC-NODEBUG .text size by 0.4%.

=============

mtx: fix up owner_mtx after r324609

Now that MTX_UNOWNED is 0 the test was alwayas false.

=============

mtx: clean up locking spin mutexes

1) shorten the fast path by pushing the lockstat probe to the slow path
2) test for kernel panic only after it turns out we will have to spin,
in particular test only after we know we are not recursing

=============

mtx: stop testing SCHEDULER_STOPPED in kabi funcs for spin mutexes

There is nothing panic-breaking to do in the unlock case and the lock
case will fallback to the slow path doing the check already.

=============

rwlock: reduce lockstat branches in the slowpath

=============

mtx: fix up UP build after r324778

=============

mtx: implement thread lock fastpath

=============

rwlock: fix up compilation without KDTRACE_HOOKS after r324787

=============

rwlock: use fcmpset for setting RW_LOCK_WRITE_SPINNER

=============

sx: avoid branches if in the slow path if lockstat is disabled

=============

rwlock: avoid branches in the slow path if lockstat is disabled

=============

locks: pull up PMC_SOFT_CALLs out of slow path loops

=============

mtx: unlock before traversing threads to wake up

This shortens the lock hold time while not affecting corretness.
All the woken up threads end up competing can lose the race against
a completely unrelated thread getting the lock anyway.

=============

rwlock: unlock before traversing threads to wake up

While here perform a minor cleanup of the unlock path.

=============

sx: perform a minor cleanup of the unlock slowpath

No functional changes.

=============

mtx: add missing parts of the diff in r325920

Fixes build breakage.

=============

locks: fix compilation issues without SMP or KDTRACE_HOOKS

=============

locks: remove the file + line argument from internal primitives when not used

The pair is of use only in debug or LOCKPROF kernels, but was passed (zeroed)
for many locks even in production kernels.

While here whack the tid argument from wlock hard and xlock hard.

There is no kbi change of any sort - "external" primitives still accept the
pair.

=============

locks: pass the found lock value to unlock slow path

This avoids an explicit read later.

While here whack the cheaply obtainable 'tid' argument.

=============

rwlock: don't check for curthread's read lock count in the fast path

=============

rwlock: unbreak WITNESS builds after r326110

=============

sx: unbreak debug after r326107

An assertion was modified to use the found value, but it was not updated to
handle a race where blocked threads appear after the entrance to the func.

Move the assertion down to the area protected with sleepq lock where the
lock is read anyway. This does not affect coverage of the assertion and
is consistent with what rw locks are doing.

=============

rwlock: stop re-reading the owner when going to sleep

=============

locks: retry turnstile/sleepq loops on failed cmpset

In order to go to sleep threads set waiter flags, but that can spuriously
fail e.g. when a new reader arrives. Instead of unlocking everything and
looping back, re-evaluate the new state while still holding the lock necessary
to go to sleep.

=============

sx: change sunlock to wake waiters up if it locked sleepq

sleepq is only locked if the curhtread is the last reader. By the time
the lock gets acquired new ones could have arrived. The previous code
would unlock and loop back. This results spurious relocking of sleepq.

This is a step towards xadd-based unlock routine.

=============

rwlock: add __rw_try_{r,w}lock_int

=============

rwlock: fix up compilation of the previous change

commmitted wrong version of the patch

=============

Convert in-kernel thread_lock_flags calls to thread_lock when debug is disabled

The flags argument is not used in this case.

=============

Add the missing lockstat check for thread lock.

=============

rw: fix runlock_hard when new readers show up

When waiters/writer spinner flags are set no new readers can show up unless
they already have a different rw rock read locked. The change in r326195 failed
to take that into account - in presence of new readers it would spin until
they all drain, which would be lead to trouble if e.g. they go off cpu and
can get scheduled because of this thread.


# 315394 16-Mar-2017 mjg

MFC,r313855,r313865,r313875,r313877,r313878,r313901,r313908,r313928,r313944,r314185,r314476,r314187

locks: let primitives for modules unlock without always goging to the slsow path

It is only needed if the LOCK_PROFILING is enabled. It has to always check if
the lock is about to be released which requires an avoidable read if the option
is not specified..

==

sx: fix compilation on UP kernels after r313855

sx primitives use inlines as opposed to macros. Change the tested condition
to LOCK_DEBUG which covers the case, but is slightly overzelaous.

commit a39b839d16cd72b1df284ccfe6706fcdf362706e
Author: mjg <mjg@ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
Date: Sat Feb 18 22:06:03 2017 +0000

locks: clean up trylock primitives

In particular thius reduces accesses of the lock itself.

git-svn-id: svn+ssh://svn.freebsd.org/base/head@313928 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f

commit 013560e742a5a276b0deef039bc18078d51d6eb0
Author: mjg <mjg@ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
Date: Sat Feb 18 01:52:10 2017 +0000

mtx: plug the 'opts' argument when not used

git-svn-id: svn+ssh://svn.freebsd.org/base/head@313908 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f

commit 9a507901162fb476b9809da2919905735cd605af
Author: mjg <mjg@ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
Date: Fri Feb 17 22:09:55 2017 +0000

sx: fix mips builld after r313855

The namespace in this file really needs cleaning up. In the meantime
let inline primitives be defined as long as LOCK_DEBUG is not enabled.

Reported by: kib

git-svn-id: svn+ssh://svn.freebsd.org/base/head@313901 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f

commit aa6243a5124b9ceb3b1683ea4dbb0a133ce70095
Author: mjg <mjg@ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
Date: Fri Feb 17 15:40:24 2017 +0000

mtx: get rid of file/line args from slow paths if they are unused

This denotes changes which went in by accident in r313877.

On most production kernels both said parameters are zeroed and have nothing
reading them in either __mtx_lock_sleep or __mtx_unlock_sleep. Thus this change
stops passing them by internal consumers which this is the case.

Kernel modules use _flags variants which are not affected kbi-wise.

git-svn-id: svn+ssh://svn.freebsd.org/base/head@313878 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f

commit 688545a6af7ed0972653d6e2c6ca406ac511f39d
Author: mjg <mjg@ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
Date: Fri Feb 17 15:34:40 2017 +0000

mtx: restrict r313875 to kernels without LOCK_PROFILING

git-svn-id: svn+ssh://svn.freebsd.org/base/head@313877 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f

commit bbe6477138713da2d503f93cb5dd602e14152a08
Author: mjg <mjg@ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
Date: Fri Feb 17 14:55:59 2017 +0000

mtx: microoptimize lockstat handling in __mtx_lock_sleep

This saves a function call and multiple branches after the lock is acquired.

overzelaous


# 315378 16-Mar-2017 mjg

MFC r313275,r313280,r313282,r313335:

mtx: move lockstat handling out of inline primitives

Lockstat requires checking if it is enabled and if so, calling a 6 argument
function. Further, determining whether to call it on unlock requires
pre-reading the lock value.

This is problematic in at least 3 ways:
- more branches in the hot path than necessary
- additional cacheline ping pong under contention
- bigger code

Instead, check first if lockstat handling is necessary and if so, just fall
back to regular locking routines. For this purpose a new macro is introduced
(LOCKSTAT_PROFILE_ENABLED).

LOCK_PROFILING uninlines all primitives. Fold in the current inline lock
variant into the _mtx_lock_flags to retain the support. With this change
the inline variants are not used when LOCK_PROFILING is defined and thus
can ignore its existence.

This results in:
text data bss dec hex filename
22259667 1303208 4994976 28557851 1b3c21b kernel.orig
21797315 1303208 4994976 28095499 1acb40b kernel.patched

i.e. about 3% reduction in text size.

A remaining action is to remove spurious arguments for internal kernel
consumers.

==

sx: move lockstat handling out of inline primitives

See r313275 for details.

==

rwlock: move lockstat handling out of inline primitives

See r313275 for details.

One difference here is that recursion handling was removed from the fallback
routine. As it is it was never supposed to see a recursed lock in the first
place. Future changes will move it out of inline variants, but right now
there is no easy to way to test if the lock is recursed without reading
additional words.

==

locks: fix recursion support after recent changes

When a relevant lockstat probe is enabled the fallback primitive is called with
a constant signifying a free lock. This works fine for typical cases but breaks
with recursion, since it checks if the passed value is that of the executing
thread.

Read the value if necessary.


# 315377 16-Mar-2017 mjg

MFC r313269,r313270,r313271,r313272,r313274,r313278,r313279,r313996,r314474

mtx: switch to fcmpset

The found value is passed to locking routines in order to reduce cacheline
accesses.

mtx_unlock grows an explicit check for regular unlock. On ll/sc architectures
the routine can fail even if the lock could have been handled by the inline
primitive.

==

rwlock: switch to fcmpset

==

sx: switch to fcmpset

==

sx: uninline slock/sunlock

Shared locking routines explicitly read the value and test it. If the
change attempt fails, they fall back to a regular function which would
retry in a loop.

The problem is that with many concurrent readers the risk of failure is pretty
high and even the value returned by fcmpset is very likely going to be stale
by the time the loop in the fallback routine is reached.

Uninline said primitives. It gives a throughput increase when doing concurrent
slocks/sunlocks with 80 hardware threads from ~50 mln/s to ~56 mln/s.

Interestingly, rwlock primitives are already not inlined.

==

sx: add witness support missed in r313272

==

mtx: fix up _mtx_obtain_lock_fetch usage in thread lock

Since _mtx_obtain_lock_fetch no longer sets the argument to MTX_UNOWNED,
callers have to do it on their own.

==

mtx: fixup r313278, the assignemnt was supposed to go inside the loop

==

mtx: fix spin mutexes interaction with failed fcmpset

While doing so move recursion support down to the fallback routine.

==

locks: ensure proper barriers are used with atomic ops when necessary

Unclear how, but the locking routine for mutexes was using the *release*
barrier instead of acquire. This must have been either a copy-pasto or bad
completion.

Going through other uses of atomics shows no barriers in:
- upgrade routines (addressed in this patch)
- sections protected with turnstile locks - this should be fine as necessary
barriers are in the worst case provided by turnstile unlock

I would like to thank Mark Millard and andreast@ for reporting the problem and
testing previous patches before the issue got identified.


# 315341 16-Mar-2017 mjg

MFC r311172,r311194,r311226,r312389,r312390:

mtx: reduce lock accesses

Instead of spuriously re-reading the lock value, read it once.

This change also has a side effect of fixing a performance bug:
on failed _mtx_obtain_lock, it was possible that re-read would find
the lock is unowned, but in this case the primitive would make a trip
through turnstile code.

This is diff reduction to a variant which uses atomic_fcmpset.

==

Reduce lock accesses in thread lock similarly to r311172

==

mtx: plug open-coded mtx_lock access missed in r311172

==

rwlock: reduce lock accesses similarly to r311172

==

sx: reduce lock accesses similarly to r311172


# 303549 30-Jul-2016 kib

MFC r303211:
Implement mtx_trylock_spin(9).

Approved by: re (gjb)