History log of /freebsd-9.3-release/sys/kern/kern_rwlock.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 267654 19-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 262192 18-Feb-2014 jhb

MFC 261517,261520:
Convert the license on files where I am the sole copyright holder to
2 clause BSD licenses.


# 255862 24-Sep-2013 jhb

MFC 240424,244582:
Improve check coverage about idle threads.

Idle threads are not allowed to acquire any lock but spinlocks.
Deny any attempt to do so by panicing at the locking operation
when INVARIANTS is on. Then, remove the check on blocking on a
turnstile.
The check in sleepqueues is left because they are not allowed to use
tsleep() either which could happen still.

On entering KDB backends, the hijacked thread to run
interrupt context can still be idlethread. At that point, without the
panic condition, it can still happen that idlethread then will try to
acquire some locks to carry on some operations.

Skip the idlethread check on block/sleep lock operations when KDB is
active.


# 252172 24-Jun-2013 jhb

MFC 251323:
- Handle the recursed/not recursed flags with RA_RLOCKED in rw_assert().
- Tweak a panic message.


# 250581 12-May-2013 hiren

MFC: r240475

Remove all the checks on curthread != NULL with the exception of some MD
trap checks (eg. printtrap()).

Generally this check is not needed anymore, as there is not a legitimate
case where curthread != NULL, after pcpu 0 area has been properly
initialized.

Reviewed by: attilio
Approved by: sbruno (mentor)


# 236238 29-May-2012 fabient

MFC r233628, r234598, r235229, r235831, r226986.

Add software PMC support.

New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).

Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.

Sponsored by: NETASQ


# 235404 13-May-2012 avg

MFC r228424,228448: panic: add a switch and infrastructure for stopping
other CPUs in SMP case


# 226255 11-Oct-2011 attilio

Adaptive spinning for locking primitives, in read-mode, have some tuning
SYSCTLs which are inappropriate for a daily use of the machine (mostly
useful only by a developer which wants to run benchmarks on it).
Remove them before the release as long as we do not want to ship with
them in.

Now that the SYSCTLs are gone, instead than use static storage for some
constants, use real numeric constants in order to avoid eventual compiler
dumbiness and the risk to share a storage (and then a cache-line) among
CPUs when doing adaptive spinning together.

Please note that this patch is not a MFC, but an 'edge case' as commit
directly to STABLE_9.

Approved by: re (kib)


# 225736 22-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


# 205626 24-Mar-2010 bz

Print the pointer to the lock with the panic message. The previous
panic: rw lock not unlocked
was not really helpful for debugging. Now one can at least call
show lock <ptr>
form ddb to learn more about the lock.

MFC after: 3 days


# 197643 30-Sep-2009 attilio

When releasing a read/shared lock we need to use a write memory barrier
in order to avoid, on architectures which doesn't have strong ordered
writes, CPU instructions reordering.

Diagnosed by: fabio
Reviewed by: jhb
Tested by: Giovanni Trematerra
<giovanni dot trematerra at gmail dot com>


# 196334 17-Aug-2009 attilio

* Change the scope of the ASSERT_ATOMIC_LOAD() from a generic check to
a pointer-fetching specific operation check. Consequently, rename the
operation ASSERT_ATOMIC_LOAD_PTR().
* Fix the implementation of ASSERT_ATOMIC_LOAD_PTR() by checking
directly alignment on the word boundry, for all the given specific
architectures. That's a bit too strict for some common case, but it
assures safety.
* Add a comment explaining the scope of the macro
* Add a new stub in the lockmgr specific implementation

Tested by: marcel (initial version), marius
Reviewed by: rwatson, jhb (comment specific review)
Approved by: re (kib)


# 196226 14-Aug-2009 bz

Add a new macro to test that a variable could be loaded atomically.
Check that the given variable is at most uintptr_t in size and that
it is aligned.

Note: ASSERT_ATOMIC_LOAD() uses ALIGN() to check for adequate
alignment -- however, the function of ALIGN() is to guarantee
alignment, and therefore may lead to stronger alignment
enforcement than necessary for types that are smaller than
sizeof(uintptr_t).

Add checks to mtx, rw and sx locks init functions to detect possible
breakage. This was used during debugging of the problem fixed with
r196118 where a pointer was on an un-aligned address in the dpcpu area.

In collaboration with: rwatson
Reviewed by: rwatson
Approved by: re (kib)


# 193307 02-Jun-2009 attilio

Handle lock recursion differenty by always checking against LO_RECURSABLE
instead the lock own flag itself.

Tested by: pho


# 193037 29-May-2009 jhb

Remove extra cpu_spinwait() invocations. This should really only be used
in tight spin loops, not in these edge cases where we restart a much
larger loop only a few times.

Reviewed by: attilio


# 193035 29-May-2009 jhb

Tweak a few comments on adaptive spinning.


# 192853 26-May-2009 sson

Add the OpenSolaris dtrace lockstat provider. The lockstat provider
adds probes for mutexes, reader/writer and shared/exclusive locks to
gather contention statistics and other locking information for
dtrace scripts, the lockstat(1M) command and other potential
consumers.

Reviewed by: attilio jhb jb
Approved by: gnn (mentor)


# 189846 15-Mar-2009 jeff

- Wrap lock profiling state variables in #ifdef LOCK_PROFILING blocks.


# 189074 26-Feb-2009 ed

Remove even more unneeded variable assignments.

kern_time.c:
- Unused variable `p'.

kern_thr.c:
- Variable `error' is always caught immediately, so no reason to
initialize it. There is no way that error != 0 at the end of
create_thread().

kern_sig.c:
- Unused variable `code'.

kern_synch.c:
- `rval' is always assigned in all different cases.

kern_rwlock.c:
- `v' is always overwritten with RW_UNLOCKED further on.

kern_malloc.c:
- `size' is always initialized with the proper value before being used.

kern_exit.c:
- `error' is always caught and returned immediately. abort2() never
returns a non-zero value.

kern_exec.c:
- `len' is always assigned inside the if-statement right below it.

tty_info.c:
- `td' is always overwritten by FOREACH_THREAD_IN_PROC().

Found by: LLVM's scan-build


# 185778 08-Dec-2008 kmacy

add RW_SYSINIT_FLAGS macro and rw_sysinit_flags initialization function


# 182914 10-Sep-2008 jhb

Teach WITNESS about the interlocks used with lockmgr. This removes a bunch
of spurious witness warnings since lockmgr grew witness support. Before
this, every time you passed an interlock to a lockmgr lock WITNESS treated
it as a LOR.

Reviewed by: attilio


# 182909 10-Sep-2008 jhb

Various whitespace fixes.


# 179334 26-May-2008 attilio

Improve a comment which, in the actual CVS stock, doesn't completely
explain the logic of the code chunk.


# 177912 04-Apr-2008 jeff

- Add sysctls at debug.rwlock to control the behavior of the speculative
spinning when readers hold a lock. This spinning is speculative because,
unlike the write case, we can not test whether the owners are running.
- Add speculative read spinning for readers who are blocked by pending
writers while a read lock is still held. This allows the thread to
spin until the write lock succeeds after which it may spin until the
writer has released the lock. This prevents excessive context switches
when readers and writers both hold the lock for brief periods.

Sponsored by: Nokia


# 177843 01-Apr-2008 attilio

Add rw_try_rlock() and rw_try_wlock() to rwlocks.
These functions try the specified operation (rlocking and wlocking) and
true is returned if the operation completes, false otherwise.

The KPI is enriched by this commit, so __FreeBSD_version bumping and
manpage updating will happen soon.

Requested by: jeff, kris


# 176076 07-Feb-2008 jeff

- In rw_wunlock_hard prefer to wakeup writers if there are both readers
and writers available. Doing otherwise can cause deadlocks as no
read locks can proceed while there are write waiters.

Sponsored by: Nokia


# 176017 05-Feb-2008 jeff

Adaptive spinning in write path with readers and writer starvation avoidance.
- Move recursion checking into rwlock inlines to free a bit for use with
adaptive spinners.
- Clear the RW_LOCK_WRITE_SPINNERS flag whenever the lock state changes
causing write spinners to restart their loop.
- Write spinners are limited by a count while readers hold the lock as
there is no way to know for certain whether readers are running still.
- In the read path block if there are write waiters or spinners to avoid
starving writers. Use a new per-thread count, td_rw_rlocks, to skip
starvation avoidance if it might cause a deadlock.
- Remove or change invalid assertions in turnstiles.

Reviewed by: attilio (developed parts of the patch as well)
Sponsored by: Nokia


# 175411 17-Jan-2008 jhb

Remove a conditional that is always true.

MFC after: 2 weeks


# 174629 15-Dec-2007 jeff

- Re-implement lock profiling in such a way that it no longer breaks
the ABI when enabled. There is no longer an embedded lock_profile_object
in each lock. Instead a list of lock_profile_objects is kept per-thread
for each lock it may own. The cnt_hold statistic is now always 0 to
facilitate this.
- Support shared locking by tracking individual lock instances and
statistics in the per-thread per-instance lock_profile_object.
- Make the lock profiling hash table a per-cpu singly linked list with a
per-cpu static lock_prof allocator. This removes the need for an array
of spinlocks and reduces cache contention between cores.
- Use a seperate hash for spinlocks and other locks so that only a
critical_enter() is required and not a spinlock_enter() to modify the
per-cpu tables.
- Count time spent spinning in the lock statistics.
- Remove the LOCK_PROFILE_SHARED option as it is always supported now.
- Specifically drop and release the scheduler locks in both schedulers
since we track owners now.

In collaboration with: Kip Macy
Sponsored by: Nokia


# 173960 26-Nov-2007 attilio

Simplify the adaptive spinning algorithm in rwlock and mutex:
currently, before to spin the turnstile spinlock is acquired and the
waiters flag is set.
This is not strictly necessary, so just spin before to acquire the
spinlock and to set the flags.
This will simplify a lot other functions too, as now we have the waiters
flag set only if there are actually waiters.
This should make wakeup/sleeping couplet faster under intensive mutex
workload.
This also fixes a bug in rw_try_upgrade() in the adaptive case, where
turnstile_lookup() will recurse on the ts_lock lock that will never be
really released [1].

[1] Reported by: jeff with Nokia help
Tested by: pho, kris (earlier, bugged version of rwlock part)
Discussed with: jhb [2], jeff
MFC after: 1 week

[2] John had a similar patch about 6.x and/or 7.x about mutexes probabilly


# 173733 18-Nov-2007 attilio

Expand lock class with the "virtual" function lc_assert which will offer
an unified way for all the lock primitives to express lock assertions.
Currenty, lockmgrs and rmlocks don't have assertions, so just panic in
that case.
This will be a base for more callout improvements.

Ok'ed by: jhb, jeff


# 173617 14-Nov-2007 attilio

Remove a bogus KASSERT which will prevent rwlock to be acquired
recursively in exclusive mode with debugging kernels.

Submitted by: kmacy
Approved by: jeff


# 173600 14-Nov-2007 julian

generally we are interested in what thread did something as
opposed to what process. Since threads by default have teh name of the
process unless over-written with more useful information, just print the
thread name instead.


# 171516 20-Jul-2007 attilio

Fix some problems with lock profiling in rw locks:
- Adjust lock_profiling stubs semantic in the hard functions in order to be
more accurate and trustable
- As for sx locks, disable shared paths for lock_profiling. Actually,
lock_profiling has a subtle race which makes results caming from shared
paths not completely trustable. A macro stub (LOCK_PROFILING_SHARED) can
be actually used for re-enabling this paths, but is currently intended
for developing use only.
- style(9) fixes

Approved by: jeff, kmacy, jhb[1]
Approved by: re

[1] Had initial reservations not shared by others, conceded
in the end.


# 171052 26-Jun-2007 attilio

Introduce a new rwlocks initialization function: rw_init_flags.
This is very similar to sx_init_flags: it initializes the rwlock using
special flags passed as third argument (RW_DUPOK, RW_NOPROFILE,
RW_NOWITNESS, RW_QUIET, RW_RECURSE).
Among these, the most important new feature is probabilly that rwlocks
can be acquired recursively now (for both shared and exclusive paths).

Because of the recursion counter, the ABI is changed.

Tested by: Timothy Redaelli <drizzt@gufi.org>
Reviewed by: jhb
Approved by: jeff (mentor)
Approved by: re


# 170295 04-Jun-2007 jeff

Commit 3/14 of sched_lock decomposition.
- Add a per-turnstile spinlock to solve potential priority propagation
deadlocks that are possible with thread_lock().
- The turnstile lock order is defined as the exact opposite of the
lock order used with the sleep locks they represent. This allows us
to walk in reverse order in priority_propagate and this is the only
place we wish to multiply acquire turnstile locks.
- Use the turnstile_chain lock to protect assigning mutexes to turnstiles.
- Change the turnstile interface to pass back turnstile pointers to the
consumers. This allows us to reduce some locking and makes it easier
to cancel turnstile assignment while the turnstile chain lock is held.

Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)


# 169675 18-May-2007 jhb

Move lock_profile_object_{init,destroy}() into lock_{init,destroy}().


# 169394 08-May-2007 jhb

Add destroyed cookie values for sx locks and rwlocks as well as extra
KASSERTs so that any lock operations on a destroyed lock will panic or
hang.


# 168073 30-Mar-2007 jhb

- Drop memory barriers in rw_try_upgrade(). We don't need an 'acq' memory
barrier here as the earlier rw_rlock() already contained one.
- Comment fix.


# 167801 22-Mar-2007 jhb

- Simplify the #ifdef's for adaptive mutexes and rwlocks by conditionally
defining a macro earlier in the file.
- Add NO_ADAPTIVE_RWLOCKS option to disable adaptive spinning for rwlocks.


# 167787 21-Mar-2007 jhb

Rename the 'mtx_object', 'rw_object', and 'sx_object' members of mutexes,
rwlocks, and sx locks to 'lock_object'.


# 167504 13-Mar-2007 jhb

Print readers count as unsigned in ddb 'show lock'.

Submitted by: attilio


# 167492 12-Mar-2007 jhb

Fix a typo.


# 167368 09-Mar-2007 jhb

Add two new function pointers 'lc_lock' and 'lc_unlock' to lock classes.
These functions are intended to be used to drop a lock and then reacquire
it when doing an sleep such as msleep(9). Both functions accept a
'struct lock_object *' as their first parameter. The 'lc_unlock' function
returns an integer that is then passed as the second paramter to the
subsequent 'lc_lock' function. This can be used to communicate state.
For example, sx locks and rwlocks use this to indicate if the lock was
share/read locked vs exclusive/write locked.

Currently, spin mutexes and lockmgr locks do not provide working lc_lock
and lc_unlock functions.


# 167365 09-Mar-2007 jhb

Use C99-style struct member initialization for lock classes.


# 167307 07-Mar-2007 jhb

Fix some nits in lock profiling for rwlocks:
- Properly note when a read lock is released.
- Always note when we contest on a read lock.
- Only note success of obtaining read locks for the first reader to match
the behavior of sx(9).

Reviewed by: kmacy


# 167054 27-Feb-2007 kmacy

Further improvements to LOCK_PROFILING:
- Fix missing initialization in kern_rwlock.c causing bogus times to be collected
- Move updates to the lock hash to after the lock is released for spin mutexes,
sleep mutexes, and sx locks
- Add new kernel build option LOCK_PROFILE_FAST - only update lock profiling
statistics when an acquisition is contended. This reduces the overhead of
LOCK_PROFILING to increasing system time by 20%-25% which on
"make -j8 kernel-toolchain" on a dual woodcrest is unmeasurable in terms
of wall-clock time. Contrast this to enabling lock profiling without
LOCK_PROFILE_FAST and I see a 5x-6x slowdown in wall-clock time.


# 167024 26-Feb-2007 rwatson

Add rw_wowned() interface to rwlock(9), allowing a kernel thread to
determine if it holds an exclusive rwlock reference or not. This is
non-ideal, but recursion scenarios in the network stack currently
require it.

Approved by: jhb


# 167012 26-Feb-2007 kmacy

general LOCK_PROFILING cleanup

- only collect timestamps when a lock is contested - this reduces the overhead
of collecting profiles from 20x to 5x

- remove unused function from subr_lock.c

- generalize cnt_hold and cnt_lock statistics to be kept for all locks

- NOTE: rwlock profiling generates invalid statistics (and most likely always has)
someone familiar with that should review


# 164246 13-Nov-2006 kmacy

track lock class name in a way that doesn't break WITNESS


# 164159 11-Nov-2006 kmacy

MUTEX_PROFILING has been generalized to LOCK_PROFILING. We now profile
wait (time waited to acquire) and hold times for *all* kernel locks. If
the architecture has a system synchronized TSC, the profiling code will
use that - thereby minimizing profiling overhead. Large chunks of profiling
code have been moved out of line, the overhead measured on the T1 for when
it is compiled in but not enabled is < 1%.

Approved by: scottl (standing in for mentor rwatson)
Reviewed by: des and jhb


# 160771 27-Jul-2006 jhb

Adjust td_locks for non-spin mutexes, rwlocks, and sx locks so that it is
a count of all non-spin locks, not just lockmgr locks. This can give us a
much cheaper way to see if we have any locks held (such as when returning
to userland via userret()) without requiring WITNESS.

MFC after: 1 week


# 157882 19-Apr-2006 jhb

Implement rw_try_upgrade() and rw_downgrade(). rw_try_upgrade() makes a
single attempt at upgrading a read lock to a write lock, and rw_downgrade()
converts curthread's write lock into a read lock.


# 157851 18-Apr-2006 wkoszek

'owner' is not used without SMP. Fix kernel build for such kernel
configurations.

Approved by: jhb


# 157846 18-Apr-2006 jhb

Adaptively spin before blocking on the turnstile if an rwlock is write
locked. In general the adaptive spinning is similar to the same code
for mutexes with some extra trickiness in rw_wunlock_hard(). Specifically,
even though both wait bits might be set and we might have a turnstile with
at least one waiting thread, there might not be any threads blocked on the
queue we are not waking up (they might all be spinning), and we should
only preserve the waiting flag for the queue we aren't waking up if there
are in fact threads blocked on that queue. Secondly, there might not be
any threads blocked on the queue we have chosen to waken threads from
(there might only be threads blocked on the other queue and the threads
for this queue are all spinning) in which case we disown the turnstile
instead of doing a braodcast and unpend.


# 157826 17-Apr-2006 jhb

- Add a rw_wowner() macro that just returns the owner of a write lock and
use it in places that only care about the write owner instead of
rw_owner() as a baby step towards limited read-lock owner.
- Tidy the code that sets the WAITER flag bits to not duplicate a test
around the atomic operation and the KTR trace in both of the lock
functions.


# 155162 01-Feb-2006 scottl

Fix another compile problem. If I find any more, this file is going in the
Attic until it is properly fixed.


# 155061 30-Jan-2006 scottl

Regroup order of operations to better reflect what was probably intended.

Submitted by: Peter Jeremy


# 155012 29-Jan-2006 scottl

Take a stab at making this compile when WITNESS is not defined. gcc can't
figure out the order of operations at line 519, and neither can I, but this
is my best guess. Also correct a number of typos and syntax errors.


# 154973 29-Jan-2006 mlaier

Unbreak on archs where %d doesn't print uintptr_t arithmetic.


# 154941 27-Jan-2006 jhb

Add a basic reader/writer lock implementation to the kernel. This
implementation is by no means perfect as far as some of the algorithms
that it uses and the fact that it is missing some functionality (try
locks and upgrades/downgrades are not there yet), however it does seem
to work in my local testing. There is more detail in the comments in the
code, but the short version follows.

A reader/writer lock is very much like a regular mutex: it cannot be held
across a voluntary sleep; it can be acquired in an interrupt thread; if
the lock is held by a writer then the priority of any threads that block
on the lock will be lent to the owner; the simple case lock operations all
are done in a single atomic op. It also shares some similiarities
with sx locks: it supports reader/writer semantics (multiple readers,
but single writers); readers are allowed to recurse, but writers are not.

We can extend this implementation further by either improving algorithms
or adding new functionality, but this should at least give us a base to
work with now.

Reviewed by: arch (in theory)
Tested on: i386 (4 cpu box with a kernel module that used 4 threads
that randomly chose between read locks and write locks
that ran w/o panicing for over a day solid. It usually
panic'd within a few seconds when there were bugs during
testing. :) The kernel module source is available on
request.)